William M. Briggs

Statistician to the Stars!

Page 147 of 415

What Probably Isn’t: Heat Waves and Nine Feet Tall Men: Part I

Probability is screwy, and we statisticians do a horrible, rotten job of teaching it. The first thing students learn in normal statistics classes is about “measures of central tendency” or some such thing. The idea of what probability means and why anybody would have the slightest interest in “central tendency” is never broached. As a consequence, students leave statistics classes with a bunch of half-remembered formula and no clear idea of what probability is.

This is unfortunate, because it allows educated men like Rolling Stone’s Bill McKibben to write the following:

June broke or tied 3,215 high-temperature records across the United States. That followed the warmest May on record for the Northern Hemisphere — the 327th consecutive month in which the temperature of the entire globe exceeded the 20th-century average, the odds of which occurring by simple chance were 3.7 x 10-99, a number considerably larger than the number of stars in the universe.[see note at bottom of page]

Poor man! Poor readers! McKibben actually believes he has said something of interest; he has worked himself into a lather over these numbers and goes on to say things like “the seriousness of our predicament”. McKibben figures that such a small number can only mean that we are doomed—unless, of course, massive amounts of money is taken from this country’s citizens and given to its politicians to apply as they see fit.

Now over the last week I tried to explain, via two examples, just what probability is and what it isn’t, and why numbers like McKibben’s aren’t of the slightest interest. See this post about global warming and this one about nine feet tall men. And if you find yourself disagreeing with me, read this one about foundations. You must at least read the first two posts because I assume it below.

What Probability Is

Suppose I let the symbol Q stand for “There are no men taller than nine feet,” and the expression D = “I observe a man 8.979 feet tall.” Let’s take this equation, or as some readers prefer to say, expression:

     (1) Pr(D | Q)

and try to solve it.

Equation (1) is a matter of logic. It is just the same as Lewis Carroll’s French speaking cats: We know that if R = “All cats are creatures understanding French and some chickens are cats” that the proposition F = “Some chickens are creatures understanding French” is true; that is Pr(F | R) = 1. And this is so even if nobody ever, not ever never, in no possible world in no possible time, never never never measures or observes or sees or posits on genetical arguments any cats understanding French. It is true even if we learn tomorrow from God Himself that He has decreed that it is a logical and physical impossibility that any cat could understand French. F given R is true and that is that: and it is true because, again, logic only makes statements about the connections between propositions. Logic is mute on the propositions themselves.

All logic, which is to say all probability, because it is solely interested in the connection between expressions, must regard propositions as fixed. In any given equation, we cannot add or subtract from these expressions: we must leave them as they are: they are not to be touched: they are sacrosanct: they exist as they are and are carved out of uncuttable stone: we are forbidden upon pain of death to manipulate them in any way. For I testify unto every man that heareth the words of these theorems, If any man shall add unto these propositions, God shall add unto him the plagues that are written in Greenpeace press releases: And if any man shall take away from the words of these propositions, God shall take away his part out of the Book of Life. I am not sure how much more of a dire warning I can issue. Don’t touch Q or D!

Equation (1) says that assuming Q is true, assuming, that is, that there are no men taller than 9 feet, that it is true that there are no men taller than 9 feet, that it is impossible there are men taller than 9 feet, that God himself has willed that there are no men taller than 9 feet, that in any possible world there cannot be men taller than 9 feet, that it is just a fact, immovable, imperturbable, irrevocable that no man can be taller than 9 feet—even if we want one to be, even if we can imagine it to be so, even if real men are actually observed to be taller than 9 feet, even if you yourself are 9’1″—given, as I say, all that, what is the chance you see a man a quarter-inch short of 9 feet?

Well, on reading D to mean seeing a man shorter than 9 feet, (1) is certain, i.e. Pr(D|Q) = 1; or on reading D to mean seeing a man precisely 8.979 feet—the actual writing of D after all, and we know we should not touch D—the best we can say is 0 < Pr(D|Q) < 1 because we have no information on how heights are distributed; all we know is that heights are contingent, meaning it is not certain (given the information we have) that all men must be precisely 8.979 feet. And therefore all we can say is “I don’t know.”

We must judge equation (1) as written! Not as we imagine it to be written, or how it might be written differently is we change the meaning of Q and D. Or about how we feel about Q and D. How it is written and nothing else.

It’s kind of funny, but if we turn probability into math there wouldn’t be the slightest interest or confusion. Suppose instead Q = “X < 9″ and D = “X = 8.979″ where X is just some number unrelated to any physical real thing. Then Pr(D | Q) no longer seems mysterious. In this case it’s hard to see where to add bits about, “In my opinion, we might see X larger than 9″ or “I would suspect that if X did equal 8.979 then X will be greater than 9.” Indeed, if anybody did announce the latter, you would regard him as eccentric. You’d say to him, “Listen, pal. These are just numbers. They don’t mean anything. And by assumption, no number can be greater than 9. So you are speaking out of your hat.”

Or change them again: Q = “Just half of all winged blue cats who understand French are taller than 9 feet” and D = “Observe a winged blue cat who understands French standing 8.979 feet”. Once again, we are not tempted to change Q and D and we interpret them as written.

Today’s lesson: don’t touch the propositions!

In Part II: McKibben’s Fantasy


If there were only 3.7 x 10-99 stars in the universe, there would not even be 1 star. 3.7 x 10-99 is of course less than 1.

Men Nine Feet Tall And Bayes Theorem

The OFloinn put up a most readable and recommended essay When is Weather Really “Climate”? and in one of the comments a reader named Gyan in part said:

Many economists and radical empiricists claim to reduce the whole of rationality to the Bayes’ Theorem. But John Derbyshire in his popular book on Riemann Hypothesis provides a curious counterexample.
Suppose you have a proposition that no man is more than nine feet tall. Then you find a man just a quarter inch short of nine feet.
Should your confidence in the proposition increase or not?
By Bayes’ it seems it should but common sense tells me that it should decrease.

I admit to flying somewhat blind here, because I don’t have Derbyshire’s book and can’t read his example; nevertheless, nothing ventured etc.

The economists and radical empiricists are partly right: it’s Bayes all the way, but only in the logical sense, i.e. the sense in which Bayes describes the probabilistic relationship between propositions, just as traditional logic only describes the logical relationship between propositions. About the origin of the propositions, and of fundamental truth, about which propositions are worthy of entertaining and which not, Bayes and logic are silent. In other words, radical empiricism is false, as it just-plain empiricism, and most of what economists say is best left unsaid. But of these things, another day. On to the example!

For ease of writing, let Q = “No man is more than nine feet tall,” and let D (for data) = “You find a man just a quarter inch short of nine feet.” These are two propositions and we can use Bayes, i.e. extended logic, to say something about their relationship. For instance, we can ask

     (1) Pr( D | Q )

or we can ask

     (2) Pr( Q | D ).

These probabilities are not the same, and are rarely the same for any two propositions; and unless you are clear about which you mean, you can easily mix them up.

Equation (1) is easily solvable. It says given that we know, or accept as true, that no man can be taller than nine feet, what is the probability of seeing a man less than nine feet, specifically a man a quarter inch shorter than nine feet. The answer is, in this interpretation, 1, or 100%. Of course it is! We have just said that it is a fact that no man can be taller; and here is a man who is indeed not taller.

This interpretation is not the same as F = “Any man a quarter inch shy of nine feet”. That would be

     (3) Pr( F | Q )

and to answer it fully would require we know more about the distribution of heights (F is about any old man; D is about a man). What we do know about heights is this: we know, via deduction, they are greater than zero feet, and, by assumption, they are less than nine feet. Therefore, the best we could say about (3) is that its probability is between 0 and 1. Now you might be tempted to say it is closer to 0 than to 1, but that is because you are implicitly adding information to Q, to the right-hand-side. That is, you might add information to Q about your experience with real heights of real men, experience which suggests a decreasing probability for very high heights. If you say (3) is closer to 0 than to 1, you are actually answering

     (4) Pr( F | Q & My experience about actual heights)

which you can see is not (3) and is therefore not an answer to (3).

Now turn the question around and answer (2): this is the chance that no man is taller than nine feet given we have seen one just shy of that number. The answer feels like it will be close to 0, but again that is because we are not strictly answering (2)—the strict answer to (2) is unknown, or perhaps just between 0 and 1 if we assume the contingent nature of these events. But what we really think we are answering is

     (5) Pr( Q | D & My experience about actual heights),

and that seems to make (5) close to 0. Let’s call E = “My experience about actual heights.”

What about Bayes’s theorem? Well, it’s easy to work out that (5) is equal to (via Bayes’s theorem):

     (5′) Pr( Q | D & E) = Pr( D | Q & E )Pr( Q | E )/Pr( D | E ).

This “updates” our belief in Q from Pr(Q | E) to Pr( Q | D & E) based on observing our “data” D. About the exact value to Pr(Q | E), I don’t know (here’s another point where we depart from economists and empiricists: Bayes does not claim all probabilities are quantifiable). As long as E doesn’t contain information contradictory to Q, such that Q is false given E, then we’re okay. In my mind, using my E, Pr(Q | E) is high, close to 1 (my E says I don’t know of any man taller than nine feet).

That leaves us Pr( D | Q & E ) and Pr(D | E) to figure out. We can attack Pr(D|E) directly or it turns out that Pr(D|E) = Pr(D|Q&E)Pr(Q|E) + Pr(D|not-Q&E)Pr(not-Q|E). The first part is just a repeat of the numerator, and “not-Q” means “it is false that no man is more than nine feet tall.” Let’s be lazy and answer Pr(D|E) directly: this is the probability of seeing a man 8′ 11.75″ given E. Pr(D|E) might be close to 0. But then so will Pr( D | Q & E ).

We already assumed Pr(Q|E) was “large”, so that if Pr( D | Q & E ) < Pr(D|E) then Pr(Q|D&E) < Pr(Q|E), i.e. our belief in Q shrinks after seeing D. But if Pr( D | Q & E ) > Pr(D|E) then Pr(Q|D&E) > Pr(Q|E) and our belief in Q increases after seeing D. Whether “Pr( D | Q & E ) < Pr(D|E)” or “Pr( D | Q & E ) > Pr(D|E)” is true depends entirely on E, which since it is so fuzzy makes this problem difficult and (sometimes) seemingly against intuition.

The Decline And Increase Of Mainstream Religions In The USA: With Pictures!

The Episcopals, like many other old guard protesting religions, are in trouble. Jay Akasie tells us that at this year’s annual conclave, the Episcopalian leadership had their most serious discussions about “whether to develop funeral rites for dogs and cats, and whether to ratify resolutions condemning genetically modified foods.” Also “the approval of transgender ordination.” All were approved, naturally.

They also issued a sort of “apology to Native Americans for exposing them to Christianity”, and began a chat about how, using “blunt modern language and with politically correct intent” to re-write the “church’s historic” (but now embarrassing) Book of Common Prayer. And, oh yes, they approved a rite to bless so-called same-sex unions. The only thing missing was a statement condemning global warming.

But that’s because, as Russ Douthat reminds us, they did that already in 2006, when the group said they “valued ‘the stewardship of the earth’ too highly to reproduce themselves.”

The good news is that Episcopalians who sleep in on Sunday are in no danger of missing the weekly sermon. All they need do is switch on NBC, ABC, CBS, MSNBC, CNN, or PBS or subscribe to the DNC newsletter and there it will be.

Evidently, this is what most Episcopalians, Methodists, Lutherans, and other WASPy religion members have done, because fewer and fewer of them are showing up in the pews. The following picture illustrates this; this is the per 1000 citizen church members (i.e. normalized by the USA population):

And here the same, unnormalized by population:

I only put up those religions with the most folks. All data is from the Association of Religion Data Archives; they’re about five years behind, and it’s not clear how accurate are the counts. In particular, if there’s one thing Protestants like to do it is protest; they’re forever splitting and splintering and creating new branches. There are five American branches of Anglicanism, for instance (with the most conservative actually growing). This fracturing is why the data for Methodists and Presbyterians is choppy. There are twenty active and eighteen inactive Presbyterian groups, with Methodists about the same. The US Census provided the population data.

All the mainline denominations are plummeting, but the Catholics are holding their own, the Assembly of God and other similar pentecostal denominations are increasing, and while the Baptists have begun declining, the rate is not as alarming as it is for the others (see Southern Baptist Baptisms at Lowest in Decades). Catholics have even been increasing of late; much of this comes from immigration from countries which have historically adopted a conquering white nation’s language which is not English, and in congregations which are traditionalist. Mormons (not shown) are also on the rise; there are about six million of them at present. And the Amish and Mennonites (also not shown) aren’t doing poorly.

In other words, those denominations which are roughly “conservative” are strengthening, while those which are roughly “liberal” or “progressive” are weakening. And it doesn’t take a keen eye to see when the trouble started. With your finger, draw a vertical line at 1960 or so on each of these plots, and then allow yourself a slight “Ah.”

To see how things might go in the future, examine these two pictures:

This is the number of churches: all are in decline, except the Assembly of God, which also saw the highest growth rate in members, and except the Southern Baptists, which now has buildings with fewer people in them. The Catholic and Episcopal rate of decrease may be termed strategic, but the Presbyterians and Methodists are in full retreat. Hold these figures in your mind while looking at this:

This is the number of clergy. The increase in Assembly of God makes sense: more churches and more members need more preachers. The Southern Baptists are filling more churches with more preachers, but those churches are just holding steady. That becomes the ten-year prediction for both these denominations: Assembly of God increasing apace, and the Baptists treading water.

The increase in clergy for Methodists and Presbyterians makes no sense—who is paying these people?—but at least these denominations are saving funds by closing buildings. But the increase makes least sense for the Episcopalians, who are losing members while trying to hold on to real estate and while hiring new clergy. Expensive business, that (plus see the original link: the church is spending money suing its own membership for splitting). The predictions are a gentle, gentlemanly decline for the Methodists and Presbyterians, who in ten years will still be with us, but whose suicides will be pleasant, sedate affairs. Look for more of their churches to be converted into lofts for hipsters.

The prediction for Episcopalians is more grim. A decade from now, there will be a million or fewer of them; and there is even a reasonable probability there will be none of them, at least in an official sense. The church could very well break down into separate churches, its various members being absorbed elsewhere.

The outlook for Catholics is sparkling. That bump you saw in clergy is no artifact (perhaps the exact number is; no warranty on the data, but evidence elsewhere suggests the priesthood is reviving). The increase in members will continue with immigration and because while some Catholics rebel when it comes to birth control, few do so when it comes to killing off inconvenient fetuses. Plus with the gradual return to more traditional services, and given what we have seen, membership will likely increase.

Check back with me ten years from now to see how well I did.

Certainty & Uncertainty: Logical Probability & Statistics

Since we have spent the weekend with these matters, I thought it appropriate to include the first part of the Introduction to the new book I’ve been working on. It is only slightly similar to the old book.

There are things we know with certainty. These things are true or false given some premises or evidence or just because upon reflection they are obviously true or false. There are many more things about which we are uncertain. These things too are more or less probable given certain premises and evidence. And there are still more things of which nobody can ever specify the uncertainty. These things are nonsensical or paradoxical.

The truth, falsity, or in-betweenness of any proposition can only be known with respect to stated evidence or premises. We know that given the premises “All men are mortal” and “Socrates is a man” that the proposition “Socrates is mortal” is true. Given other premises the same proposition may be true, false, or in-between, which is to say merely probable. Swapping the first premise with “No men are mortal” changes the truth of the final proposition to falsity. Exchange it with “Most men are mortal” and the truth or falsity of the final proposition can no longer be ascertained, though its probability can. The probability of “Socrates is mortal” given “All men are mortal and Socrates is a man” is 1; just as the probability of the same proposition given “No men are mortal, etc.” is 0. And the probability of the same proposition given “Most men, etc.” is greater than 0.5 but less than 1. This, incidentally, shows that probability is often an interval. This result only follows because we tacitly include a premise about the definition of the English word most; here its definition means “a majority but not all.”

This move is perfectly acceptable, even if unfamiliar. Consider you are supplying the argument with many tacit assumptions, such as definitions for all, men, and so forth, along with premises about how the words All men, etc. in sequence are turned into English with a definite meaning, about how men are discrete individuals, and so on. Thus if you quibble with my definition of most, you are free to substitute your own, as long as you make it clear just what definition you hold. As we shall see, debates about the probability of a proposition are really about the list and meanings of premises.

Change the first premise to “All men are moral” (notice the absence of t) and then nothing can be said about the proposition “Socrates is mortal.” A man’s morality has no bearing on his mortality (though it might affect his immortality). The proposition given these premises is clearly not true, and just as clearly not false. It also has no probability because there is no evidence in the premises which are probative to the proposition before us. The probability the proposition is true given these premises is undefined.

This also should not be strange. Consider any proposition you like, such as “Jack can lift 100 pounds.” What is the probability it is true? There is none because no premises have been supplied. Suppose I offer as a premise “2 < 4″ and re-ask the question. Still no answer because this mathematical fact has no bearing on the proposition.

Or perhaps this tacit premise has suggested itself, “Jack can either lift 100 pounds or he cannot”; therefore, given this premise the probability the proposition is true is 0.5. This is false. It is a deducible truth of logic that adding any truth to a list of premises, or to a proposition does not change the logical status of that proposition. So if we prefix “No men, etc.” with “T & No men, etc.” the proposition “S. is mortal” given these premises remains false, and where T is any truth. The premise “Jack can either lift 100 pounds or he cannot” is a truth; it is a tautology and tautologies are always true. It is just as true as “Jack either has cancer or he doesn’t.” That being so, this latter truth can be substituted with the first truth, where it is now obviously unrelated to the proposition.

One other tacit premise lurks: this is that the proposition about Jack’s muscle power suggests contingency. Contingent propositions are events which we know, via a multitude of paths, are not necessary truths or falsities. This “premise” is actually a host of premises which lead us to conclude that we cannot (it is impossible to) find a formal proof that makes the proposition about Jack a necessary truth or falsity (this sentence is the conclusion to that argument). But accepting this premise, or premises, does not buy us much, for given this premise, or premises, the probability of the proposition is greater than 0 and less than 1, and that is the best we can do. This merely says, in quantifiable terms, that the proposition is not a truth or falsity.

Logic, of which probability is a branch, is concerned only with the connections between premises and conclusions and not with the premises or conclusions themselves; and this is so whether we discuss their veracity or origin. Thus when we say, given “All men, etc.” the probability that “S. is mortal” is 1, we are not casting judgment on the premise “All men are mortal” nor are we concerned (at this point) where the “conclusion” “Socrates is mortal” arose. The premise “All men are mortal” indeed appears false—but only because we implicitly add premises such as Benjamin Franklin’s which encapsulate all human experience about man’s limited stay on the planet. Since these premises did not appear in the original argument, we are not free to put them there, at least not when considering the argument as it stands, or for “argument’s sake”, or when demonstrating logical principles. The proposition which is the conclusion was also supplied to us, and we must be ever careful to keep it and not exchange it for another; at least not without being clear that a modification is being made.

From this we can conclude that Dr. Dodgson was right when he wrote that given, “All cats are creatures understanding French” and “Some chickens are cats” that the proposition “Some chickens are creatures understanding French” is true—and deduced to be true at that. But then all probabilities, just as all statements of logic, are deduced. It makes not a whit of difference that given the premise “Nobody ever observed a cat understanding French” Dodgson’s first premise was false. What matters is that the argument as it stands leads to a valid conclusion, that we deduce the probability of its conclusion (relative to the premises) as 1. And thus from the premises “Half of all Martians wear hats” and “George is a Martian” we judge the probability of the proposition “George wears a hat” to be 0.5, and no other number. Note that the first premise contains the tacit premise that the number of Martians is divisible by 2, unless we allow the colloquialism that half means “about half.” If so, then the probability of the conclusion is about 0.5. The quantification cannot be made more precise than the language.

More On The 1 in 1.6 Million Heat Wave Chance

Yesterday we looked at NCDC’s claim that the 13-month stretch of “above-normal” temperatures had only a 1 in 1.6 million chance of occurring. Let’s today clarify the criticism.

The NCDC had a list of premises, or evidence, or assumptions, or some model which they assumed true. Given that model (call it the Simple Model), they deduced there was a 1 in 1.6 million chance of 13-in-a-row months of “above-normal” temperatures. This probability, given that model, was true. It was correct. It was right. It was valid. Everybody in the world should believe it. There was nothing wrong with it. Finis.

However, the intimation by the NCDC and many other folks was that because this probability—the true probability—was so small, that therefore the Simple Model was false. And that therefore rampant, tipping-point, deadly, grant-inducing, oh-my-this-is-it global climate disruption on a unprecedented scale never heretofore seen was true. That is, because given the Simple Model the probability was small, therefore the Simple Model was false and another model true. The other model is Global Warming.

This is what is known as backward thinking. Or wrong thinking. Or false thinking. Or strange thinking. Or just plain silly thinking: but then scientists, too, get the giggles, and there’s only so long you can compile climate records before going a little stir crazy, so we musn’t be too upset.

Now something caused the temperatures in those 13 months to take the values that it did. Some string of physics, chemistry, topography, whatever. Call this whatever the True Model; and call it that because that is what it is: it is the true cause of the temperature. Given the True Model, then, the probability of the temperature taking the values it did was 1—100%. We can only add of course.

The Global Warming model is a rival model held by many to be unquestionable (which is not to say true). Why not ask: given the Global Warming model, what is the probability of 13-in-a-row “above-normal” temperatures? Nobody did ask, but let’s pretend somebody did. There will be some answer, some probability. Save this and set it aside. This probability will also be true, correct, right, assuming we believe the Global Warming model is true.

Yet there also exists many other rival models besides the Global Warming and Simple Models. We can ask, for each of these Rival Models, what is the probability of seeing 13-in-a-row “above-normal” temperatures? Well, there will be some answer for each. And each of those answers will be true, correct, sans reproche. They will be right.

Now collect all those different probabilities together—the Simple Model probability, the Global Warming probability, each of the Rival Model probabilities, and so on—and do you know what we have?

A great, whopping pile of nothing.

What we have are a bunch of probabilities that aren’t the slightest use to us. Get rid of them. Consider them no more. They will do us no good. And why should they? All they are, are a group of true probabilities, each calculated assuming a different model was true.

But we want to know which model is true! The probabilities are mute on this question, silent as the tomb. We ask these probabilities to tell us which model is true (or closest to the True Model) but answer comes there none. Actually, the answer will be, “Why ask me? I’m just a valid probability calculated assuming my model was true. I have no idea whether my model, or any other model, is true.”

Here is what we should ask: Given we have seen 13-in-a-row “above-normal” temperatures, and given my understanding of all the rival models, what is the probability that any of these rival models is true?

So if somebody tried to answer that question with a “I don’t know. But I do know that if I assume the Simple Model is true, the probability of seeing the data is this-and-such” you would be right to find that person a comfortable chair and to lecture him gently on the advantages of decaffeinated coffee.

Last thrust: assume the Simple Model is the best model there is. Once more, the probability of seeing the data we saw is small. But so what? Rare things happen all the time (see yesterday’s example). People win the lottery, which has a smaller probability than seeing the temperatures we say. If the Simple Model is the best we have, then all we can say is that we have seen a rare event. And this should be cheering news! Especially if you did not enjoy 13 months in-a-row of “above-normal” temperatures. For we have just learned that such events are rare, and that things almost certainly return to “normal.”

« Older posts Newer posts »

© 2014 William M. Briggs

Theme by Anders NorenUp ↑