Probability Isn’t “Fair”: An Answer To Senn; Part III

Varying degrees of “fairness.”

A quick reminder that we’re trying to unpack the meaning of the “is fair” in the proposition “This die is fair,” and trying to deduce the probability this proposition is true given (and only given) the evidence “This die has been rolled five times and showed five ’6′s.” See the previous installment for why.

“Is fair” can take one of several definitions. Our predicament arises from not being clear which, and by mixing versions at different stages of the problem.

Meaning 1: In any finite number of tosses, the proportion of observed tosses will match the probabilities deduced from the first example; i.e., the observed proportions will show 1/6 ’1′s, 1/6 ’2′s, and so on, or whatever is closest to these if the number of tosses is not divisible by six.

Assuming Meaning 1, and given our evidence, we deduce the probability the proposition is true is 0; it is false. If the proposition were true, we should have seen some combination of five numbers with one missing (e.g. ’6′, ’3′, ’5′, ’1′, ’4′); the missing could have been any number between ’1′ and ’6.’ (I keep the quotes around the outcomes to help us recall these are labels and not numbers.)

Meaning 2: In any finite number of tosses, the proportion of observed tosses will approximately equal the probabilities deduced from the first example; i.e., the proportions will approximately show 1/6 ’1′s, 1/6 ’2′s, and so on.

Assuming Meaning 2, and given our evidence, we deduce the probability the proposition is true is not calculable. The probability is unknown—because “approximately” is not defined. If “approximately” means (and I do not jest) “Leave me alone, I’m tired of playing dice” then the proposition is true, because the observed frequencies are more than close enough for somebody who doesn’t give a damn about dice. If you fail to appreciate this example, you are in for tough times ahead; so pause here and make sure this sinks in.

If “approximately” means “not varying more than 5% from” then the proposition is deduced to be false because, of course, the observed proportions have differed by more than 5%. But if “approximately” means “not varying more than 90% from” then the proposition is deduced to be true, because the observed variations are within this bound.

Who gets to decide what “approximately” means? Well, you do; as does Senn; as do I. Fights start over things like this. What is the one and only definition of “approximately”? There isn’t one! It depends on the situation. As we saw, for some it could mean “Leave me alone”, for others, say casinos, it would be much tighter.

Think this ambiguity bad? It’s even worse than this.

Meaning 3: In any finite number of tosses greater than or equal to 6, the proportion of observed tosses will equal the probabilities deduced from the first question; i.e., the proportions will be 1/6 ’1′s, 1/6 ’2′s, and so on, or whatever is closest to that if the number of tosses is not divisible by six.

Given this and our evidence, the proposition is not true or false (1 or 0) but somewhere in between because we haven’t yet reached the limit of 6 tosses. Kind of. If the die were tossed just one more time (for 6 in total), then there is no way the observed proportions could equal the deduced probabilities. The proposition would then be false. But the die hasn’t been tossed just one more time. It could be tossed 100 more times. Who knows? But we still have the feeling, based on the observations, that the future tosses won’t bring the final proportions in line with the deduced probabilities (I keep repeating deduced to remind us these are not subjective guesses nor are they estimates).

Our evidence and assumed definition isn’t proof the proposition is false, especially if we consider it with respect to Meaning 4, which is the same as 3 but with “approximately” put in usual place. Nor is the proposition true. But we also don’t seem in a strong position to quantify the probability. Nothing in the world wrong with that. Not all probability is quantifiable. See the original series for why.

If we insist on asking the original question, we’re left trying to understand what “is fair” could mean. We need to settle on a definite, unambiguous meaning before we can progress. And then even if we do we’re going to be left will all kinds of nagging questions about real dice.

Real dice have weight distributed unevenly. There’s no way to create perfect balance. We can prove this easily: displaying the numbers, which are of different shape, creates an imbalance, however minuscule. It might be possible to engineer a die down to the level of a quark, so that each side is precisely the same number of quarks across, and that the mass of the die is uniform at the Planck scale (except for the surface where the displays are). In practice, for macro-scale dice, this is impossible. But maybe some physicist will figure it out for some tiny thing. Even then, he won’t be sure that the strings which comprise the quarks are the “same length” everywhere and uniformly (if that even makes sense to say).

But even supposing we have this toy, we have the problem of tossing it. How? Onto what kind of surface? From what height? How much spin? With what downward force? In what gravitational field? After all, if we want to discuss tossing a “fair” die, all these things have to be considered. Tossing is part of “fairness.”

It is at this point it dawns on us that we’re on a fool’s errand. If the die were perfect, as we imagine (and as a logical die is in effect), and if the environmental conditions and forces were known precisely, then we’d know—before tossing—exactly what the outcome would be. Indeed, if the forces did not vary, the die would land the same way each time.

Point is, just by our knowledge of physics we know that any real die and its tossing environment isn’t “fair” in any complete physical sense. There’s no point to the original question. No real die (or its tosses) is “fair” in this sense. The proposition is contingent.

We’re asking the wrong question. What we really want to know is if the die is “fair enough”, and to answer that requires, as above, knowing what decisions we want to make regarding the die.

What we can do is to deduce the probability of seeing any arrangement of observations, either before seeing any observations whatsoever, or conditional on our initial knowledge of six-sided (logical) objects supplemented by a set of observations specified by the evidence. (We do this using Bayes’s theorem: see the next Parts.)

In other words, we can then make statements like this, “Given our evidence about six-sided objects and the old observations, the probability of seeing departures of future observed proportions at least as great as X% from the deduced probabilities is Y.” If Y exceeds a threshold, then we act as if the die is not “fair”, but if it is less than this threshold, we say it is. The threshold varies depending on the application. For the person sick to death of dice, X is unimportant and Y is quite low. Casinos want a small X and large Y for obvious reasons.

We’ll never have 100% certainty that any real die is “fair” in this (final) sense that Y = 0 (for vanishingly small X), because we knew before we started that question dealt with a contingent matter, and we are never 100% certain of contingent matters (though we can be 1 – ε certain).

And you’ll notice that nowhere did we confuse the observed proportions—i.e. the relative frequencies—as probabilities. We knew the probabilities and used them to discern whether the relative frequencies were in line with the them; this is what we meant by “fairness.”

We have proved what we set out to show. That we don’t, at least for the kinds of examples that Senn provided, need two kinds of probability. The one kind—probability as logic—was enough.

Yet there is still more to understand. Stick around!

Update We could also form statements like this: “Given our evidence and old observations, in the next n throws, there is probability Y of seeing X ’1′s” and so forth. In other words, this and the previous example are predictions, statements of uncertainty of the future (or of that which is as yet unseen).

There Is Only One Kind Of Probability: An Answer To Senn; Part II

What are the chances a green die will land on top?

Read Part I. Some of this material is explained in detail in this series.

Just after the introduction, Senn starts his argument by claiming an “important distinction between two types of probabilities: direct and inverse.”

The distinction is simply explained by an example. The probability that five rolls of a fair die will show five sixes is an example of a direct probability—it is a probability from model to data. The probability that a die is fair given that it has been rolled five times to show five sixes is an inverse probability: it is a probability from data to model.

If we accept this distinction and example as written, we are already lost; all the standard confusions are there.

If probability is all of one sort, then there is no distinction between “direct” and “inverse” kinds. Our candidate is logical probability, in which, as in just-plain-logic, there is only evidence (equivalently, premises), a proposition to be considered with respect to that evidence, and a probability this proposition is true deduced from the evidence.

Let’s begin by rewriting the examples. The evidence is what? Trouble starts with the words “fair die”. This is taken to mean that we have a real, physical, tangible object which must, when tossed, results in equal chance of any side face up. This is asserted and not proved. It is a dictate. It sets in the mind a view of an actual die, of the kind that cannot (or at least does not) exist. Once this die is imagined, objections immediately arise: what if it isn’t “fair”? Can real dice be “fair”? What about imperfections? The confusion between asserting a probability and wondering whether the asserted probability equals the “real” probability, i.e. the long-run frequency of tosses, is already ineradicable. It becomes impossible to keep in mind what the real question is.

Start over rewriting all as a logical argument. “We have a six-sided (logical) object, just one side of which is labeled ’1′, just one side of which is labeled ’2′, and so on up to ’6′, which when tossed must show just one of these sides.” No physical, real die is implied, though because of the ubiquity of dice-like examples, people usually think one is. So if you find yourself unable to imagine a logical, i.e. non-physical, six-sided object, change it to a six-state Martian bleen, a device which is activated by tentacle and displays each time it is activated on a screen one and only one of the figures (translated into English) ’1′, ’2′, etc. There is no hint—as in no hint—of the workings of this device. All—as in all—we know is that the device when activated can show one of ’1′ through ’6′; how it does so is a mystery.

I stress again (and again) that since there are no Martians, there are no bleens. Any imperfections we imagine in a bleen are our own creations and are not part of the evidence supplied. The key to LPB is that we must—as in must—use only the evidence supplied, and all of it, in our deductions of probability. What is not directly implied from the given evidence must—as in, well, you get the idea—be ignored.

Now using the statistical syllogism (which itself can be deduced from simpler principles), we deduce the probability a ’6′ shows on one activation of a bleen, just as we can deduce the probability of five ’6′ activations. Or we can deduce anything which can happen in any (for now stick to finite) number of activations.

We are done with the first example which ends with at a conditional probability; i.e. a probability deduced from given, fixed evidence. All probability is likewise conditional. If you think not, see the series linked above for examples, or see Part III tomorrow for more on this.

Notice that I do not use the word “model”. It isn’t needed. Not here, and in far fewer cases than usually thought.

Senn’s second (“inverse”) example is also confusing. This asks the probability the following proposition is true: “This die is ‘fair’.” The only written evidence is “This die has been rolled five times and has showed five ’6′s.” That we are dealing with a real, physical die is implied from the words, but it is never stated. But suppose this is wrong and Senn meant a logical die or a breen: then where would we be?

Right where we started. If this is the logical “die” or breen, then we start by knowing the chance each number is displayed is 1/6. We end there, too. We have deduced “fairness.”

So we must be talking of a physical, rea-life die. Our task is to interpret this proposition with regard to the given observations.

This evidence is easy and means just what it says: five rolls, five ’6′s of some real die. The proposition is less clear. The subject makes sense: “This die” means some real, actual physical die. The difficulty is with the verb: “is fair.”

Ah, fairness. From youth we are told that there is nothing finer! Indeed, fairness is so fine that we discuss it next time.


Bayes Is More Than Probably Right: An Answer To Senn; Part I

Stephen Senn very kindly answered a post I wrote on p-values (Unsignificant Statistics: Or Die P-Value, Die Die Die) by sending me his “You May Believe You Are a Bayesian But You Are Probably Wrong” (in Rationality, Markets and Morals).

Since I will be teaching at Cornell these two weeks, and the topics are the same, I will use part of this time to answer his paper in depth.

It would be best to start here Subjective

Stephen Senn

Stephen Senn

Versus Objective Bayes (Versus Frequentism): Part I, since that series explains matters in greater detail.

Probability

Senn went wrong before he even began, with his title: “You May Believe You Are a Bayesian But You Are Probably Wrong.” If you are only “probably wrong” about your belief then you also might be right. And if you were certainly wrong, then we would have a proof which says so. A proof is a string of deductions, i.e. a valid and sound argument, which begins with obviously true premises (agreed to by all) and ends at a proposition we must believe—even if we don’t want to.

Senn does not have, nor does he claim to have, a proof which shows being a Bayesian is certainly wrong. It is only his best guess that this philosophy is wrong. Probably wrong. So here we are, already at probability. What could Senn mean by his probabilistic statement “probably wrong”? (Besides the pun, I mean.) It can’t be any kind of frequentist statement, as in “I’ve collected a ‘random sample’ of Bayesian philosophies, itself embedded in an infinite sequence of such philosophies, and the mean of this sample (considering errors in theory equal to zero) tends towards zero.” That makes no kind of sense, as I’m sure Senn would agree, but it would have to if probability was frequentism.

Bayesian philosophy, at best, comes in a finite number of flavors. It could be that some of these are false (I agree subjectivism, as it is usually understood, is), but in no way can we imagine any individual theory as being embedded in an infinite sequence of theories, which is required for frequentist theory to hold. No: either we can prove each theory true or false, or our evidence is not (yet?) sufficient, and thus we are only probably sure each theory is true or false. This sounds like a Bayesian statement, no? (If so, do we fail because of self-reference? Well, no, because we can build this theory from simpler propositions.)

It could be that Senn took a subjective Bayesian tack when he formed his title, or perhaps he took a logical probability, or objective, Bayesian one. (Incidentally, I’ll call this latter theory LPB for short.) Or he could have meant some as yet unknown (or at least unidentified) theory. Whatever it was, it couldn’t have been frequentism, as shown.

His leading candidate is eclecticism (Senn is not frequentist), which is one of two things. One is no belief at all. It means “I’ll do whatever I want whenever it seems good to me.” There is no theory here to disprove, nor prove. To say “I’m an eclectic” this way means “I don’t want to argue for anything, just against things.” Since we go nowhere engaging with this “theory”, we pass on to number two. This is to say, “I’ll take a little of that, some of this, and some of the other.” Here we have several sub-theories. As such, this kind of eclecticism is actually a whole theory (the compilation of sub-theories) which might be true or false. Thus Senn might have used Bayes for his title and he might use frequentism for (say) dice tosses.

Senn recalls that Fisher himself was “skeptical” of attempts to unify probability. Hacking, another Big Cheese, in line with other well-aged curds, is of the same opinion. Why should we have a theory? Why not many? The obvious answer to this is that there is that which is true and that which is false and we should seek the truth. If it turns out a theory of probability works for all kinds of uncertainty, we’re stuck with it. If it must be that several theories are true, then we must accept them all. But it’s wrong to use desire or suspicion as proof there are many and not one theory.

Senn himself proved that frequentism is out (and forever) as a complete theory of probability because it cannot handle propositions like his “probably wrong.” But this isn’t proof that Bayes everywhere right; not yet. Senn’s later examples might be sufficient to show all versions of Bayes are wrong, in which case some other theory must be true.

But we’ll have to see next time, because we’re already out of space, and because next topic isn’t simple.


Cornell Teaching Sojourn: Probability, Stats, & R

Time for the annual migration to Ithaca via a well accoutered golden coach (complete with undergraduates feeding professors grapes grown at Cornell’s orchards). There I will linger for two weeks, ruling as benign and loving dictator over ILRST 5150, i.e. Statistical Research Methods in ILR’s MPS program.

The class works by me holding forth with dulcet but brief pontifications followed by intense questioning of the students, as a cop might grill a suspect. “What did I just say? What in the dark-mattered universe do you think I meant by that? Have you signed up for the wine tour yet?”

The wine tour—completely unofficial and off the books—ends Week One with a journey to several Finger Lakes wineries to sample their wares. To be cruelly honest, many of these are poor. If the wines aren’t sour and vinegary, they are so sweet you could stand a teaspoon up in them. One unbearable vineyard (the name of which is hidden in a riddle) produces nothing but pinkish paint thinner. But everywhere the wines are wet and contain (among other chemicals) ethanol, which is welcome after five full days of statistics statistics statistics and with another week of the same to come.

(But there are dangers, too. At one stop on the wine trail, I was once nearly abducted by a bachelorette party and had to be rescued by one of my students.)

The class contains almost no math and certainly no memorization of formulas. I figure the computer can do those things for you, and that time spent proving things mathematically removes time spent in understanding what probability is and learning the strengths and limitations of statistics. As regular readers know, the latter are many, nefarious, and ubiquitous.

I have only one or two canned examples. The rest have to be provided by the students themselves. This eliminates having to figure out a whole new field and its data and how to describe its uncertainty. Besides, textbook examples are far too neat, even coy. Better to see how messy, compromising, and ambiguous collecting data is. Gives a far better appreciation of the ease of making mistakes and the resultant over-confidence.

I teach R; successfully, too. Yes, it is a programming language, but that is its great advantage. I was able to teach R to a man who did not know what a spreadsheet was and could not type. He did not own a computer. This wasn’t because of my ability, but because learning the rudiments of any logical programming language is something almost anybody can do. (I do not include SAS in this list; it is an appalling language.)

Following my custom, for the next two weeks posts will reflect, broadly or in detail, what is going on in the class. I won’t have time to do anything more. Feel free to ask questions, but understand I might not be able to get to all of them.


Update A good joke.