Why probability cannot be subjective

A reader recently disputed my condensation of the tenets of Bayesian subjective probability. (I promised a thread on which we could discuss the matter more fully, so here it is.) Here is what I said:

To [subjective Bayesians], all probabilities are experiences, feelings that give rise to numbers which are the results of bets you make with yourself or against Mother Nature (nobody makes bets with God anymore). To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability. Subjective Bayesianism, then, was a perfect philosophy of probability for the twentieth century. It spread like mad starting in the late 1970s and still holds sway today; it is even gaining ground on frequentism. In its favor, it should be noted that, after we get past the bare axioms, the math of subjective Bayesianism and logical probability is the same.

The formal name subjectivists have given to guessing probabilities is elicitation, i.e. the process whereby you “poll your inner self.” I invite all to search this term, where you will find many sources. A good summary is given by this paper from the Aviation Human Factors Division Institute of Aviation at the University of Illinois at Urbana-Champaign:

Gamble Methods [of probability elicitation]. Probabilities can also be determined using two gamble-like methods.
In the certain-equivalent method, the expert chooses either a certain payoff or a lottery where the payoff depends on the probability in question, and the elicitor adjusts the amount of the certain payoff until the expert is indifferent between the two choices. In the lottery-equivalent method, the expert chooses either a lottery where the outcome depends on a probability set by the elicitor or a lottery where the outcome depends on the probability in question.

Now, if that doesn’t sound like “To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability” then I’ll eat my hat (the old straw one, the Montecristi, that no longer fits).

There are whole groups of people whose job is to investigate ways of guessing probabilities. One group is, I kid you not, BEEP (Bayesian Elicitation of Experts’ Probabilities) at the University of Sheffield. They have a wide array of, semi-psychological, semi-statistical papers on the pitfalls and joys of probability guessing. An excellent summary is this paper by some pretty big wigs in statistics.

Back of the envelope

As rough estimates, and in many cases, I have absolutely no problem with guessing. I’d even go so far as to say that when any decision has to be made, then unless the situation can be completely deduced, we nearly always fall back on guessing. “Guessing” is usually called “decision making” or “expert opinion.”

But this does not imply, and it is not true, that probability is subjective. There is also the, potentially very large, problem that the elicitation makes you too certain because of the quest for quantification (the point of the original post).

This following is an excerpt from my forthcoming introductory book; this comes after discussions of frequentism and logical probability.

Why probability can’t be subjective

If 3 out of 4 dentists agree that using Dr Johnston’s Whitening Powder makes for shiny teeth, what is the probability that your dentist thinks so? Given only the evidence (premises) that 3 out of 4 etc., then we know the probability is 0.75 that your dentist likes Dr Johnston’s Whitening Powder.

But what if you learned your dentist had just attended an “informational seminar” (with free lunch) sponsored by Galaxy Pharmaceuticals, the manufacturer of Dr Johnston’s Whitening Powder? This introduces new evidence, and will therefore modify the probability that your doctor would recommend Dr Johnston’s.

It may suddenly seem that probability is a matter of belief, of subjective feeling, because different people will have different opinions on how the free lunch will effect the doctor’s endorsement. Probability cannot be a matter of free choice, however. For example, knowing only that a die has 6 sides, and knowing nothing else except that the outcome of the die toss is contingent, then the probability of seeing a 6 is 1 in 6, or about 0.17, regardless of what you or I or anybody thinks. [This is from a discussion of logical probability where the evidence is “A die which will be tossed once has six sides, just one of which is labeled ‘6’” and we want the probability of “We see a 6”, which, given the explicit evidence, is 1/6.]

After you learn of your doc’s cozying up to the pharmaceutical representative, you would be inclined to increase your probability that he would tout Dr Johnston’s to, say, the extent of 0.95. I may come to a different conclusion, say, 0.76 (just slightly higher). Why? Because we are using different sets or collections of information, different evidence or premises, which naturally change our probability assessments. You might know more about pharmaceutical companies than I do, for example, and this causes you to be more cynical, whereas I know more about the purity and selflessness of doctors, and this causes me to be trusting.

But, if I agreed with you about the new evidence, and I felt it was as relevant as you did, then we would share the same probability that the conclusion was true. This, of course, is very unlikely to happen. Rarely will two people agree on a list of premises when the argument involves human affairs, and so it is natural that for most complex things, people will come to different probabilities that the conclusions are true. Does this remind you of politics?

Because people never agree on the set of premises—and they cannot loosely agree on them, they have to agree on them exactly—probabilities will differ. In this sense, probabilities are subjective—rather, it is the choice of premises that is subjective. The probabilities assigned to a conclusion given a set of premises is not. The probability of a conclusion always follows logically from the given premises.

14 Comments

  1. Luis Dias

    It’s always a pleasure to read your pedagogical posts, Mr. Briggs. I’ve been learning a lot of statistics!

    Thanks

  2. Rich

    You seem to be saying, “All probabilities are deduced logically from subjective assessments of evidence” and that the logical deduction isolates the probability from the subjectivity of the evidence assessment. This may be true but I’m not sure it’s necessarily a practical distinction. It might even be that we are too sure of our probabilities if we can label them “non-subjective”.

    Incidentally, if I remember correctly, IBM had a risk assessment method years ago called “Delphi” which polled experts for a number, then told them what all the others had said and polled them again, repeating till convergence. So, a “rigourous” method for assessing gut-feeling.

    Rich

  3. Joy

    Luis:
    Absolutely, they’re far superior to the andragogical ones.

  4. PI

    In other words, “Probability can’t be subjective, unless people disagree on assumptions, in which case it is subjective, and since nobody ever agrees on assumptions, probability must be subjective”.

    As Rich said, this is kind of splitting hairs. You haven’t really said anything here that any of, say, the Sheffield subjectivists would disagree with. They’ll agree that if everyone’s using the same likelihood function, they will logically deduce the same probabilities conditioned on the hypothesis. But, being Bayesians, they say that’s not the end of the story and if you want to assign a probability to a hypothesis, you need a prior too. The choice of prior (and the choice of likelihood!) is a premise upon which people’s opinions can differ, and therefore any inference will be subjective.

    While it’s true that elicitation may lead to overconfident priors, and you can argue whether elicitation is a good way to come up with priors in the first place, that doesn’t say anything about whether probability itself is subjective.

  5. PI

    Rich,

    The Delphi method was invented by the RAND Corporation, around 1960.

  6. Briggs

    Rich, PI,

    I appreciate the fine distinctions might be lost in my examples. Let me further highlight an example:

    E1 = “An item which will be tossed once has n sides, just one of which is labeled S”

    The proposition of interest

    A = “We see an S”

    Then

    Pr( A | E1 ) = 1/n

    A subjectivist might agree with this but has two problems: justifying why he does so and why anybody should agree with him, and explaining why he does not pick another number. The logical probabilistic must choose 1/n, regardless of what he wants the number to be. I invite anybody who is a subjectivist to argue either (1) for 1/n, or (2) against it.

    Even more illuminating is this classic:

    E2 = “All men are mortal and Socrates is a man”

    The proposition of interest is

    B = “Socrates is mortal”

    Then

    Pr( B | E2 ) = 1

    A subjectivist could argue that the probability was some other number, an impossibility in logic(al probability). To prove what I just said is false requires you to show how it is impossible for a subjectivist to supply a different answer (and note that just saying “It’s a valid argument” doesn’t work, because then you have to answer “What is a valid argument and why isn’t the first one so”).

    I realize this post falls far short of a proof or final argument for logical probability, however. Books and books and 100s of journal articles discuss these topics; it’s nearly impossible to summarize easily.

    I was a referee on a paper (in W&F) where the author was trying to introduce the use of Bayesian probability in a problem with a beta prior. All you have to know here is that the beta has two parameters which must be specified: different choices will lead to different answers. The usual values for the parameters are 1 and 1 (or 1/2 and 1/2) chosen by semi-logical appeals to niceness criteria, but the author, on a whim, chose 10 and 3.

    I tried to argue with him and the editor that this was silly, but he countered that since Bayes was subjective, he was free to choose whatever prior he wanted. There’s no countering that argument, not ever, given the premise that probabilities are subjective. (The paper, incidentally, was published with the author’s odd values.)

    The choice of probability models for observables and the probability models for the parameters of the those models make a huge difference in the final answer, which no one disputes. Old-school frequentists rightly fear that people could select priors so as to produce desired results. This can happen, so the fears of the old guard are real (though perhaps exaggerated).

    Our task, then, is to provide models (and parameter models) that are logically deduced given certain evidence. Naturally, the examples I give here are rather simple and perhaps do not seem as controversial as they really are. It is still the case, I argue, that if we agree on the evidence then we must agree on the probability. Because the choice of evidence is free, this is no way says that probability is subjective, because once the evidence is fixed, so is the probability. I’ll try to think of better, simpler examples of this.

  7. Joy

    Anyone:
    So a subjectivist’ must be introducing some other factor that was not in the original premise. That’s moving the goal posts half way through, surely.
    A pure maths professor once told me that applied maths was “rubbish”. I thought that sounded odd. I am amazed that in what looks like a black and white subject, there is any room for controversy. Surely if the premise is agreed upon then the rest is a given? This might be the sort of thing he meant.

  8. ok, let me use my street knowledge to stumble through this dentist example.

    two proposed facts: 3 out of 4 dentists in the total population of dentists recommend the product. my dentist went to a company-sponsored product demonstration.

    it is suggested that the probability that my dentist will recommend the product is somewhere between 76% to 95%, inclusive, the reasoning being that it would be different that 3 out of 4.

    it is that last statement that suggests something to me. the “3 out of 4” dentist population has nothing to do with the company-sponsored-demonstration population. the only overlap is that they are both dentists.

    what if the product has a great marketing campaign but is backed by junk science? perhaps the dentists that get a closer look at the product use it at a significantly lower rate, say 1 out of 5 – that 1 being thankful for the free meal and wasn’t paying attention to anything anyway.

    it is true that the “3 out of 4” population wholly includes the “company-sponsored” population, but i think that the similarities end there.

    what if a new car came on the market. no one has driven it yet. a wonderfully creative marketing campaign ensues. 3 out of 4 drivers say they would consider buying it. then a subset of drivers attend the road show and get to test drive it. the car moves like a hunk of cheese. almost everyone – 19 out of 20 drivers – that actually see the car walk away unimpressed.

    how can we suggest that Population A (opinion swayed by marketing, free samples (whitening powder), and other qualitative factors (likes to have a variety of products available for patients, feels sorry for the sales rep)) has anything whatsoever to do with Population B (had opportunity to set aside all general public information, and actually review the science)?

    if we can’t do anything but state that Pop B was drawn from members of Pop A, then it seems to my admittedly uneducated mind that mentioning “3 out of 4” in context of Pop B is simply wrong. the stats applicable to one population have nothing to do with the other population.

    sure, we can get from A to B, but just like we can get from virtually any population to another – it’s just a matter of the number of layers we wish to add or remove in the process: MADE becomes MAKE becomes BAKE becomes CAKE becomes CEDE, yet MADE and CEDE have little in common except two letters – and even one of those was removed and added back in.

    am i that far off base?

  9. Gamblers Anonymous is the primary subjective Bayesians club, not the University of Sheffield, although there could be significant overlap. In fact, I’ll bet there is.

  10. Joy

    Clyde_M:

    The reason that the second statement about the dentist has a different probability is that the new information about his having been on some course or soiree is now part of the new evidence that is then added into the calculation. A and B are separate examples. For this reason. Your argument that for instance you know that your dentist may have been too busy eating volavents to care is also valid only in a separate example. It would mean that your probability statement would be different and would be based on your own evidence which would differ from anothers.
    I think that’s right. I think the point is that once you’ve defined or decided upon the premise, you are not then free to embellish the probability without first altering the initial premise again on receipt of additional evidence.
    Mike D:
    They may all be wrong in SheffieldUni but bet they’re ‘100% blade!’

  11. joy, i agree with you. i think my point is this …

    Population A – any definition of people – are given only generally available information about Product Z. 3 out of 4 adopt the product.

    Population B – again, any definition of people – are given a personalized demonstration of Product X. Assume 1 out of 5 adopt the product.

    As presented above, it seems to me that if I am given the Population A adoption data, there is no reliable manner to estimate the Population B adoption rate. The populations and products are dissimilar.

    Now, I do not see how that inability is cured if we first standardize the product – both populations had their contact with Product Z (Product X is no longer in the example), and then further to establish similarity of the populations – Pop B is a subset of Pop A.

  12. Tom

    You wrote, “If 3 out of 4 dentists agree that using Dr Johnston’s
    Whitening Powder makes for shiny teeth, what is the probability that
    your dentist thinks so?”

    In what sense are your dentist’s thoughts a matter of probability at
    all? I know it’s common to talk this way, but I don’t think it makes
    any sense. He either does or doesn’t like DrJWP. It’s not like there’s
    a population of events in which he sometimes likes it and sometimes
    doesn’t. On the other hand, if you asked, “what is the probability
    that a randomly chosen dentist likes DrJWP?”, then I’d say you could
    give a meaningful answer. I suspect I’m just being overly literal
    here, but this has always bothered me.

    I confess I have a similar problem understanding what probabilities
    quoted in weather forecasts are supposed to mean. The path of a
    hurricane is a matter of physics, not probability. Precisely what is a
    statement like “there’s a 30% chance that hurricane XYZ will hit New
    Orleans” supposed to mean? How do you validate a probability like
    that? Sorry if I’m just being dense.

  13. Briggs

    Tom,

    I should have been clearer about the dentist example. Your evidence can be expanded to “3 out of 4 doctors prefer Dr Johnston’s. You haven’t yet spoken to your doctor but you will ask him whether he prefers Dr Johnson’s.” The proposition of interest is “Your doctor will prefer it.”

    The probability of this proposition being true given either “3 out of 4 etc.” or “3 out of 4 plus lunch etc.” are the main questions. Either probability is between 0 and 1.

    The probability that your doctor prefers Dr Johnston’s given “Your doctor’s knowledge” is either 0 or 1 just in case he doesn’t or does prefer the toothpaste. His evidence is different than your evidence (before you ask him). So naturally you will have different answers to the probability question.

    As for weather forecasts—which I think I will expand in a new post—they mean just what they say: “Given all the evidence at our disposal, there is a 30% chance that the hurricane will hit Tom.”

    This does not mean 3 out of 10 hurricanes will hit you: there is only one hurricane of interest, after all.

    Verification is a huge question, so I hope if you don’t mind if I hold off answering here. But as a hint: that 30% forecast is better than one that said “80% chance” if the hurricane doesn’t hit.

  14. Rich

    “It is still the case, I argue, that if we agree on the evidence then we must agree on the probability. Because the choice of evidence is free, this is no way says that probability is subjective, because once the evidence is fixed, so is the probability.”

    I think that’s what I said. I worked with someone once who did research on computer speech analysis. He said, “The first thing you do is make the analog signal digital” as though we should retreat as soon as possible from the scruffy, untidy analog world into the clean, tidy world of digital. This has a similar flavour. Real-world experience is messy and subjective; let’s move on quickly to the clean and perfect world of maths. I’m not saying it’s wrong only that we ought to know we’re doing it. Oh. That’s sounds like what you say too.

    By the way, doesn’t Karl Popper have something to say about probability that goes even further and decides that the whole idea of probability is therefore incoherent? It’s a while since I read him. “The logic of Scientific Discovery” wasn’t it?

    Rich

Leave a Reply

Your email address will not be published. Required fields are marked *