September 26, 2008 | 14 Comments

Why probability cannot be subjective

A reader recently disputed my condensation of the tenets of Bayesian subjective probability. (I promised a thread on which we could discuss the matter more fully, so here it is.) Here is what I said:

To [subjective Bayesians], all probabilities are experiences, feelings that give rise to numbers which are the results of bets you make with yourself or against Mother Nature (nobody makes bets with God anymore). To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability. Subjective Bayesianism, then, was a perfect philosophy of probability for the twentieth century. It spread like mad starting in the late 1970s and still holds sway today; it is even gaining ground on frequentism. In its favor, it should be noted that, after we get past the bare axioms, the math of subjective Bayesianism and logical probability is the same.

The formal name subjectivists have given to guessing probabilities is elicitation, i.e. the process whereby you “poll your inner self.” I invite all to search this term, where you will find many sources. A good summary is given by this paper from the Aviation Human Factors Division Institute of Aviation at the University of Illinois at Urbana-Champaign:

Gamble Methods [of probability elicitation]. Probabilities can also be determined using two gamble-like methods.
In the certain-equivalent method, the expert chooses either a certain payoff or a lottery where the payoff depends on the probability in question, and the elicitor adjusts the amount of the certain payoff until the expert is indifferent between the two choices. In the lottery-equivalent method, the expert chooses either a lottery where the outcome depends on a probability set by the elicitor or a lottery where the outcome depends on the probability in question.

Now, if that doesn’t sound like “To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability” then I’ll eat my hat (the old straw one, the Montecristi, that no longer fits).

There are whole groups of people whose job is to investigate ways of guessing probabilities. One group is, I kid you not, BEEP (Bayesian Elicitation of Experts’ Probabilities) at the University of Sheffield. They have a wide array of, semi-psychological, semi-statistical papers on the pitfalls and joys of probability guessing. An excellent summary is this paper by some pretty big wigs in statistics.

Back of the envelope

As rough estimates, and in many cases, I have absolutely no problem with guessing. I’d even go so far as to say that when any decision has to be made, then unless the situation can be completely deduced, we nearly always fall back on guessing. “Guessing” is usually called “decision making” or “expert opinion.”

But this does not imply, and it is not true, that probability is subjective. There is also the, potentially very large, problem that the elicitation makes you too certain because of the quest for quantification (the point of the original post).

This following is an excerpt from my forthcoming introductory book; this comes after discussions of frequentism and logical probability.

Why probability can’t be subjective

If 3 out of 4 dentists agree that using Dr Johnston’s Whitening Powder makes for shiny teeth, what is the probability that your dentist thinks so? Given only the evidence (premises) that 3 out of 4 etc., then we know the probability is 0.75 that your dentist likes Dr Johnston’s Whitening Powder.

But what if you learned your dentist had just attended an “informational seminar” (with free lunch) sponsored by Galaxy Pharmaceuticals, the manufacturer of Dr Johnston’s Whitening Powder? This introduces new evidence, and will therefore modify the probability that your doctor would recommend Dr Johnston’s.

It may suddenly seem that probability is a matter of belief, of subjective feeling, because different people will have different opinions on how the free lunch will effect the doctor’s endorsement. Probability cannot be a matter of free choice, however. For example, knowing only that a die has 6 sides, and knowing nothing else except that the outcome of the die toss is contingent, then the probability of seeing a 6 is 1 in 6, or about 0.17, regardless of what you or I or anybody thinks. [This is from a discussion of logical probability where the evidence is “A die which will be tossed once has six sides, just one of which is labeled ‘6’” and we want the probability of “We see a 6”, which, given the explicit evidence, is 1/6.]

After you learn of your doc’s cozying up to the pharmaceutical representative, you would be inclined to increase your probability that he would tout Dr Johnston’s to, say, the extent of 0.95. I may come to a different conclusion, say, 0.76 (just slightly higher). Why? Because we are using different sets or collections of information, different evidence or premises, which naturally change our probability assessments. You might know more about pharmaceutical companies than I do, for example, and this causes you to be more cynical, whereas I know more about the purity and selflessness of doctors, and this causes me to be trusting.

But, if I agreed with you about the new evidence, and I felt it was as relevant as you did, then we would share the same probability that the conclusion was true. This, of course, is very unlikely to happen. Rarely will two people agree on a list of premises when the argument involves human affairs, and so it is natural that for most complex things, people will come to different probabilities that the conclusions are true. Does this remind you of politics?

Because people never agree on the set of premises—and they cannot loosely agree on them, they have to agree on them exactly—probabilities will differ. In this sense, probabilities are subjective—rather, it is the choice of premises that is subjective. The probabilities assigned to a conclusion given a set of premises is not. The probability of a conclusion always follows logically from the given premises.

September 25, 2008 | 13 Comments

More evidence that people are more sure than they should be

From Jerry Pournelle (What? You haven’t read Lucifer’s Hammer yet?) on how just about everybody making bets in the financial markets were wrong. This “everybody” includes very highly educated, extraordinarily well paid, respected, etc. etc., people.

One of my favorite lines, “Given incorrect models to work with, the computers continued to forecast profits right up to the crash.”

Another “As to what can be done, it may not matter. That is, it’s important what we do, but the chance that it will be done sanely and rationally is very small.” Of course, what we do will be pronounced as “the” thing to do. After all, the eventual plan, whatever it might be, will be made by experts.

Pournelle’s worry, as should be ours, is that the only thing that will happen is the creation of yet another big-government bureaucracy.

Pournelle’s Iron Law of Bureaucracy states that in any bureaucratic organization there will be two kinds of people: those who work to further the actual goals of the organization, and those who work for the organization itself. Examples in education would be teachers who work and sacrifice to teach children, vs. union representative who work to protect any teacher including the most incompetent. The Iron Law states that in all cases, the second type of person will always gain control of the organization, and will always write the rules under which the organization functions.”

Ah, government bureaucracy. Is there anything experts at the government can’t fix? I know I can’t wait for the EPA to start regulating the “pollutant” CO2. They ought to figure a way to tie mortgages to global warming. Then things will really get better.

Yes, a disconnected rant today. All I know is that I have been prudent and actually have saved to buy a house, did not try to purchase anything I couldn’t afford, and now I will be asked to pay for the mistakes of all the experts and fools who brought this on.

In any government bailout, the first thing I would require is that any executive of the firms that are being helped would lose all of their personal assets. Every penny. Then I’d sue the traders and stockholders to recover more. I’d do all that before I started taking money from innocent civilians.

As it is, the executives from Fannie Mae, Lehman Brothers, etc., will all walk away very rich men. They will be rewarded.

And the government will continue to bloat.

September 22, 2008 | 35 Comments

Not all uncertainty can be quantified

(This essay will form, when re-written more intelligently, part of Chapter 15, the final Chapter, of my book. Which is coming….soon? The material below is not easy nor brief, folks. But it is very important.)

To most of you, what I’m about to say will not be in the least controversial. But to some others, the idea that not all risk and uncertainty can be quantified is somewhat heretical.

However, the first part of my thesis is easily proved; I’ll prove the second part below.

Let some evidence we have collected—never mind how—be E = “Most people enjoy Butterfingers”. We are interested in answering the truth of this statement: A = “Joe enjoys Butterfingers.” We do not know whether A is true or false, and so we will quantify our uncertainty in A using probability, that is written like this

#1    Pr( A | E )

and which reads “The probability that A is true given the evidence E”. (The vertical bar “|” means “given.”)

In English, the word most at least means more than half; it could even mean a lot more than a half, or even nearly all—there is certainly ambiguity in its definition. But since most at least means more than half, we can partially answer our question, which is written like this

#2    0.5 < Pr( A | E ) < 1 and which reads "The probability that A is true is greater than a half but not certain given the evidence E.” This answer is the best we can do with the given evidence.

This answer is a quantification of sorts, but it is not a direct quantification like, say, the answer “The probability that A is true is 0.673.”

It is because there is ambiguity in the evidence that we cannot completely quantify the uncertainty in A. That is, the inability to articulate the precise definition of “most people” is the reason we cannot exactly quantify the probability of A.

The first person to recognize this, to my knowledge, was John Maynard Keynes is his gorgeous, but now little read, A Treatise on Probability, a book which argued that all probability statements were statements of logic To Keynes—and to us—all probability is conditional; you cannot have a probability of A, but you can have a probability of A with respect to certain evidence. Change the evidence and change the probability of A. Stating a probability of A unconditional on any evidence disconnects that statement from reality, so to speak.

Other Theories of Probability

For many reasons, Keynes’s eminently sensible idea never caught on and instead, around the same time his book was published, probability theory bifurcated into two antithetical paths. The first was called frequentism: probability was defined to be that number which is the ratio of experiments in which A will be true divided by the total numbers of experiments as that number of experiments goes to infinity1. This definition makes it difficult (an academic word meaning impossible) to answer what is the probability that Joe, our Joe, likes Butterfingers. It also makes it difficult to define the probability for any event or events that are constrained to occur less than an infinite number of times (so far, this is all events that I know of).

The second branch was subjective Bayesianism. To this group, all probabilities are experiences, feelings that give rise to numbers which are the results of bets you make with yourself or against Mother Nature (nobody makes bets with God anymore). To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability. Subjective Bayesianism, then, was a perfect philosophy of probability for the twentieth century. It spread like mad starting in the late 1970s and still holds sway today; it is even gaining ground on frequentism.

What both of these views have in common is the belief that any statement can be given a precise, quantifiable probability. Frequentism does so by assuming that there always exists a class of events—which is to say, hard data—to which you can compare the A before you. Subjective Bayesianism, as we have seen, can always pull probabilities for any A out of thin air. In every conceivable field, journal articles using these techniques multiply. It doesn’t help that the many times probability estimates are offered in learned publications, they are written in dense mathematical script. Anything that looks so complicated must be right!

Mathematics

The problem is not that the mathematical theories are wrong; they almost never are. But because the math is right does not imply that it is applicable to any real-world problems.

The math often is applicable, of course; usually for simple problems and in small cases the results of which would not be in much dispute even without the use of probability and statistics. Take, for example, a medical trial with two drugs, D and P, given to equal numbers of patients for an explicitly definable disease that is either absent or present. As long as no cheating took place and the two groups of patients balanced, then if more patients got better using drug D, that drug is probably better. In fact, just knowing that drug D performed better (and no cheating and balance) is evidence enough for a rational person to prefer D over P.

All that probability can do for you in cases like this is to clean up the estimates of how much better D might be than P in new groups of patients. As long as no cheating took place and the patients were balanced, the textbook methods will give you reasonable answers. But suppose the disease the drugs treat is not as simply defined. Let’s write what we just said in mathematical notation so that certain elements become obvious.

#3    Pr ( D > P | Trial Results & No Cheating & Patients Like Before) > 0.5.

This reads, the probability that somebody gets better using drug D rather than P given the raw numbers we had from the old trial (including the old patient characteristics) and that no cheating took place in that trial and the new patients who will use the drugs “look like” the patients from the previous trial, is greater than 50% (and less than certain).

Now you can see why I repeatedly emphasized that part of the evidence that usually gets no emphasis: no cheating and patients “like” before. Incidentally, it might appear that I am discussing only medical trials and have lost sight of the original thread. I have not, which will become obvious in a moment.

Suppose the outcome of applying a sophisticated probability algorithm gave us the estimate of 0.72 for equation #3. Does writing this number more precisely help if you suppose you are the doctor who has to prescribe either D or P? Assume that no cheating took place in the old trial, then drug D is better if the patient in front of you is “like” the patients from the old trial. What is the probability she is so (given the information from the old trial)?

The word like is positively loaded with ambiguity. Not to be redundant, but write out the last question mathematically.

#4    Pr ( My patient like the others | Patients characteristics from previous trial)

The reason to be verbose in writing out the probability conditions is that it puts the matter starkly. It forces you, unlike the old ways of frequentisim and subjective Bayesianism, to specify as completely as possible the circumstances that form your estimate. Since all probability is conditional, it should always be written as such so that it is always seen as such. This is necessary because it is not just the probability from equation #3 that is important, equation #4 is, too. If you are the doctor, you do not—you should not—focus solely on probability #3 because what you really want is this:

#5    Pr ( D > P & My patient like before | Trial Results & No Cheating & Patients Character)

which is just #3 x #4. I am in no way arguing that we should abandon formal statistics which produces quantifications like equation #3. But I am saying that since, as we already know, exactly quantifying #4 is nearly impossible, we will be too confident of any decisions we make if we, as is common, substitute probability #3 for #5 because, not matter what, the probability of #3 and #4 both is always less than the probability of #3.

Appropriate caveats and exceptions are usually delineated in journal articles when using the old methods, but the results are buried in the text, which causes them to be weighed more or less importantly, and which give the reader a false sense of security. Because, in the end, we are left with the suitably highlighted number from equation #3, that comforting exact quantification reached by implementing impressive mathematical methods. That final number, which we can now see is not final at all, is tangible, and is held on to doggedly. All the evidence to the right of the bar is forgotten or downplayed because it is difficult to keep in mind.

The result to equation #3 is produced, too, only from the “hard data” of the trial, the actual physical measurements from the patients. These numbers have the happy property that they can be put into spreadsheets and databases. They are real. So real that their importance is magnified far beyond their capacity to provide all the answers. They fool people into thinking that equation #3 is the final answer, which it never is. It is always equation #5 that is important to making new decisions. Sometimes, in simple physical cases, probabilities #3 and #5 are so close as to be practically equal; but when the situation is complex, as it always is when involving humans, these two probabilities are not close.

Everything That Can Happen

The situation is actually even worse than what we have discussed so far. Probability models, the kind that spit out equation #3, are fit to the “hard data” at hand. The models that are chosen are usually picked because of habit and familiarity, but responsible practitioners also choose the models so that they fit the old data well. This is certainly a rational thing to do. The problem is that, since probability models are only designed to say something about future data, the old data does not always encompass everything that can happen and so we are limited in what we can say about the future. All we can say for certain is what has happened before might happen again. But it’s anybody’s guess whether what hasn’t happened before might happen in the future.

The probability models fit the old data well, but nobody can ever know how well they will fit future data. The result is that over reliance on “hard data” means that probabilities of extreme events are underestimated and mundane events overestimated. The simple way to state this is the system is built to engender overconfidence.2

Decision Analysis

You’re still the doctor and you still have to prescribe D or P (or nothing). No matter what you prescribe something will happen to the patient. What? And when? Perhaps the malady clears up, but how soon? Perhaps the illness is merely mitigated, but by how much? You not only have to figure out what treatment is better, but what will happen if you apply that treatment. This is a very tricky business, and is why, incidentally, there is such a variance in the ability of doctors.3 Part of the problem is explicitly defining what is meant be “the patient improves.” There is ambiguity in that word improve, in what will happen with either of the drugs is administered.

There are two separate questions here: (1) defining events and estimating their probability of occurring and (2) estimating what will happen given those events occur. Going through both of the steps is called computing a risk or decision analysis. This is an enormously broad subject which we won’t do more than touch on, only to show where more uncertainty comes in.

We have already seen that there is ambiguity in computing the probability of events. The more complex these events the more imprecise the estimate. It is also often the case that part (2) of the risk analysis is the most difficult. The events themselves cannot be articulated, either completely or unambiguously. In simple physical systems they often can be, of course, but in complex ones like the climate or ecosystems they are not. Anything involving humans is automatically complex.

Take the current (!) financial crisis as an example. Many of the banks and brokerages failed to both define the events that are now happening, and they extent of the cost of those events. How much will it cost to clean it up? Nobody knows. This is the proper answer. We might be able to bound it—more than half a billion, say—and that might be the best anybody can say (except that I have been asked to pay for it).

Too Much Certainty

What the older statistical methods and the strict reliance on hard data and fancy mathematics have done is to create a system where there is too much certainty when making conclusions about complex events. We should all, always, take any result and realize that they are conditional on everything being just so. We should realize those just so conditions that obtained in the past might not in the future.

Well, you get the idea. There is already far too much information to assimilate in one reading (I’m probably just as tired of going on and on as you are of reading all this!). As always, discussion is welcome.

—————————
1Another, common, way to say infinity is the euphemism “in te long run”. Keynes has famously said that “In the long run we shall all be dead.” It’s always been surprising to me that the same people who giggle at this quip ignore its force.

2There is obviously a lot more to say on this subject, but we’ll leave it for another time.

3A whole new field of medicine has emerged to deal with this topic. It is called evidence based medicine. Sounds good, no? What could be wrong with evidence? And it’s not entirely a bad idea, but there is an over reliance on the “hard data” and a belief that only this hard data can answer questions. We have already seen that this cannot be the case.

September 19, 2008 | 5 Comments

Book coming…

I’ve been taking the past few days and building an Index for the my “101” book. It is painstaking, meticulous…well, excruciatingly dull work. But it’s nearly done.

This is slowing me down from posting new entries here.

One that I especially want to get to was suggested by reader Mike D and a May issue of the Economist. Can all risks be quantified? Can all probabilities be quantified?

The answer, perhaps surprisingly, is no. It’s surprising if you hang out in statistics and economics departments. Places with far too much math and far too little philosophy.

Anyway, book should be done “soon.” End of the month?

Stay tuned so you can be the first one on your block to own a copy.