Not all uncertainty can be quantified

By Briggs September 22, 200835 Comments

(This essay will form, when re-written more intelligently, part of Chapter 15, the final Chapter, of my book. Which is coming….soon? The material below is not easy nor brief, folks. But it is very important.)

To most of you, what I’m about to say will not be in the least controversial. But to some others, the idea that not all risk and uncertainty can be quantified is somewhat heretical.

However, the first part of my thesis is easily proved; I’ll prove the second part below.

Let some evidence we have collected—never mind how—be E = “Most people enjoy Butterfingers”. We are interested in answering the truth of this statement: A = “Joe enjoys Butterfingers.” We do not know whether A is true or false, and so we will quantify our uncertainty in A using probability, that is written like this

#1 Pr( A | E )

and which reads “The probability that A is true given the evidence E”. (The vertical bar “|” means “given.”)

In English, the word most at least means more than half; it could even mean a lot more than a half, or even nearly all—there is certainly ambiguity in its definition. But since most at least means more than half, we can partially answer our question, which is written like this

#2 0.5 < Pr( A | E ) < 1 and which reads "The probability that A is true is greater than a half but not certain given the evidence E.” This answer is the best we can do with the given evidence.

This answer is a quantification of sorts, but it is not a direct quantification like, say, the answer “The probability that A is true is 0.673.”

It is because there is ambiguity in the evidence that we cannot completely quantify the uncertainty in A. That is, the inability to articulate the precise definition of “most people” is the reason we cannot exactly quantify the probability of A.

The first person to recognize this, to my knowledge, was John Maynard Keynes is his gorgeous, but now little read, A Treatise on Probability, a book which argued that all probability statements were statements of logic To Keynes—and to us—all probability is conditional; you cannot have a probability of A, but you can have a probability of A with respect to certain evidence. Change the evidence and change the probability of A. Stating a probability of A unconditional on any evidence disconnects that statement from reality, so to speak.

Other Theories of Probability

For many reasons, Keynes’s eminently sensible idea never caught on and instead, around the same time his book was published, probability theory bifurcated into two antithetical paths. The first was called frequentism: probability was defined to be that number which is the ratio of experiments in which A will be true divided by the total numbers of experiments as that number of experiments goes to infinity¹. This definition makes it difficult (an academic word meaning impossible) to answer what is the probability that Joe, our Joe, likes Butterfingers. It also makes it difficult to define the probability for any event or events that are constrained to occur less than an infinite number of times (so far, this is all events that I know of).

The second branch was subjective Bayesianism. To this group, all probabilities are experiences, feelings that give rise to numbers which are the results of bets you make with yourself or against Mother Nature (nobody makes bets with God anymore). To get the probability of A you poll your inner self, first wondering how you’d feel if A were true, then how you’d feel if A were false. The sort of ratio, or cut point, where you would feel equally good or bad becomes the probability. Subjective Bayesianism, then, was a perfect philosophy of probability for the twentieth century. It spread like mad starting in the late 1970s and still holds sway today; it is even gaining ground on frequentism.

What both of these views have in common is the belief that any statement can be given a precise, quantifiable probability. Frequentism does so by assuming that there always exists a class of events—which is to say, hard data—to which you can compare the A before you. Subjective Bayesianism, as we have seen, can always pull probabilities for any A out of thin air. In every conceivable field, journal articles using these techniques multiply. It doesn’t help that the many times probability estimates are offered in learned publications, they are written in dense mathematical script. Anything that looks so complicated must be right!

Mathematics

The problem is not that the mathematical theories are wrong; they almost never are. But because the math is right does not imply that it is applicable to any real-world problems.

The math often is applicable, of course; usually for simple problems and in small cases the results of which would not be in much dispute even without the use of probability and statistics. Take, for example, a medical trial with two drugs, D and P, given to equal numbers of patients for an explicitly definable disease that is either absent or present. As long as no cheating took place and the two groups of patients balanced, then if more patients got better using drug D, that drug is probably better. In fact, just knowing that drug D performed better (and no cheating and balance) is evidence enough for a rational person to prefer D over P.

All that probability can do for you in cases like this is to clean up the estimates of how much better D might be than P in new groups of patients. As long as no cheating took place and the patients were balanced, the textbook methods will give you reasonable answers. But suppose the disease the drugs treat is not as simply defined. Let’s write what we just said in mathematical notation so that certain elements become obvious.

#3 Pr ( D > P | Trial Results & No Cheating & Patients Like Before) > 0.5.

This reads, the probability that somebody gets better using drug D rather than P given the raw numbers we had from the old trial (including the old patient characteristics) and that no cheating took place in that trial and the new patients who will use the drugs “look like” the patients from the previous trial, is greater than 50% (and less than certain).

Now you can see why I repeatedly emphasized that part of the evidence that usually gets no emphasis: no cheating and patients “like” before. Incidentally, it might appear that I am discussing only medical trials and have lost sight of the original thread. I have not, which will become obvious in a moment.

Suppose the outcome of applying a sophisticated probability algorithm gave us the estimate of 0.72 for equation #3. Does writing this number more precisely help if you suppose you are the doctor who has to prescribe either D or P? Assume that no cheating took place in the old trial, then drug D is better if the patient in front of you is “like” the patients from the old trial. What is the probability she is so (given the information from the old trial)?

The word like is positively loaded with ambiguity. Not to be redundant, but write out the last question mathematically.

#4 Pr ( My patient like the others | Patients characteristics from previous trial)

The reason to be verbose in writing out the probability conditions is that it puts the matter starkly. It forces you, unlike the old ways of frequentisim and subjective Bayesianism, to specify as completely as possible the circumstances that form your estimate. Since all probability is conditional, it should always be written as such so that it is always seen as such. This is necessary because it is not just the probability from equation #3 that is important, equation #4 is, too. If you are the doctor, you do not—you should not—focus solely on probability #3 because what you really want is this:

#5 Pr ( D > P & My patient like before | Trial Results & No Cheating & Patients Character)

which is just #3 x #4. I am in no way arguing that we should abandon formal statistics which produces quantifications like equation #3. But I am saying that since, as we already know, exactly quantifying #4 is nearly impossible, we will be too confident of any decisions we make if we, as is common, substitute probability #3 for #5 because, not matter what, the probability of #3 and #4 both is always less than the probability of #3.

Appropriate caveats and exceptions are usually delineated in journal articles when using the old methods, but the results are buried in the text, which causes them to be weighed more or less importantly, and which give the reader a false sense of security. Because, in the end, we are left with the suitably highlighted number from equation #3, that comforting exact quantification reached by implementing impressive mathematical methods. That final number, which we can now see is not final at all, is tangible, and is held on to doggedly. All the evidence to the right of the bar is forgotten or downplayed because it is difficult to keep in mind.

The result to equation #3 is produced, too, only from the “hard data” of the trial, the actual physical measurements from the patients. These numbers have the happy property that they can be put into spreadsheets and databases. They are real. So real that their importance is magnified far beyond their capacity to provide all the answers. They fool people into thinking that equation #3 is the final answer, which it never is. It is always equation #5 that is important to making new decisions. Sometimes, in simple physical cases, probabilities #3 and #5 are so close as to be practically equal; but when the situation is complex, as it always is when involving humans, these two probabilities are not close.

Everything That Can Happen

The situation is actually even worse than what we have discussed so far. Probability models, the kind that spit out equation #3, are fit to the “hard data” at hand. The models that are chosen are usually picked because of habit and familiarity, but responsible practitioners also choose the models so that they fit the old data well. This is certainly a rational thing to do. The problem is that, since probability models are only designed to say something about future data, the old data does not always encompass everything that can happen and so we are limited in what we can say about the future. All we can say for certain is what has happened before might happen again. But it’s anybody’s guess whether what hasn’t happened before might happen in the future.

The probability models fit the old data well, but nobody can ever know how well they will fit future data. The result is that over reliance on “hard data” means that probabilities of extreme events are underestimated and mundane events overestimated. The simple way to state this is the system is built to engender overconfidence.²

Decision Analysis

You’re still the doctor and you still have to prescribe D or P (or nothing). No matter what you prescribe something will happen to the patient. What? And when? Perhaps the malady clears up, but how soon? Perhaps the illness is merely mitigated, but by how much? You not only have to figure out what treatment is better, but what will happen if you apply that treatment. This is a very tricky business, and is why, incidentally, there is such a variance in the ability of doctors.³ Part of the problem is explicitly defining what is meant be “the patient improves.” There is ambiguity in that word improve, in what will happen with either of the drugs is administered.

There are two separate questions here: (1) defining events and estimating their probability of occurring and (2) estimating what will happen given those events occur. Going through both of the steps is called computing a risk or decision analysis. This is an enormously broad subject which we won’t do more than touch on, only to show where more uncertainty comes in.

We have already seen that there is ambiguity in computing the probability of events. The more complex these events the more imprecise the estimate. It is also often the case that part (2) of the risk analysis is the most difficult. The events themselves cannot be articulated, either completely or unambiguously. In simple physical systems they often can be, of course, but in complex ones like the climate or ecosystems they are not. Anything involving humans is automatically complex.

Take the current (!) financial crisis as an example. Many of the banks and brokerages failed to both define the events that are now happening, and they extent of the cost of those events. How much will it cost to clean it up? Nobody knows. This is the proper answer. We might be able to bound it—more than half a billion, say—and that might be the best anybody can say (except that I have been asked to pay for it).

Too Much Certainty

What the older statistical methods and the strict reliance on hard data and fancy mathematics have done is to create a system where there is too much certainty when making conclusions about complex events. We should all, always, take any result and realize that they are conditional on everything being just so. We should realize those just so conditions that obtained in the past might not in the future.

Well, you get the idea. There is already far too much information to assimilate in one reading (I’m probably just as tired of going on and on as you are of reading all this!). As always, discussion is welcome.

—————————
¹Another, common, way to say infinity is the euphemism “in te long run”. Keynes has famously said that “In the long run we shall all be dead.” It’s always been surprising to me that the same people who giggle at this quip ignore its force.

²There is obviously a lot more to say on this subject, but we’ll leave it for another time.

³A whole new field of medicine has emerged to deal with this topic. It is called evidence based medicine. Sounds good, no? What could be wrong with evidence? And it’s not entirely a bad idea, but there is an over reliance on the “hard data” and a belief that only this hard data can answer questions. We have already seen that this cannot be the case.

Last updated on September 22, 2008

Briggs

Briggs is an internationally reviled thoughtcriminal, listed as One Of The Top 7 Dangerous Minds by the Hague.

View All Posts

35 Comments

Wade Michaels

September 22, 2008, 3:12 pm

We have an engineer here who insists that if customer X implements action Y then X will save 44% on their monthly bill. No matter how much we explain conditions, demand curves, etc… he just thinks the savings is 44% all of the time, every time. It’s like negotiating with a text book.
lucia

September 22, 2008, 3:26 pm

Interesting categories for this post. . .

But yes. The answer we get for analysis depends strongly on

a) the constraints we put on our analysis and
b) what evidence we accept as valid.

This is true both when computing probabilities and in other areas!

Oddly enough, in many of the climate change statistical arguments, the discussion of what is accepted as evidence is made as obscure as possible; the math is highlighted. Those unfamiliar with the math think the math is the key thing. However, it’s the assumptions, constraints and evidence accepted as valid that often drive the answer.
Raphael

September 22, 2008, 6:31 pm

In English, the word most at least means more than half; it could even mean a lot more than a half, or even nearly allâ€”there is certainly ambiguity in its definition.

Given the context that 10 people enjoy Butterfingers, 8 people enjoy Snickers, and 6 people enjoy Heath, in English, we can say, “Most people enjoy Butterfingers,” and your definition does not hold.

If most refers to only two groups, as would be implied by your definition, such as those who enjoy Butterfingers and those who do not, in English, we use more instead.
Steve Hempell

September 22, 2008, 10:27 pm

Matt,

OT sort of but I didn’t want to fill up your e-mail. Thanks for the reply on AUC. Have you seen the Wavelet analysis of SSN on

http://wattsupwiththat.com/ ?

Any idea what it really means?
Mike D.

September 23, 2008, 3:09 am

Lucia,

Almost all climate discussions concern #1: the Pr(global warming | various evidence) which is understandably interesting. But behind the discussions is another assumption, namely #2: the Pr(the future will be bad | global warming actually happens). That second conditional probability is rarely discussed. Based on numerous dire reports, like rising sea levels, spread of tropical diseases, crop failures, increase in storms, extinction of polar bears, etc., most people have assumed that #2 has a 100% probability, or at least is greater than 0.5.

In some ways, assumption #2 is necessary to make discussion about #1 interesting. Without #2, who cares about #1? Even less frequently considered is #3: the Pr(the future will be good | global warming actually happens).

#3 is my favorite rejoinder to discussions of #1. I phrase #3 as “warmer is better.” People don’t like it, though, because it takes the air out of discussions of #1. People would rather be worried about something than not worried, or more properly, if people are not worried then they don’t get involved in #1 discussions. What would be the point? It’s just a waste of time. Or an academic pursuit, which could be viewed as much the same thing.

Of course, bad futures and good futures are highly subjective, even when the future becomes the present. Was today a bad day or a good one? That depends on your situation and point of view. It is another conditional, #4: Pr(future good (or bad) | individual attitudes).

That last one, #4, conditioned on attitudes, can be very upsetting. Some people have lousy attitudes no matter what, and some people just don’t care one way or the other. A handful of people (definitely not most) are happy regardless of whatever happens. We all know how annoying those latter people are. That’s probably why they are so few in number; group one and two are always trying to convert the foolishly happy people to either the lousy attitude or just don’t care factions.

Putting 1-4 together, we ask what is the Pr(people will be happy | any future) and are forced to conclude it’s a very small number. But we can’t be certain.

My point, and it may not have anything to do with Brigg’s lovely exposition above, is that the future is largely unknown except in that it will be what we make of it. Oh yes, and also that warmer is better.
Mike D.

September 23, 2008, 3:17 am

PS — Raphael you got it just backwards. In your 10-8-6 example, we say more people enjoy Butterfingers than enjoy other candy bars. Only when more than 50% prefer a single brand do we say that most people enjoy Butterfingers.
Briggs

September 23, 2008, 4:45 am

Raphael,

I’m with Mike. You example is ambiguous. It can read either “10 out of 24 people prefer Butterfingers” or “all 10 people enjoy Butterfingers, and 8 of the enjoy etc.”

If the first case most people do not prefer Butterfingers, in the second case they do. I still claim that “most people” means “more than half,” though it does carry with it the unstated assumption that I am talking about a fixed population of people.

But I can fix my example. Change the evidence to E = “10 out of a group of less than 20 people enjoy Butterfingers” then the Pr (A | E ) > 0.5 but you can’t do better.
Nigel Mellish

September 23, 2008, 7:58 am

Wow, what an amazingly bad summary of Bayesian Subjectivism. It’s pretty much just ad homenim.

It’s kind of feeding the troll, I know, but I have to ask what your feelings are about Bayesian Objectivism?
Joy

September 23, 2008, 9:02 am

Briggs:
Thank you for this very clear explanation. At the risk of being called a dunce, please could you answer a very sincere question although you are free to laugh, I wonâ€™t notice. Was that example, (butter fingers) an example of uncertainty being quantified? Or was this a demonstration of uncertainty? I assume the latter.
â€œEvidence based practiceâ€ Probably the place to learn about this is not in an interview! Several years ago I was asked:
â€œwhat do you think about evidence based practice?â€ I thought this was the most ridiculous question I had ever heard. On thinking for a second the penny dropped that this was a catch phrase that had hither to alluded me. Wasnâ€™t this something that we had all been doing all along? I think I got away with it.
Medicine is one example, but in physio the evidence either way is rarely compelling. The human body does not behave like a machine or computer but many still refuse to recognise this.
In my area of interest, chronic pain, there is even more uncertainty and much speculation. I have long suspected that the power of placebo is underestimated.
If â€œmostâ€ and â€œmoreâ€ are ambiguous â€œbetterâ€ is even worse!
â€œOn a scale of 1 to 10 if 0 is no pain and 10 is having your leg sawn off without anaesthetic, where is your pain on that scale?â€ The answer to this depends on countless factors and only one of these factors has anything to do with pain. The interpretation of the answer is open to yet more speculation. Medicine can never be free from subjectivity as long as humans are involved.
Itâ€™s high time professional modellers of climate were indeed introspective and self critical in their own practice.
Mike D:
Well said as always although not convinced about the â€œhappy foolishâ€ people and the â€œfuture is in our handsâ€.
Joy

September 23, 2008, 10:01 am

Never mind, I’ve read again and see that it is a quantification of sorts. Sorry.
Bob North

September 23, 2008, 11:42 am

Raphael, Briggs, Mike D. –

Most is a funny word which can be used in multiple ways. In the context used originally by Briggs and in the example given by Raphael, it definitely means at least “more than half” or possibly “much more than half.” However, if the 24 people in the example above had voted on their favorite candy bar, we would be correct in saying “butterfingers received the most votes” Don’t get stuck on a single definition of the word without understanding the context in which it is used.

Lucia – you are absolutely right that too many people lose sight of the underlying assumptions and constraints of an analysis as well as the validity of the evidence and think that it is the math that drives the answer. It is something I would call quantitative bias. No matter how much I interpolate, extrapolate, estimate from proxies, or manipulate data, if I can come up with a number at the end, many people are more likely to accept it as more valid than evidence which can’t readily be quantified and thus is deemed “anecdotal” evidence”

William – thank you for this most interesting post.
lucia

September 23, 2008, 11:47 am

Briggs–
One of the difficulties is there is a difference between use of “most” and “the most”:

“Joe ate the most pie” doesn’t mean he ate more than half the pie. “Joe ate most of the pie.” does mean he ate more than half the pie.

Also, the first online dictonary I consulted agrees with Raphael’s use of “most” in the butterfinger example. See most. The first definition is:

# Greatest in number: won the most votes.

So, if there are three candidates, and one gets 49% of the votes, one gets 48% of the votes and one gets 3%, the one with the most votes wins. That “most” is 49%, which is less than half!

On the other hand, further down on the page I link, they do include the “more than half” definition.

Merriam Webster has most:

1 : greatest in quantity, extent, or degree 2 : the majority of

The first definition would agree with Raphael’s butterfinger example and the voting example. The second definition agrees with you!

So, it appears “most” is may mean either thing. I cannot begin to guess which way is used by most people. 🙂

MikeD–

Almost all climate discussions concern #1: the Pr(global warming | various evidence) which is understandably interesting.

Even on this first question, people often make their choice of “various evidence” rather obscure. The choice of evidence, and a number of other assumptions make a large difference to the outcome of an analysis.

Often, the issues surrounding choice of evidence can be understood by people who are relatively unfamiliar with the mathematics or precise terminology of statistics.

Lots of the hockey-stick argument relates to choice of proxies. How do you decide which proxies should be used on a PCA anaylysis? (My Answer: Beats me!)

I’ve posted discussions on whether
a) we should estimate variability in trends on the real earth based on the variability of trends predicted by models of the earth and
b) whether we should estimate uncertainty in trends for the past 10 or so years using variability from a period which contains the largest volcano eruption of the century as well as a number of other large ones.

If we estimate variability based on earth data, we get different answers from estimates based on model data. If we use periods with no Pinatubo / Fuego / El Chicon eruption, we get different estimates. These matter when we are trying to determine whether a projection of “2C/century trend plus earth noise” is consistent with earth data.

The real arguments are often not over the math. They are over what we accept as evidence when estimating p(A | evidence).

There are actually many choices of evidence underlying every probability assumption for even the narrowest questions in AGW (or science in general.)

Of course, Matt Briggs may be about to discuss the later problems you number 2,3,4 etc. rather than 1.
Briggs

September 23, 2008, 11:55 am

Nigel,

Well, I’ll tell you what. Why don’t you offer us a clearer explanation and we’ll discuss.

I will admit to taking liberties with my definition, but not so many that it’s an unfair portrayal. I’d be interested to see if any subjectivist can show exactly where it’s wrong. Actually, maybe I should start a new thread for this since it would take us off topic here.

For your second question, I’ll say that most or nearly all of the math between subjective Bayesianism and objective Bayesianism are identical. We differ mainly in the axioms. And, of course, how probabilities are interpreted.
Patrick Hadley

September 23, 2008, 1:56 pm

An example of how prior beliefs influence the debate on global warming is the debate about whether or not the lack of warming over the last decade has any statistical significance.

The AGW consensus believes (based on what they accept as accepted science) that because of high and increasing levels of CO2 the probability that there is an underlying trend of warming of at least 0.2C per decade is very high.

The data over the last decade shows that there has been no significant warming at all, but if you believe that there is a real trend then it is easy to think it very likely that the current lack of trend is just “noise”. That will appear to you to be much more likely than the science of greenhouse gasses and the computer climate models being in error. You will be frustrated that sceptics are wanting to throw out the science based on data from a relatively short period of time.

Lucia on her Blackboard blog has shown that the hypothesis that the trend since 2001 is 0.2C per decade can be rejected at the 95% level. However if you believe that the science tells us that the probability that the trend really is that high has a value of 0.99 then you will reject Lucia’s analysis as unimportant. You will believe that the less than 5% chance of the trend existing is more likely than the 1% chance of the climate models being wrong.
lucia

September 23, 2008, 3:13 pm

Patrick–
But what I show does depend on certain assumptions. Early on, I used only AR(1), but done two ways. AR(1) seemed acceptable to ‘others’ back when it ‘proved’ a statistically significant warming trend since 2000, but was decreed flawed when it ceased to show any such thing, and began to show that 2C/century is excluded based on data since 2001. (My 2001 date selected based on publication dates on documents.)

So, yes, it appears that people’s assessment of whether or not a particular set of assumptions is sound, or whether evidence is convincing seems to change based on the result. Supposedly, this is not supposed to happen in science. 🙂

Still, AR(1) might not be perfect– (though using only recent data, you really can’t distinguish between AR(1) and other models. You need more– and for a number of reasons, we could argue more doesn’t help all that much either.)

Since AR(1) could be imperfect (and quite likely is) I’ve been trying various different models– searching both through “model data” and considering the effect of measurement artifacts.

This week, I’m trying to go through the various permutations of answers we get with the ARMA(1,1) model Tamino recently suggested (which actually coincides with the AR(1)+white noise model had discussed and applied in an earlier post. ) Depending on
a) what we accept as evidence and
b) the choice of statistical model

We’ll get different answers.

Right now it looks like the largest difference depends on what we take as evidence for the amount of variability. That’s the topic Briggs is discussing– though I have reason to believe he is thinking of my “volcano argument” in particular.

Still it fits in the larger point– the estimate of the probability is conditional on what you accept as evidence (and a number of other things.)

So far, I haven’t resorted to Bayes Law in any blog post.:)

BTW Patrick — If I adopt the point of view of an engineer monitoring a tank of fluid that I am monitoring as I heat and mix it, I believe there is “underlying uptrend” that is masked by “random fluctuations” bringing cooler eddies of fluid near my thermometer. (Apologies to Briggs for the “underlying trend”…)

That said, at this point, I’m pretty sure it ain’t 2C/century. I’ve looked at this several ways I haven’t blogged about. But, it right now, would take a lot to convince me there is no uptrend at all. (In fact, I’d bet a plate of brownies on that!)
Raphael

September 23, 2008, 8:26 pm

Re: Most

Most is the superlative of Some, Much or Many. It is used just like every other superlative– when comparing more than two things. When comparing two things, we use a comparitive.

Both a comparitive and a superlative denote a thing that has the transcendant quality among those things being compared.

Thus:

Mike D, you have it backwards.

Briggs, my point was that most is even more ambiguous than you led people to believe. You seem to be stuck on the assumption that most means one particular thing. It does not. It is a superlative, and without context we should not assume a specific definition. (In particular a definition which would grammatically use a comparitive rather than a superlative.)

Bob North, Definitions need not apply. I try not to get too tied up on definitions. Definitions are subject to rapid change. Rules for grammar on the other hand can change, but takes time, lots of time.

Lucia, I blame being annoyed by the use of most on you. 🙂
Raphael

September 23, 2008, 8:57 pm

Re Mike D having it backwards. That wasn’t quite accurate.

In your 10-8-6 example, we say more people enjoy Butterfingers than enjoy other candy bars.

Yes that is another grammatical way of describing a superlative. example: The tallest tree is taller than any other tree.

Only when more than 50% prefer a single brand do we say that most people enjoy Butterfingers.

The use of only is wrong. 🙂
Jinnah Mohammed

September 23, 2008, 9:36 pm

Thanks for using drug trials as your example. I am a individual who suffers from bipolar disorder. Over the years I have become exasperated (mildly stated) by the medical profession’s tendency to push drugs at us which don’t work well because (a) the medical model used when designing the drugs is incomplete and therefore (b) the assignation of a probability close to 1 of the statement “my patient is like those in the drug trial” is likely to be inaccurate since it is not even clear what factors the patients are supposed to have in common.

Even the hard data from drug trials in not compelling – the best drugs give statistically significant results, but usually help less than 50% of the persons in the trial.

Nevertheless the psychiatrists are quite happy to very confidently offer the drugs to the patient (i.e. us bipolar persons) with no warning that the drugs may not work, leaving us with hopes and expectations which are likely to be dashed and a distinct sense that perhaps it is our fault that things went wrong because the drug should be working because the psychiatrist was so confident about it.

Treating patients like this does not get them better.
Briggs

September 24, 2008, 4:32 am

Raphael,

You have convinced me. In the book, the example is now:

E = “At least 10 out of 20 people prefer Butterfingers”

I think this is better:

E = “At least half of all people prefer Butterfingers”

But I will also keep the “most” example because the idea of it was to show ambiguity in definitions and meaning of evidence.

Thanks.

Jinnah,

Thanks for letting us know about your experiences. Your example about statistically significant results not being any indication of what fraction of patients will see a benefit it excellent (unfortunately for you).
Joy

September 24, 2008, 7:02 am

Jinna,

I agree. I am not a Doctor, but a physio and observe the following:
Just as no two faces are the same, no two patientsâ€™ conditions are the same, ever.
We are all different. The closer the clinician to any given study or drug, the more likely they will be to overlook this either due to academic necessity or over enthusiasm in the noble quest to finding an answer. Treating patients is a humbling experience. Staff are there to make patients feel better not the other way round.

You are right in your summary that so often the patient is made to feel guilty for not responding. Some Orthopaedic surgeons and Neurosurgeons for example still blame patients for having pain despite having had surgery. â€œThe disc is gone now you cannot possibly still be in painâ€â€¦â€so why does it hurt more than before?â€ Your experience is widespread in medicine and professions allied to it. So while some clinicians blame the patients. The irony is that some clinicians blame themselves, and some patients blame the clinician!
So weâ€™re all blaming each other. Matt Briggs blames the statistics!
What advice would I give to your consultant? â€œHistorically the patient is often right so listen very, very carefully, Sirâ€
Joy
Joy

September 24, 2008, 7:10 am

…and sorry Jinnah I spelt your name wrong.
JH

September 24, 2008, 9:02 am

â€œMatt Briggs blames the statistics!â€

No kidding. In this case, as a fellow statistician, I am sorry that Briggs is a statistician, and would like nothing more than to swear!

F^@#$%^&*()*^%$#@ (I donâ€™t swear well, and English is not my best language). Ha.

Yeah, itâ€™s not my fault that I fail to recognize my limited abilities to make accurate uncertainty specifications.

Well, I always blame Microsoft Word for my grammatical errors.
Raphael

September 24, 2008, 9:44 am

Briggs,

I don’t think you need to change the example. All that would need to be done is to rewrite,

“In English, the word most at least means more than half; it could even mean a lot more than a half, or even nearly allâ€”there is certainly ambiguity in its definition. But since most at least means more than half, we can partially answer our question, which is written like this,”

to specifiy the restriction of “most” to a specific definition.
Joy

September 24, 2008, 9:55 am

JHon:
So who do you blame JHon?
JH

September 24, 2008, 1:36 pm

Joy, Always, always, there is no one else to blame but my spouse!

I just donâ€™t see how you reached the conclusion that â€œstatisticsâ€ is to be blamed. I hope you understand why I, a statistician, was defensive about such statement.
Joy

September 24, 2008, 3:47 pm

JHon:

Iâ€™m glad you said you were cross, I was getting that feeling although I wasnâ€™t sure why, honest.
Would it help if I said misuse of statistics?
Or a misunderstanding of the limits of statistics?
Or foolish physios and Doctors using tools that they donâ€™t understand when they involve themselves in statistics? Thatâ€™s the sort of thing I meant.

I did not blame statistics. I deliberately left my own opinion out of that list; it wasnâ€™t about me.
If a patient does not respond to treatment, the first person I blame is myself.
Iâ€™m reading about statistics and philosophy because itâ€™s interesting.
You are right to blame microsoft Word for your grammatical errors. Donâ€™t rely on it.
Hope that helps you feel less cross. (Iâ€™d put a smiley but I donâ€™t know how and the text ones look naff.
Briggs

September 25, 2008, 6:29 am

C’mon, JH, you know I’m right.

We statisticians, particularly those of us who are academics, often give ourselves far too much credit. We concentrate too much on the methods we know, because, after all, we know them and can test others for their knowledge.

But while doing this, while artfully calculating a nifty new saddlepoint approximation, we often lose sight of the original goal of any analysis. We substitute our faith in the mathematics of the methods to our confidence that the final results are useful in real life.

We’d do better in our classes if we tossed out teaching yet another regression method (say, ANOVA), and instead taught how to be skeptical.
JH

September 25, 2008, 5:49 pm

Briggs,

Busy Thursday! D^^**!!!%%% (I hope this doesnâ€™t read as â€œI am crossâ€ due to possible malfunction of internet transmission)!

I agree with you about the quantification of uncertainty. I don’t disagree with most of your opinions regarding Statisticians or the mathematics simply because I am an open-minded skeptic (not being ironic here, you know I am).

Without the uncertainty, there wouldn’t be the science of probability and statistics. Do most statisticians appreciate the “beauty” of uncertainty? I know I do.

Giving ourselves too much credit for statisticians? I don’t know.

However, I can tell this. I believe that, say, our well-known advisors, deserve all the credit they wish to have. They have enjoyed and devoted their life to statistics research. They helped us (at least me, let me speak for myself only) learn and get a Permanent Head Damage. Of course, they have done more good than damage (if any at all).

Oh my, this just occurred to me. My advisor guided me through my first rewarding journey “to explore strange new worlds, …, to boldly go where no man has gone before. ” (Yeah, I love all the Star-Trek Captains.)

If one loses sight of the original goal of any analysis, that’s unfortunate.

Whatever math methods academic statisticians utilize or devise the goal is mostly the same as what we all wish to do in our life – to better understand and quantify uncertainty. And “quantification” often involves mathematics.

Just because the math is seemingly useless now, it doesnâ€™t implies that it wonâ€™t be applicable in the future. And the math sometimes could provide necessary insights as to why a solutions or idea works and how it can be applied.

Teaching students how to be skeptical? Ideally, Yes.

Well, a skeptic without knowing basic tools or understanding the uncertainty? No. So if I can help them understand what uncertainty is (and learn how to construct simple graphs and summary statistics andâ€¦ love the magic of those three dots) in an intro stat course, I consider it a “job well-done.”

This is too long and boring. Donâ€™t enjoy being serious at all.

LOL and Peace.
Briggs

September 26, 2008, 5:46 am

Oh, JH, you know I love you. But tell me this.

What exact passage, or equation, in this post do you disagree with?

I’m sure you’d agree that we teach how to get answers to eq. #3 but rarely emphasize that we really need eq. #5.

I know you’re a fan of frequentism, but can you use it for #1?

I do point out that statisticians include the appropriate caveats to analysis in their papers, but I think you’ll have to agree, especially in classical analysis, people latch on to the p-value and minimize or forget everything else.

How about the model selection part? (I didn’t call it that, of course) Where we fit to the data at hand but obviously cannot fit to data we don’t have and so are models are routinely “overdispersed” (a technical term meaning too certain).
Joy

September 26, 2008, 10:02 am

Sorry about the â€˜onâ€™, itâ€™s my speech software, itâ€™s not to be trusted it is mischievous at times. It no doubt seeks revenge, as I have in the past been known to deliberately misspell words and names.
So you like playing with and crunching numbers but take no responsibility for any such numerical mastication? In the matter of your point about appreciation of beauty of uncertainty etc:
Uncertainty is the real world. Itâ€™s the difference between theory and practice.
Without patients there would be no Medics. There is a well-known saying in Physio that goes â€œeveryone loves to talk about physio but no-one actually likes doing itâ€
When you use words like â€œno kiddingâ€ and â€œI would love nothing more than to swearâ€ and â€œI hope you know why Iâ€¦ am defensiveâ€ maybe you can see why I thought you were cross! It doesnâ€™t take much at times. Some professional statisticians are so used to being right all the time. Itâ€™s a shock to their delicate constitution when there is malfunction.
I loved Tinkerbell and hated Star Trek; still do. Thatâ€™s where Iâ€™m going wrong. It was that awful ill considered mustard coloured thing the captain used to wear, every day, and forever!
Joy

September 26, 2008, 11:03 am

JH:
Sorry about the â€˜onâ€™, itâ€™s my speech software, itâ€™s not to be trusted it is mischievous at times. It no doubt seeks revenge, as I have in the past been known to deliberately misspell words and names.
So you like playing with and crunching numbers but take no responsibility for any such numerical mastication? In the matter of your point about appreciation of beauty of uncertainty etc:
Uncertainty is the real world. Itâ€™s the difference between theory and practice.
Without patients there would be no Medics. There is a well-known saying in Physio that goes â€œeveryone loves to talk about physio but no-one actually likes doing itâ€
When you use words like â€œno kiddingâ€ and â€œI would love nothing more than to swearâ€ and â€œI hope you know why Iâ€¦ am defensiveâ€ maybe you can see why I thought you were cross! It doesnâ€™t take much at times. Some professional statisticians are so used to being right all the time. Itâ€™s a shock to their delicate constitution when there is malfunction.
I loved Tinkerbell and hated Star Trek; still do. Thatâ€™s where Iâ€™m going wrong. It was that awful ill considered mustard coloured thing the captain used to wear, every day, and forever!
JH

September 26, 2008, 2:38 pm

Finding exact passage that I disagree with? I understand what your viewpoints are, and thatâ€™s it. I usually don’t disagree with people, unless I know for sure they are wrong.

Your criticism towards classical approaches is not unjustified. Too much credits or too certain in our statistical analysis? Over-dispersed models by the â€œmodelersâ€? I wish we all were perfect geniuses. Bayesian method has its own issues too, e.g., prior specification. There are definitely benefits of carrying out a careful Bayesian statistical analysis. Bayesian or not, we all must be modest in the claims that we make.

I got an idea, maybe I should have a sign on my office door stating â€œbuyers, beware, over-dispersed conclusions, adopt with great discretion.â€

I can tell you what I really really appreciate about academic statisticiansâ€¦ their new ideas and reasoning in their research.

Jack Nicholsonâ€™s performance in the movie â€œA Few Good Manâ€ was brilliant, especially, when he said

â€œY o u C a n â€™t H a n d l e t h e T r u t h.â€

So, maybe I just cannot accept the possibility of Bayesian analysis being better or classical analysis being wrong. Oh, maybe Luisâ€™s point that the truth is in the middle is correct.
JH

September 26, 2008, 2:46 pm

Joy, my daughter loves Tinkerbell too. Have a great weekend.
MrPete

October 22, 2008, 7:09 am

I love where you’re going with this!

I’m looking for more real-world examples of your three-part uncertainty model (data, model, parameters) that would make sense to “normal” folks.

The medical example is of interest to me as well.

I had an apparent medical problem (suddenly unconscious for a minute or two in various situations.)

Many tests later, a neurologist confidently concluded: “you have a seizure disorder.”

Before going on the blithely-recommended meds (“MrPete, it’s very safe, only 25% of patients find it affects their ability to think!” “Guess what I do for a living!”), I got a second opinion.

A sleep doc was quite confident I had a sleep disorder of some kind. “We’ve found 93 sleep disorders. You have one of these five: A,B,C,D,E. Testing will show us which one. We can treat them all.”

Testing showed that I do have a sleep disorder. But it is not one of the 93, let alone one of the five. So they give it a name: idiopathic. Any time you hear that word from a doctor, you’re hearing confidence about the unknown :).

Neurologist: good data, bad model.
Sleep doc: good data, good model, still unable to interpret.

Conclusion #1: people fit the data to the models they know.
Conclusion #2: we know a lot less than we think.

No surprise here, of course.

A related story that I find fascinating: the med I take to alleviate my symptoms has a completely unknown mechanism of action. Yet it is proven effective. A good example of how useful it can be to have limited knowledge.

The ironic conclusion:
* I have an medical condition (yet we don’t know what it really is).
* My symptoms are treated by a medication (yet we don’t know how it works).
* But we can scientifically prove both of the above are real, not imaginary.

🙂
Briggs

October 22, 2008, 8:37 am

Petey,

Does this happen when you’re being lectured by a loved one? I have similar symptoms.

Not all uncertainty can be quantified

Related

35 Comments

Leave a Reply

Share this:

Related

35 Comments

Leave a Reply