Teaching Journal: Day 4

Today is the quietest day, a time when all is still, a moment when nary a voice is raised and, quite suddenly, appointments are remembered, people have to be seen, the room empties. Because this is the day I introduce the classical confidence interval, a creation so curious that I have yet to have a frequentist stick around to defend it.

Up until now we have specified the evidence, or premises, we used (“All Martians wear hats…”) and this evidence has let us deduce the probabilities of the conclusions (which we have also specified, and always will, and always must; e.g. “George wears a hat”).

But sometimes we are not able to use the premises (data, evidence) in a direct way. We still follows the rules and dictates of logic, of course, but sometimes the evidence is not as clear as it was when we learned that “Most Martians wear hats.”

The game of petanque is played by drawing a small circle into which one steps, keeping both feet firmly planted. A small wooden ball called a cochonette is tossed 6 to 10 meters downstream. Then opposing teams take turns throwing manly steel balls, or boules, towards the cochonette trying to get as close as possible to it. It is not unlike the Italian game of bocce, which uses meek wooden balls.

Now I am interested in the distance the boule will be from the cochonette. I do not know, before I throw, what this distance will be. I therefore want to use probability to quantify my uncertainty in this distance. I needn’t do this in any formal way. I can, as all people do, use my experience in playing and make rough guesses. “It’s pretty likely, given all the games I have seen, the boule will be within 1 meter of the cochonette.” Notice the clause “given all the games I have seen”, a clause which must always appear in any judgment of certainty or uncertainty, as we have already seen.

But I can do this more formally and use a store-bought probability distribution to quantify my uncertainty. How about the normal? Well, why not. Everybody else uses it, despite its many manifest flaws. So we’ll use it too. That I’m using it and accepting it as a true representation of my uncertainty is just another premise which I list. Since we always must list such premises, there is nothing wrong so far.

The normal distribution requires two parameters, two numbers which must be plugged in else we cannot do any calculations. These are the “m = central parameter” and the “s = spread parameter.” Sometimes these are mistakenly called the “mean” and “standard deviation.” These latter two objects are not parameters, but are functions of other numbers. For example, everybody knows how to calculate a numerical mean; that is just a function of numbers.

Now I can add to my list of premises values for m and s. Why not? I already, quite arbitrarily, added the normal distribution to the list. Might as well just plug in values for m and s, too. That is certainly legitimate. Or you can act like a classical statistician and go out and “collect data.”

This would be in the form of actual measurements of actual distances. Suppose I collect three such measurements: 4cm, -15cm, 1cm. This list of measurement is just another premise, added to the list. A frequentist statistician would say to himself, “Well, why don’t I use the mean of these numbers as my guess for m?” And of course he may do this. This becomes another premise. He will then say, “As long as I’m at it, why don’t I use the standard deviation of these numbers as my guess for s?” Yet another premise. And why, I insist, not.

We at least see how the mistake arises from calling the parameters by the names of their guesses. Understandable. Anyway, once we have these guesses (and any will do) we can plug them into our normal distribution and calculate probabilities. Well, only some probabilities. The normal always—as in always—gives 0 probabilities for actual observable (singular) events. But skip that. We have our guesses and we can calculate.

The frequntist statistician then begins to have pangs of (let us say) conscience. He doubts whether m really does equal -3.3cm (as it does here) and whether s really does equal 10.2cm (as it does here). After all, three data points isn’t very many. Collecting more data would probably (given his experience) change these guesses. But he hasn’t collected more data: he just has these three. So he derives a statement of the “uncertainty” he has in the guesses as estimates of the real m and s. He calls this statement a “95% confidence interval.” That 95% has been dictated by God Himself. It cannot be questioned.

Now the confidence interval is just another function of the data, the form of which is utterly uninteresting. In this example, it gives us (-10cm to 3.3cm). What you must never say, what is forbidden by frequentist theory, is to say anything like this, “There is a 95% chance (or so) that the true value of m lies in this confidence interval.” No, no, no. This is disallowed. It is anathema. The reason for this proscription has to do with the frequentist definition of probability, which always involves limits.

The real definition of the CI is this: if I were to repeat this experiment (where I measured three numbers) an infinite number of times, and for each repetition I calculated a guess for m and a confidence interval for this guess, and then I kept track of all these confidence intervals (all of them), then 95% of them (after I got to infinity) would “cover”, or contain, the real value of m. Stop short of infinity, then I can say nothing.

The only thing I am allowed to say about the confidence interval I actually do have (that -10cm to 3.3cm) is this: “Either the real value of m is in this interval or it isn’t.” That, dear reader, is known as a tautology. It is always true. It is true even (in this case) for the interval (100 million cm, 10 billion cm). It is true for any interval.

The interval we have then, at least according to strict frequentist theory, has no meaning. It cannot be used to say anything about the uncertainty for the real m we have in front of us. Any move in this direction is verboten. Including finite experiments to measure the “width” of these intervals (let he who readth understand).

Still, people do make these moves. They cannot help but say something like, “There is (about) a 95% chance that m lies in the interval.” My dear ones, these are all Bayesian interpretation. This is why I often say that everybody is a Bayesian, even frequentists.

And of course they must be.

Homework

Typo patrol away!

Find, in real-life, instances where the normal has been used with confidence intervals. Just you see if whoever used the interval interpreed it wrong.

12 Comments

  1. William Sears

    One dimensional petanque?

  2. JohnK

    For example, two paragraphs buried in:
    Rosner B. Fundamentals of Biostatistics. Fourth Edition, 1995. p. 162.

    ===

    “Therefore, we cannot say that there is a 95% chance that the parameter µ will fall within a particular 95% CI. However, we can say the following:

    “Over the collection of all 95% confidence intervals that could be constructed from repeated random samples of size n, 95% will contain the parameter µ.”

    ===

    “WE cannot say”? What is this strangely passive formulation — as if nobody had anything to do with it, it just ‘happened’ that “we cannot say.”

    And how nicely the author bowdlerizes-out that tawdry word “infinite” with euphemism: “Over the collection of ALL 95% confidence intervals that COULD be constructed…”

    And just how many is ‘all’ that ‘could’ be constructed, perchance? Why, it’s quite a lot, isn’t it? Like, an INFINITY?

    “WE cannot say”? ‘We’ means YOU.

    YOU – and Tom, Sarah, Matilda, etc. – “cannot say” (and cannot EVER say) that any “particular 95% Cl” that YOU could ever calculate has 95% confidence.

    WHICH ONE of the infinite set of confidence intervals (“all 95% confidence intervals that could be constructed”) is YOUR confidence interval?

    “We cannot say.”

    Every single one of the confidence intervals that you or anybody else can actually calculate contains NO information about WHICH ONE of the infinite set of confidence intervals your particular confidence interval is.

    YOUR confidence interval — and Tom’s, Sarah’s, and Matilda’s — says NOTHING about how ‘close’ or ‘far’ it is from the central parameter.

    “We cannot say.”

    And then, of course, the next 200+ pages of the book contain CIs. Everywhere.

    Because, even though “we cannot say,” somehow, we must.

    After all, it is expected of us.

    With 95% confidence.

  3. Doug M

    Can we say:

    if X1, X2…Xn are normally distributed random variables with mean m and standard deviation s.

    Sample mean M = sum(X)/n
    and sample variance S^2 = [sum(X^2)/2-M^2]/2-M^2

    P(M+2*S>m>M-2*S) aproximately equals 0.95

  4. Will

    You _need_ the CI if you’re going to render pretty curved bands on top of a regression, and we all know that a graph with pretty a CI band is way better than one without. Dressing up a graph is the best way of avoiding questions about your methods.

    God forbid someone show a histogram of residuals…

    Confidence intervals are like saying: “my model isnt wrong. It’s as right as it can be!”

  5. Ray

    @ JohnK
    Ditto. I have a text book where the author correctly explains that the Ci has to do with repeated experiments and it really does not give you the probability that a parameter is in that interval. He then uses CIs throughout the book. He suddenly developed amnesia and forgot his explanation.

    BTW, shouldn’t “m and s” be “c and s”?

  6. Perhaps the reason so few “frequentists” respond at this stage is because those straw bogeymen were never there in the first place.

    Most conventional statisticians neither endorse nor depend on frequentism as a definition of probability though they may correctly state that if you have any consistent measure of probability in some context, then for a repeatable experiment in that context the average of a long series of repetitions will have a low probability of being far from the expected value (which gets lower as the number of repetitions increases). Yes, this is circular and doesn’t tell us what that probability actually “means”, but I have yet to see evidence that you can do better.

  7. JH

    I hope you will show your students how to perform a Bayesian analysis on the petanque data set.

    If I were you, I would first use the ubiquitous coin-tossing example to demonstrate the simple Bayesian framework. Prior. Toss some coins. Likelihood Function. Posterior. Predict prob heads in next toss. Is this a fair coin? The Bayesian way. Definitely don’t want to teach statistics (calculus) relying solely on a statistical software (calculator).

    Why do you teach the confidence interval if you don’t value it?

    Anyway, setting the interpretation aside, it’s obvious that we can’t employ the over-used formula for the petanque example because of an overt assumption violation. Even if one bogusly assumes a normal distribution, using the data set {4,-15,1cm}, the resulting 95% CI for the mean is (-29, 22) and the CI for the distance that the boule will be from the cochonette is (-73, 66). Note that one can easily use a different confidence level.

    No matter what, if you want to teach classical methods, you still need to teach them correctly.

  8. JH

    Modern academic statisticians understand the basics of both classical and Bayesian methods. Most people would agree that the Bayesian conclusions are easier for most people to understand. They really don’t like to rehash the same thing over and over again.

    These are the “m = central parameter” and the “s = spread parameter.” Sometimes these are mistakenly called the “mean” and “standard deviation.”

    It’s known that a normal distribution is uniquely determined by its mean and standard deviation, both of which have non-negotiable mathematical definitions. A mean is a central parameter, but a central parameter may not be the mean. The same thing holds for standard deviation. The difference between a sample mean and a parameter mean is not hard for my students to understand at all.

  9. Brian

    Everyone must understand the environment of their calculation. The question is the distance, which would be the resultant of the X, Y, & Z distances. As such a negative distance is impossible. The actual mean would be (4 + 1 + 15 / 3) = 6.666…. Just to add another twist, is time a consideration? Over time their positions will change, possibly becoming closer together. With infinite time the distance will always be zero. Now my head is spinning too.

  10. Uncle Mike

    Let’s not beat around the bush. The students are going to venture forth into the real world someday (either via graduation or other means) where they will be considered adults with responsibilities, and among those are passing judgment on “science” or something like it.

    And they will most likely encounter confidence intervals, and p-values, and other nonsense perpetrated by “scientists” who wish to impose some crap or other on the rest of us.

    We live in the Age of Science where “scientists” are high priests who can screw you royal, take your house, shut down your economy, poison you, make you dance a jig, and have been given total control over every aspect of our lives.

    It’s those damn “experts” who run the show. But they aren’t really experts; they are frauds and bounders and grasping little shits. You know who you are and you know that I’m talking about you and you know I’m right.

    So it is imperative that we arm our children, soon to be adults with responsibilities, with the intelligence and means to ferret you out and expose your little game.

    We can be scholarly about it, and probably should be, but the real purpose to catch the con artists, the would be kings, the authoritarian nincompoops who aspire to slick the rest of us.

    Roger that?

  11. Hey Uncle Mike, Wouldn’t our kids be better able to call BS on the frauds and bounders if they had a good understanding of what they were actually pretending to be doing rather than being taught how to win a fight with a straw man?

Leave a Reply

Your email address will not be published. Required fields are marked *