This post originally ran 3 November 2014.
Reader Professor Doctor Moritz Heene writes:
I read your post on CIs with great interest, especially this one: http://wmbriggs.com/blog/?p=11862, see “Thinner is better”: “Frequentists prefer thinner, which is to say, narrower intervals over wide, assuming that, ceteris paribus, narrow intervals are more precise. For example, larger samples result in narrower intervals than small samples. But since all you can say is your interval either contains the true value or it doesn’t, its width does not matter. The temptation to interpret the width of an interval in the Bayesian fashion is so overwhelming that I have never seen it passed up.”
However, a colleague, with whom I discussed this issue sent me the following lines and I wonder what you think of it. I think he made a reasonable point: “For me a confidence interval is a summary of the effects I would [have] rejected if submitted to a hypothesis test (and we don’t need to think discretely here, we can think of the p-value as the continuous measure that it is of inconsistency of data with null, so I have stronger evidence against effects closer to the end of the confidence interval).
So a tight confidence interval is one that rejects many effects I may find interesting to know are rejected. A wide confidence interval is one that does not reject many effects I may find interesting.”
Your colleague didn’t pass up the Bayesian interpretation, either. He can’t really be blamed. The official frequentist meaning is too perplexing to keep in mind, its consequences intolerable, that relief is sought.
To repeat the official definition. Observe data, posit a parameterized probability model to “explain” that data, and construct a confidence interval (for one of these parameters). Now repeat the “experiment” such that the repetition is exactly the same as the first run, except that it is “randomly” different. Reconstruct the confidence interval. Repeat the “experiment” a third time, giving a third confidence interval. Repeat again. Then again. Then again forever.
When you reach forever, 95% of your collection of confidence intervals will overlap the “true” value of the parameter.
But what, you ask, about the confidence interval you have in hand? What does it mean? Well, it means just what I said, and nothing more. The only thing you can say about the confidence interval before you—regardless of its width—is that either the true value of the parameter lies within it or it doesn’t.
Suppose your interval is [a, b]. Either the true value of the parameters is in the interval or it isn’t. Introduce hypothesis testing: form a “null” which says the true value of the parameter is some number c. The frequentist then checks whether c is outside [a, b]. If so, he “rejects” the null.
Rejects is a word more apt than you think. For, as Neyman the man who invented confidence intervals tells us, rejecting a “null” is a pure act of will, just as is assigning the “null” a value of c. When the “null” is rejected, because all we know is that the true value of the parameter is in [a,b] or it isn’t, which is a tautology and true for any interval, there is no basis besides “I want”.
Your colleague says he would reject “nulls” where c is anywhere not in a to b. Well, he might. But he does so—on the official frequentist theory—with no basis. We are not entitled to say that the true value of the parameter has any probability to be c nor any probability to be in the interval [a, b]. We are not entitled to say that any finite collection of confidence intervals will be “well behaved” either. Only once we have an infinite collection are we allowed to speak—but only because we have observed everything that can ever be observed.
It is a Bayesian interpretation to say that the parameter “probably” or “likely” lies in [a, b]. It is a Bayesian interpretation that the parameter “could very well be” c. If you decide to reject the “null”, or to “fail to reject” it, with any kind of sureness or conviction or hope (the word “confidence” is lost to us here) then you have used a Bayesian interpretation.
Of course, this Bayesian interpretation is not a formal one, where the priors have been set up in the official fashion and so forth, but assigning any kind of probability, quantified or in the form of a “feeling”, to a parameter just is to be Bayesian.
If this is confusing, well, so it is. But that’s frequentism for you. A bizarre idea that you only know a probability at the end of all time.