# Frequentists Are Closet Bayesians: Confidence Interval Edition

Actually, all frequentists *and* Bayesians are logical probabilists, but if I put that in the title, few would believe it.

A man might call himself an Anti-Gravitational Theorist, a science which describes the belief that gravity is subject to human will. But the very moment that man takes a long walk off a short dock, he’s going to get wet.

One can say one believes anything, but in the end we are all Reality Theorists. We’re stuck with what actually is. This notion deserves full development, but today one small corner: confidence intervals. Never a stranger invention will you meet. Though these creatures have an official frequentist mathematical definition, they are *always*—as in *always*—interpreted in a Bayesian or logical probability sense.

**The official definition**

For some uncertain proposition, a parameterized probability model is proposed. Classical procedure, both frequentist and Bayesian, expends vast efforts in estimating this parameter or these parameters (for ease, suppose there is just one parameter). This estimate is a guess of the parameter’s “true” value given the observed data.

Nobody ever believes, nor should they believe, the guess. To compensate for the overt precision of the guess, the confidence interval was created. This is an interval, usually contiguous but not required to be, around the guess. For example, if the guess is 1.7, the confidence interval might be [1.2, 2.1]. The interval need not be symmetric. The width of the interval is determined by a pre-set number, the most typical being used like “95% confidence interval.”

If you were to repeat the “experiment” which gave rise to the data used in constructing the guess and confidence interval, a new guess and confidence interval could be calculated. The repetition is required to be identical to the old “experiment”, except that the new “experiment” is supposed to be “randomly” different in those aspects related to the parameter. Nobody knows what this means, because, of course, it is impossible to rigorously define. But let that pass.

Repeat the “experiment” a third time. Again, a new guess and confidence interval can be calculated. Repeat a fourth time, a fifth, and so on *ad infinitum*. At the end, we will have an infinite collection of confidence intervals. The punchline: 95% of these confidence intervals will “cover” the true value of the parameter.

This means that any individual confidence interval in the collection, e.g. [-72.2, -64.8], might not contain the true value of the parameter, but that, in the limit, 95% of those intervals will. It may be helpful to understand that the interval *itself* is what classicists call a “random variable”.

The natural question is, “What about *my* confidence interval, the one I calculated for the data I have; what might I say about *it*?”

Only this: *that your interval either contains the true value or it doesn’t*. According to frequentist theory, *this is it.* You may *not* say more. Doing so is strictly verboten.

Yet everybody—and I mean *everybody*—does say more. Indeed, everybody acts like a Bayesian. And that’s because, like our anti-gravitational theorist, frequentism breaks down here. Frequentism is self-consistent, though just like the anti-gravitational theory is; but like that theory, it fails upon meeting the real world.

**95% chance**

Everybody will say that the actual interval has a 95% chance, or thereabouts, of containing the true value of the parameter. The “thereabouts” is used by the frequentist to comfort himself that he is not a Bayesian, but it’s a dodge. Frequentist theory insists that the actual interval must not be associated with any probability proposition. To state, or feel, or assume, or think that the actual interval has a chance, even an unquantified chance, of containing the true value of the parameter is to act like a Bayesian.

Bayesians, acknowledging this, call their creations “credible intervals”, which in many textbook explanations overlap (a pun!) frequentist confidence intervals exactly or nearly so.

**Thinner is better**

Frequentists prefer thinner, which is to say, narrower intervals over wide, assuming that, *ceteris paribus*, narrow intervals are more precise. For example, larger samples result in narrower intervals than small samples. But since all you can say is your interval either contains the true value or it doesn’t, its width does not matter. The temptation to interpret the width of an interval in the Bayesian fashion is so overwhelming that I have never seen it passed up.

**Computer experiments**

Some complex probability models aren’t amenable to analytically calculated confidence intervals, and in these cases computer simulations are run to prove the proposed models are well behaved. These show the simulated confidence intervals “cover” the parameter (which in these cases are known exactly) at the specified percent (say 95%).

These simulated intervals, necessarily finite in number, are usually close to these percents, and their authors ask us to take this evidence to support the proposition that confidence intervals computed for real problems (with unsimulated data) will behave well. But this is to assign a (Bayesian) probability, albeit non-numerical, to future, unseen confidence intervals.

**Hypothesis tests**

Many “null” hypotheses suppose the parameter equals 0 (the exact value matters not for this criticism) and if the confidence interval does not contain 0, the “null” is rejected. To keep within frequentist theory, this is a pure act of will. It is saying, the 0 is not in the interval *because I do not want it to be*. This must be so because all we can know, according to the strict view of frequentist theory, is that the 0 is in the interval or it isn’t. We cannot attach *any* kind of measure of uncertainty to this judgement.

It is a Bayesian interpretation to claim, conditional on the observational evidence of the interval, that 0 is likely in or out of it.

**True values**

As said above, everybody takes the confidence interval to express a chance that the true value lies within the parameter, even though this is forbidden on theoretical grounds. Dzerzij Neyman, who invented confidence intervals, understood these facts, but nearly all those who follow him do not. Jim Franklin in his influential paper “Resurrecting logical probability” (Erkenntnis, 2001, pp. 277-305), in the section “Frequentists are secret Logical Probabilists” (and from where I stole today’s title), said (ellipses original):

Neyman was to some degree aware of the problem, and it is this entertaining to watch him explain what he is doing, given that it is not reasoning [Neyman wanted to remove reasoning from objective statistical procedures]. As is common among philosophers who have rejected some familiar form of justificatory reasoning, he seeks to permit the conclusion on some “pragmatic” grounds:

The statistician…may be recommended…to state that the value of the parameter θ is within…(Newyman 1937, p. 288)

Later, Neyman admitted the trick of rejecting “nulls”: “To decide to ‘assert’ does not means ‘know’ or even ‘believe’. It is an act of will” (From Neyman 1938, p. 352). To paraphrase Franklin’s quote of Neyman (to remove the unnecessary mathematical notation; Neyman 1941, p. 379, emphasis original to Neyman):

We may

decideto behave as if we actually knew that that true value [of the parameter] were [in the confidence interval]. This is done as a result of ourdecisionand has nothing to do with ‘reasoning’ and ‘conclusion’.

As Franklin said, “It is difficult to argue with pure acts of will, for the same reason that it is difficult to argue with tank columns”.

Howson and Urbach (*Scientific Reasoning: The Bayesian Approach*, second edition, 1993, pp. 237-241) said:

Neyman was careful to observe, however, that when asserting that the parameter is contained in the interval, the statistician is not entitled to conclude or believe that this is really true. Indeed, Neyman held the idea of inductive reasoning to a conclusion or belief to be contradictory, on the grounds that reasoning denotes “the mental process leading to knowledge” and this “can only be deductive.”

Neyman makes the category mistake, common in probabilistic reasoning, with confusing a decision with a probability, the same kind of mistake, incidentally, which leads to the “subjective” interpretation of probability—but that is a subject for another day.

As Howson and Urbach point out, the confusion about confidence intervals run deep. Suppose there are two intervals calculated, a 90% and a 95%. The frequentist must not say that either is more likely to contain the “true” value of the parameter. Indeed, he has no reason to prefer one over the other, since to do so would be to interpret the intervals in a Bayesian sense. Both intervals (individually) either contain the parameter or they do not is all the frequentist can say.

**Future experiments**

Some statisticians, noticing the discrepancy, will admit that they do not know whether the confidence interval before them contains the true (or “null”) value of the parameter, but that over time, over many implementations of frequentist procedures, i.e in the “long run”, then 95% of those intervals will cover the true value.

But this is a mistake (some Bayesians make it, too, when they argue for using procedures which are Bayes but which have “good frequentist properties”). Since one wants to know about *this* interval, it is pointless to argue from other intervals not yet created. Though if one could, it would be a Bayesian (logical probability) interpretation.

It would say that given all these other intervals cover the true values of the parameter 95% of the time, therefore mine is likely to cover the true value of the parameter. Some would swap in “95%” for “likely.” Either way, this is logical and not frequentist probability.

**What now?**

Since no frequentist can interpret a confidence interval in any but in a logical probability or Bayesian way, it would be best to admit it and abandon frequentism, as this author says.

OT

http://www.ritholtz.com/blog/2014/05/statistically-appropriate-climate-change-debate/

More fruitful than to continue the battle between the Bayesians and the frequentists is to concentrate one’s attention upon the question of how, in logic, one can generalize. This is the so-called “problem of induction.” It is this problem that set off the battle.

I’m trying to understand the difference. Does the following analogy hold any water: Frequentist is to Bayesian as discrete is to continuous?

To forestall any objection that Matt (excuse me, William M. Briggs) has mischaracterized frequentists in re confidence intervals, I append the following from (the popular) Rosner B. Fundamentals of Biostatistics. Fourth Edition, 1995. p. 162 (which I have at hand; the 7th edition came out in 2010).:

“Therefore, we cannot say that there is a 95% chance that the parameter Âµ will fall within a particular 95% CI. However, we can say the following: Over the collection of all 95% confidence intervals that could be constructed from repeated random samples of size n, 95% will contain the parameter Âµ.”

Thus, Rosner specifically states that, regarding any actual confidence interval that we can calculate, ‘we cannot say’, then goes on to say ‘but… infinity!’, just as Matt has it.

What Rosner (slyly?) omits is what we CAN say about a calculated C.I. About any particular CI. he says, ‘we cannot say’, but he never goes on to state what we CAN state about a particular CI.

As Matt notes, what we CAN say about any particular CI. is not precisely nothing. It’s: “the parameter falls within our particular calculated CI, OR it doesn’t”.

Then Rosner spends the next 200 pages ignoring what he just said and treating our ‘particular 95% CI’ AS IF we could say more about it than that.

Thus does Rosner vindicate Matt’s statements and analysis in every particular.

You might like the discussion between Neyman and a student called Milton Friedman, referred to in my post at:

http://davegiles.blogspot.ca/2011/08/overly-confident-future-nobel-laureate.html

Dave,

Thanks.

JohnK,

This is exactly it. Pretty standard example from textbooks. The theory is screwy, they know it, they ignore it, and they still say it is right.

Gary,

Hmmm. Maybe not. There’s nothing wrong with continuity, of course, But to start there is always wrong, because ALL our measurements are necessarily discrete.

Terry,

Exactly.

Bob,

Thanks.

Just for slaps and giggles, I took a quick blick in my old undergrad stat text: Meyer:

Introductory Probability and Statistical Applications.It makes precisely the same cautionary notes as our hosts. One may only assert that the parameter is in the interval or not. The 95% refers to the probability that this horseshoe has ringed the stake; but we may not say that there is a 95% probability that the parameter is inthisinterval. The parameter is not the variable; the interval is.Then I checked Leonard Savage in

The Foundations of Statisticsand he sees no use whatever in interval estimates. He seems to advocate something called a “behaviorist” theory of probability, so I don’t know where he falls in the axis. (I have neither finished nor digested his book.)YOS,

Savage also advocates sitting “bolt upright” with a pencil in hand while contemplating statistics. His is an excellent book, though I don’t favor his proto-subjectivism. His material on decisions is still worth reading.

The example you and John K found are in (practically) every introductory text book. Introduce, define, then forget—and hope nobody remembers. Which nobody does.

Except for cranks who end up blogging for a living.

I think I understand why, if we are frequentists, we can never say anything about the probability of an unknown, fixed parameter, being in any interval – it either is in it or not.

So I have some follow up questions:

1. What does a C. I. allow us to say that is useful, that helps me make a decision ?

2. Can we just forgot about C.I. and use Bayesian credibility intervals

3. Or can we say ‘I know the parameter is either in this interval or not’ but I am going to take some action as if it is highly likely in this interval, and hopefully I am right…?

Can I just ask? – I’m a civilian in all this mathematical philosophy and so can very easily get the wrong end of the stick.

An experimenter is collecting a sample from a population and using the results produced by mathematically manipulating this sample to make estimates about the whole population. Right?

Lets imagine the samples collected are like sweets in a bag – you put your hand in and pull one out (one sample containing lots of individual data points), another time you do it again and pull out a different one (I realize this is a discrete object, while in reality as you resample you will mix the data – maybe this is where this analogy breaks down!)

My simple way of thinking is that 5% (or whatever) of the sweets pulled out of the bag will be orange – where the maths will give a confidence interval which does not contain the population average (or whatever), and 95% of the sweets pulled out of the bag will be coffee flavoured with the confidence interval accurately telling you something about the population.

I don’t get a chance to taste the sweet, ie I don’t know if the maths has given me an accurate representation of the population.

But I do know how often the different sweets will appear.

Is this anything like an accurate analogy?

Is it a frequentist, bayesian, or just a confused civilian way of looking at what I am doing when I mathematically manipulate a sample to tell me something about the whole population?

Thanks for a fascinating web site. Great informative, thought provoking reading.

I am still a bit confused by this – when people say that the the parameter is either in side or outside the confidence interval and therefore the probability of A – that the parameter is inside the interval – is either 100% or 0% surely that p(A|E) assumes it is known where the population average (or whatever) is. If your E doesn’t include this data (or assumption) then it is possible for p(A|E) to have values other than 100% or 0%.

The probability of you knowing a fact is down to the knowledge (or assumptions) you have about that fact.

The ball in my hand is with 100% probability black, but (as you assume I’ve picked it from a bag mixed 70/30 with black and white balls) the probability you assign to it being black is 70%, not 0% or 100%.

It would be wrong to tell me the probability I give should be either 0% or 100% – right?

So isn’t it also wrong to say with either 0% or 100% probability the parameter is without/within the confidence interval?

Oh I am a confused civilian!

Just to edit my post immediately above. I said:

It would be wrong to tell me the probability I give should be either 0% or 100% â€“ right?Given that I know I’m holding a black ball, that is wrong. It is “you” who has made the assumptions therefore given these assumptions …

It would be wrong to tell you that the probability you give should be either 0% or 100% – right?Chinahand,

Lots of ways to go wrong with probability. Best short answer is that your commonsense is leading you in the right direction. If you don’t understand confidence intervals, don’t try to. They are wrong, as shown, and learning them can only corrupt your thinking.

See also the classic posts page for more.

Chinahand @5:30 AM,

your reasoning seems indeed very natural, and in fact many people argue along these lines and believe that 95% of all the confidence intervals they calculate should cover the true value. They don’t care that such a conclusion is forbidden in pure frequentist thinking, so telling them “das ist verboten!” 🙂 is unlikely to make them change their mind.

But, as far as I understand, the above conclusion is not only forbidden, ist is wrong, too. The point is this: when estimating a parameter like the mean value Âµ, say, and calculating a confidence interval for it, your result migth be Âµ = 8.5 with CI = [7,10]; according to the mathematical algorithm which produced the CI, this means that *if* the true value of Âµ were exactly 8.5 and the experiment were repeated many many times, then 95% of the resulting CI’s would contain 8.5 and would taste like coffee ;-). But to arrive at this conclusion, we had to *assume* that Âµ really and truly equals 8.5. Which, in all likelihood, it doesn’t. If this were the case, then our CI is indeed one of the 95% of intervals covering it. But the true Âµ could of course have other values, like Âµ = 8.65347, or 8.453, or 9.4565 etc. Also in these cases, we had a 95% chance that our CI covers the true mean, and indeed it does. On the other hand, Âµ could equal 6.5, 12.233, or any other value outside our calculated CI. It was of course unlikely for us to find an interval which does not cover Âµ, 5% in each case, but there are so many possible values of Âµ outside our (one and only) CI, so many possible cases, that we cannot infer that our CI has to cover Âµ with 95% probability.

CI’s refer, as is common in frequentist statistics, to the probability of certain outcomes, given a value of a parameter. What we would like to know instead is the probability of certain values of the parameters, given our single outcome. To turn this information around, we need a prior probability distribution on Âµ, and could invoke Bayes’ formula. For all we know, the prior probability for Âµ to be inside the interval [7,10] might even be zero (perhaps for some physical reason), but then the probability of Âµ being an element of [7,10] would still be zero after the “Bayes update”, despite the tempting 95% probability interpretation of our CI.

Hey Nick,

I am pretty certain you are wrong in saying:

according to the mathematical algorithm which produced the CI, this means that *if* the true value of Âµ were exactly 8.5 and the experiment were repeated many many times, then 95% of the resulting CIâ€™s would contain 8.5

The 95% CI is not *if* the sample mean were the true mean.

The CI is based on the distribution of the sample means being normally distributed about the true mean (based on the central limit theorem). Therefore, the differences of the sample mean from the true mean are also normally distributed. Using this assumption, we can set up a probability equation that gives an interval for which the sample mean is between 95% of the time a sample is measured. i.e since the difference is now assumed to be a normal distributed random variable, we can now give a 95% interval of its distribution. We can then solve for the interval of the true parameter and find our CI interval. However, we need to note that, when we solve for the parameter, the end points of the CI are now in terms of the normally distributed random variable that is the sample mean (i.e capital X, where X is the random variable “sample mean”). This is where we would say, if we wanted to be frequentists, aha the true mean is fixed, and the end points are our random variables. Therefore, 95% of these generated CI intervals will cover the mean, but this specific interval either covers it or does not.

I could be wrong…but I posted this to persuade Briggs to sort me out if I am!

What I am still unclear on is how a C.I. should change our behavior if we have to make some decision based on a parameter estimation. I.e I read some study and for some set of data the sample mean is 3 with CI of 2.8, 3.2. I read another study and the sample mean is 3, but with CI of 0.1 and 5.9. Do I care? or do I ignore the CI and look at other measurements…maybe Bayesian, that are more useful?

One quick follow up…what I meant when I said:

“Using this assumption, we can set up a probability equation that gives an interval for which the sample mean is between 95% of the time a sample is measured…. ”

I mean we can give the distribution in terms of a standardized normal distribution.. usually -1.96, 1.96 for the 95% interval.

Confused Will,

Not to further confuse you but why would you make a decision based on the parameter at all? Its a bit like buying a car because it’s red regardless of how the car will perform wrt to other requirements — like how fast it can go, stopping distance, carrying capacity, etc. IOW: why care about a model’s parameters when what you really want to know is the model’s predictive performance?

I was already able to predict with 100% accuracy what misinterpretations and howlers (statistical, philosophical, and historical, wrt Neyman especially) you’d be promulgating in a post with this title. Frequentists are not keen to use probability to quantify degree of belief, support, plausibility or what have you in hypotheses about the parameter value. As in real life, and the rest of science, they employ probabilistic properties of inference methods in order to ascertain which claims are well or poorly warranted. They detach “amplicative” inferences, and do not keep to presumed deductive updating. Do you see any posteriors on the knowledge claims inferred about the properties of the SM Higgs particle? No. It would take a book to really answer these mistakes, but the courageous reader can find them all on my blog and beyond. Ironically, it is the Bayesian who, if sensible, acts like a frequentist wrt his .95 HPD intervals. Statements they give .95 belief should be true 95% of the time–or if they don’t strive for this, they will happily assign .95 to HPDs known to exclude the true value with probability 1. (This is what Berger and Wolpert concede. Look up optional stopping on my blog.) Of course, the subjectivists are not “really” misled, given what they believed—that’s the Bayesian magic.

Deb,

The only howler I have is—wait for it—Hold the Mayo!

Sorry. I know you’ll forgive me (even though you must have heard that a thousand times).

No, no. You missed where I say really meant logical probability and where I say (as I say often) subjective probability is also an error.

I think the main error both frequentists and subjectivists make is to assume all probabilities are quantifiable. This is why you don’t hear from me probabilities on the Higgs.

You say frequentists “are not keen to use probability to quantify degree of belief, support, plausibility or what have you in hypotheses about the parameter value…they employ probabilistic properties of inference methods in order to ascertain which claims are well or poorly warranted. They detach ‘amplicative’ inferences, and do not keep to presumed deductive updating.”

As I often say, I’m not too keen on the Cult of the parameter. And I’m not too sure what ‘amplicative inferences’ are.

Say. How about a guest post? Or a link back to the post or posts you feel explain these kinds of things?

I don’t think the CI assumes a particular value; only that a whole buncha CIs (calculated from unbiased samples pulled from the same population in the same way etc.) will be such as to include the True Value about 95% of the time. It may or may not be the initial point estimate. If your first sample (the only one usually that you will get) results in X-bar = 8.5 with CI = [7,10] what you might say is: “Our best guess for Âµ is that it is 8.5, but it may just as easily be somewhere else nearby,” with the CI being a definition of “nearby.” If a second sample equally unbiased results in 8.7Â±1.5 so we get [7.2, 10.2]. If we consider both samples, we might pragmatically conclude that the fox has been tracked to [7.2, 10] if we assume that the population itself has not changed between the two samples. This may not be a safe assumption in many cases.

It’s worth pointing out a concrete example of a confidence interval procedure that doesn’t look too wrong at first glance but be shown to give garbage. Jaynes gives a nice example of one in Confidence intervals vs. Bayesian intervals on the 22nd page of the pdf file (page number 196).

Short version: a protective chemical prevents a device from failing but is exhausted at an unknown time, after which time to failure follows an exponential distribution with known mean. Interest lies in estimating the lifetime of the protective chemical. The confidence interval is based on the mean time to failure. Jaynes gives specific data points (sample size = 3) where the 95% confidence interval lies entirely after the minimum time to failure, i.e., the direct evidence of the data show that the parameter of interest cannot possibly be in the confidence interval, 95% confidence or no.

A sample size of three? I wouldn’t expect much of anything from that sort of thing. Besides, on the assumptions, the usual CI calculations, based on normality of sample averages, is unlikely to be fruitful. N=3 is not really big enough for the Central Limit Theorem to come riding to the rescue for data as highly skewed as an exponential. In fact, the mean time to failure is the sum of two distinct distributions: the (apparently unknown) distribution of chemical depletion plus the “random” failures thereafter (which gives the exponential life curve). Sounds like a three-parameter Weibull, but with the minimum time to failure unknown. With enough data, it might make more sense to plot on Weibull probability paper and determine probability limits empirically.

It seems to me that the CI is intended to bound the mean value, not the minimum value. Individual measurements always spread out farther than means.

Ye Olde Statistician,

That failure had nothing to do with the assumptions being violated or the sample size. It failed because CI’s themselves are seriously flawed. All you need do to find examples of it is look for situations when it starts to deviate appreciable from posterior Credibility Intervals.

Mayo,

There is so much nonsense in your response I can’t get to all of it. But this one in particular is worth bringing out:

“Ironically, it is the Bayesian who, if sensible, acts like a frequentist wrt his .95 HPD intervals. Statements they give .95 belief should be true 95% of the timeâ€“or if they donâ€™t strive for this, they will happily assign .95 to HPDs known to exclude the true value with probability 1.”

This is categorically false. Bayesians want whatever interval they create to contain the true value 100% of the time it’s actually needed (i.e. not some future repitiations that never happen).

The Frequentist failure to realize this leads them to try and create intervals which are wrong alpha% of the time theoretically, but in practice are wrong more lie 60-90% of the time.

Corey,

Thanks. Good citation to show both that CIs are interpreted as credible intervals, and that the two aren’t that different in most applications, but that some can be.

Ye Olde Statistician,

Just to be clear, the CI procedure in question is based on the exact N=3 sampling distribution of the mean, not an asymptotic approximation. The time to depletion is assumed fixed and identical from sample to sample, but unknown; and also by assumption, the distribution for the post-depletion component of time-to-failure is exponential with known mean equal to 1. In this setup, if we knew the exact mean of total time-to-failure, we’d know the time to depletion.

The problem is that the sample mean is not sufficient for the time-to-depletion parameter. A CI procedure based on the sufficient statistic (which is the sample minimum, not the sample mean) works just fine — and is identical to the posterior credible interval under a flat prior for time to depletion.

There needn’t be frequentists and Bayesians. Both have it’s mathematical theories. Both have its merits and shortcomings. Horses for courses. “Modern” statisticians see them as methods for data analyses. I find the century-old debate between frequentists and Bayesian tiring. In efforts to sort out big data, both frequentist and Bayesian ideas should be valued.

Dear Mr. Briggs,

Please see Jon Williamson’s book for a brief description of the differences between logical probability and objective Bayesianism. Either in chapter 1 or 2. (I don’t have the book with me at this moment. It is a book worth reading.) To conflate them into one is misleading. If I remember correctly, Williamson (Mayo also) claims that philosophers agree that logical probability is of little practical use.

Corey,

Instead of using the minimum, Jaynes constructed a 90% confident interval using a less efficient estimator, via method of moment, for the parameter \theta defined to be the time of guaranteed safe operation. Don’t you think this is a bit unfair? Just as one can pick a ridiculous prior to demonstrate the problem of Bayesian analysis.

Anon,

I understand Mayo just fine.

“The Frequentist failure to realize this leads them to try and create intervals which are wrong alpha% of the time theoretically, but in practice are wrong more lie 60-90% of the time.”You and Briggs make strange claims about what frequentists and Bayesians want, know and do not know! Competent statisticians would know what \alpha is. But… how did you come up with the number of 60-90% of the time?

Forecastin/prediction is one of the main purposes of statistical modeling. The assumption for making a forecast based on the available data is, in a way, the repeatability of certain situations under which the data set is collected.

Anon, Corey,

You’ll notice that JH did not answer a single criticism mentioned in the post. She points to a book, which I have and have read, that actually agrees with me, in the main. And she offers a three-out-four-dentists-agree argument to dismiss logical probability.

Now these are all fine debating tactics for high school, but as philosophical criticism, they won’t do and can be ignored.

JH,

The way to rebut professionally will be to start a sentence with words like, “The confidence interval actually has a 95% probability, just like people use it…” or whatever. Feel free to have another go. But if you don’t answer any specific critique I made above, I won’t bother answering you.

Mr. Briggs,

I was not debating with you at all. As I said, I find the century-old debate between frequentists and Bayesian’s tiring.

I only want to point out that there are differences between logical probability and objective Bayesianism. Williamson offers explanations and references in his book. Check them out!

If I were home, I would have typed them up for you… only because I love you.

No, you won’t respond to my comments directly because you won’t admit that logical probability is of little practical use in statistical modeling.

Yep, I am basically repeating my previous comments. No debating tactics.

JH,

Oh, I can wait until you get home to type out Williamson’s opinions—from his book

In Defence of Objective BayesianismToo bad you were tired and couldn’t respond to any of the criticisms made of confidence intervals. Perhaps when you regain strength you could have a go?

Oh, Mr. Briggs, is Jaynes a member of the Cult of the parameter? I think so. He obviously understood what a parameter is for!

Mr Briggs,

Better yet, you seem to have the book, why don’t just scan it for your readers? It is in chapter 1 or 2. Two weeks are a long time for curious readers to wait, don’t you think?

Nope, I rather spend my energy on something else. If you want more arguements of both sides, check out the stat. stackexchange site. You would find interesting examples to help you better understand confidence intervals and credible intervals. Great ones, I have used a couple of them for my classes.

JH,

For somebody tired of the debate, you sure have a lot of energy to come back and NOT answer any of the critiques of confidence intervals.

What’s to answer about your critiques? Yep, for example, it will take much more time ro demonstrate that lager sample sizes do not always result in narrower intervals than small samples.

The problem is that the sample mean is not sufficient for the time-to-depletion parameter.Of course it isn’t. By hypothesis, (assuming the process is statistically stable) you’re estimating the mean of a distribution-plus-a-constant. That is, only the period

afterthe unknown constant period is the actual random variable. The exponential distribution applies to failures that are due to a constant hazard rate, and by supposition that does not the entire operational lifetime. This is not unusual for reliability problems, where there are often multiple failure regimes. And they aren’t even constant.A similar problem, but one from the real world, is the number of times a telecomm underwater repeater must be calibration-tested, using test-tweak-test-tweak strategies in the calibration lab. On the face of it, it sounds like a Poisson, but it isn’t because there is always the initial test. However, “number of tests minus one”

didbehave like a Poisson.The problem is that a “chunk” of data may not come from a distribution at all, and the Stat 101 stuff doesn’t help. Example: sample paste weights on lead battery grids over a period of several hours had a particular mean, let’s say 7.0. But at no time during that period was the pasting machine running that average. For the first four hours it was running lighter weights; then there was a density change in the lead oxide paste batch and for the next four hours the paste ran heavier. So one is wary of estimators, even interval estimators, under any conditions. Product lots seldom mimic statistical distributions.

The key point is that in both cases, Bayes and Classic estimation methods (I do not like the word frequentist, for it seems to imply that the requirement of asymptotic convergence properties only are not a logical requirement for the first ones but only for the second ones) the parameter can be estimated using a random set (i.e. confidence interval shrinking towards the unknown parameter when the sample increases), or a random variable (point estimator converging to the parameter). I do not see too clearly why “classic statisticians think as bayesians when they use a confidence interval.” The same can be said the other way around.

The logic behind the interpretation of CI’s as Bayesian is compelling . There is just this little rub about choosing a prior. Which one and why?

See http://www.bayesian-inference.com/priorclasses

Briggs says, “The punchline: 95% of these confidence intervals will â€œcoverâ€ the true value of the parameter.”

If the above is true then, if A is a random draw from the set of all possible confidence intervals, it follows that P(A)=95%, which is a true probabilistic statement about a confidence interval.

I don’t need to be convinced that Bayesian interpretation is far superior to the frequentist approach, but I think Briggs is painting a far too extreme version of frequentist logic. Where, for instance, does the idea that “Frequentist theory insists that the actual interval must not be associated with any probability proposition” come from?

A confidence interval is defined as the random interval [x,y] around some parameter b, for which P(x<b<y)=.95 is true. So I'm not getting how one cannot make probabalistic statements about confidence intervals (or any other interval, including a subset of the CI). Given the central limit theory, the probability that P(x<b<y) can be defined for any x and y.

The fact that frequentists treat the interval [x,y] as random and the parameter as fixed does not mean we can't make probabalistic statements, such as a confidence interval.

Bayesians treat the parameters as random and the data as fixed, while frequentists treat the parameters as fixed and the data as random. How does it help to argue that frequentists don't get to make probability statements. Both groups are dealing with probability statements.

If all frequentists use Bayesian logic, are you sure you are not using an overly narrow conception of frequentist theory.

Knut,

Your counter example fails because your “random draw” does not correspond to frequentist theory. Indeed, it is Bayesian.

If you are unsure where the statements I made came from, as you appear to be, you are welcome to look up the references I gave.

Plus, you have not answered the main criticism, which is that all one can say about the interval in front of you is that it either contains the “true value” or it doesn’t.

While your practical criticism is valid within reason, I can summarize your main point as follows: If something is not immediately obvious, don’t bother to learn it. It must be wrong and useless. I wish you luck as a scientist with that attitude.

Frequentist confidence intervals summarize evidence in a rigorous and objective way. It allows you to compare the precision of different experiments and to combine their evidence. What else could you want?

Bayesian inference is not more intuitive and it does not protect you from making the wrong decisions based on its probability statements. If you take it seriously, you also have to talk about the problem with subjective priors, and if you want to avoid logical inconsistencies, about Jeffrey priors and reference priors. And then the ease of Bayesian inference is out of the window. Maximising a likelihood is simple.

HD,

Allow me to summarize your main point this way: “Frequentism is right even when it’s wrong.”

See the Classic Posts page.