# Model Selection and the Difficulty of Falsifying Probability Models: Part III

To clarify: to prove false means to demonstrate a valid argument with a *certain* conclusion. If a theory/model says an event is merely unlikely—make it as unlikely as you like, as long as it remains possible—then if that event happens, the theory/model is *not* falsified. To say, “We may as well consider it falsified: it is practically falsified” is like saying, “She is practically a virgin.” False means *false*; it has *no* relationship with *unlikely*.

A theorist or statistician has in hand *a priori* evidence which says model M_{1}, M_{2}, …, M_{k} are possible. Some of these, conditional on the theorist’s evidence, may be more likely than others, but each might be true. If these models have probability components, as most models do, and these probability components say that any event is possible, no matter how unlikely, then none of these models may *ever* be falsified. Of course, those models in the set that say certain events are impossible, and these events subsequently are observed to occur, then this subset of models can be falsified; the remaining models then become more likely to be true.

In Bayesian statistics, there is a natural mechanism to adjudge the so-called posterior probability of model’s truth: it is called “posterior” because the model’s truth is conditional on observation statements, that is, on what happens empirically (this needn’t be empirical evidence, of course; any evidence will do; recall that probability is like logic in that they study the relationships *between* statements). Each models’ *a priori* probability is modified to its *a posteriori* probability via a (conceptually) simple algorithm.

These *a posteriori* probabilities may be ordered, from high to low and the model with the highest *a posteriori* probability picked as “the best.” The only reason one would want to do this is if a judgment must be made which theory/model will be subject to further actions. The most common example is a criminal trial. Here, the theories or models are suspects in some crime. At the end, only one theory/model/suspect will face punishment; that is, at most one will. It may be that no theory/model/suspect is sufficiently probable for a decision to be made. But if the suspect is found guilty, it is not that the convicted theory/model/suspect is certainly guilty, for the other theories/models/suspects might also have done the deed, yet the probability that they did so, given the evidence, is adjudged low. This implies what we all know: that the convicted might be innocent (the probability he is so is one minus the probability he is guilty).

It is often the case (not just in criminal trials) that one model (given the evidence and *a priori* information) is overwhelmingly likely, and that the others are extraordinarily improbable. In these cases, we make few errors by acting upon the belief that the most probable model is true. Our visual system works this way (or so it has been written). For example, your brain assures you that that object you’re reaching for is a cup of coffee, and not, say, cola. Sipping from it provides evidence that this model was true. But as we all know, our vision is sometimes fooled. I once picked up from the carpet a “cookie” which turned out to be a wiggling cockroach (big one, too).

Now, since we began with a specified set of suspects (like my cookie), one path to over-certainty is to not have included the truly guilty in the list. Given the specific evidence of omittance, the probability the other suspects are guilty is exactly zero (these theories/models are falsified). But, in the trial itself, that specific evidence is not included, so that we may, just as we did with the green men, calculate probabilities of guilt of the (not-guilty) suspects. Keep in mind that all probability and logic is conditional on specific, explicit premises. The probability or certainty of a conclusion changes when the premises do.

So what is the probability that we have not included the proper theory/model/suspect? That question *cannot* be answered: at least, it cannot be answered except in relation to some evidence or premises. This applies to *all* situations, not just criminal trials. What might this external evidence look like? We’ll find out in Part IV.

You either eat some very strangely shaped cookies or you have an unusual species of roach residing nearby.

OK. The trial thing is maybe where pragmatism breaks down. The police essentially have a time limit in many cases — self imposed perhaps but a limit nonetheless . They have to discard large numbers of potential suspects. We never find out why. Plus there’s the fact that the trial itself and the appearance of the defendant is oft seen as evidence of guilt. ONN recently ran a parody of a young white NJ murder defendant ordered by the court to be seen as a large black man and ordered the jury to see her as such.

The probability that the wrong person has been selected is the point of the trial. Theoretically, a publishing a paper is opening it to critical review — essentially a trial. Unfortunately, all too often the general appearance and form of the paper is sometimes compelling evidence. Papers rarely discuss discarded alternates and sometimes the evidence amounts to hand-waving. And, just like juries, it’s the show that counts. Science publication is viewed as a business. There’s really no easy way to fix it.

All very interesting. (No, not sarcastic …)

Leaving any other comments until I have heard the full story!

(If you wonder, my email address has changed. This is current. Please redact. Thank you.)

Mr Briggs, you wrote: â€œAt the end, only one theory/model/suspect will face punishment; that is, at most one will.â€

I appreciate that the criminal trial is just an analogy, but it really doesnâ€™t support the â€œone true modelâ€ hypothesis. Unlike criminal suspects, models can never be convicted. This is due to the nature of the charge brought against them, which is the purely metaphysical crime of being the â€œone true modelâ€.

Starting from the quotation above – surely more than one person can be guilty of a particular crime? Some crimes by definition require more than one criminal participant (price fixing cartels). Moreover, people are not models (well, not mathematical models anyway). Models are predictive, a criminal suspect is not. For the analogy to work, perhaps it should be the â€˜hypothesis of suspect guiltâ€™ that is put on trial, rather than the suspect as a person?

Applying that idea to models, it would mean that it is the hypothesis that the model is the â€œone true modelâ€ that is being tested, not the model itself. Is the distinction important? I think it is, because unlike the criminal trial, we cannot find that hypothesis â€˜guilty as chargedâ€™, meaning we cannot prove it is the one true model. Instead, at most, we can only find the hypothesis â€˜not guiltyâ€™ of being the one true model, which we do by showing that the model is not a true model. However, you showed convincingly that falsification is not possible for a statistical model. If we cannot falsify a model, we leave its status as the â€œone true modelâ€ undecided. It therefore seems that all statistical models must all share the same unknown (and unknowable) true/false status about whether they are the â€œone true modelâ€. Putting models on trial cannot change that status.

Stephen P,

The statement “more than one person can be guilty of a particular crime” is itself a model or theory. Models/theories are not only predictive, they are also explanative. Nearly all of frequentist statistics is based on explanative models (though many models should be predictive). In the case of a criminal trial we only have the evidence provided about

thislist of suspects/models/theories. There is nothing to predict.William B,

I think that my statement: â€œmore than one person can be guilty of a particular crimeâ€ is a fact, not a model or theory. It can be deduced syllogistically from the definition of certain practices (like price fixing) and their legality.

I would be happy to replace â€˜predictiveâ€™ with â€˜explanative/predictiveâ€™ in my text. I donâ€™t think it changes the argument.

So I guess this means that ESP is not falsifiable?

@StephenPickering

The model in a crime case is the description how and by whom the crime is committed: suspect A drove in 10 minutes 500 miles from place B to D on his bike to kill victim E by cutting his throat with a daisy, because A did not like F’s tie.

Invalidating the model means proving that A was somewhere else, or showing that the proposed sequence of events could not have happened in the real world.

@Mike B

If ESP makes observable predictions, it is verifiable.

I think the trial example confuses rather than clarifies your point. Stick to logic and rational models.