When I checked FiveThirtyEight.com’s Senate prediction, it said “Republicans have a 72.3% chance of winning a majority.” There were also words that the “probability that each party will win control of the Senate by running those odds through thousands of simulations.”
What are these “simulations”? Who knows. But probably not what you or even the staff at FiveThirtyEight think they are. “Simulations” have more than a patina of mysticism about them and often mask what’s really happening. But it doesn’t matter. Call whatever evidence Nate Silver’s team assembled, which includes the evidence of their calculation methods, S.
S has whatever polling data Silver thought relevant (at the date I accessed his forecast), details on past races, maybe even information about the voters in each state, anything that was thought probative of the proposition R = “Republicans win the majority.”
Put into equation form, we have Pr( R | S ) = 0.723. Note that that’s 0.723 and not 0.722. Never mind.
Now the outcome was that the Republicans did take the Senate. This observations does not mean the probability was right nor that it was wrong. Assuming Silver didn’t make any errors—a big assumption, given that they admitted to using “simulations”—the probability was correct. It would have been correct no matter what happened in the elections. That is, the probability did not have to be Pr( R | S ) = 1 or 0.
Here’s why. Suppose our evidence is that C = “We have a two sided-coin which when flipped must show one and only one of two sides, head or tail” and proposition H = “A head shows.” We have Pr(H | C) = 1/2. We apply this model to a real coin and discover, after flipping, the head shows, i.e. H is true given this observation. The probability of H given C was correct. The probability of H given the observation, which equals 1 (of course), is also correct.
With a coin it is possible, and it has been done, to find the precise physical conditions which will tell us exactly which way the coin will land. That is, given sufficient resources we can find a C’ for which either Pr(H | C’) = 0 or Pr(H | C’) = 1, depending on whether the coin will land tails or heads. This C’ is then a causal model.
The original model C had nothing to do with causality, except in the weak sense that it correctly listed the possibilities. Now we don’t know for sure, but it’s likely that S contained no causal elements either. What might one of these elements have looked like? Suppose one Senate race was fixed (say in Illinois or New York, where such things are de rigueur) and that S had that information. This part of the election was purely causal. But all of S obviously was not.
Other examples of mixed models, which contain both causes and probabilities are, yes, Global Warming Models, thought they lean toward being fully causal. The final forecasts are not fully causal, however, which is important. Here’s why. C’ was a fully causal model. Suppose we asked the same folks who program GCMs to model our coin. Because they’re too busy applying for new multi-year grants, they don’t pay too much attention and say Pr( H | C’) = 0, when in fact H occurs. Then C’ has been falsified.
But suppose a climatologist makes a forecast which says, “Pr( Temperatures soar | 98% Consensus Model) = 1 – epsilon”, where epsilon is any number greater than 0 (and less than 1, naturally). Then even if the temperatures do not soar, which they did not, the model has not been falsified.
No non-causal probability model can be falsified, as long as that model gives non-zero probabilities to possible events. This is a fact of life.
Thus can pollsters and climatologists remain ever hopeful, because, I must emphasize, the probabilities they issued (assuming no error in calculation) are correct, they are right, they are true. And truth is always to be embraced.
Now Silver’s and any modeler’s job is to find those elements and conditions that make their models as close to causal as possible, which means making their predictions as close to 0 or 1 as possible. It often turns out that two or more different people have differing information, which results in different—and correct—probabilities. Thus is gambling born.
From this, it might be clear that we can also find models which are as far from causal as possible, but which still contain all the correct information about possibilities. C was as an example. These pure probability models are extremely important, and words like “uniformity”, “maximum entropy”, “information”, and “random” (as computer science theorists use the term) arise. These for another day.
Update I idiotically forgot to say anything about usefulness. So here’s a teaser: probabilities are not decisions!