Statistics

Everything Is Already In The Simulation

On Drudge a while back was linked the article “Pandemic ‘could wipe out 900 million people,’ experts warn” in some tabloid or sensationalistic rag (New York Times?).

A chilling simulation has revealed just how easily a new pathogen could wipe out a huge slice of the world’s population — up to 900 million people.

Researchers at John Hopkins University simulated the spread of a new illness — a new type of parainfluenza, known as Clade X.

The simulation was designed so the pathogen wasn’t markedly more dangerous than real illnesses such as SARS — and illustrates the tightrope governments tread in responding to such illnesses.

Here’s the world’s simplest chilling simulation:

     Input X —> Output X.

Now imagine you’re a scientist anxious to understand how millions will die. Input “Millions will die from splenetic fever” (i.e., a mind-fever produced by consuming too much news media). What’s the output? “Millions will die from splenetic fever.”

What’s the headline?

Artificial Intelligence Computer Model Predicts Millions To Die By Splenetic Fever!

Doubtless “climate change” would feature in the body of the breathless article.

You will say the example is silly, which it is. But it is no different in a fundamental sense from the linked article. There, there was an Input and Output, and an algorithm to turn the one into the other. (The algorithm was “—>”.)

The algorithm is designed by the scientist or “researcher.” It does what it is told to do. Always. The algorithm—any algorithm—was programmed on purpose to say “When you see X, say Y”, however complicated the steps in between from X to Y. This is so even if the algorithm uses “randomness” (see the full dope of the severe limitations of simulation methods).

Of course, some algorithms are so complicated that some people cannot see which combinations of X lead to which combinations of Y. So what? Some people can’t multiply two numbers without a calculator, but multiplication is no mystery. That X leads to Y is in any algorithm by design. It was put there!

If you want to cheat, or cheat yourself, the path is clear. Call X whatever you like, label the algorithm a “simulation” or “deep learning” or “artificial intelligence” or similar, and then express marvel at Y. Again, sometimes the path is not clear from X to Y, and the way the algorithm produces Y might teach you something about X. But since X is put there by you, and the algorithm does what you told it, it cannot be marvelous when it works as it should.

This, incidentally, is why there is not one whit of difference between a “simulation”, “forecast”, “prediction”, “prognostication”, “scenario”, or any of the other words that describe getting from X to Y. People who take refuge in a failed “scenario” by claiming the scenario wasn’t a forecast are fooling themselves. And possibly you, too.

There is no saying the Y has to be certain: it need only be probable with reference to X and the algorithm.

Anybody notice the similarities between any probability model, or mathematical model, or indeed any model at all? You should by now.

A simulation, prediction, etc., fails in two ways. X could be mismeasured or misspecified, and the algorithm is good. Mistakes happen. Or X could be fine and the algorithm stinks. Or both. Pros, like those behind the linked article above, rarely screw up X. But they love their algorithms too well. Algorithms can be right in saying Y from X, but wrong in why Y truly came about. Monkeys throwing darts can pick good stocks.

Of course, I am not saying there will not be a pandemic where a seventh of the population is wiped out. Nor am I claiming “a doomsday cult” won’t release a “a genetically engineered virus.” But if you’re writing a simulation that takes as input X = “Doomsday cult releases genetically engineered virus”, part of that algorithm that leads to Y = “Nearly a billion die” has to specify, by design, the kind of virus that would kill a billion in a manner that must be imagined by the algorithm’s designers.

That is, we are not at from our simple chilling algorithm. Update Yes, I wrote the previous sentence. And even I have no idea what it means.

Categories: Statistics

12 replies »

  1. You have ruined me forever. Every time I see a study or headline, engage in a discussion with someone, I immediately ask or at least wonder what the input parameters and the algorithm for output are. I find myself saying things like “That depends on the input parameters” or “What was the algorithm designed to do?”. I am ruined for all of eternity, never able to simply say WE’RE ALL GOING TO DIE!!!! again. 🙂

  2. A person can do a lot with a pencil, a piece of paper, numbers, and a goal. Lots of folks make their living doing this.

  3. algorithms that simulate disease transmissions are tunable, because different diseases also differ in terms if mortality, infection rate and whatnot. Tweaking the parameters so you can model historical epidemics is reasonable, because then you will get some insight in what might happen when a much more aggressive variant of some disease, or a new one is found.

    What this headline says is that this particular new nasty will not kill everybody, even with the model tweaked for this disease to its most horrible setting. Which is good news, in a way, but good news doesn’t sell.

  4. Models and more predictions. More crystal balls. Some inside people’s heads. This is like the Briggs Saturday movable feasts on pulp often fiction.

    When something, anything, bad happens, he and others will join in the refrain,,
    “we was right”.
    Like the hypochondriac’s gravestone which reads:

    “told you I was ill”.
    It’s so important to be proved right!

  5. Sheri —

    So true. I can’t even engage in morning break room banter over the latest headline stating caffeine is good without first asking for the underlying assumptions in the model.

  6. Using epidemic simulation makes sense. It doesn’t provide an absolute truth, but it can, in fact, provide insight into the problem. Briggs’ problem should be with the headlines, which imply certainty, rather than with the use of the model.

Leave a Reply

Your email address will not be published. Required fields are marked *