The Only Thing That Has Fat Tails Are Rappers: More On The PP Fallacy

Evidence for an event cannot be the horrific consequences of that event.

Suppose we’re interested in the probability of a Y = “Horrific plague”. Since nothing has a probability, we need first supply evidence probative of Y with which to calculate a probability. I only mean “calculate” loosely; not all probability is strictly quantifiable.

Suppose this evidence: X = “Horrific plagues are horrific”. Then what is

     Pr(Y | X) = ?

I am praying it is obvious that X is not probative of Y. X indeed is merely a restatement of the quality of Y: horrific plagues are horrific. This is a tautology. It is not new evidence. X is no way gives us any evidence of how Y might come about, about how Y is caused, about the circumstances of Y in any way. X merely restates that horrific is horrific.

Thus Pr(Y | X) has no solution; the probability is, if you like, the entire unit interval (this is deduced via obscure arguments not of interest here). We cannot get from X to Y. Agreed?

Suppose instead our evidence is: X = “Horrific plagues have fat tails.” Now what is

     Pr(Y | X) = ?

Some seem to think that, suddenly, this probability is now greater than 0; perhaps small, perhaps not a tangible number, but a number greater than 0 just the same. And since a number greater than 0 is real enough, Y suddenly becomes possible. We have moved from complete ignorance in whether Y could happen to definite information that Y might happen.

Then comes step two: since Y might happen, given our assumed X, and Y is by definition horrific, then we must take immediate steps to stop or mitigate Y!

Now this all came up in a conversation I had with a fellow, a rabid fan of Taleb, and I can only hope my brigade of boosters is as tenacious, who kept insisting, via his master, that because horrific plagues “have” “fat tails” we should move to protect against them. He used the analogy of buying fire insurance for his home. He acknowledged he only had a dim idea of the probability of his home burning, but he knew well the consequences of it. So he bought insurance.

The two scenarios are not equivalent. (I informed him his insurance company had such evidence and knew the price, which is how they set the price.)

First, by “fat tails” Taleb is mixing up probability of events and the consequences of those events, sort of. Horrific plagues, for instance, are rare: we observe they do not often happen, though they do happen on occasion, and (again) by definition they are horrific. Taleb thinks that, somehow never mind how, these rare consequential events actually possess the property of “fat tailness”.

Since it is a property, it can be measured. And since it can be measured, Y suddenly becomes possible, and since Y is possible and horrific, it should be protected against. Like Black Swans Attacking From Outer Space.

Do we even need to dissect this? The answer is: yes; yes, we do. Because somehow this argument “feels” convincing to some.

What can “fat tails” mean? Well, it can mean “rare”. Thus X = “The event Y is rare” then

     Pr(Y | X) = small.

Nothing wrong with this mathematically. But it’s circular since X assumes the probability we wish to deduce. We wanted the probability of Y with respect to some evidence, and the evidence dictated the rarity.

So “fat tails” meaning “rare” is out. How about “fat tails” means “horrific”, as in consequential? What’s Pr(Y|X) now?

We already did that. We can’t deduce the probability of the event by discussing the consequences of the event.

We have rejected both “rare” and “horrific” (or other similar large consequential words). How about “fat tails” means “rare and horrific”? Then what is Pr(Y|X)? Easy: Pr(Y | X) = small. But that’s again circular and thus of no help.

Saying “fat tails” is absolutely useless, except as a shorthand to say “bad things happen, and really bad unpredicted things happen rarely.” Which everybody always knew. If we knew how to predict the really bad things, we’d predict them. If follows that because we can’t predict them (always), we can’t predict them (always).

So what are people really doing when they’re fretting over, say, this new coronavirus? Some are doing this: X = (something like) “Y is really terrible, and I worry I’ll die if I get the bug, or that many will die if they get it, as some have already died and where the media is implying huge numbers might die, so we better do something to prevent its further spread.” Then

     Pr(Y | X) = modest to large.

This X is a confusion of many different pieces of evidence, some of which are truly probative, others of which are restatements of “fat tails”.

An epidemiologist has a better X. He uses historical evidence of outbreaks, ties this together with mathematical this and that, which also uses the plain evidence that the outbreak is now occurring. With this he pegs a Pr(Y|X). This may be high or low depending on what model the epidemiologist uses. It has nothing to do with “fat tails” in any ontological sense, though everything to do with probability (when “fat tails” means rare).

You care about the coronavirus now because you’re hearing about it now. But two weeks ago, or whenever it was, before you heard about it you didn’t care about it. There was no X most were willing to consider, except historical X, about possible new plagues. It’s very reasonable for somebody to argue X = “Horrific plagues have happened before, and here are a list of reasons, such as easy air travel, to why they may happen again.” We can get a Pr(Y|X). We can calculate, roughly anyway, costs of protection and so on. Fun stuff.

You didn’t two weeks ago, or whenever, say “Oh my! Horrific plagues have fat tails and anything might happen, which we know because we don’t know what can happen, so let’s ban air travel now just in case!” That’s a pure Talebism, relying on the precautionary principle. I quote myself:

That we don’t know what we don’t know is known, or should be, and is thus a given. But because we don’t know what we don’t know does not make what we don’t know bad. It could also be good, or benign. To say it could only be bad is the PP [precautionary principle] Fallacy.

It is incoherent to run in a circle screaming “We know nothing so we must do something!” Can it really be that “fat tails” means “unpredicted”? That Pr(Y| Y unpredicted) = not small? That’s what the precautionary principle does for you: truly something out of nothing.

You care now because you have heard definite evidence in favor of the proposition Y. That evidence may be good or bad, who knows at this point. It is not irrational, conditional on these X, to say the probability this plague will be horrific is not small. But you’re saying this because of the definite evidence of infection and deaths you have heard about. You’re not saying it because horrific plagues have “fat tails” and “fat tails” are scary.

We come to the main two points: all probability is conditional, and the conditions are what are important. No event “has” a probability. All probability depends on the evidence we assume. And it is that evidence which we should always be debating. Probability (except for the math bits) is always trivial once the evidence is laid down.

So I ask you, what is the X = actual evidence for this actual outbreak to be an actual Y, i.e. a horrific plague? If you only go with the negatives, your Pr(Y|X) will be high. If you factor in nervousness and instant news cycles hungry for sensation, your Pr(Y|X) will be much lower.

Which X are the correct X? That is the right question.

To support this site and its wholly independent host using credit card or PayPal (in any amount) click here

10 Thoughts

  1. “(I informed him his insurance company had such evidence and knew the price, which is how they set the price.)”

    They estimate the probability from the frequencies too (frequentism). 😉

    Justin

  2. Fat Tails my approximation underpredicts the frequency of unusual/uncommon events and overpredicts the frequency of common events. Commonly occurs when using simple approximations (e.g. point estimates of sample means and variances and normal distributions)

  3. In the case of insurance, they would be out of business if their models were bad. Generally it seems that the relative frequency of the kinds of past events that they insure appears to predict pretty well the number of such events a company must pay for each year, so they can set their prices appropriately. This is not at all the same as the probablity of something Bad happening to You, as that Bad thing had very specific causes. If you are a skydiver who isn’t great at packing your parachute, your probablity is probably very different from someone who stays home all day every day.

  4. To give Taleb the benefit of charity, perhaps he means by a fat tail that he is quantifying the uncertainty in Y by a distribution that has a large or divergent first moment, as opposed a Gaussian or exponential.
    That doesn’t bail out the PP, but at least makes the arguments clearer. Support for the estimate of Y is still needed.

  5. Thanks, Matt. My understanding, which may indeed be incorrect, is that a “fat-tailed” distribution means that events far from the mean (average) are more common than in a standard Gaussian normal distribution.

    A common example in the area where I use statistics, which is climate science, is the difference between Gaussian normal distributions, and high Hurst Exponent distributions like fractional Gaussian noise. The FGN distribution is much “fatter-tailed” than the Gaussian normal distribution. Here’s a comparison of the two:

    https://wattsupwiththat.files.wordpress.com/2015/07/histogram-trends-fractional-gaussian-random-normal-pseudodata.png?resize=681%2C667

    The result of this difference between FGN and Gaussian normal is a lot of claims that uncommon events far from the average run-of-the-mill weather are “significant”, or that they are not “natural” but man-made, when in fact they are a totally expected result of a “fat-tailed” FGN distribution.

    Anyhow, that’s how I understand “fat-tailed”. But as I’ve said more than once … I’ve been wrong before …

    w.

  6. Willis,

    A “fat-tailed” distribution, as I said in the original post (and in many other places), is one which puts higher probability on larger events than a normal distribution. This is Taleb’s big push: to remove reliance on normal distributions. Same things used in regressions, as I go on and on about.

    Nothing “has” a “fat tail”, though. Nothing is normal or “has” a normal tail, either. Probability is only assigned, or deduced, with respect to evidence assumed. And it’s the evidence that counts. There may be here in this case some biologist waving his arms around saying, “Look at my evidence. I predicted this!” To this fellow, the event was not rare, it was highly probabile given the evidence he used. To me, not a viral expert, it was informally rarer.

    Here today (and to answer your question on other post too), is an article from Forbes showing how to get it wrong: The Coronavirus Is A Black Swan Event That May Have Serious Repercussions For The U.S. Economy And Job Market.

    “A black swan event is a term used on Wall Street that refers to a rare and unpredictable occurrence that is beyond what is expected and has severe consequences.”

    Basically just saying BAD THINGS HAPPEN and this is a BAD THING. This guy also thinks that all such events are “unpredictable”. And that means “can’t be assigned not-low probability”, or something like that.

    Now some things are unpredictable in this sense, like an electrons position given we know its velocity with certainty. We know this based on certain evidence.

    There’s no proof I know of that shows souped-up cold virus outbreaks are unpredictable. It’s just that most don’t predict it. The evidence that would have given the event high probability might have been available. Might still be available.

  7. Matt, that clarifies some things very much, thanks. However, I’m stuck on this one:

    ===
    “Nothing “has” a “fat tail”, though. Nothing is normal or “has” a normal tail, either. Probability is only assigned, or deduced, with respect to evidence assumed. And it’s the evidence that counts.”
    ===

    I would say that a time series of 10,000 throws of an unweighted pair of dice has a normal distribution. And I would say that a time series of 10,000 hourly measurements of temperature does not have a normal distribution.

    And more to the point, this knowledge lets me evaluate things like whether a series of throws of the dice might indicate that they are weighted, or a series of temperatures might indicate that they are being influenced by some underlying change in variables.

    It doesn’t seem to me (although as I said, I’ve been wrong more than once … well, more than twice if I’m being perfectly honest) that I have NOT made assumptions about the evidence in these cases.

    What am I missing here? And please be aware, this is an honest question. As in all scientific fields, I’m totally self-taught. This has left me with an education which is wonderful because it is immensely broad, but sometimes it is too shallow …

    Regards,

    w.

  8. Willis,

    No problem. Best place to start is Book page which has many links to articles on this topic, or the book itself. Probability is only conditional on the evidence assumed, that pile of dice throws has causes, which if known would eliminate need for probability.

    Of course, all evidence has assumptions attached to it, without exception. This is a well known argument in epistemology, anyway.

    This is for everybody. What evidence counts about the new/novel coronavirus? And the dual question: what does not count as evidence? To answer either (important) question requires assumptions. Even assumptions of the words and symbols you use to express the problem.

    How that all ties together is too large for a comment, but see those other sources.

  9. Dear Willis,

    10,000 throws of a pair of dice follow the binomial distribution which is discrete. The normal distribution is continuous and can produce parameters that are impossible in that situation.

    Also, dice throws are NOT a time series in that their order (in time) is not of import or interest.

    Hourly temperature measurements are a time series. Their order in time is vitally important: night-time is cooler than day-time, summer is warmer than winter, etc. Two measurements an hour apart are probably related (dependent) more so than measurements taken 1,000 hours apart.

    Time series are also discrete unless the measurements are continuous, which they never are. They are also never normal and do not follow a Gaussian curve.

    Time series analysis is quite complicated. There many types of time series data. Some are cyclical, such as temperature measures. Some are hyperbolic, such as growth curves. Some have other discrete (step) shapes, such as in survival data. There are lots of other factors to consider, but “normality” is not one of them. Neither are “tails”, fat or otherwise.

Leave a Reply

Your email address will not be published. Required fields are marked *