*Say! What happened to lessons three through four or five? Who knows. This morning, I’m dreadfully rushed, so just a sketch. I do not expect anybody to be convinced this fine day.*

Where were we?

Suppose I’m interested in the ages (in whole years) of my blog readers. Now, except for about three or four exceptions, I don’t know these ages, do I? Which means I’m uncertain, and thus I’ll use some kind of probability model to quantify my uncertainty in these numbers.

In some cases, I can supply premises (evidence, information) that allow me to deduce the probability model that represents my uncertainty in some observable. This applies to most casino games of chance.

But most times I cannot find such evidence. That is, there do not exist plausible premises that allow me to say that a certain probability model is *the* probability model that should be used. What to do? Why, just assume *for the sake of argument* that I do know which probability model that should be used! Problem solved.

Most times, for anything that resembles a number (like ages), a normal distribution is used. This is usually done through laziness, custom, or because other choices are unknown. Before I can describe just what assuming a probability model does, we should understand what a normal distribution is.

It is the bell-shaped curve you’ve heard of, and it gives the probability of every number. And every number is just that: every number. How many are every? Well, from all the way out to negative infinity, progressing through zero, and shooting off towards positive infinity. And in between these infinities, are infinite other numbers. Why, even between the interval 0 and 1 there are an infinite number of numbers.

Because of this quirk of mathematics, when using the the normal to quantify probability, the probability of *any number* is zero *in all problems* (not just ages). That is, given we accept a normal distribution, the probability of seeing an age of (say) 40 is precisely zero. The probability of seeing 41 is zero, as it the probability of seeing 42, 43, and so on.

As said, this isn’t just for ages: the probability of any number anywhere in any situation is zero when using normals. But even though the probability of anything happening is zero, we can still (bizarrely) calculate the probability of *intervals* of numbers. For example, we can say that, given a normal, the chance of seeing ages between 40 and 45 is some percent; even though each of the numbers in that interval can’t happen.

Somewhat abnormal, no? It’s still worse because, as said, normals give probability to every interval, including the interval from negative infinity to zero. Which in our case translates to a definite probability of ages less than 0. It also means that we have positive probability to ages greater than, say, 130. An example later will make this all clearer.

The main point: the normal stinks as a vehicle to describe uncertainty. So why is it used? Because mathematicians love mathematics, and because of a misunderstanding of what statisticians call the *central limit theorem*. That theorem says that, for *any* set of numbers, their *averages* converge to a normal distribution as the sample size grows to infinity.

This theorem is correct; it’s mathematics precise and true. But not all mathematical constructions have any real-life applicability. Anyway, the central limit theorem is a theorem about averages, not actual observations.

Plus we have the problem that we’re not interested in averages of the ages, but of the ages themselves. Another problem: I don’t (sad to say) have infinite numbers of readers.

Yet it is inescapable that normal distributions are used all the time everywhere and that it is sad that they can sometimes give reasonable approximations. Both statements are true. They are ubiquitous (I almost wrote iniquitous). And they *can* give reasonable approximations. It’s just that they often do not.

We have to understand what is meant by “approximation”. This is tricky; almost as tricky as viewing probability as logic for the first time.

Now, based on my knowledge that ages are in whole years, and that nobody can be less than 0, and that nobody can be of Methuselahian age, the probability that any pronouncement I make using a normal distribution about ages is true is exactly 0; which is to say, it is false. This means I *know with certainty* that I will be talking gibberish when I use a normal.

Unless I add a premise which goes something like, “All pronouncements will be roughly correct; but none will be exactly correct.” And what does that imply? Well, we shall see.

(Fisher, incidentally, knew of the problems of normals and warned users to be cautious. But like his warning about over-reliance on p-values, the warning was quickly forgotten.)