Let’s Try This Time Series Thing Again: Part I

Part I, II, III, IV, V.

We’ve gone on and on about how to think about time series, but we are having trouble grasping some very simple ideas. The discussion here, and on other blogs, demonstrates there is a lot of confusion and plenty of misunderstanding. Also a complete lack of humor. Who would have guessed that something as banal as statistics could get so many people so excited?

The true test of an honest mind is how seriously it considers arguments that produce uncomfortable conclusions. This is not to say that uncomfortable conclusions are always right; clearly they are not. But in the case of how to think about time series, I am right and my enemies are wrong.

I rarely ask this, but you’d be doing us all a favor if you passed this series on to those in need of it. I’ll answer questions after the series is completed. I’ll Latex this up when it’s finished, sans asides, for easy and portable reading. Remember: be nice.

——————————————————————–

Below is pictured a time series. Imagine it is something to do with climate, say, monthly temperature anomalies. Let’s first suppose that each of the points on the picture are measured without error. That is, we are 100% sure that each point is what it is. The first value is X1 = 0.43. Given our observations, what is the probability that X1 = 0.43? It is 1, or 100%. And so on for each data point. If you find yourself disagreeing with me at this point, well, there is nothing I can do for you: we must remain forever at odds.

Now, something caused that data to take the values it did. Call this cause T. (Something causes every observation to take the values it does.) You must agree with this, too, or all is lost. T will be more or less complicated depending on what X is. If X is, say, global average temperature (anomaly), then T will contain everything that can change the temperature, even down to butterflies flapping their wings. T is the earth and sun, etc.

In real life, we rarely (if ever) exactly, precisely, down-to-each photon know what T is. But suppose we did. Then we can answer questions like this: what is

(1) Pr(X1 = 0.43 | T)?

It is 1, or 100%. T says, after all, exactly what causes each X, therefore if we know T we know before taking any observations what each X will be with certainty.1 Equation (1) is different than

(2) Pr(X1 = 0.43 | Observations),

which also equals 1, or 100%. In other words, we know (1) before we take observations, but we know (2) after. This is an important distinction. Okay so far?

Again, we hardly ever know T precisely in real life; we surely do not know it if X is any kind of atmospheric or oceanic temperature. We might guess, or use evidence compiled from various sources, to say that, although we cannot know T exactly, we can approximate it, i.e. we can model it. To be clear: no scientist claims to know T precisely, but all believe (I do, too) that we can approximate T by a model.

One person will say that the best model is M1, another will claim that it is M2, and so on. It will usually be the case that

(3) Pr(Xi = x | Mj) n.e. Pr(Xi = x | Mk)

where “n.e.” means “not equal”, x is some value, i is for the i-th value of X, and j and k are indexes over our collection of posited models. This should be no surprise: if instead in (3) there was equality for all i, j, and k, then there would be no difference in the models.

A sticking point for you might be using the language of probability to speak of physical models. It shouldn’t be. For one, probability is the language of uncertainty, and don’t forget that we don’t know T, we only guess that M is a good approximation of T, so we have to speak not in terms of certainty, but uncertainty.

Let’s take a fully deterministic model as an example:

(4) M = “Yi+1 = Yi + 2.”

From this we can ask, this (or any other question of the Ys),

(5) Pr( Y17 > Y12 | M )

which is 1, or 100%. There is no problem, therefore, using probability even though M itself has no probabilistic components. Again, if you fail to agree with this, we must part ways.

Let’s get back to X, which we are imagining has something to do with temperature. What we cannot ask is this: what is

(6) Pr(X1 = 0.43)?

There is no answer because we are not considering how X1 came about. We’re missing the stuff that comes after the vertical bar “|”. If we say X was caused as T said it was, then we have eq. (1). If we mean (6) to implicitly incorporate the observations, i.e. given we have already seen X1, then we have eq. (2). We must first supply a “premise” of how X1 came about: eq. (6) is therefore incomplete. We can ask, for instance, this:

(7) Pr(X1 = 0.43 | M1),

or similar questions for every different model we are considering.

The only other point you must understand, before we move on, is that usually

(8) Pr(Xi = x | Observations) n.e. Pr(Xi = x | M),

for most (or even all) i and for any M which is not T. That is, once we have seen what Xi is, the probability of Xi taking the value it took given we see Xi is 1 or 100%, but the probability the model predicted this value is in general something less than 100%. With me? I hope so, else we will have troubles with what follows.

—————————————————————————————-

1If your objection is that T might contain “randomness” (quantum or “normal”), wait until Part IV.

Part I, II, III, IV, V.

Update Link to the data (CSV file), for those who like to touch.

Let’s Try This Time Series Thing Again: Part I — 11 Comments

1. Doc, you are indeed a patient man (glutton for punishment?). In all the various attempts at trying to get your point across (I think I understood it the first time) you must have realized that trying to explain this in terms of anomalies is going to get people to shut down and approach whatever you say with the attitude of debunking you.

I wonder if you had indicated that the graph is that of the stability of a new drug formulation over time would go over better.

The years of fighting the AGW wars has hardened positions so much that even the 30,000 pound bunker busting bombs cannot get to it.

A for effort though

2. I just spent some time over at Phil’s. All I can say is that the definition of skepticism must have undergone radical change over the years. Most there seem to have no inclination to come here to see for themselves. That’s not how a proper skeptic behaves, dontchya know. So Matt is more like John the Baptist, a voice crying out from the wilderness. We all know what happened to him.

3. Sometimes “T” is the problem. Sometimes the value (or worth) of T is overstated or interpreted as more or less meaningful then it is. Sometimes there are agended groups who see their job in life as rejecting or promoting specific T’s or the meaning of T if they are unable to prove it is larger or smaller then they believe it should be. In some cases even recognizing there is a “T” value will get you in trouble (Lawrence Summers). One “T” that is sure to get you on someone’s shitlist is to say that some rapes occur because the woman is wearing provocative clothing. So in the end almost any T of political or social importance is controlled, elevated, dismissed, ostricized, lied about, the third rail, etc. and becomes meaningless except to destroy a legitimate arguement or to force action on a useless program. No one wants their T to be gored and T’s abound in the graveyard of failed efforts.

4. T is never the problem. T is what T is. The T will set you free, after all.
No, the problem is the difficulty we human beings have relating to all things T.

Nice post Briggs, looking forward to the rest.

5. Well, the idea of such a series as representing something that is “measured without error” and the claim that “Something causes every observation to take the values it does” both find me challenged. So the chances that there is nothing you can do for me do seem to be rather unpromisingly high.

6. GoneWithTheWind,

You seem to be confusing ‘T’ with ‘M’. How does one overstate say, a temperature reading? The data collection may have a bias but that only affect ‘M’.

7. So let me understand you.

Are you saying that Pr(X1 = 0.43 | M) reflects the probability that X1 is 0.43 given a model M about my measuring equipment?

In this case, for example, M might be that “the thermometer is in thermodynamic equilibrium with its surroundings”, OR “the weather station is a suitable random representative of normally distributed temperatures in the vicinity.”

Is that the thing you have in mind?

Or are you saying that Pr(X1 = 0.43 | M) reflects the probability that X1 is 0.43 given a model M about how X1 was caused?

In this case, M might be that “temperatures are caused by solar radiation + backscattered radiation, together with random fluctuations of wind currents.”

Thanks,