Do not smooth times series, you hockey puck!

21 October 2011: Welcome Register fans! Comments on this site are always close after 8 days to control spam. To see more about BEST, read this post.

The advice which forms the title of this post would be how Don Rickles, if he were a statistician, would explain how not to conduct times series analysis. Judging by the methods I regularly see applied to data of this sort, Don’s rebuke is sorely needed.

The advice is particularly relevant now because there is a new hockey stick controversy brewing. Mann and others have published a new study melding together lots of data and they claim to have again shown that the here and now is hotter than the then and there. Go to and read all about it. I can’t do a better job than Steve, so I won’t try. What I can do is to show you what not to do. I’m going to shout it, too, because I want to be sure you hear.

Mann includes at this site a large number of temperature proxy data series. Here is one of them called wy026.ppd (I just grabbed one out of the bunch). Here is the picture of this data:
wy026.ppd proxy series

The various black lines are the actual data! The red-line is a 10-year running mean smoother! I will call the black data the real data, and I will call the smoothed data the fictional data. Mann used a “low pass filter” different than the running mean to produce his fictional data, but a smoother is a smoother and what I’m about to say changes not one whit depending on what smoother you use.

Now I’m going to tell you the great truth of time series analysis. Ready? Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have an outside (the learned word is “exogenous”) estimate of that error, that is, one not based on your current data.

If, in a moment of insanity, you do smooth time series data and you do use it as input to other analyses, you dramatically increase the probability of fooling yourself! This is because smoothing induces spurious signals—signals that look real to other analytical methods. No matter what you will be too certain of your final results! Mann et al. first dramatically smoothed their series, then analyzed them separately. Regardless of whether their thesis is true—whether there really is a dramatic increase in temperature lately—it is guaranteed that they are now too certain of their conclusion.

There. Sorry for shouting, but I just had to get this off my chest.

Now for some specifics, in no particular order.

  • A probability model should be used for only one thing: to quantify the uncertainty of data not yet seen. I go on and on and on about this because this simple fact, for reasons God only knows, is difficult to remember.
  • The corollary to this truth is the data in a time series analysis is the data. This tautology is there to make you think. The data is the data! The data is not some model of it. The real, actual data is the real, actual data. There is no secret, hidden “underlying process” that you can tease out with some statistical method, and which will show you the “genuine data”. We already know the data and there it is. We do not smooth it to tell us what it “really is” because we already know what it “really is.”
  • Thus, there are only two reasons (excepting measurement error) to ever model time series data:
    1. To associate the time series with external factors. This is the standard paradigm for 99% of all statistical analysis. Take several variables and try to quantify their correlation, etc., but only with a mind to do the next step.
    2. To predict future data. We do not need to predict the data we already have. Let me repeat that for ease of memorization: Notice that we do not need to predict the data we already have. We can only predict what we do not know, which is future data. Thus, we do not need to predict the tree ring proxy data because we already know it.
  • The tree ring data is not temperature (say that out loud). This is why it is called a proxy. It is a perfect proxy? Was that last question a rhetorical one? Was that one, too? Because it is a proxy, the uncertainty of its ability to predict temperature must be taken into account in the final results. Did Mann do this? And just what is a rhetorical question?
  • There are hundreds of time series analysis methods, most with the purpose of trying to understand the uncertainty of the process so that future data can be predicted, and the uncertainty of those predictions can be quantified (this is a huge area of study in, for example, financial markets, for good reason). This is a legitimate use of smoothing and modeling.
  • We certainly should model the relationship of the proxy and temperature, taking into account the changing nature of proxy through time, the differing physical processes that will cause the proxy to change regardless of temperature or how temperature exacerbates or quashes them, and on and on. But we should not stop, as everybody has stopped, with saying something about the parameters of the probability models used to quantify these relationships. Doing so makes use, once again, far too certain of the final results. We do not care how the proxy predicts the mean temperature, we do care how the proxy predicts temperature.
  • We do not need a statistical test to say whether a particular time series has increased since some time point. Why? If you do not know, go back and read these points from the beginning. It’s because all we have to do is look at the data: if it has increased, we are allowed to say “It increased.” If it did not increase or it decreased, then we are not allowed to say “It increased.” It really is as simple as that.
  • You will now say to me “OK Mr Smarty Pants. What if we had several different time series from different locations? How can we tell if there is a general increase across all of them? We certainly need statistics and p-values and Monte Carol routines to tell us that they increased or that the ‘null hypothesis’ of no increase is true.” First, nobody has called me “Mr Smarty Pants” for a long time, so you’d better watch your language. Second, weren’t you paying attention? If you want to say that 52 out 413 times series increased since some time point, then just go and look at the time series and count! If 52 out of 413 times series increased then you can say “52 out of 413 time series increased.” If more or less than 52 out of 413 times series increased, then you cannot say that “52 out of 413 time series increased.” Well, you can say it, but you would be lying. There is absolutely no need whatsoever to chatter about null hypotheses etc.

If the points—it really is just one point—I am making seem tedious to you, then I will have succeeded. The only fair way to talk about past, known data in statistics is just by looking at it. It is true that looking at massive data sets is difficult and still somewhat of an art. But looking is looking and it’s utterly evenhanded. If you want to say how your data was related with other data, then again, all you have to do is look.

The only reason to create a statistical model is to predict data you have not seen. In the case of the proxy/temperature data, we have the proxies but we do not have temperature, so we can certainly use a probability model to quantify our uncertainty in the unseen temperatures. But we can only create these models when we have simultaneous measures of the proxies and temperature. After these models are created, we then go back to where we do not have temperature and we can predict it (remembering to predict not its mean but the actual values; you also have to take into account how the temperature/proxy relationship might have been different in the past, and how the other conditions extant would have modified this relationship, and on and on).

What you can not, or should not, do is to first model/smooth the proxy data to produce fictional data and then try to model the fictional data and temperature. This trick will always—simply always—make you too certain of yourself and will lead you astray. Notice how the read fictional data looks a hell of a lot more structured than the real data and you’ll get the idea.

Next step is to start playing with the proxy data itself and see what is to see. As soon as I am granted my wish to have each day filled with 48 hours, I’ll be able to do it.

Thanks to Gabe Thornhill of Thornhill Securities for reminding me to write about this.


Do not smooth times series, you hockey puck! — 81 Comments

  1. Now that certainly clairified things in my mind, and I should have known better in the first place. Thanks for the kick in the pants…heh


  2. Smooth will also make start/end points at the time series flawed, wouldn’t it? Since those data will weight more? Especially the end point, which is of most interest?

  3. Mikael,

    Depends on the smoothing method used. Some do, some don’t. But all smoothing methods take the real data and turn it into fictional data.

  4. “Dr. Mann has the puck, he twists, he turns, he’s looking for a winger to pass off to, he’s trying the boards, now he’s winding up for a shot . . . . . . ohhhhhhhhhhhhhhh right in his own net.

    That’s another own net goal goal by Dr. Mann . . . how many more can his teammates on the Screaming Mannomatics take before they kick his [*butt] off the team.

    *some words change by editor

  5. Hi -

    I had a colleague once who was famous for being able to pull out some amazingly high statistical correlations via OLS and also using co-integrated techniques.

    She took the original monthly data, just as we all did, and ran it first through X11. But unlike the rest of us, she didn’t use the seasonally adjusted numbers, but rather … yep, you guessed it – she used the trend component of the seasonally adjusted numbers.

    She didn’t understand what was wrong with doing that, and she insisted that was the right and proper way to do any sort of statistical analysis. After all, the statistics proved her right: her r2 and DW numbers put the rest of us to shame.

    I never wanted to have anything to do with her after that…

    Outstanding post. :-)

  6. Great post!
    [BTW there is one typo, repeated twice: "We do no need to predict the data we already have." should of course be 'not'.]

    So many people who do climate stats abuse regression, probably never even having heard the words “stochastic process”.

    Mostly their whole point is to predict future temperatures (whether or not they are careful about using the word ‘predict’). But it seems we can’t even even say whether temperatures in the recent past have gone up or down. “It’s climate change” vs “It’s just weather”. Any number given will be challenged with a “cherry picking” claim.

    I think climate data and the stock market data have a lot more in common than people think. With the stock market, we see general slow climbs and occasional (but not periodic) crashes, and lots of noise in between. With the climate, we see general cooling trends, with occasional (but not periodic) sudden warming, with lots of noise in between. Interestingly, this pattern repeats at hugely different scales. With both there are random external events which have effects.

    The difference between the climate data and the stock market data is that people know better than to extrapolate the recent stock market data hundreds of years into the future.

  7. Doubter,

    That’s what I get for writing the post after reading Burns. I’ve rewritten it in Queen’s English.

  8. If it’s such a basic point, why is it not covered in peer review? Can you write it up wit specificefernces and impact and as a comment? As it is now, it is not clear as a scolarly rebuttal.

  9. What rot.

    TCO‘s implication is correct, there is no blanket prohibition in smoothing data. Your own work on hurricane statistics used data smoothed in multiple ways – sea surface temperature smoothed in space and time and detrended (to give the AMO index), the hurricane numbers themselves smoothed into annual averages, seasonally smoothed ENSO indices etc etc.

    What you meant to say (if you’ll forgive the imposition) is that you need to take account of the fact that data was smoothed if you do any analysis with it. Degrees of freedom decrease, auto-correlation structure changes etc.

    Frankly, your claim that smoothing is at all times a bad thing has about as much credibility as Karl Rove’s statements on the role small town mayors should play in national politics.

    An additional point worth making are that predictions are not just of things in the future. They can just as well be things that have already happened, but that you don’t know about. All historical sciences work that way (cosmology, archeology, paleo-climate) and many detective shows (CSI). The goal of paleo-climate reconstructions actually has very little to do with predicting the future except very indirectly (despite many peoples opinions to the contrary).

  10. I’m actually concerned about using smoothed data (for instance in a significance test). But I just want it better spelled out exactly what the concern is, with math and references and such. Not an “on high” put down.

  11. Gavin,

    You completely misrepresent the content of Brigg’s post. Here is what Brigg’s wrote:

    “Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have an outside (the learned word is “exogenous”) estimate of that error, that is, one not based on your current data.”

    Now tell us what is wrong with what Briggs actually wrote.

  12. Gavin:
    What does Karl Rove have to do with this? Are you saying that there is nothing wrong with how Mann has handled the proxies?

  13. Hi Matt,

    This issue would seem to merit attention in the literature. You are surely aware that the statistical expert Dr. Michael Mann has just had his new improved smoothing technique published in GRL ( here ). You have addressed this in an amusing way but I hope the your AGU counterpart can help thrash out the issues in detail.

  14. I just posted the following remark over at, on the 9/5 thread Proxy Screening by Correlation. I do recognize, as per Gavin above, that it is commendable that Mann et al did not use 144 DOF to obtain critical values for their correlations on smoothed series. However, I argue that they still did not sufficiently adjust. In particular, there is no way you can get 13 DOF out of 9 or 10 real observations! I don’t usually follow this blog, so commenters may wish to post on CA.

    CA post follows:

    RE Bob B, #9, William Briggs makes a very important point the webpage Bob links his comment: If you smooth a series and then do statistics with it, you’re surely kidding yourself, since the true independent sample size is much smaller than the nominal sample size.

    Mann et al report (SI p. 5) that “To avoid aliasing bias, records with only decadal resolution were first interpolated to annual resolution, and then low-pass filtered to retain frequences f [less than] 0.05 cycle/yr (the Nyquist frequency for decadal sampling.) …. We assumed n = 144 nominal degrees of freedom [for correlations] over the 1850-1995 (146 year) interval for correlations betweeen annually resolved records … and n = 13 degrees of freedom for decadal resolution records. The corresponding one-sided p=0.10 significance thresholds are |r| = 0.11 and |r| = 0.34 respectively.”

    The Punta Laguna examples considered by Steve would apparently be considered decadal, since the spacing is never less than about 8 years. They would then be interpolated to annual frequency, itself a smoothing operation, as shown in the graphs in the recent CA threads, and then further massaged with the unspecified (Butterworth?) low-pass filter, removing cycles shorter than approximately 1/.05 = 20 years. These were then correlated with local instrumental temperature, at an annual frequency, apparently, yielding r = .397 for #382 and .627 for #383 over the full 146 year calibration period, according to the spread sheet on Mann’s website. Since there are only about 15 decades in 146 years, they “conservatively” used a critical value of .34 based on only 13 DOF rather than 144 DOF as would be appropriate for serially independent errors. Since these two exceeded this value, they were apparently included, while the other two Punta Laguna series, which fell short, were not used for the reconstruction.

    Although it is commendable that Mann et al did not use 144 DOF, 13 DOF is still way too many for these two series. #382 has only 9 true observations, while 383 has only 10. With 9 observations, there are only 7 DOF, and the .10 1-tailed critical value of r is .472, not .34 (actually .351 by my calculation, but close enough) as for 13 DOF per Mann et al. Since .397 falls short of this, there is no way #382 is significant, even at this feeble “significance” level, corresponding to a t-stat of about 1.28.

    The reported r for #383 does exceed .472, but in order for the test to be valid, the 10 actual observations should have been correlated directly with the corresponding instrumental temperatures, per statistician Briggs. Since the 10 real observations were massaged by first interpolating and then smoothing with a filter that damps cycles under 20 years in duration, it is hard to say what the true effective sample size is — perhaps as small as 146/20 = 7, for 5 DOF!

    So it looks as if Mann et al may have admitted far too many of the “decadal” series, even by their very weak criterion.

    PS: Note that although the Pyrgophorus coronatus of #382 is a gastropod or snail, Cytheridella ilosvayi of #383 is a tiny crustacean or “seed shrimp”, and not a mollusk at all.

  15. Bernie,

    Of course Gavin doesn’t think that there is nothing wrong with how Mann has handle the proxies. There is nothing that Mann could write or say that he would consider wrong, well as long as it goes along his preconceive notion that the world is warming, that it is catastrophic and that human are responsible through nothing else than Co2.

    Sorry for being snippy but I’m tired of self proclaimed gods.

  16. I am VERY sympathetic to a concern about smoothing data and then confusing it with real data (since it’s processed) and with doing significance tests on it. But WM Briggs just has an on high post. If it’s such a no brainer than why wasn’t it picked up? And if they did make a mistake anyhow and it’s such an obvious thingk, why doesn’t WM Briggs get off a scholarly comment?

    Hu McC shows that Mann did at least concern himself somewhat with the issue and try to deal with it (although perhaps not optimally). I am so sick of our side just cackling and egging each other on and not doing MATH to quantify things.

    Hu, don’t know about your math…but appreciate your effort and your approach.

  17. Hey Gav—can I call you Gav? —nice to see you back,

    Actually, not rot.

    You are right to say that I have, in print, made the mistake of smoothing time series and using them as input to other analyses before. I have also, in print too, used p-values. I mean, both in peer-reviewed journals. I did this before I knew better (actually, in medical journals it is still nearly impossible to not use p-values; but that is a story for another time).

    But not in the hurricane paper. In those papers I did the first type of analysis spoken of above: trying to find out what physical measures helped predict hurricane numbers. You might recall that when I gave the paper up at NASA I spoke of the skill of the model I used: this is in relation the the second type of analysis spoken of above, i.e. does the model have skill in predicting hurricanes (it does). This is mentioned in the article, but the editors, who thought that people wouldn’t understand it, made me take it out. The figures are still in the original arxiv paper, which is here. They are the same as the one I used in my talk. Whatever else I did, I did not smooth them into annual averages; instead I predicted the actual values and quantified the uncertainty of those predictions (this is Figure 3b in the arxiv paper). But, you are right that the final work is missing this (I do talk about it, but no pictures; regrettable).

    Karl Rove? Don’t make me quote Al Gore back at you. Which reminds me, don’t forget to cast your guess of who will win the next presidential race.

    You’re right to say that prediction is not always about the future. You’ll note that, in my original post, I agree with this. Re-read the last few paragraphs: trying to predict past temperatures with a model based on proxy is a reasonable thing to try. Trying to predict past temperatures with a smoothed proxy is not. There is not reason to do it, none.

    You often read—not just in papers in this controversial topic—that people smooth the series “to reduce” or “to remove noise.” Again, barring measurement error, what is noise? It’s usually the part of the real data that is hard to model, or that doesn’t fit preconceptions, or whatever. Smoothing does toss this data out, or reduces its influence, with the end result that you will be too certain of yourself.

  18. TCO,

    You asked the best question of the bunch, “If it’s such a no brainer [not to smooth] than why wasn’t it picked up?”

    Why, indeed? What I said above, except for one small part that is not related to Mann’s work in any way, is not unknown in the literature. There are also dozens and dozens—maybe even hundreds and hundreds—of articles on what p-values are not and why they should be not be used or at least radically deemphasized.

    How does bad statistics escape scholarly peer review?

    As scientists are, or should be, forever saying: peer review is no guarantee of correctness. It increases the probability that a result is true if it goes through peer review, but it nowhere comes close to making it certain. If you’re a regular reader of this blog, you will have also noticed that some papers—those on topics that people want to be true—are absolutely atrocious, even risible. You can’t always believe what you read in a scholarly journal.

    What usually happens is this: a paper is received by an editor who then sends it to two to three people he thinks will be likely to understand it. The editor doesn’t and can’t know everybody, so this pool of people is limited. In meteorology/climatology, the papers containing heavy statistics are usually sent to other people who have used similar heavy statistics. They are almost never actually sent to statisticians—which isn’t always bad. Not many statisticians will understand the context. Since the reviewers’ work is similar to the author’s, and as long as they all roughly—and I emphasize roughly—agree, then the paper goes through.

    This process usually, but not always, results in quite a lot of business as usual and conservative results. New ideas leak in slowly. Usually—again, not always—a new idea is criticized by a reviewer with the comments “I don’t recognize this, neither will anybody else; try it the old way.” That isn’t necessarily bad, either, because, of course, most new ideas stink.

    Anyway, what you end up seeing is a lot of papers that don’t look terribly different from one another. A suspect method will be used because “everybody else uses it”, and even if people know it’s suspect, they feel that they have such long experience with it, that they can mentally adjust for its shortcomings. Sometimes they can, but usually they end up too sure of themselves.

    Even everything I’ve written in this comment is well known, and has been studied in scholarly fashion.

  19. RE TCO and peer review. (BTW not seen you at CS lately)

    The original MBH went through peer review or maybe peer review lite.

    Imagine asking anyone from the UK Royal Society to conduct peer review on a paper that suggests there is no AGW. Have you seen the junk they have produced on geo-engineering with cheerleading from the Economist.

    For about 40 years there were literally thousands of papers, books, studies and trials that showed that stomach ulcers were caused by stress of diet. All were peer reviewed and all were wrong.

    Of course peer review is necessary and indeed outside of climate science and a lot of medical research, it still works. If Mann had submitted to any of the major maths journals he would have been told where to go.

    On a related issue in one of the CA threads, someone has observed that the basic data appear to show no MWP. This takes us back to MBH. The changes in climate during the MWP and the LIA are well documented in the historical record. Yet no one in the peer review process thought that an outcome that contradicted all of the historical record was wrong. There were ice fairs on the Thames in the early 19th century. There have been none since. The thermomenter record since then shows warming in the UK but most of it happening before the rise in CO2.


    Paul Maynard – my real name

  20. Mann has been thoroughly discredited by the incompetence shown in Hockey Stick I. Hockey Stick II is very likely certain to be a low budget sequel.

  21. Briggs:

    Well right it up as a comment then. Leave out the “you hockey puck” and the exasperation. But point out the error and it’s effect (if there was one). Something like:

    “It is well know not to smooth data as an input from references A, B, C and D. They diescribe how doing so causes effect G. Mann attempts to deal with this by adjusting the degrees of freedom. Still, this is non-optimal becuase of V (fill in).

    An estimation of the effect of smoothing is presented here: fill in. And a non-smoothed result would look like “fill in” [this part may be a little tricky, given the complexity of his method.]


    BTW, I’m particularly concerned about using smoothed data as the input for calibration. Since we have such a short record of instrument and data overlap. So wiggle matching gives me more confidence in the calibration of the CFR method (which is questionable, not established). Zorita has been pretty clear about this inutitively.

    OF course I’m not sure the mathematical extent. That’s for a stats guy. But it’s intutively what I worry about.

    Heck, maybe there is even some sort of esoteric argument how the systems respond to the several year effects more so than the indiividual ones, so smoothing is justified. (would feel better with direct modeling, though, perhaps with a lag term). But let’s tee things up and have that argument clearly, if that is the case.


    WM: Thanks for the kind words. But I still come down to the need for a more thoughtful (and appropriately placed) debiunking. “You hockey puck”. And “you just shouldn’t do this” won’t cut it. Cause plainly he did. Got it into the literature, etc. Maybe an imperious putdown would woirk if you were the editor or the reviewer or what have you. But this thing is out there, now.

  22. Jeff’s comment was silly and is the sort of thing that makes the rest of us look silly. Mann has done a lot of work, so one bad paper would not “thoroughly” discredit his opus.

    Also, even that paper is not thoroughly discredited. There are some issues with parts of that work. Some generally recognized. (except for by him…I think he has been grudging to admit even the clear things…like failure to describe his method properly). Some in debate. Some not at issue. BTW, you should be careful of some of the Steve McI cackles, since Steve is very prone to confounding faults (mushing them all together, trying to have them support each other…which is a rhetoric trick, but not good science logic) and to not quantifying impacts.

    So, Mann is not “thoroughly discredited”. And his new work should be judged as such. Heck, maybe it corrects some of the issues of the previous work. In that case, the “discrediting meme” is kinda silly. Since what we see is an evolution to better method.

  23. Excellent post, Mr Smarty-Pants.
    I remember my Dad saying to me once, that if there was a significant change in some data, it should be obvious from just looking at the graph – you shouldn’t have to do any fancy statistical analysis to extract it.
    I’m glad to see some serious comment from you on this.
    I look forward to more:
    What do you think of the way ‘climate scientists’ extend the data in order to get their fictional smoothed data right up to the present? Especially Mann’s brilliant ‘minimum roughness’ method that forces the smoothed data to go thru the last data point?
    What do you think of drawing a straight line fit through wiggly noisy data and calling it a trend?
    What do you think of the way ‘climate scientists’ have abandoned the r2 statistic used by all statisticians (because it didnt give the results they wanted), and invented their own ‘RE’ statistic?
    (PS: I claim I also encouraged you to write about statistics and time series, by email!)

    TCO, this is the weakness of peer review. Papers are reviewed by people in the same self-reinforcing clique who all make the same errors. Mann’s paper will have been reviewed by someone like Gavin.

  24. PaulM:

    I’m on that…and two steps past you. It would not surprise me to see a flaw in statistics in a published science paper. My point is that Briggs needs to more clearly nail it and in the literature. Just “tsking” will not cut it.

  25. As a world renowned expert in statistics, I am well placed to proclaim that Matt Briggs has it right. The tail is wagging the dog! The dog is running after a stick that was not thrown and is still in his

    masters hand. Chase it Gavin!
    Bernie! Spot on!

    Straight forward plain talking,no picture language, that’s what I like. Just press the button and stand back.

    I smell something fishy, and it’s not the dead Kippur.

  26. “An additional point worth making are that predictions are not just of things in the future. They can just as well be things that have already happened, but that you don’t know about.”

    I need a new dictionary. According to my old dictionary “prediction” is to foretell a future event. The dictionary doesn’t say anything about foretelling a past event.

  27. You can predict past or current things in this sense. You take a (knowingly imperfect) survey (or set of proxies or what have you) and put them through a model (in this case the Mannian CFR system, which concerns me…but that;s what they use). And then you come out with an estimate. It’s even possible (without a time machine) that the prediction may be validated or invalidated in the future. For instance if more definite, rock clad proxies are found in the future.

    An example of a “current” prediciton would be a limited survey as opposed to a census. Or estimates of a submarine position, clasificaction, etc. based on data and model (sound at hydrophones combined with the signal processing, to include user processing). Both of them are predicitons in every thing except for a semantic sense. But then you wouldn’t advance a purely semantic nit would you?

  28. Ray,

    Nah, I’m with Gav here. Predict means make a guess of something you don’t know. Doesn’t matter when the thing happened.

  29. The trouble with your argument Mr. Briggs is that it makes perfect sense to the layman. I’m reminded that in the discipline of Tasseography, the occult skill lies in reading patterns where normal people see nothing.

  30. Chris,

    Are you sure you don’t have it backwards? If you smooth, the final confidence intervals should increase, get wider, to express greater uncertainty.

    Smoothing cannot make you more certain, which classically is expressed with narrower confidence intervals.

  31. I did not quote the paper wrong. You’ll need to check the context of that quote yourself to see if it’s relevant or incorrect, as I don’t get into the statistical end of this stuff yet (I’m still getting my edumucation!!! Give me some time). It does interest me when blogs and other such places challenge the peer-review, and very rarely do I find the claims end up holding closer scrutiny.

  32. TCO:
    irresistible! What girl could refuse such a proposition!

    Sounds like you’ve got the bare bones of an excellent blog! I can’t resist a bone.
    I defy you to keep a straight face:

    There once was an old man of ESSER
    Who’s knowledge got lesser and lesser
    It at last grew so small
    He knew nothing at all
    And now he’s a college professor

    There once was a man called Buck
    Who had the most terrible luck
    He got on a punt
    And fell off the front
    and got bitten to death by a duck

  33. When I read, “Confidence intervals have been reduced to account for smoothing,” my mind replaced “smoothing” with “something.” It would make just as much sense.

  34. “Confidence intervals have been reduced to account for smoothing” probably means “I just make this up as I roll along”.

  35. Doubter: “The difference between the climate data and the stock market data is that people know better than to extrapolate the recent stock market data hundreds of years into the future.”

    Bad analogy. In climate data there *is* an underlying physical process (no matter how loudly Briggs wishes to shout about it, he’s wrong on that point). In the stock market there is no physical process – only some abstract process that models the sum of the economic behavior of many actors.

    Economics may aspire to the state of physics but it will never get there because there are no underlying physical laws.

    When Briggs says: ” There is no secret, hidden “underlying process” that you can tease out with some statistical method …”

    He’s wrong. We’re talking about physics. There *is* an underlying process.

    This whole post is rubbish.

    (And I have a background in both physics and market data, so I’ll claim a small amount of authority in making that distinction).

  36. JM,

    since you are so sure about the underlying physical process that can be teased out by climatologists and modelers, care to let us in on it??

    Or, are you just being rhetorical??

    You state:

    ” There *is* an underlying process.”

    Just one JM??

    You also state:

    “This whole post is rubbish.”

    Additionally you claim authority to make this statement due to your having a background in both physics and market data. JM, the Janitor has a background in those areas also. Care to exhibit some of this fabled authority from your BACKGROUND??

  37. Score! Most modeling is an exercise in throwing away data, so no wonder the modelers groan at this post.

    Re predicting the past, especially temperature with proxies: it’s fruitless! Unless you have a time machine with a thermometer on board. Predicting the past is an oxymoron anyway. I prefer the term “hindcasting” as in “I got my hind cast out of the Scientist’s Club for rewriting history.” Want to bet me about who won the Super Bowl last year?

    Re hockey pucks: what’s the difference between a hockey mom and a fired up statistician/logician? The lipstick, I hope, in Matt’s case anyway.



    We’re talking about temperature right? The underlying physical process is thermodynamics, the first law (conservation of energy) and how a change in either energy input to a system or retention will change the equilibrium condition.

    The observational data can then be analysed to a.) detect any change in that process or it’s parameters and b.) remove noise.

    The alternative proposition – that there is no underlying process – is a statement that climate changes randomly “just because it feels like it” – ie. the temperature we observe is unrelated to any physical cause.

    That is very much like stock markets which are known to be more or less random walks which can be modelled by a stochastic process. In that situation, Brigg’s point about smoothing would have some validity because the volitiliy of the data has meaning – a stock price represents a real event, not noise and is not due to some factor external to the model – and it is important to retain the raw observations.

    Now in the absence of a physical hypothesis (model), temperature data could not be distinquished from stock prices except in its external character. But since we do have a physical hypothesis we are using the data to validate that hypothesis.

    To make that test we *have* to extract the underlying process (or at least a representation of it) and test it against our model.

    So the assumption of an underlying process – particularly when we can articulate it in terms of known physical laws – is perfectly valid.

  39. JH,

    Allow me to agree with you and say that my original language was not careful where it needed to be. I often deride others for sloppy writing, so I will accept my drubbing like a man.

    There absolutely is an underlying physical process (or processes) driving climate change. It of course is valid to say so, and ridiculous to say otherwise. It is even a good, and usually a necessary, course of action to marry statistical analysis with physical analysis. If I said or implied differently, then I was wrong and you’d be right to think I was off my nut.

    What I was trying to say is something like this. You will often see the phrase “The data was normally generated.” The assumption is that there is some sort of stochastic process that is driving or creating the data, or if not the entire part of the data, then the “noise” component. You say something similar when you say “The observational data can then be analysed to a.) detect any change in that process or it’s parameters and b.) remove noise.” (You also say that I hint that “climate changes randomly”, but if you have read anything I have said about the word “random” and its common misinterpretations, then you would know that I would never, ever even whisper this.)

    These two statements, while commonly used, are false. Let me handle “b” first. I’ll ask anybody to define “noise.” Just what part of the data is it? It’s usually that part that is “left over” after some probability model is applied to the data. The model will usually be purposely imperfect, something easy to handle and nicely linear. The “noise” will be said to be generated by some probability distribution.

    This poor use of language is what accounts for the misunderstanding of what statistics is. What people should say is “My uncertainty in the values of the data—and not just the parameters of the probability distribution describing that uncertainty—is quantified by this probability model M.” Or, “Given the model M is true, there is an x% chance that data I have not yet seen (in a certain specified situation) will fall between ylow and yhigh.”

    That later statement, assuming no miscalculations or forgotten “divide by 2s” (my mortal enemy), is always true. This does not imply that the model M is true. In fact, it says absolutely nothing about the veracity of M.

    For any given set of data there is one true “model”, the one that produced the exact values you see. If you knew what this model was, then you could apply it to data not yet seen and predict it exactly. If you discovered that your prediction was in error, by even a little, then your model was wrong (by “wrong” I mean “not right” as in wrong). In climate—or meteorology—there certainly is such a model, and we know lots of its components. We can never represent it exactly because of its enormous complexity, however, so we approximate.

    Now comes the tricky part (see video below; sorry, I can’t help myself). What is the best approximation? What might be obvious, and what is true, is that there are an infinite number of possible approximations for any set of data. How can you tell which one to use? Any number of models fit the past data perfectly, including the models implied in statement “a”, so you cannot look to how well the model fit the data set in hand. To tout, say, R2, which is a measure of closeness between the model and the past data, is in this vein.

    This is because it is also not true—it is false—that any model that “fits” the past data better than some other model will continue to do so for data not yet seen. It is not even necessary more likely to. Many people are surprised to learn this. We are getting far afield here, so let me wrap up (for now) and say you can only trust a model after it has shown itself better than another in actual decisions you make using the model. Among other things, it means that a model that works for you might not work for me—but that is another subject, and I don’t mean to say that this necessarily applies to temperature reconstructions in all circumstances.

    There is only one question I haven’t answered is how can you tell whether a given “hypothesis” is true or false? Usually, just by looking. But what if you cannot ever learn by looking? Think of the results of a jury trial for murder where the jurors can never actually see the past events, yet they still have to make a decision—then you have to do something else. We’ll talk about that in another post.

    The Tricky Part

  40. I have no special skills in either statistics, physics or economics but I know a dodgy argument when I see one.

    JM makes a false diochotomy by arguing that if we question the supreme importance of one physical factor then we must reject them all.

    Quote”The alternative proposition – that there is no underlying process – is a statement that climate changes randomly “just because it feels like it” – ie. the temperature we observe is unrelated to any physical cause.”End Quote

    I do not think that anyone has ever said that the temperature changes we observe are unrelated to any physical cause. The alternative position to the belief that the primary driver of temperature changes is the level of CO2 in the atmosphere is not that there are no physical causes of temperature change, but that that are many physical causes.

    Does he really claim that any changes in temperature that do not fit in which his model must be noise and have no underlying physical cause?

  41. On thermodynamics.

    Thermodynamics says if there is net energy addition into a system the temperature change is either greater than or equal to zero. The ‘equal to’ part is for the case of phase change. In the absence of phase change the temperature of the system must increase as energy is added into the system.

    There are no exceptions.

    The Great Global Average Temperature Totem shows both increases and decreases in temperature as a function of time.

    We have been told unendingly that there is an energy imbalance away from equilibrium such that there is an energy addition into the Earth system.

    By thermodynamics, the temperature of the system can only increase.

    I can only conclude that we are not talking about the fundamental principles of thermodynamics.

    What then are the underlying physical principles that supply the theoretical foundation for The Great Global Average Temperature Totem?

    All corrections to incorrectos will be appreciated.

  42. I would like to say that I think the stock market analogy is not as bad as JM is making it out to be. Here’s JM:

    Bad analogy. In climate data there *is* an underlying physical process (no matter how loudly Briggs wishes to shout about it, he’s wrong on that point). In the stock market there is no physical process – only some abstract process that models the sum of the economic behavior of many actors.

    I agree that the stock market is measuring “some abstract process that models the sum of the economic behavior of many actors,” but I wonder whether we should be so quick to dismiss the idea that there is a unified, measurable thing at work here all the same. That is, all these independent actors are trying to maximize wealth by (a) innovating, (b) getting more efficient or (c) both. Repeat that a number of times in series for a large number of actors and you get something like a unified force. Which is a fancy way of saying that we DO know what “economic activity” is “working toward” – the accumulation of wealth. That’s why things like dollar cost averaging and indexed mutual funds work out – because over the long term (e.g. a 20 year span) there is always more value in the stock market than there was before – because there IS a predictable “force” at work here – or at least something that works like one. Notwithstanding, anyone who gets too confident in his predictions in the stock market will invariably get burned – just because of the nature of the data, the measurement techniques, and the nature of the “force” being studied. This seems a good analogy for climate change to me. That is, there IS something “out there” being measured, but we’re not clear on what all the contributing factors are, and the nature of our data collection methods make them somewhat unreliable. So – caution of the kind identified by Briggs here is in order.

  43. Matt:
    Could you restate your 5:54 comment with more explicit implications for Mann et al’s 2008 HS paper?

    I am not clear whether you are saying:
    (i) Mann et al’s approach is wrong because he fails to adequately recognize the large amounts of uncertainty he has introduced by smoothing the proxies they uses in their model; and/or
    (ii) Mann et al’s model, M, is but one of a number of possible models of the data and it is unclear what makes M superior to the other models; and/or
    (iii) Mann et al’s model, M, is problematic because it combines multiple proxies which in turn represent unspecified models and possibly competing models of the physical processes related to the earth’s temperature.

    You may also be focusing on something else that I have simply missed.

  44. Doubter:
    In the stock market there are no underlying physical laws? This is also wrong. We may never understand them, but it is wrong to say there are no underlying physical laws. Unless you want to use the Lewis Carroll method, Or invoke God again, I mean the blue one, not the stupid aqua one.

    What is the Monte Carol method?
    Depends on
    What? Is a rhetorical question.
    And that was not an answer. Was it?
    And who gives a fig?

  45. Apologies for spelling “dichotomy” incorrectly above. I really should not use big words if I do not know how to spell them.

    Can I put things in the way that I as a non-scientist understand the argument?

    As I understand it if we could know everything about past climate we could work out the exact cause of every temperature change everywhere in the world. This is based on the belief that we understand basic physics, and that climate is produced by purely physical activities. Our problems arise because we do not have (and in practice certainly cannot have) enough observational data to inform us completely.
    We know that every bit of weather is caused by purely physical activity, but our models of weather behaviour are imperfect because we do not have all the theoretically possible data. To describe the difference between what we observed and what our models had predicted as “noise” seems perverse. It is not the data that is “noisy” but the model that is inaccurate (or should I say imprecise, I am not sure of the distinction).

    I can well understand that if you had some data which had temperature at time 1 as 0C and at time 2 as 10C and at time 3 as 20C, simply to smooth these results before analaysis and say that over the period from time 1 to time 3 the average temperature was 10 might throw out a lot of relevant information, and cause some false conclusions. Is that what this thread is all about?

  46. I’m still missing a more helpful explication of why not to smooth data. Something other than “don’t ever do it”. Intutively, I would think there are some real dangers from it in terms of significance, in terms of losing insight, even in terms of confusing ourselves over the difference between data and processing. So I’m actually sympathetic to the kvetch.

    But I still don’t see the explication. And given that Mann’s paper got into the literature, perhaps it really would be good to make a scholarly comment that corrects the mistake and prevents more of the same (if it was a mistake).


  47. TCO,

    A perfect example would be from the post just above yours. Smoothing would remove your “extreme” points, then you could tout something like “Today’s temperatures are unprecedented for the last 1300 years…” and you would have a pretty graph which backs up your case.

  48. underlying process: I didn’t say anything about this for either the climate or the stock market, but since you brought it up, I will concede that there probably is one in both cases. But how much does that actually help us to predict the future?

    There are several levels of science. There’s the Newtonian mechanics level, which is pretty straightforward, even deterministic up to Heisenberg, or until you get into n-body problems. Then there is statistical mechanics, which can deal with huge numbers of particles pretty well, and not too hard until you get into turbulence and so on. Note that it says nothing predicting the paths of individual particles, so it is in a way less than Newtonian and in a way it is more. The third level is where you have complex complicated systems with many individuals and you want to predict paths at the individual level. Now it’s getting hard. Or fun, if you’re masochistic.

    Can we predict the path evolution will take? We know the fundamental principles pretty well, but we don’t know future events, random or not. And even if we did, it all depends on lots of little chance events. There is an element here that evolution will do what it wants just because it feels like it, if you don’t mind putting it that way.

    Climate is (maybe) like that. It is certainly not deterministic. Even if we knew the present conditions exactly, we could not predict the weather very far into the future, even with perfect computational ability. Edward Lorenz (famous for chaos theory and the butterfly effect) was on the right track.

    What is the physical model behind climate theory? It is certainly not linear. How much statistics does it take to prove that? Do we know enough of the physical model to say for sure it is not a random walk? Throwing everything that does not fit your preferred belief-system model into the category “weather” is not scientific.

    We find ourselves in the situation described above. We might be able to describe the space of possible outcomes, but we cannot know which one of those will be the actual outcome.

    In statistics, you can fit data points as closely as you want, but this is not a good thing, because you want to fit the actual system, not the particular data points.
    In fitting climate models to past weather, you can’t know if you are overfitting.

  49. Hurrah! Finally some statistical sanity.

    If one assumes, that the point to point data changes, are not noise, but are actually real believable data, then one may argue that each data point is actually supposed to be different, because the system under study made them different.
    So any kind of low pass filtering algorithm; or for that matter any kind of filtering algorithm, can only THROW AWAY INFORMATION.

    You do not add information by filtering; but you can remove noise by filtering, and noise in climate data, is simply measurment errors of various kinds.

    If you do enough low pass filtering, you eventually end up with no variation at all; so why not simply keep a running average of ALL of the data values you have to date, and simply report the single number value of that average; it is probably at least as meaningful as your five year running average is. Remember that the first point of a data set, must vanish form a five year running average,as will the last point, so you throw away history as well as actual data.

    Actually, the same concept can be applied to the data gathered from different locations.

    On any northern summer day, one can measure temperatures on earth (surface) that can range as high as +60C, or as low as -90C, and every temperature between those extremes can be found somewhere. Those temperatures are different, because they are supposed to be different; so why attempt to average them in any kind of way, to get some number (+15C say) that wasn’t measured at any actual place at any actual time, and has no scientific significance to it whatsoever.

    That 150 C range of temperatures also covers a wide variety of terrains, and ground cover, even deep oceans, and the thermal energy flows in each of those different environments relate to the local temperature in totally different ways, so there is no relationship between the “average” global temperature (even if it was possible to measure such a number) and the energy balance of the planet.

    You can learn about as much by simply averaging all the telephone numbers in the Manhattan Telephone Directory, to come up with a mean telephone number. The numbers in that book are all suppsed to be different, because each relates to a single telephone somewhere, and the average of all the numbers is of no interest to anyone, unless it happens to be your phone number.

    Signals can be improved by filtering out real noise; but nothing extra is learned by throwing away much of the real information contained in the signal.

  50. One slight correction; since we are being pedantic.

    The “King’s English” is a technical term describing the approved language of the Royal Court (British anyway), and has nothing to do with the occupier of that position.

    so there is no such thing as “The Queen’s English.”


  51. George,

    Shh! You’ll wake the children. Next you’ll be telling them that parameters and probability don’t exist either.

  52. Briggs,

    Hah, I like the Santa Math! I’m familiar with the writings of Bruno D., too.

    How about a discussion on the difference between intensive and extensive measurements with respect to linear operations on temperature? When I try that one on engineers, I usually get funny looks. Some of them average clock times, too.

  53. Does George E Smith not overstate the case when he criticises the use of global mean temperature?

    Imagine Starbucks head office receiving data every day from all its coffee shops with the gross takings, and the details of the sales of each item. At the end of each month it could work out the average sales per shop, the anomaly of this from the long term average for that shop at that time of year, and at the end of the year it could produce a time series to see what the trend is for this anomaly when averaged over the whole company. Now this trend is very far from telling the whole story about the health of the business, but that does not mean that it would not contain some useful information.

    Businesses do regularly publish their like-for-like sales figures and these figures are seized upon by analysts because they do give a picture of whether or not the business is growing. I know this analogy is not perfect, but without a GMT how would we know whether the world is warming or not?

    I don’t agree with him about “The Queen’s English” either. I remember when I was at school in England in 1960, using a text book that was called “The Queen’s English”, and I have just traced a book written in 1856 Called “A Plea for the Queen’s English” by Henry Alford.

  54. “Signals can be improved by filtering out real noise; but nothing extra is learned by throwing away much of the real information contained in the signal.”


    I put your words through a language sharpness filter to find the sharp bits and never was a truer word spoken in jest. This is how the graph looks: all words out of quotes are mine. It’s poetic.
    “Just a word to the wise…”
    “scientists need to put their own babies on the chopping block and try to kill them.”good job you weren’t talking to the foolish!
    I think the whole jumping up and down on him for a remark is silly. Although opportunistic. I make black humor all the time. Hate PC speech codes or stifling of good fun. Let’s not use the namby pampby tactics of our opponents. It justifies them and makes us ball-less like them.
    “Using the actors for an inference is silly…’Capisce’?”
    “it is not clear as a ‘scolarly’ rebuttle”
    “I am actually concerned about using smoothed data for instance in a significance testbut I just want it better spelled out…”!!!
    “with math and references and such, not an on high put down”
    “and I love Sarah Palin”
    “I am very sympathetic to a concern about smoothing data…”
    “”but WM Briggs just has an on high post…”
    “If it’s such a no brainer”
    “an obvious thingk”
    “Why doesn’t WM Briggs get off a scholarly comment”?
    “Mann did at least concern himself somewhat with the issue”!!!
    “perhaps not optimally” (Understatement)
    “side just cackling and egging each other on”
    “Hu don’t know about your Math…. But appreciate your approach”
    “well ‘RIGHT’ it up as a comment then…”
    “leave out the “you hockey puck”” Who’s team are you in?
    “it is well know…(fill in”
    “fill in”
    “fill in”
    “fill in”
    “fill in”
    “given the complexity of his method.
    “BTW I’m particularly concerned about using smoothed data…””Zorita has been pretty clear about this ’ inuTitavely’”?
    “of course I’m not sure the mathematical extent..” evidently
    “that’s for a stats guy” Yep
    “heck, maybe there is even some sort ofesoteric argument”!!!
    “but lets “T” things up and have that argument clearly”

    “WM thanks for the kind words”
    “but this thing is out there now”
    “Jeff’s comment is silly and is what makes the rest of us look silly”!! How silly? Could we quantify?
    “I’m on all that and two steps past you”…and I’m half way back!
    “…more clearly nail it and in the literature. Just “stking”won’t cut it”…
    “so I’m actually sympathetic to the kvetch” did you say VET!!!
    “perhaps it really would be good to make a’scolarly’ comment…”!! couldn’t agree more.

  55. Patrick,

    Starbucks and the global climate are not commensurate. Starbucks pools and averages because all the dollars get mixed into the same pot, where they are exchangeable. (the dollars from Seattle spend just the same as the dollars from, say, Miami). George’s post points out that this is not the case with climate/temperature measurements. The temperature in Seattle does not get pooled with the temperature in Miami in the physical world, as it apparently does in the climate models.

    When I pool Starbucks (net) profits, I’m am essentially talking about a big pile of cash that “exists” somewhere. When I average temperature values taken at different locations, the “average” is a fantasy (an estimate of a parameter) that doesn’t describe any particular location.

  56. Air Force: Thanks.

    Joy: You lookin’ mighty fine in those comments…

    BillR: I think that there is obviously more information and potentially some very interesting information in the change in temp distribution (by latitude, by land versus water, for examples) rather than just the average versus time. All that said, looking at the average versus time is a simple way to get started on analysis.

    Pat: You go, girl.

  57. # Chris Colose wrote on 07 Sep 2008 at 3:19 pm

    “(I’m still getting my edumucation!!! Give me some time). It does interest me when blogs and other such places challenge the peer-review, and very rarely do I find the claims end up holding closer scrutiny.”

    Dear Chris,

    If you continue your education, eventually you will find that there is no such thing as a perfect paper and that many faulty papers pass peer review.

    However, I would not look to the claims of error that come out in the popular press or the wilder blogs.

  58. TCO:
    I make you right there! I missed the quotation marks out of this comment, one of your more poignant.I would hate to take the credit for such wisdom.

    “I think the whole jumping up and down on him for a remark is silly. Although opportunistic. I make black humor all the time. Hate PC speech codes or stifling of good fun. Let’s not use the namby pampby tactics of our opponents.” It justifies them and makes us ball-less like them.

  59. Bill R, I did say that I knew my Starbucks analogy was not perfect, but I think that it is better than you suggest. Let me see if I can improve it.

    Imagine if a sample of Starbucks coffee shops gave each customer a feedback questionnaire and every day these shops reported back to head office the average score of customer satisfaction. Adding up these scores would not be like adding up the dollars and cents into a big pile of cash that “exists” and presumably you would describe the “average” as a fantasy (an estimate of a parameter) that does not describe any particular location.

    But surely Starbucks would be getting useful information if they found that over a time there was a trend in this average?

  60. This is JH, not JM…wanna be J-L Picard though.

    Smoothing techniques can be used to not only filter out the noise but also remove seasonality to make the long term trend clearer. If one is to look at the trend, I think smoothing is a natural first step. They do sometimes pick up trends undetectable by our eyes.

    You know, I will soon need bifocals.

    The measurement errors can yield distorted modeling results, for example, the choice of best time-series model in terms of certain criteria. I suspect the trend probably won’t be affected. Of course, as you pointed out, it will depend on the (structure) relationship between the proxy/surrogate and true variable. Will have to do some study to be sure.

  61. I don’t see the problem with smoothing data over time and extrapolating trends. It worked well in the 1920s and 1990s for the stock market and 1945 to 2005 for predicting house prices. All of this variability stuff is vastly over-rated in my humble opinion.

    Just because the relationship between tree rings and temperature has some fluctuations in it and there appear to be fluctuations in the tree ring thicknesses over time doesn’t mean that you can’t get equally good relationships as the geniuses on Wall Street have been able to develop. Mathematical models ARE the real world after all. Everything else is an illusion

  62. Rdd,

    Ah, but that kind of smoothing is for a different purpose. That is standard time series analysis with the goal of trying to predict future values of the same observable.

    The smoothing I am talking about is where one series (or more) is smoothed and then input into another analysis. The results will be too certain.

    We disagree about the “variability stuff.” I argue that most people underestimate variability, which is another way of saying they are too confident.

  63. Briggs

    I’m really sort of confounded here (stats is not my forte), but I can’t really see your point.

    You start with two processes which I’ll rewrite as follows (if I mess this up please let me know):

    X0 = ax + b + w(v,s0)
    X1 = ax + b + w(v,s1)

    where a = slope (same for both), b = offset (same for both), w = white noise process (same for both), v = variance (same for both), but ….

    s0 = seed for process 0
    s1 = seed for process 1

    The important thing here IMHO is that both X0 and X1 are identical *except* for the seed values.

    You then generate your input data by letting both processes run for a while, but because the seeds are different the output is different. To the eye, they look completely uncorrellated provided the amplitude of w() is large enough to overwhelm (or at least disguise) the linear process “ax + b”).

    At this point you say X0 “has nothing to do with” X1. (I don’t agree, because stochastically they are identical, but I’ll pass on that point for a second.)

    You then start doing correllations and smoothing until you get to the following situation

    X0-smoothed = ax + b
    X1-smoothed = ax + b

    ie. you’ve removed all the variation which was originally due solely to the differing seeds for the white noise process w()

    So what? You started with the same thing, did two runs to get some output then started smoothing to remove the random element and ended up with the same thing.

    Why are you surprised?

    Now second question:- If your purpose is to uncover (and validate or deny) the signal of “ax + b”, how could you possibly do this without removing w() by smoothing? Isn’t that the point of the whole exercise?

    Now if the white noise process w() was what you were interested in, I could understand your complaint, it has been eliminated by smoothing.

    But if the signal of the physical process you are trying to detect (ax + b) is the item of interest (which it is in climate matters), why are you concerned that the noise (weather) is removed by smoothing?

    Sorry, I just don’t get it.

  64. JM,

    Your equations don’t represent what I did. They should start

    X0_i = epsilon_0i
    X1_i = epsilon_1i

    where epsilon_ji ~ N(0,1). The “a” and “b” in your equations should be 0.

    “Stochastically” the same? Well, let’s say that our uncertainty in X0_i is described by a normal distribution with parameters 0 and 1, and knowledge of X0_j (where j does not equal i) or any X1_k (for all k) does not change this. Same for X1_1.

    Your second set of equations aren’t right either (well, aren’t right for what I did).

    S0_i = aX0_i-1 + aX0_i-2 + …aX0_i-k

    where a = 1/k, and the same thing for S1_i. This holds when we do the running means smoother. For the low pass, it’s slightly different.


  65. It all depends. We engineers smooth input data to our Kalman filters all the time. However, you generally want to make sure that the smoothing prefilter has a much higher bandwidth than the Kalman filter, or you may be taking out the information it needs to converge accurately. You have to be especially careful of inducing excessive phase delay, which is why we usually like to have at least an order of magnitude between the bandwidth of the Kalman filter and the smoothing prefilter in any real-time processing. If you are using a “moving average” smoothing prefilter, what we like to call a “finite impulse response filter”, you can compensate phase delay in post-processing just by shifting the data forward. If the smoothing prefilter bandwidth is approaching the bandwidth of the Kalman filter, then you are best advised to model the prefilter difference equations in the Kalman filter. This task can be considerably simplified without increasing the dimension of the filter state if you decimate the prefiltered data, which you might as well do as you just smoothed out all the high frequency content anyway.

    I doubt many of you will read this far. If you skipped ahead, let me put it in a nutshell – you can smooth input data, but you have to be careful how you do it and how you process subsequent data. It takes a very experienced and very talented operator to do it right so, if you are not sure of the bonafides of the person doing it, you should proceed with caution.

  66. I wouldn’t call smoothed data fictional data, I would call it transformed data. I’ll agree that it can increase spurious correlation but that doesn’t mean that there isn’t some dynamic model that fits it well. Weather you have enough data to fit that model with reasonable certainty is another question. If you take the inverse transform of your fit, then you can see how much of the real data is explained by the fit to the transformed data.

  67. John,

    Not quite. An infinite number of models will fit your already observed data, and will fit it as close as you like with respect to any measure of goodness. But so what? The real test of a model is its ability to well predict new data.

    Also, search for the term “smoothing” on the site here.

  68. Weather a model makes good predictions is outside the scope of my comment. Even a simple line can make predictions as it is an estimate of the derivative and the Taylor series tells us how much error is introduced by higher order derivatives assuming we have a decent estimate of the first derivative. In some circumstance within the same process the estimate of the derivative will predict the future well, and in others it won’t. We cannot know when there will likely be a rapid or discontinuous change in the trend without a good understanding of the underlying process.

    With regards to climate science, it is nonlinear, so there are likely processes like hysteresis and possibly bifrications that will introduce rapid changes in the behavior of the system. When these changes occur, we will have to throw out our fit and start again. Most simple model fits are an approximation about some region of a much more complex system.

  69. I can’t find it here but John S made a comment about cross spectral correlation. Well, not totally related to this thread I find it interesting because where the cross power spectrum has peaks are areas in the frequency spectrum where there could possibly exist some correlation and the narrow narrower the bandwidth of these peaks the easier it will be to use them to make predictions. However, the morrow narrow the peak the less statistical evidence there will be to evaluate whether the correlation is spurious.

  70. We engineers smooth input data to our Kalman filters all the time.

    Um, no. Not this engineer. Never, ever, ever. By “smoothing” the inputs, you introduce time-correlation into the inputs, which you then should have to represent in your measurement model. Not good. No, the correct way to handle noisy measurements is by modeling the amount of noise in the measurement in the measurement error covariance (usually called “R”, in notation found in Gelb, for instance).

    If you’re only smoothing in one direction, you’re introducing a time lag, which may throw off your filter in a completely different way. Frequently your measurement is a dynamic quantity, so time delay is going to result in residuals that are time-correlated in a way that is almost sure to confuse a filter that you’ve made overly confident by smoothing its inputs, and then reflecting the attendant reduction in noise by lowering “R”.

    It is true that sometimes we engineers are forced to feed one Kalman filter with the output of another Kalman filter. There are ways for handling these situations, but you wouldn’t do this by design; you do it because you have no other choice. I’ve done a number of these, and never once has it been necessary or desirable for me to smooth the inputs to the downstream filter.

    EOR (End of Rant)

    NB: I work in defense, not climate science. It may be that complete no-nos in the world of defense are less important in climate science, but it is NOT true in general that Kalman filter measurements are smoothed or otherwise manipulated. The measurement is your best information; by messing with it, you are destroying information.

  71. I think one of the problems here is that what Mann is doing, IIRC, is smoothing a naturally varying (but accurately measurable, one might suspect) process. In doing so, he’s removing a great deal of the natural variation in the name of “noise” reduction. And also, inevitably, introducing a different kind of time correlation into the data than is in there in the first place. Possibly to the point of dominating or even masking what’s actually in there.

  72. Slartibartfast, doesn’t the process of sampling smooth data because we don’t sample over an infinitesimal period of time and know measurement device has infinite bandwidth.

  73. Nothing’s missing, a thermometer has thermal mass therefore it soothes data, a microphone diaphragm has inertia therefore it smooths data, a turbine flow meeter has rotational inertia, therefore it smooths data, a digital to analog converter has inductance and capacitance and therefore it smoothies data, etc, etc, etc, …….

  74. My apologies, John; I had interpreted “smooth” in your question as an adjective, not a verb. My error.

    Sampling does inherently roll off the frequency content of the data. Nyquist, and all that. Sampling sensors do have frequency limitations as well. All this says is if we’re missing important frequency content by sampling and by using the sensors we use, then we’re sampling too slowly, and using sensors that are not sufficient to the task. Typically, you use a measurement process that introduces time-correlation of the data that’s short compared with your treatment of the data; in other words, you want to provide data samples to your estimation process that are spaced several time constants apart, so that they are effectively decorrelated. Barring that, you’ve got to account for the time-correlation effect in the data processing.

    Which means that, in the case of proxies (and I feel that I need to emphasize here that I am not any sort of climatologist or dendrochronologist), you have to have some kind of model that represents the timewise correlation (as well as, and this is very important, correlation to other variables) of the measurements. If the sequence in question was the reading of a single thermometer, sampled hourly over a span of years, why then you’d have to account for diurnal and seasonal temperature variation (as well as other important effects) as a prominent artifact of the measurement. I haven’t looked to see what proxies the chart in the main post represents, but a similar understanding of the random and nonrandom drivers in the measurement is a must.

    As I said: not a climatologist. But I do optimal estimation for a living, and I have a great deal of experience with model-based estimation. Most important thing: your model has to be sufficiently detailed to reflect how the real world is behaving.

  75. From where I stand, a Monte Carlo Routine could be either a statistical process or a bunch of really pretty girls kicking up their legs (I much prefer the latter…).
    What is a Monte Carol Routine?