You will have already seen this comparison graph from the Daily Mail.
That paper points out correctly that the version of the plot provided by BEST is presented in such a fashion as to obscure that not much of interest has happened over the past decade.
But this graph also obscures the uncertainty in the curve. It appears as if it is certain that historical temperatures were lower in years prior to 1950. This is false. The further back in time we go, the less certain we are of what the global average temperature was. We cannot tell with sufficient confidence whether dates before 1900 were warmer or cooler than they are now.
According to the Daily Mail story:
[Muller] admitted it was true that the BEST data suggested that world temperatures have not risen for about 13 years. But in his view, this might not be ‘statistically significant’, although, he added, it was equally possible that it was — a statement which left other scientists mystified.
‘I am baffled as to what he’s trying to do,’ Prof Curry said.
Prof Ross McKittrick, a climate statistics expert from Guelph University in Ontario, added: ‘You don’t look for statistically significant evidence of a standstill.
‘You look for statistically significant evidence of change.’
Muller’s and McKittrick’s reported1 comments belie a (common) misunderstanding of what statistical “significance” means. Here is what it does mean in the context of temperature change.
To achieve statistical “significance” requires three things: a start date from which the analysis begins, an end date on which the analysis ends, and a fixed probability model. All three are arbitrary, at least partially ad hoc, and changing any of them will give different results. Which is the best start date? Depends on what question you want to ask. Is the best end date always today? No. And what probability model is best? Nobody knows.
Any analysis also assumes that the data underlying these three choices is perfect, error free, and that it represents the question asked. For example, are the few land surface stations we have chosen truly representative of global average temperature? Let’s don’t argue about that and assume that BEST’s data is perfect, representative, etc.
Take the (assumed perfect) Daily Mail graph at January 2001. What is the probability that the temperature was warmer in January 2007? Do look at it before answering.
The answer is 1, or 100%. And the reason is that it was certainly warmer in 2007 just because the measurements showed that it was. Is this increase—for we already know it was an increase—”statistically significant”? We have start and end dates, we have assumed pure data, so all we need is a model. Which should we choose?
People are inordinately fond of (various forms of) straight-line regression. Why? I’m guessing when I say simplicity. This is true even though there is not much solid physical evidence that the atmosphere responds linearly to forcings and feedbacks over any scale measured in years, while there is plenty of evidence that the atmosphere instead responds non-linearly, perhaps even chaotically. What physics dictates temperature should increase only in a straight line over any two arbitrary time points?
But never mind that. For as soon as we start asking these kinds of questions, we have already gone off the rails. Remember: we already know that the temperature was higher in 2007 than in 2001. We were done before we began.
To insist on answering whether the change was “statistically significant” is nonsensical unless we believe that the data was measured with error and we are trying to ascertain what the real change in temperature was given our belief that we have modeled the error and the time course of temperature accurately, or that we believe the data sources were different over this time span and we want to quantify the probability of a change given no difference in the data sources. But that also requires having a model of the change and of the time course of temperature.
As it is, for data over the last decade, which has a large component of satellite measurements, there is (probably) negligible error, thus there is no need any statistical model2. But there is still this notion of “no change” we would like to get at. We cannot say there has been no change, because (looking at the graph) clearly there has been. But we can say things like “Only thrice have monthly temperatures increased more than 1.6 degrees centigrade over the 1950 to 1980 average.” And so on.
Incidentally, why 1.6 degrees? Why not? It too is arbitrary: pick whatever number is meaningful to the exact decision you want to make.
For historical data we do need a statistical model because temperature was measured with error and the components that went into creating the (operationally defined) global average temperature (GAT) have changed. We can’t directly measure what the GAT was, so we have to model3. Since a model is necessary to declare (withing some level of certainty), so are beginning and end dates. Which to choose? And which model? What physical justification do you offer for these three choices?
Change focus: The best picture that gets at the idea of uncertainty comes from Anthony Watts:
This shows NASA’s GISS and BEST’s data for Los Angeles. The difference from these two different sources—for years that are not historically remote, even—is as much as two degrees. Is it any wonder, then, that some of us are concerned when we hear predictions that, say, fifty years hence the GAT will be 0.5 degrees higher than some arbitrary number, when we cannot even say within that accuracy what the average temperature of Los Angeles was last year?
Update Still assuming the perfection of the data points in the last decade, something cause the GAT to change. What? A statistical model (chosen from an infinite number of models) might be fit and insight might—might—be gleaned from it, but far better would be to investigate the physics over this period. And if that statistical model has any value, it should be able to skillfully (used in the technical sense) predict data into the future. We cannot tell whether the model is actually good until we wait to see what happens.
1I say “reported” because of my suspicion that these words are imperfect representations of what each gentleman said.
2A model is only needed for these years if one wants to predict temperatures beyond the end date. There is no reason to predict what we already know.
3And we should give our predictive uncertainty of this temperature, not the uncertainty of the parameters in our model. See this.