A reader writes:
I am a fairly new reader of your blog, coming from WattsUpWithThat and reading with delight and frustration your thoughts on statistics and climate. I have a question in this regard, as I am trying to figure out a thing or two about my local climate and weather.
Recently we were told that Norway (my home) was 2-something degrees warmer in 2014 than the “normal”. I asked our Met office how they calculated this, and the reply I got was that they take all stations available and smear them across a 1×1 km grid through some kind of interpolation in order to get full coverage of mainland Norway. And thus they can do an average.
Now, I realise that whatever we do there is no such physical thing as a mean temperature for Norway. But say that I would like to calculate an average of the available data, would it not be more appropriate to just do the average of all stations without the interpolation?
I would really appreciate it if you would give an opinion on this.
Anders Valland
Interpolation is a source of great over-certainty. Here’s why.
You can operationally define a temperature “mean” as the numerical average of a collection of fixed, unchanging stations. The utility of this is somewhat ambiguous, particularly for large areas, but adding a bunch of numbers together poses no theoretical difficulty.
The problem comes when the stations change—say an instrument is swapped—or when old stations are dropped and new ones added. That necessarily implies the operational definition has changed. Which is also fine, as long as it is remembered that you cannot directly compare the old and new definition. Nobody remembers, though. Apples are assigned Orange designations. It’s all fruit, right? So what the heck.
It gets worse with interpolation. This is when a group of stations, perhaps changing, are fed as input to a probability model, and that model is used to predict what the temperature was at locations other than the stations. Now, if it worked like that, I mean if actual predictions were made, then interpolation would be fine. But it doesn’t, not usually.
There are levels of uncertainty with these probability models, which we can broadly classify into two kinds. The first is that internal to the model itself, which is called the parametric uncertainty. Parameters tie observations to the model. If you can remember the “betas” of regression, these are they. Nearly all statistical methods are obsessively focused on these parameters, which don’t exist and can’t be seen. Nobody except statisticians care about parameters. When the model reports uncertainty, it’s usually the uncertainty of these parameters.
The second and more important level of uncertainty is that of the prediction itself. What you want to know is the uncertainty of the actual guess. This uncertainty is always, necessarily always, larger than the parametric uncertainty. It’s hard to know without knowing any details about the models, but my experience is that, for interpolation models, prediction uncertainty is 2 to 8 times as large as the parametric uncertainty. This is an enormous difference.
If the interpolation is used to make a prediction, it must be accompanied by a measure of uncertainty. If not, toss it out. Anybody can make a guess of what the temperature was. To be of practical use, the prediction must state its uncertainty. And that means prediction and not parametric uncertainty. Almost always, however, it’s the latter you see.
You have to be careful because parametric uncertainty will be spoken of as if it is prediction uncertainty. Why? Because of sloppiness. Prediction uncertainty is so rare that most practitioners don’t know the difference. In order to discover which kind of uncertainty you’re dealing with, you have to look into the guts of the calculations, which are frequently unavailable. Caution is warranted.
The uncertainty is needed to judge how likely that claimed “2 degrees warmer” is. If the actual prediction with prediction uncertainty is “90% chance of 2 +/- 6” (the uncertainty needn’t be symmetric, of course; and there’s no reason in the world to fixate on 95%), then there is little confidence any warming took place.
But watch out for parametric uncertainty masquerading as the real thing. It happens everywhere and frequently.
the messing with data justified by pseudo-statistics is just another form of fudging–less obvious than just altering the data as in the case of the Paraguayan weather stations… see
https://notalotofpeopleknowthat.wordpress.com/2015/01/26/all-of-paraguays-temperature-record-has-been-tampered-with/
Thank you for this elaboration on my question. I am going to have to take our met office to task for this then and ask about both the parametric and prediction uncertainties.
One of the reasons I started looking into this was that I could not get my head around what the uncertainties in these estimates would be. That is why I initially thought that it would be more appropriate to simply calculate the average of the stations. I was thinking about how to handle the inevitable changes in instrumentation, closing and starting of stations etc. I still believe that doing the simple calculation should achieve the same thing IF we are looking for changes only. I am not arguing that it would make more sense, only that it would be as appropriate as the smearing/infilling method.
And lately this issue has become even more serious, what with the discussing on 2014 as a record year. I believe what you say here just underscores that the +/- 0.1 degree uncertainty is waaay below the real figure.
The more that is averaged, the less that is known. We sacrifice completeness for convenience. Dr. Heisenberg had a tiny insight about this sort of thing.
I had to prepare a prediction of a certain fuel consumption in the next 5 years, and to do that I had to establish monthly relationship between temperatures, number of users over time and consumption. The relationship turned to be quite linear while daily temperatures were under 15°C, yet there is an almost constant residual consumption for temperatures above 15°C. With such ski-like piece-wise linear curve I could produce a well converged model of the past and establish the 2 sigma boundaries. So with my simple model and a simplified model of temperature change I could do the projection with diverging upper and lower boundaries… which will prove wrong as soon as temperature variation does something extraordinary. And we know it often does just that. So I put that sort of comment in a conclusion.
A whole process can be explained like this: measure with a micrometer, mark with a chalk, cut with an axe, and kick it in its place.
I know precisely all the mistakes that I employed in my prediction that got truncated to the models’ sigmas (cut with an axe) but there was no way to do it any better with data at hand, and the whole idea of predicting something – even with perfect data – is a slippery thing.
The obfuscation starts at hourly temperature averages. The hourly averages are fine as such, but there are holes in hourly records, and you seldom have opportunity to work with raw data, so you grab the daily, or worse monthly averages. In monthly averages you already have two rounds of averaging upon interpolation. And so on.
In my case the urban heat island effect worked in my favour, as I had to do a prediction for fuel distribution in a city area only, precisely where temperature data came from.
Been there, done that. It requires a certain level of conceit to brag about such predictions while knowing what exactly stands behind it.
The funniest part is that the bosses were very happy. My consciousness is clean thanks to the comment at the end of a report, but I’m kinda sure no one paid attention to that. Such things go past the executive summary part.
Reminds me of a TV story called the “Agency” where they reconstructed the image a face by interpolating between what could have been only four pixels captured from the reflection of the back of a car side mirror by a liquor store camera. Hey! It’s done with a computer so how could it be wrong?
It is science!
How can you even question that?
GISTemp:
http://pubs.giss.nasa.gov/docs/2010/2010_Hansen_etal_1.pdf
HADCRUT3/4:
http://onlinelibrary.wiley.com/doi/10.1029/2005JD006548/full
http://www.earth-syst-sci-data.net/6/61/2014/essd-6-61-2014.pdf
Briggs (pt. II),
Berkeley Earth:
http://static.berkeleyearth.org/papers/Methods-GIGS-1-103.pdf
Looks like a lotta guts to me.
I have a question on using the “average” of data. If the data is extremely divergent, such as the average temperature of the globe, what meaning does the average have? Average seems to only be useful if the numbers are very similar. Otherwise, it’s just a number, like “average height”, “average income” etc. People use it because it’s easy. Is easy correct in this case?
In regards to adjustments, etc.:
“Government climate scientist Peter Thorne, speaking in his personal capacity, said that there was consensus for the adjustments.”
So consensus is there. How dare you question consensus!
(http://www.foxnews.com/science/2013/01/10/hottest-year-ever-skeptics-question-revisions-to-climate-data/)
Brandon,
See the Classic Posts for more details on BEST.
Don’t forget that there are all types of interpolation formulas, linear, quadratic, polynomial, etc. It’s like curve fitting. Practically everybody fits a polynomial to the data but you can use other functions.
Sheri,
Measures of central tendency can be a hot-button topic (pun intended?). Part of me thinks that they are hold-overs when plotting and other forms of data presentation were hard. A mean and a sigma were easily calculated and transmitted. Some data sets (I think the HADCRUT4 is one of them) don’t use mean, but they use the median of an ensemble of some kind. So we get to argue about mean, median, and mode, along with whether or not those ‘mean’ anything in the context.
Nowadays, it’s a snap to generate histograms and other pictures of the density of values. With those, fat tails can be easily seen by the viewer. There is not much excuse any more to show a mean and a standard deviation when you can make a picture (and write the values on the figure). A mean and standard deviation are not the easiest things to picture, unless the variable is “normally distributed”. If there is skew or fat tails, you’re SOL.
Sheri,
The absolute surface temperatures are quite different from place to place, this is the argument for doing the anomaly calculations and then taking the area-weighted averages of the resulting trends. Which are still divergent … the poles are more sensitive than the tropics.
Briggs,
Speaking of BEST, Mosher is taking questions over at ATTP’s today: https://andthentheresphysics.wordpress.com/2015/02/09/guest-post-skeptics-demand-adjustments/
If the temperature adjusters are asked a question along the lines of: “Can you possibly justify altering a temperature reading at a station that occurred 20 years ago based on a temperature reading in that region that happened yesterday? Your model is not even physically real?”
There are two types of replies: “This is just the way we do things.” And “The problems you’ve found are obscure anomalies that don’t effect the overall result.”
Although it’s hard to see if they alter past temperature by as much as 4C, how that can in any sense be justified on any grounds whatsoever. But they just shrug it off.
One real problem with interpolation is missing intervening temperatures. For example, there are places in Pennsylvania and Maryland which regularly reach winter temperatures as low as -40F that would be completely passed over by interpolating from surrounding readings.
Another is that the method of interpolation used may not be appropriate for some gaps. For example, the ground temperatures between Pittsburgh and DC can vary by more than 15 degrees but, considering there is an intervening mountain range between them, a linear interpolation certainly doesn’t seem reasonable. Even between Baltimore and Quantico which are effectively on the same plain and where you would expect similar temperatures have had differences of 20 degrees or more. There have even been recorded differences of similar magnitude between DC and Quantico though they are much closer together. While a linear interpolation may seem reasonable one really doesn’t know.
If these prove difficult then think of the difficulty in interpolating ocean atmospheric surface temperature when only land based measurements are available which are thousands of miles apart.
The thing is, until you have measurements within the gaps, your knowledge of your uncertainty within them is practically zero.
“The problems you’ve found are obscure anomalies that don’t effect the overall result.”
To which one might ask: if changing them doesn’t affect anything the why do it?
Sheri asked what meaning does the average have?
If you take a course in thermodynamics you learn that temperature is an intensive parameter and intensive parameters do not scale. Average temperature is physically meaningless.
It’s like measuring the voltage at all the nodes of a printed circuit board and calculating the average voltage. It’s meaningless.
Average temperature is physically meaningless.
Not entirely. Are you saying there is no average temperature of a house or bath water?
@Ray
” Average temperature is physically meaningless.”
Average home price is also physically meaningless. But I still find that information useful when I’m house shopping.
James: Makes sense. To me, there should be better ways of dealing with temperature variations than averaging.
Brandon: Anomalies may help, but we are still starting with an average global mean and figuring from there. Bottom line, the average is starting point and I’m not sure average is what we need. As you note, the values input are still divergent.
DAV: Excellent question. (To which one might ask: if changing them doesn’t affect anything the why do it?)
Ray: I am inclined to agree that average temperature has no meaning. Our temperatures can vary by 20 to 30 degrees in one day. So we go from 20 to 60 back to 30. The average is 37 degrees. If I tell someone the average temperature was 37, that says really nothing about the swings in the temperature. Average flattens out all the hill and valleys, and in most cases, that is not a good idea.
I’m glad folks are discussing this.
One confusion people have is this. They think we are creating an average temperature. We don’t. Well, we call it an average, but technically that is not
What is it?
Its a prediction. Spatial stats provides predictions of what the temperature would have been at unvisited locations. To get this prediction we use the data from visited locations.. and a bunch of assumptions.
What we want to do is predict what the temperature would have been at location x,y,z and time t, IF it would have been measured perfectly.
Such that if you were god or had a secret stash of perfect data you could come along and check our prediction.
In some cases we actually do have some hold out data that we can use to validate the prediction. In the US for example we can hold out the best data we have ( from CRN) construct a prediction and then test the prediction with this very good ( its not perfect ) data.We can also go buy data (ouch) from some providers of pristine data to check the prediction.
The prediction works pretty simple
T = C + W
The temperature at a given location and time is a combination of C, the climate at that location and W the weather.
Climate is a function of the position (lat,lon,alt, season) Its found through a simple regression. That regression explains about 93% of the variation. you tell me the lat,lon,alt and season for a unvisited location and I will predict the temp. Simple. Easy to test. So we dont exactly infill or extrapolate or interpolate. We have a model of the climate at a location. This goes back to old school physical geography. Now, We continue to play with adding other variables ( distance from coast, geomorphology, etc) to get a better regression, but 93% is PDG.
The residual is W. the weather is that stuff that is not explained by location.
The weather gets kriged.. fancy word for interpolated.
Next you add C + W and you have a prediction. That prediction says
‘if you had secret perfect data about the past, this is our best guess (minimize the error) of what it would be. Or going forward if you wanted to hold out some current data that was from a really good source, you could test the prediction that way.
Adjustments? technically we dont do adjustments. An adjusted series is nothing more than the fitted values of the regression at the given datapoints.
In other words.. the recorded data ( which is suspect , doubtful and full of error) says X. The model predicted X+e for this location.
You can believe the model and its uncertainty which is actually calculated
You can trust the “raw” data and have an unmeasured over confidence.
Mosh,
“What we want to do is predict what the temperature would have been at location x,y,z and time t, IF it would have been measured perfectly.”
But that’s the thing isn’t it? Like with the lakes of cold in PA and MD there are undoubtedly similar basins along both sides of the Appalachians down into GA and maybe even rivers of cold air. Nearly a thousand miles of it. It’s caused by topography. I would expect the same with any mountain range.
Would seem your caveat is incomplete.
Sheri,
I don’t understand. Are you saying it’s your understanding that the absolute temperatures are averaged first, then the global anomaly is calculated from there?
What’s the alternative? Pielke Sr. stumps for doing everything in joules instead of just reporting ocean heat content that way. I think the argument has some merit in principle if only to do away with the need for averaging but, well let’s just say that idea is the subject of some lively debate.
Doing GMST anomaly doesn’t make the original data go away. We’ve still got the daily mins and maxes at the daily level. Raw and adjusted.
In thermodynamics average temperature is meaningless because temperature is an intensive parameter. I didn’t say you can’t calculate an average temperature or an average price.
“You can believe the model and its uncertainty which is actually calculated
You can trust the “raw” data and have an unmeasured over confidence.”
The above claim is the most amazing piece of rubbish I’ve read recently. I don’t “believe” your model of the temperature, nor as a consequence do I have to “believe” the raw data (assuming there is some reason to be suspicious of it.) I look at independent reality checks to see if I should trust your assumptions.
* Is your model consistent with min/max recorded values?
* What independent measures can I use for consistency?
* Is your model consistent with physical reality (i.e., does it do physically impossible things like adjust temperature backward in time)
* How consistent are your model’s trends with near surface sat trends?
And so on.
For example, if a temperature model shows that the Great Lakes region is experiencing record warm temperatures while at the same time the Great Lakes region is showing record ice coverage in summer, or I look at a long standing temperature station in Antarctic which doesn’t have a nearby station for thousands of KM, and it’s been adjusted, then the BS detector fires.
Come back to us when you’ve got *independent* verification for why your model should be believed. NOTE: Cross checking the model against different parts of the model for consistency is NOT independent evidence. Such checks are circular reasoning.
Anyone who claims “believe my model because it’s better than not having a model” frankly, can’t be taken seriously. Models are a dime a dozen. Until they are verified against physical data, they should be ignored.
Hey, I’ve got a thought. When I was teaching statistics to radiology residents (an apt illustration of “In the country of the blind the one-eyed man is king”), I told them about “box-plots”, a nice, non-parametric way of showing data spread and outliers. Given the availability of effortless computer graphics, why not use box-plots to present temperature data?
(Google for example figures)
Ray,
If you’re doing thermodynamics problems you probably want to be working in Kelvin, and you might not want seasonal variability removed either. So again, the station data are still available if you need them to do physics.
Brandon: Yes, I thought the global temperatures were averaged, that was the “zero” line, so to speak, and the anomalies calculated from there. If that’s not correct, I have no idea then where the average came from.
I see no reason why we need a global mean temperature. My preference is to look at the changes in each location over time. If I had enough computer, I’d use every reading I could get. Then, graph the values and see if there is a trend at each location. If the trends at the locations are all in an upward direction, then warming is occurring. How do we know if it’s global–we don’t. But we don’t know now. It’s burying the US in snow, but globally it’s supposed to be hotter.
Hills and valleys are important in looking at habitability, etc. If a place has huge temperature swings, living there is more challenging. We really should be keeping those hills and valleys. I know this is available, but it’s not what most climate science uses.
Has anyone here looked at studies on kriging? I’ve read a couple of them and it looked interesting. While I am skeptical of interpolating temperatures over much distance, I’m open to evidence that we have found a way to make accurate fill-ins.
Sheri,
Ok, I understand. It would be a TERRIBLE idea to average the absolute temps first then do the anomaly. It’s the other way around, the anomalies are done by month at the local level (either by station or grid depending on provider … but I’m fuzzy on all the particulars), then the whole mess is rolled up to zonal, hemispheric and global mean anomalies.
That would be information overload for we mere mortals …
Think about that. Does that mean all locations in a downward location are required to say that cooling is occurring
[mutter mutter mutter]
You and I have had that conversation before. What do you say in the summer when it’s really damn hot? Oh well, that’s just the weather today?
My rule of thumb: if the data are available, they’re being used. Just think about it. It costs money to host them on public servers. That means there’s demand for them. What’s communicated in media is the barest fraction of what climatologists are using. You and I are only just talking temperature here. How often do you see a global mean water vapor content plot? Sea level pressure plot?
Sheri re: kriging, you won’t be happy with Cowtan and Way if you’ve read that one. The interpolations make the Arctic warmer ….
Brandon: No, it’s not the same conversation about weather versus climate. It’s now asking how so many lows still manage to be high. Not just the US, but Europe and other places. Many, many lows should affect the average and anomalies, but they are not doing that. My curious mind says this is not computing and I should find out why this is happening. I don’t say hot proves anything any more than I say cold does, except when the observations are in direct conflict with what I am being told.
Agreed that we rarely see much of the actual science involved, which is a truly sad situation. There’s lots of money in climate science. I think they can afford the servers.
Yes, if all locations in an area are cooling, then that area is cooling. It’s what the temperature is doing. Why would I call it anything else?
I’m going to have research that anomaly/average thing. It’s not making sense. It’s late–I’ll check it out more tomorrow.
I have read Cowtan and Way. I don’t base my like or dislike on the outcome. I am trying to understand whether the process is valid or not.
Lets see if I can help
“Sheri asked what meaning does the average have?
If you take a course in thermodynamics you learn that temperature is an intensive parameter and intensive parameters do not scale. Average temperature is physically meaningless.
It’s like measuring the voltage at all the nodes of a printed circuit board and calculating the average voltage. It’s meaningless.”
This is wrong.
The wrongness of course traces back to Essex.
In some manefestations the global “averages” are called INDEXES. Why?
Because we merge SST and SAT. so its basically not a physical thing. But it doesnt really matter, because we are not really creating an average.
NEVER FORGET THAT.
Now of course people call it an average, and we say the average is higher, but you cant really average temperatures.
So what is this thing? Its a prediction. When I say the average temperature of the earth ( say land) is 10C, that means, operationally, that an estimate of 10C will yeild you the lowest prediction error if you select random places to take measurements. Sound odd? Yup, that’s why we call it an average.
And it does have meaning. A very clear operational meaning. When I tell you that the average today is say 1C warmer than the average in 1900, that means
pick any place you like. look at the temp today. Then if you have a temp for that place in 1900…. then my prediction ( 1C coller than today ) will minimize the error. Go ahead, try it. The “average” is the best prediction for the temperature at unvisited locations where you are allowed to guess any location at random.
Go ahead.. you can try to come up with a better prediction.
So, ‘physically averaging intensive parameters doesnt give you a physical answer. BUT, we are not “averaging” we are using data at know locations to predict data at unvisited sites. Now if you integrate that prediction feild you will get a number. duh. we call that number an average, but operationally its not.
“But that’s the thing isn’t it? Like with the lakes of cold in PA and MD there are undoubtedly similar basins along both sides of the Appalachians down into GA and maybe even rivers of cold air. Nearly a thousand miles of it. It’s caused by topography. I would expect the same with any mountain range.
Would seem your caveat is incomplete.”
##################################
You would be surprised how few cold drainage areas there are.
Now where they exist we will get a local answer where some “climate” is in the residual. However, by comparing our result in the US with a product ( PRISM) that does account for cold air drainage, we find that …… The answer doesnt change !!! go figure.
That said, I now have some new data sets that will allow me to model cold air drainage using a topgraphical wetness index. basically a 1km world wide grid that captures areas where you could get cold air drainage. This may give us better local detail… BUT the global answer wont change. Some local areas may change.
Now I bet you thought you were schooling me on rivers of cold air.
Here is the deal. I’ve been working on the problem for a couple of years.
Sadly I know before I start that the global answer wont change.. Not in any way that matters. C02 still warms the planet
Here is a paper by a couple of R buddies.. See how much TWI helped
http://onlinelibrary.wiley.com/doi/10.1002/2013JD020803/epdf
“Come back to us when you’ve got *independent* verification for why your model should be believed. NOTE: Cross checking the model against different parts of the model for consistency is NOT independent evidence. Such checks are circular reasoning.”
One standard way of testing the model is to construct the field with a subset of data and then predict your hold out data. Did that.
Another way to test it, is to aquire new data. For example, since we built our first field, 2000 new stations have been recovered from archives.
That means we can test the field against records we never saw. Did that.
Another way to test it is to buy data from people who operate private networks.
You can do that. I will give you the contacts
Another way to test it is to do joint projects with countries that have tons of data they never shared ( China, India, Korea ) working on that.
Another way to test it is to go make new measurements at locations not sampled before. Robert Way has got some cool stuff on this.. you’ll have to wait.
Lastly.. when I say model I mean statistical model– estimate is probably a word you are used to
“* Is your model consistent with min/max recorded values?
YES, it is derived from these value. we use primarily daily data, min max
* What independent measures can I use for consistency?
A) ITSI data we didnt use can be sued for out of sample testing
B) Various states run agriculture networks that have out of sample data
C) there are data sources in China, india, and Korea that were not used.
D) Satellite data
E) Reanalysis data ( nowcast from Weather service )
Been there done that.
* Is your model consistent with physical reality (i.e., does it do physically impossible things like adjust temperature backward in time)
The model doesnt adjust data. It’s phsyically realistic
* How consistent are your model’s trends with near surface sat trends?
Matches the newest best satillite.
Understand this is a STATISTICAL MODEL.. all global averages are statistical models. You read the word “model” and your knee jerked.
Hope your chin is ok
“One standard way of testing the model is to construct the field with a subset of data and then predict your hold out data. Did that.”
Self referential. E.g., if your model smears data in a consistent way, you’ve demonstrated that the smearing is consistently done. Congratulations.
“The model doesnt adjust data”
What strange meaning have you have now assigned to the word ‘adjust’ ? What model are you referring to? Are you claiming that GISS, etc., do not change older data based on new data? Why do historical records keep changing? This means you are making a nonsense claim doesn’t it? Or you’re intentionally misleading?
“Understand this is a STATISTICAL MODEL..”
I have nothing against models. On the other hand, I have no reason to believe a model just because someone shows me they created one.Any nitwit can create a model. Hence when you wrote:
“You can believe the model and its uncertainty which is actually calculated
You can trust the “raw” data and have an unmeasured over confidence.”
You wrote complete nonsense. The calculated error is only meaningful if the assumptions are 100% correct. I can create any sort of statistical model that is 99.9999999% certain of a claimed result, so long as you don’t question the assumptions of the model. In other words, you wrote nonsense.
I must rephrase something:
It requires a
certainhuge level of conceit to brag about such predictions while knowing what exactly stands behind it.Every day one learns something new. It’s not only for the old dogs.
Now for a real life example on a urban heat island experienced first hand. It is winter, and I drove to my club which is some 30km away from the city. Right after the last city houses there is a snow cover, and at the club location there is some 20cm of cover. Going back the same evening the snow cover raised to some 40cm, but it stops right where i left it at the first city houses, and it continues to be so. And they say that station siting in urban heat islands does not spoil measurements.
In a country like mine, with ~100 inhabitants per square km, and only about 10% of all territory ever being visited by people (as checked through cadastre data), and only about 1% of all territory being inhabited clutters (as checked through clutters in GIS), and all stations sitting firmly on that 1% of territory, I say bollocks. You can’t possibly interpolate open areas using only station data siting in settlements in a country with heating season longer than 6 months, and say you did a good job. It is just impossible. The minima are completely screwed up.
Consider an analysis such as this one:
https://notalotofpeopleknowthat.wordpress.com/2015/02/04/temperature-adjustments-transform-arctic-climate-history/#more-13039
A link to a new website seems fitting here.
http://climatechangepredictions.org
Mosher,
I like your model. It’s kinda cool (no pun intended). I understand how it’s making predictions/projections for areas where there are no instruments. What I also understand is that with predictions come errors and it seems that the errors with this type of prediction can be in full degrees versus the real physical temperature. When a year is reported as the hottest, coldest, whatever by tenths or even hundredths of degree, that’s when you lose me. It seems improbable to move from less precision to more precision within the error bounds of modelled temperature anomalies.
It goes without saying that imputation/interpolation of missing values (the grid boxes without observations) will add uncertainty.
Whether interpolation would be appropriate depends on the number of missing cases, the mechanism of missingness, the interpolation methods, and so on. Interpolation is, in general, weighted average of nearby values or all data values.
Deleting missing cases is not innocuous For example, in certain situation higher-income survey participants are more likely to report their income than lower income ones. Removing missing cases would probably result in an mean income estimate with an upward bias.
Missing data imputation has been studied intensively in statistics. Search for keywords “imputation” “interpolation” and “missing value.” One might also find some fun reads on “image interpolation.”
If you are looking for easy-to-understand answers to the following questions, visit this site
http://www.ncdc.noaa.gov/monitoring-references/faq/anomalies.php.
* What is a temperature anomaly?
* Why use temperature anomalies (departure from average) and not absolute temperature measurements?
* How is the average global temperature anomaly time-series calculated?
(Weighted average, really.)
Ray bouncing off of Sheri: you guys really got me thinking about the meaning/value of average temperature.
In the US Pacific Northwest, there’s typically thick cloud cover throughout the winter, with little temperature variation on a typical day. Since heat flow comes from temperature differences, one could assume that there’s not much heat exchange taking place, at least at ground level. But some days might see a short break in the clouds, and temperatures will quickly spike (up during the day, or down during the night). It’s not hard to think that the bulk of the heat exchange for the day could happen during one of these spikes. The calculated average temperature for that day would tend to obscure a perhaps significant heat exchange.
The other thought I had from reading about the various corrections applied to the temperature record is that these corrections are applied in manner that is anything but ‘blind’. This seems like a virtual invitation for ‘researcher bias’ to slip in, unnoticed by even the researcher. Even one of the researchers, in a video response to defend the corrections, seemed oblivious to the distinct pattern of cooling the more distant past and warming the recent past that seemed to jump out of in his own graph, undermining his argument significantly.
If there’s an obvious pattern to the corrections, one that biases the reported data in favor of the “consensus”, the need for some sort of explanation/justification for the pattern is also obvious, at least to those of us outside the consensus.
Sheri,
My mistake then. Appreciate, however, my generally dim view of news media headlines which play up single extreme warm-weather events as fingerprints of AGW. Also note that snow is not solely a function of temperature. It’s quite cold in Antarctica, however most of it is considered a desert on the basis of low annual precipitation rates.
I agree that the way you’ve stated it is counterintuitive, and that, yes, you should dig into it.
This all hinges on what observations you look at. My understanding of how weather works is that if there’s anomalous cold in one place it’s typically offset somewhere else with anomalous warm. Now if that anomalously warm place happens to be sparsely populated …
My mistake for assuming a global frame of reference when you were writing from a local one. I was challenging what I thought was your notion that all areas of the planet must be warming for the planet to be considered warming.
We’ll see. In the US, for the the next two years, that might depend on Obama making sure that all the pens in the Oval Office are perpetually out of ink. My point though is that there are a lot of data already on public servers which we the collective public don’t often talk about.
Well ok, but here’s the thing, even without C&W’s kriging methods the argument still goes that the poles are warming at a faster rate than the mid latitudes. And yet, the poles are still “cold” in terms of how we temperate-zone dwellers think of it — quite obviously the Arctic hasn’t warmed to the point where it’s not capable of sending a large dollop of cool dry air south to meet warm moist air coming up from the sub-tropics resulting in one hell of a blizzard.
Brandon: I have lived all of my life where there are four seasons and snow. I understand the lack of correlation between snow amounts and temperature. Perhaps I should have added for those living in warmer areas that the temperatures in the areas I am referring to with excess snow are well below the average temperature for this time of year, some 10 to 20 degrees below. Then there’s the very foolish statement made by a climate scientist that snow would be a thing of the past. Sure, he was one guy with degree and an audience, but he’s an “expert” and experts are supposed to know these things. He failed miserably, both in predictions and in getting people to believe in global warming.
If that anomalous warm place happens to be sparsely populated? By people or by thermometers? No people, no news. No thermometers, no real data. See my point? (I actually go by the map that shows where NASA has thermometers, which are sparse at the poles and in Russia. Actually, most are in Europe and North America, which leads to question again why the bitter cold there is not affecting the average.)
I understand the poles are still cold and that the use of the term “warming” is problematic. Maybe certain media types and others should not have spoken about melting Arctic ice so much. People get confused and think you need warmth to melt ice. Consequently, no ice melt, no warming to many people. There is massive miscommunication and outright lies and exaggerations. I know you understand that. However, until that’s corrected, people will remain skeptical and downright hostile. People do not like being lied to (witness the Brian Williams affair). Honesty would go a long way to aid global warming science.
(Very true on the “hell of a blizzard”!)
Milton,
Indeed. I think it’s fair to say that the bulk of surface station records used to build global anomaly time series are daily min/max. And many of those min/max data were derived from station-keepers who recorded the min/max by manually reading whatever the thermometer said when the schedule called for doing so. If there was a schedule. More modern tech helped out with this, first it was thermometers which mechanically recorded the min and max when they happened. Now we’ve got fully automated equipment that could in theory record temperature every second, but in the interest of processing time and storage space we settle for hourly resolution. [1]
But for sake of argument let’s assume a station which used an automated min/max apparatus over its entire record, and recorded those values daily. We’ll leave out the complication of maxes and mins sometimes happening before GMT midnight and sometimes after. At the very least, for any given month we know the temperature never fell below the lowest minimum recorded during the month. And we’ve got a distribution minimum daily temperatures to work with as well, on the order of 30 per month on average which is a fairly healthy sample size from which we can do all sorts of interesting things statistically. I argue that we could derive a pretty robust analysis of what temperature is doing at that location over years and decades on the sole basis of knowing the minimum amount of energy at that location.
Increasing GHGs reduces the rate of heat loss. Where else better to look for that than in the trend of daily, monthly and annual minima?
—————
[1] Such equipment could compute the mean and other descriptive stats as it goes and log those daily or hourly etc. I wouldn’t be at all surprised if that’s exactly what’s being done.
Sheri,
So I believe you’re talking about Hansen, are you not? No, that’s not right: who was the scientist and what did he say?
I do see your point because that’s exactly what I was driving at. Since your main argument here really is that the news media screws the pooch when covering science in general, no contest. That there are political motives behind that, still no contest.
Force of habit for me these days is to go out of my way to mention the increasing trend in Antarctic sea ice coverage whenever ice is subject. See, I just did it. It’s automatic like.
Color me somewhat confused then. Ultimately more warmth on balance = less ice on balance. We know this already from glaciation cycles, but those things typically evolve over thousands of years, and our understanding of what happens on century and decadal time periods is quite limited by the very nature that the instrumental record is limited to the century scale.
So, the Antarctic caught “everyone” by surprise. Which is a good thing in science, a bad thing in the court of public opinion when the bulk of the message has been ice melting, not freezing.
Your question is exactly why GMST anomaly time series MUST do things like area-weighting the observations, doing the anomaly calcs at the local level first. A simple arithmetic average of absolute temps would be biased toward regions with the highest density of observations per unit area.
I do understand it, and I don’t like it. Something I wrestle with is knowing how many rocks to reserve from distortions, nonsense and lies from my own side. Part of the challenge is first figuring out if it’s a blatant lie or a simple misunderstanding. And I always must consider that everyone is wrong about a particular issue, especially myself. That can’t always be done on a factual basis, so a lot of times I’m reduced to deconstructing the logic of the argument to see if it holds up. Which sucks. A perfectly logical argument written in a neutral, non-rhetorical “tone” can still be dead nuts wrong, and intentionally so.
I’m not going to cop to the implication that climatologists and AGW activists are more dishonest than not. I don’t know how one would even begin going about objectively measuring such a thing. It’s certainly true that I believe the distortions are more on the side of my opposition, but I’m very firmly cognizant that I express an opinion based on belief here. Ultimately my opinion is that it’s not my side’s job to cater to everyone’s perceptions of how the debate ought to be prosecuted. Clearly, it’s everyone’s moral duty to be honest and truthful in matters of public policy. But the argument of “if only everyone were more honest, we’d listen” rings rather hollow for me, not least because of the broad-sweeping nature of the argument — which is a rhetorical tactic that I specifically see as dishonest.
Nudge. Nudge.
Brandon: David Viner East Anglia (of course)
http://www.independent.co.uk/environment/snowfalls-are-now-just-a-thing-of-the-past-724017.html
Automatic responses! Scary, isn’t it?! (I do it too–sign of spending a great deal of time on one subject)
“must do things like area-weighting”, etc: No, they need to admit they lack real data and go gather some. Since they are still stuck in the “alarmist” stage, they can’t see that. If we don’t use circular reasoning and assume what we are trying to prove, we realize we just don’t have enough data. Gridding, weighting and in-filling are signs one lacks sufficient data. I know it’s expensive–maybe they could use the money the scientists blow on conferences and stuff and actually put in thermometers. Then gather data and build models that might actually work.
Activists are pretty much by definition willing to lie and do lie. It’s because the WIN is all they care about. You can check the history of activists and you should see this. Sure, some of the lower down in the movement are true believers, but many at the top are not. Same thing happens in politics and religion, unfortunately. The scientists–the media huggers (those who love to see their faces on TV) I doubt are sincere. Others I don’t know.
I cannot excuse bad behaviour no matter who does it or why. Lying to the public alienates them. You can choose not to believe that, but it won’t change it. People do not like finding out they have been sold a political goal cloaked in science and lies.
Things like this article just serve to further the belief than global warming is nothing but a scam to kill capitalism.
http://news.investors.com/ibd-editorials/021015-738779-climate-change-scare-tool-to-destroy-capitalism.htm
Sheri,”Activists are pretty much by definition willing to lie and do lie. It’s because the WIN is all they care about. You can check the history of activists and you should see this”
Well said, but I wonder whether they are actually liars. To lie you have to know what the truth is, and I don’t think many of the warmists actually know what the truth is or how one should do science with integrity. So let’s term what they say and do “counter-factual”, or CF for short.
Brandon, I’m not calling you a liar!
Bob: As I noted, some probably believe and some are so invested they don’t care about the truth anymore. I don’t think you can lie without knowing the truth, so CF sounds good.
Sheri,
But of course.
“Children just aren’t going to know what snow is,” he said.
My first reaction would not be fear, but welcome. That’s great, less sidewalk and roadway salting. To the extent I have concern about stuff melting, it’s glaciers. Maybe I should worry about that, but it’s difficult … I’ll be long dead by the time whole meters or tens of meters of sea rise are supposed to happen. My nephews’ kids though … hard to say what that would mean for them.
Can’t do that with the data that are already in the books, Sheri. We’ve had that conversation before. You note the expense of gathering more observations, which I appreciate. I also appreciate that money is being spent to do exactly that.
I don’t have to check up on it, I’ve seen it happen. I’ve been at odds with the fringe of the fringe for nearly my entire adult life, not so much for lying, but more for an unreasonably extreme point of view coupled with much wishful thinking. Wasn’t until later that I started picking up on the dishonesty. I know someone has to be lying though, because so many different opposing things are being said. Whenever I’m given to taking out the broad brush, I typically apply it to everyone.
Probably true. I have the same thoughts about oil industry execs. [wink]
I’m not objecting to your principles on the matter. It’s the unspoken subtext that your end of the debate is squeaky clean. That’s the fundamental problem I have with all partisan politics … it makes hypocrites out of everyone. So my message to you is, you can want the lies to stop but as a student of human behavior I can all but assure you that it won’t. For damn sure I can’t prevent it from happening, and sure as heck don’t want radicals’ views and behaviors foisted on me to defend just because I happen to broadly agree with the scientific basis for their position.
At a news conference last week in Brussels, Christiana Figueres, executive secretary of U.N.’s Framework Convention on Climate Change, admitted that the goal of environmental activists is not to save the world from ecological calamity but to destroy capitalism.
“This is the first time in the history of mankind that we are setting ourselves the task of intentionally, within a defined period of time, to change the economic development model that has been reigning for at least 150 years, since the Industrial Revolution,” she said.
Which is pretty much where I stopped reading. The selected quote does not say what the editorializing paragraph above it says. Oldest trick in the book, and a sure sign that my time would be better spent reading something else.
What I know for sure is that I am not out to kill capitalism. I don’t personally know anyone on my side of the debate who is advocating that. I would not support doing it. And I’m really beyond sick and tired of being called an alarmist over at WUWT by people who are constantly fretting about UN Agenda 21 and “warmunists” who want to send us all back to the Stone Age. I think it’s bullcrap of the highest order.
Bob, I’ll cop to true believer who knows he could very well be wrong but really doesn’t think so.
Brandon: Yes, I know they can’t do that with the data already on the books. Doesn’t matter–they need to get this right before trying to fix something that may very well not be a problem.
Oil executives actually could care less about climate change. They worry more about political actions taken before the data is there. Oh, and they love those wind and solar subsidies they lap up (wink,wink).
You are absolutely correct I cannot stop the lying. However, I can refuse to accept it and can dismiss those who lie in order to obtain a particular outcome. The fact that many people lie does not excuse it. (I have never said that both sides don’t lie. It’s just that the global warming people seem to be the loudest at this point.)
It’s kind of hard to argue that the Industrial Revolution was bad isn’t anti capitalism, but maybe. Too bad you stopped reading. Okay, there was nothing more, but stopping reading is just not cool. 🙂 I was simply pointing out that things environmentalists say can be problematic.
As for WUWT beating you up, don’t go there. I don’t. And I agree that the agenda 21 stuff is really annoying, too. However, there’s really no way to avoid the stone age, except nuclear, which I know you are okay with. However, wanting nuclear may fall under the same umbrella as not wanting lying. Nice in theory, very tough in reality. (You know both sides of this issue are very, very set in there ways. I am often riding skeptics about behaving the same way they accuse warmists. My feeling is that if you’re not ready to be beaten up by both sides, it’s best to stay out of the discussion. Or find blogs that don’t allow the banter.)
(I am signing off for today—too much going on to keep commenting. Will return tomorrow or Friday. Meantime, have fun!)
Sheri,
What does getting it right mean exactly? How long is that going to take? These are rhetorical questions. NO public policy is ever done with absolute certainty that it’s the correct course of action.
I know. I have the exact opposite opinion. But it’s a zero-sum argument done at that level of generality so I just don’t go there. When I do, I try to keep it specific, speaking of …
… I reserve the right to ignore liars. 🙂
You know, they were great to me at first over there and I had what I thought were some very interesting discussions. Honeymoon lasted, oh, a week. Maybe two. And don’t think for a second I don’t give as good as I get.
More than ok with. I think it’s essential. Not to stay out of the stone age though. That’s hyperbole plain and simple. The Amish are a good model, but the horse poo could be a problem.
I dropped the bomb over at ATTP’s today and not for the first time. So far no reaction. It’s been a while since I’ve gotten any more than token resistance by force of habit, and a few times it’s been discussed in earnest after I brought it up. I don’t know anyone on my particular blog circuit who “stumps” for it like I do though. It’s one of those issues I think I just have to be patient about.
The intermediate step seems somewhat broken to me. Let me try to work through it…
Situation: we have a number of data collection points, unevenly distributed throughout the area under consideration. In addition, these points are discrete, whereas the “temperature in a given area” is continuous.
So what are we trying to calculate the average of?
We can trivially calculate the average of the data collection points. But that doesn’t necessarily tell us much about the temperature at any single point, let alone the temperature at a point where there is no collector.
If we’re trying to model the continuous temperature curve (for given points in time), then it’s not illegitimate to interpolate between the available points, subject to some pretty brutal assumptions about data consistency between points.
If we then average the results from this model, then all we’re really doing is generating a weighted average of the existing data points. We have no more information about non-measured points than when we started. All the model gives us is an algorithm to generate the weights. And perhaps a way of being explicit about where we have higher or lower data coverage.
I’m not claiming this is illegitimate, unless we think the interpolation is adding data. It’s not; we have exactly the same amount of data as we started with, with whatever limitations (and errors) existed in that data.
Let me try this again …
So did Joseph Schumpeter regarding aggregated accounts in national economies. No matter the context, this same problem eventually appears.
Briggs,
When physicists measure fundamental constants they are, in effect, trying to predict the true value of such, aren’t they? A long time ago a couple of sociologists, Fischoff being one, showed that successive published values of some fundamental constants were 42 standard deviations away from earlier values. Fischoff concluded that physicists were not very good at quantifying biases. But another way to state this is just as you have for models in general–the uncertainty contributed by model parameters is a poor guide to the uncertainty of prediction. And this ought to provide a cautionary story about making predictions, but never does.
Do the use the Humidity when assessing the temperature change?
Or do they just smear the temperature across the 1km grid?