Skip to content
April 8, 2008 | 120 Comments

Why multiple climate model agreement is not that exciting

There are several global climate models (GCMs) produced by many different groups. There are a half dozen from the USA, some from the UK Met Office, a well known one from Australia, and so on. GCMs are a truly global effort. These GCMs are of course referenced by the IPCC, and each version is known to the creators of the other versions.

Much is made of the fact that these various GCMs show rough agreement with each other. People have the sense that, since so many “different” GCMs agree, we should have more confidence that what they say is true. Today I will discuss why this view is false. This is not an easy subject, so we will take it slowly.

Suppose first that you and I want to predict tomorrow’s high temperature in Central Park in New York City (this example naturally works for any thing we want to predict, from stock prices to number of people who will vote for a certain USA presidential candidate). I have a weather model called MMatt. I run this model on my computer and it predicts 66 degrees F. I then give you this model so that you can run it on your computer, but you are vain and rename the model to MMe. You make the change, run the model, and announce that MMe predicts 66 degrees F.

Are we now more confident that tomorrow’s high temperature will be 66 because two different models predicted that number?

Obviously not.

The reason is that changing the name does not change the model. Simply running the model twice, or a dozen, or a hundred times, does not give us any additional evidence than if we only ran it just once. We reach the same conclusion if instead of predicting tomorrow’s high temperature, we use GCMs to predict next year’s global mean temperature: no matter how many times we run the model, or how many different places in the world we run it, we are no more confident of the final prediction than if we only ran the model once.

So Point One of why multiple GCMs agreeing is not that exciting is that if all the different GCMs are really the same model but each just has a different name, then we have not gained new information by running the models many times. And we might suspect that if somebody keeps telling us that “all the models agree” to imply there is greater certainty, he either might not understand this simple point or he has ulterior motives.

Are all the many GMCs touted by the IPCC the same except for name? No. Since they are not, then we might hope to gain much new information from examining all of them. Unfortunately, they are not, and can not be, that different either. We cannot here go into detail of each component of each model (books are written on these subjects), but we can make some broad conclusions.

The atmosphere, like the ocean, is a fluid and it flows like one. The fundamental equations of motion that govern this flow are known. They cannot differ from model to model; or to state this positively, they will be the same in each model. On paper, anyway, because those equations have to be approximated in a computer, and there is not universal agreement, nor is there a proof, of the best way to do this. So the manner each GCM implements this approximation might be different, and these differences might cause the outputs to differ (though this is not guaranteed).

The equations describing the physics of a photon of sunlight interacting with our atmosphere are also known, but these interactions happen on a scale too small to model, so the effects of sunlight must be parameterized, which is a semi-statistical semi-physical guess of how the small scale effects accumulate to the large scale used in GCMs. Parameterization schemes can differ from model to model and these differences almost certainly will cause the outputs to differ.

And so on for the other components of the models. Already, then, it begins to look like there might be a lot of different information available from the many GCMs, so we would be right to make something of the cases where these models agree. Not quite.

The groups that build the GCMs do not work independently of one another (nor should they). They read and write for the same journals, attend the same conferences, and are familiar with each other’s work. In fact, many of the components used in the different GCMs are the same, even exactly the same, in more than one model. The same person or persons may be responsible, through some line of research, for a particular parameterization used in all the models. Computer code is shared. Thus, while there are some reasons for differing output (and we haven’t covered all of them yet), there are many more reasons that the output should agree.

Results from different GCMs are thus not independent, so our enthusiasm generated because they all roughly agree should at least be tempered, until we understand how dependent the models are.

This next part is tricky, so stay with me. The models differ in more ways than just the physical representations previously noted. They also differ in strictly computational ways and through different hypotheses of how, for example, CO2 should be treated. Some models use a coarse grid point representation of the earth and others use a finer grid: the first method generally attempts to do better with the physics but sacrifices resolution, the second method attempts to provide a finer look at the world, while typically sacrificing accuracy in other parts of the model. While the positive feedback in temperature caused by increasing CO2 is the same in spirit for all models, the exact way it is implemented in each can differ.

Now, each climate model, as a result of the many approximations that must be made, has, if you like, hundreds (even thousands) of knobs that can be dialed to and fro. Each twist of the dial produces a difference in the output. Tweaking these dials, then, is a necessary part of the model building process. The models are tuned so that they, as closely as possible, first are able to produce climate that looks like the past, already observed, climate. Much time is spent tuning and tweaking the models so that they can, at least roughly, reproduce past climate. Thus, the fact that all the GCMs can roughly represent the past climate is again not as interesting as it first seemed. They better had, or nobody would seriously consider the model as a contender.

Reproducing past data is a necessary but not sufficient condition that the models can predict future data. Thus, it is also not at all clear how these tweakings affect the accuracy in predicting new data, which is data that was not used in any way to build the models, that is, future data. Predicting future data has several components.

It might be that one of the models, say GCM1 is the best of the bunch in the sense that it matches most closely future data. If this is always the case, if GCM1 is always closest (using some proper measure of skill), then it means that the other models are not as good, they are wrong in some way, and thus they should be ignored when making predictions. The fact that they come close to GCM1 should not give us more reason to believe the predictions made by GCM1. The other models are not providing new information in this case. This argument, which is admittedly subtle, also holds if a certain group of GCMs are always better than the remainder of models. Only the close group can be considered independent evidence.

Even if you don’t follow—or believe—that argument, there is also the problem of how to quantify the certainty of the GCM predictions. I often see pictures like this:
GCM predictions
Each horizontal line represents the output of a GCM, say predicting next year’s average global temperature. It is often thought that the spread of the outputs can be used to describe a probability distribution over the possible future temperatures. The probability distribution is the black curve drawn over the predictions, and neatly captures the range of possibilities. This particular picture looks to say that there is about a 90% chance that the temperature will be between 10 and 14 degrees. It is at this point that people fool themselves, probably because the uncertainty in the forecast has become prettily quantified by some sophisticated statistical routines. But the probability estimate is just plain wrong.

How do I know this? Suppose that each of the eight GCMs predicted that the temperature will be 12 degrees. Would we then say, would anybody say, that we are now 100% certain in the prediction?

Again, obviously not. Nobody would believe that if all GCMs agreed exactly (or nearly so) that we would be 100% certain of the outcome. Why? Because everybody knows that these models are not perfect.

The exact same situation was met by meteorologists when they tried this trick with weather forecasts (this is called ensemble forecasting). They found two things. The probability forecasts made by this averaging process were far too sure—the probabilities, like our black curve, were too tight and had to made much wider. Second, the averages were usually biased—meaning that the individual forecasts should all be shifted upwards or downwards by some amount.

This should also be true for GCMs, but the fact has not yet been widely recognized. The amount of certainty we have in future predictions should be less, but we also have to consider the bias. Right now, all GCMs are predicting warmer temperatures than are actually occurring. That means the GCMs are wrong, or biased, or both. The GCM forecasts should be shifted lower, and our certainty in their predictions should be decreased.

All of this implies that we should take the agreement of GCMs far less seriously than is often supposed. And if anything, the fact that the GCMs routinely over-predict is positive evidence of something: that some of the suppositions of the models are wrong.

April 5, 2008 | 11 Comments

Spanish Expedition

I have returned from Madrid, where the conference went moderately well. My part was acceptable, but I could have done a better job, which I’ll explain in a moment.

Iberia Airlines is reasonable, but the seats in steerage were even smaller than I thought. On the way there, I sat next to a lady whose head kept lolling over onto me as she slept. The trip back was better, because I was able to commandeer two sets. Plus, there were a large, boisterous group of young Portuguese men who apparently had never been to New York City before. They were in high spirits for most of the trip, which made the journey seem shorter. About an hour before landing they started to practice some English phrases which they thought would be useful for picking up American women: “Would you go out with me?”, “I like you”, and “You are a fucking sweetheart.”

My talk was simultaneously translated in Spanish, and I wish I would have been more coherent and that I would have talked slower. The translator told me afterwards that I talked “rather fast.” I know I left a lot of people wondering.

The audience was mostly scientists (of all kinds) and journalists. My subject was rather technical and new, and while I do think it is a useful approach, it is not the best talk to present to non-specialists. My biggest fault was my failure to recognize and speak about the evidence that others found convincing. I could have offered a more reasonable comparison if I had done so.

I’ll write about these topics in more depth later, but briefly: people weight heavily the fact that many different climate models are in agreement in closely simulating past observations. There are two main, and very simple problems with this evidence, which I could have, at the time, done a better job pointing out. For example, I could have asked this question: why are there any differences between climate models? The point being that eight climate models agreeing is not eight independent pieces of evidence. All of these models, for instance, use the same equations of motion. We should be surprised that there are any differences between them.

The second problem I did point out, but I do not think I was convincing. So far, climate models over-predict independent data: that is, they all forecast higher temperatures than are actually observed. This is for data that was not used to fit the models. This means, this can only mean, that the climate models are wrong. They might not be very wrong, but they are wrong just the same. So we should be asking: why are they wrong?

There was a press conference, conducted in Spanish. I can read Spanish much better than I can hear it, which is a fault I should work harder to correct, but it meant that I could not follow most of the comments or questions well. I was the critical representative, and a Professor Moreno was my foil. The most pertinent question to me was (something like) “Do I think it is time for new laws to be passed to combat global warming?” I said no. Professor Moreno vehemently disagreed, incorrectly using as an example the unfortunate heat wave in Spain that was responsible for a large number of deaths. Incorrect, because it is impossible to say that this particular heat wave was caused by humans (in the form of anthropogenic global warming). But the press there, like here (like everywhere), enjoyed the conflict between us, so this is what was reported.

Here, for the sake of vanity, are some links (in Spanish) for the news coverage. We were also on the Spanish national television news on the first night of the conference, but I didn’t see it because we were out. Some of these links may, of course, expire.

  1. ?Existe el cambio clim?tico?
  2. Estad?stico de EEUU rebaja la fiabilidad de las predicciones del IPCC contra la opini?n general
  3. Un estad?stico americano pone en duda la veracidad del cambio clim?tico
  4. Un experto americano duda de las consecuencias del cambio clim?tico
  5. Evidencias apabullantes
  6. Un debate sobre cambio clim?tico termina a gritos en Madrid

Madrid itself was wonderful, and my hosts Francisco Garc?a Novo y Antonio Cembrero were absolute gentlemen, and I met many lovely people. I was introduced to several excellent restaurants and cervesaria. The food was better than I can write about—I nearly wept at the Museo del Jamon. I felt thoroughly spoiled. Dr Novo introduced me to La Grita, a subtle sherry that is a perfect foil for olives. I managed to find some in the duty free shop, and I recommend that if you see some, snatch it up.

Come back over the next few days. By then, I hope to have written something on the agreement of climate models.

March 31, 2008 | 7 Comments

Tall men in planes

I am off to Spain today, for the conference, to present my unfinished, and unfinishable, talk. Why unfinishable? I am asking people to supply estimates for certain probabilities (see the previous post), on which there will never be agreement, nor will these estimates cease changing through time. I am somewhat disheartened by this, and would like to say something more concrete, but I am committed. So. It’s eight hours there and back, crammed into a seat made for, let us say, those of a more diminutive stature than I. There will be no more postings until Saturday, when I return, which is why I leave you with this classic column I wrote several years ago, but which is just as relevant today.
Burden of the very tall

Lamentations of the Very Tall

An alternate title of this article could have been, “Short People Rejoice!” for it’s my conviction that the world is mercilessly biased in favor of tiny people. That is, probably you.

I say “probably you” because of the firm statistical grounding in the fact that it is quantifiably improbable for a random person to be tall. I’m also assuming that you, dear reader, are a random person, and therefore most likely belong to the endless, but shallow, sea of short people.

Here’s the thing: since you are probably short you are likely to be unaware of how tall people suffer, so I’m going to tell you. For reference, I am a shade over six-two, which is tall, but not professional basketball player tall. This is still taller than more than nine-tenths of the American population, however.

Life as a tall man is not all bad. It’s true I’ve developed strong forearms from beating off adoring females who lust after my tallness, but there are many more misfortunes that outweigh the unending adulation of women. Showers for one.

Shower heads come to mid-chest on me. I’ve developed a permanent stoop from years of bending over to wash my hair—and then from scrunching down to see my reflection in the mirror, typically placed navel high, so that I can comb it.

The lamentations of the tall when it comes to airplane seats are too obvious to mention. As is our inability to fit into any bathtub or fully on any bed.

I once worked in a building that required, for security reasons, a peephole to be drilled into the door. I stood guard over two workers who dickered over where to place the pencil mark that would indicate where they were going to drill. Each in turn stepped up the door and put a dot in the spot where their eye met the door. The marks didn’t quite match but they soon settled on the difference.

Ultimately, the hole was about crotch high on me. To be fair, I was in Japan and the workers were Japanese, and therefore on the not tall side of the scale. Because I was in the military, I wasn’t entirely comfortable bending down to that degree1. This meant that I breached security each time I opened the door because I couldn’t see who was on the other side. Suspicious, is it not?

It was at this point that I began to believe that this discrepancy in height was not entirely genetic and that sinister motives may be behind the prejudices of the non-tall.

For example, I have to place my computer monitor on three reams of paper so that it approaches eye level, and I have to raise my chair to its maximum so that my knees aren’t in my chin, but when I do my legs won’t fit under the desk. No matter how I position myself I am in pain. I sit2 in a factory made cubicle-ette which, as far as I can tell, causes no difficulties for my more diminutive co-workers. This is more evidence of the extent of the conspiracy of the non-tall.

Shopping is suspiciously dreadful too. Short people can freely walk into any department store and grab something, anything, off the rack, while we tall men are stuck with places like Ed’s Big and Tall. These stores are fine if you have a waist of at least 46 inches and you have stumpy legs, but they are nearly useless otherwise.

Pants for the tall are a cruel joke. Even if they carry labels that promise lengths of 35 or more inches, we know that these labels are a lie. Yes, the legging material may stretch for yards and yards, but there is never enough space where it counts. These pants are called “short-rise” for obvious reasons. I asked a salesguy (a non-tall man, of course), do they make long-rise pants anymore? He didn’t stop laughing. Normally, I’d have my revenge by not buying anything from him, but I couldn’t buy anything from him in the first place. I could do nothing but fume.

I’m not sure how we, the tall, will be able to overcome these horrific adversities. In raw numbers we are but a small minority—a fairly imposing looking minority it’s true—but a minority just the same. Still, there is word that something can be done and I hear that we’re to discuss ideas at our next official Tall Man Meeting. Don’t bother trying to sneak in, though, because we take measurements at the door.

1If I would have been in the Navy, I would have been used to it, of course.
2This was true then; it no longer is. I do not have a desk now.
March 28, 2008 | 9 Comments

Quantifying uncertainty in AGW

My friends, I need your help.

I have written a paper on quantifying the uncertainty of effects due to global warming, but the subject is too big for one person. Nevertheless, I have tried to—in one location—list all of the major areas of uncertainty, and I have attempted to quantify them as well. I would like your help in assessing my guesses. I am not at all certain that I have done an adequate or even a good job with this.

At this link is the HTML version of the paper I am giving in Spain (I used latex2html to encode this; it is not beautiful, but it is mostly functional).

At this link is the PDF version of the paper, which is far superior to the HTML. This paper, complete with typos, is about draft 0.8, so forgive the minor errors. Call me on the big ones, though.

I would like those interested to download the paper, read it, and help supply numbers for the uncertainty bounds found within. I would ask that you not do this facetiously or glibly, or that you not purposely underestimate the relevant probabilities. I want an open, honest, intellectual intelligent discussion of the kinds and ranges of uncertainties in the claims of effects due to global warming. For example, the words “Al Gore” should never appear in any comment. If you have no solid information to offer in a given area, please feel free to not comment on it.

The abstract for the paper is

A month does not go by without some new study appearing in a peer-reviewed journal which purports to demonstrate some ill effect that will be caused by global warming. The effects are conditional on global warming being true, which is itself not certain, and which must be categorized and bounded. Evidence for global warming is in two parts: observations and explanations of those observations, both of which must be faithful, accurate, and useful in predicting new observations. To be such, the observations have to be of the right kind, the locations and timing where and when they were taken should be ideal, and the measurement error should be negligible. The physics of our explanations, both of motion and e.g. heat, must be accurate, the algorithms used to solve and approximate the physics inside software must be good, chaos on the time scale of predictions must be unimportant, and there must be no experimenter effect. None of these categories is certain. As an exercise, bounds are estimated for their certainty and for the unconditional certainty in ill effects. Doing so shows that we are more certain than we should be.

My conclusions (which will make more sense, obviously, after you have read the paper) are

Attempting to quantify, to the level of precision given, the uncertainties in effects caused by global warming, particularly through the use of mathematical equations that imply a level of certainty which is not felt, can lead to charges that I have done nothing more than build an AGW version of the infamous Drake equation (Drake and Sobel 1992). I would not dispute that argument. I will claim that the estimates I arrived at are at least within an order of magnitude of the actual uncertainties. For example, the probability that AGW is true might not be 0.8, but it is certainly higher than 0.08.

The equations given, then, are not meant to be authoritative or complete. Their purpose is to concentrate attention of what exactly is being asked. It is too easy to conflate questions of what will happen if AGW is true with questions of is AGW true. And it is just as simple to confuse questions of the veracity and accuracy of observations and with the accuracy of the models or their components. People who work on a particular component are often aware of its boundaries and restrictions, and so are more willing to reduce the probability that this component is an adequate description of the physical world, but they are usually likely to assume that the areas on which they do not have daily familiarity are more certain than they are. Ideally, experts in each of the areas I have listed should supply a measure of uncertainty for that area alone. I would welcome a debate and discussion on this topic.

I also would not make the claim that I have accurately listed all the avenues where uncertainty arises (for example, I did not even touch on the uncertainty inherent in classical statistical models). But the ones I did list are relevant, though not necessarily of equal importance. We do have uncertainty in the observations we make and we do have uncertainty in the models of these observations. At the very least, we know empirically that we cannot predict the future perfectly. Further, the claims made about global warming’s effects are also uncertain. Taken together, then, it is indisputable that we are less certain that both global warming and its claimed effects are true than in either AGW or its effects alone.

Thanks everybody.