*I was asked to talk about how to handle uncertainty in forecasts. Below is a rough sketch.*

Many of you run or work with models that produce forecasts, guesses of the future. The output of these models make statements like, “Sales in June will be up 15%,” “The number of cars sold in 2011 will be 1.4 million,” or “The percent of people who vote for candidate A will be 38%.”

Typically, these forecasts will in the form of points, single numbers, as the examples suggested. But nobody believes that the sales in June will be *exactly* 15%, that the number of cars sold will be *precisely* 1.4 million, and that the number of voters for A will be equal to 38.000%.

When we hear these numbers, we automatically, depending on our experience level, add a “plus or minus” where the width of that adjustment is wider or narrower depending on whether or trust in the forecast is low or high.

The National Weather Service used to be reluctant to issue precipitation forecasts other than to say that it was or wasn’t likely. Today, we hear probabilities such as “The chance of rain is 40%”, which provide a fuller understanding of the uncertainty in the prediction.

Incidentally, since it is often asked, that “40%” is defined to mean that, during the relevant period stated (usually 12 hours), there is a 40% chance that at least a trace of precipitation will fall somewhere in the relevant location (usually a large area). It does *not* mean that precipitation will certainly fall over 40% of the area.

Stating the forecast probabilistically is necessary for those who make decisions about precipitation. For example, farmers deciding to irrigate, you deciding whether to carry an umbrella. Of course, these probabilities are married to precipitation quantity forecasts (called QPFs in the lingo) which increase their usefulness.

No forecast is complete without some indication of its uncertainty. If you run an economic model whose output is a single point you are making a very strong statement. It is no different than claiming that you are *certain* that the future is known, that car sales *will be* 1.4 million.

Adding an internal “plus or minus” moves you away from dogmatism, but the problem is that these adjustments can be swayed by emotion, inexperience, or desire (moving a *forecast* in the direction you want it to go is called wishcasting; moving the *thing* forecast in the direction of forecast is called cheating). Also, you cannot know whether *your* “plus or minus” is the same as the next guy’s.

Obviously, you would like some way to account for uncertainty systematically. Models come in two pieces: inputs and internals. Suppose an economic model which produces forecasts of car sales. Some of its inputs might be outputs from other forecasts, such as GDP, unemployment rate, and so forth. Other inputs might have more specific information on the car industry: this information might be knowns—such as the number of manufacturers—or uncertainties, such as iron prices.

The internals might, and usually are, black boxes, software packages provided by third parties. Even so, all models are sets of equations, algorithms, various modules and parameterizations, and so forth.

The easiest thing to do is to run the model with your inputs fixed. Then, since those inputs are not certain, “perturb” them—change them in accordance with their uncertainty—and then rerun the main model with the perturbed inputs.

For example, suppose our model has one input, a GDP guess for 2011 of $14.2 trillion. Run your model with that as an input. Your car sales model spits out 1.4 million. But GDP will not certainly be $14.2 trillion. It might be $14.1 or $14.3 trillion. Re-rerun your model twice more with each of these as inputs.

Repeat as necessary, once for each possible GDP input. You will have an ensemble of outputs, which you can then weight by the uncertainty of each of its inputs. If you have adequately represented the uncertainty of the inputs, your ensemble will automatically give you a “plus or minus” envelope around car sales.

Of course, that example is somewhat screwy because car sales are part of GDP. Or maybe it’s not so screwy after all, since many economic models that use GDP as input forecast something that is part of GDP.

Again, this was just a rough sketch. We haven’t begun to discuss how to handle the uncertainty inherent in the model’s internals. The trick is in fixing an experiment where both the model’s inputs *and* internals are perturbed in such a way to fully quantify your uncertainty in the eventual forecast.

We’ll get there!

Categories: Statistics

I was recently reviewing a forestry research paper in which the authors made this statement:

“According to our analysis, the combined effects of land-use

practices across the East caused a substantial and sustained

net loss of forest between 1973 and 2000 (table 1). Forest land

cover declined from an estimated 54.7% (Â±1.9%) in 1973 to

52.4% (Â±1.7%) in 2000, resulting in a 2.3% (Â±0.7%) change in

regional land cover. This equates to a 4.1% decline in the total

area of forest. The annual rate of loss was 0.15%, with declines

occurring during all four time intervals. The estimated cumulative

decline in net forest totaled more than 3.70 million ha.”

These numbers were obviously intended to inspire, if not fear, then at least a certain amount of hand-wringing.

But, wait! The data table for this information showed this in the heading:

“Percentage forest area and 85% confidence

interval (Â±)”

I donâ€™t believe I have ever reviewed any forest research (in my 45 year career) that worked with 85% confidence limits. While there are a number of other problems (assumptions and definitions) with this paper, those limits, to me, indicate that the numbers are just WAGs and that the paper is not worth spending time on.

Good opportunity to clear up what may be a mistake.

I’ve worked with the weather guys for decades, and was under the impression that when they reported a “40 percent” chance of precipitation, what they were referring to was that according to the information they had available, the statistical probability that there would be precipitation would occur in 40 out of an hundred times, given the conditions observed. That is, precipitation has occurred in the past under similar conditions, but, let’s face it, a forty percent chance of precip is still slight, no?

.

My favorite misinterpretation of chance of rain is that there is a 40% chance that at any given moment it will be raining.

I spend a lot of my day looking at economic forecasts — usually the opposite way around from what you describe. I donâ€™t care how many cars more cars will be sold if GDP rises, I wonder what the reported level of GDP will be if 1 million cars are sold. I donâ€™t really care how accurate these forecasts may be. I want to know the consensus. Then, if GDP comes in above or below consensus, I know how to react. Or, if I have a view that is different from consensus, I can position myself.

When GDP numbers come out on April 30, the commerce department will have not finished compiling their numbers for the month of March. Many of the inputs to GDP will be based off of only 2/3 of the necessary data and estimates for the remainder. In May, the commerce department will have all three months and will re-release Q1 GDP. Then in June they will clean up data on the margins and declare â€œGDP final.â€ But, it isnâ€™t final. It is still subject to regular revisions. We make our money on the forecast of a forecast.

Now, when it comes time to be in the business of making forecasts, points are not awarded for being on the nose. Credit goes to those who are more accurate than their peers. There is nothing to be gained straying too far from the herd. Anecdotally, I have a friend who processes bankruptcy data. He sells his analysis of trends and his forecasts to banks and credit card companies. At some point, one of the credit card processors asked my friend to revise his forecasts to be in line with the already published forecasts of the in-house economist. It would be less embarrassing for them if both estimates turned out to be wrong, then to have to explain why they were different.

I’m eagerly looking forward to additional entries on this topic. Thanks to the person who suggested it, and to you for tackling it.

Quantifying sensitivity and quantifying uncertainty are not the same things, but I’m sure you’ll get there.

Dennis, are you quite sure that 85% wasn’t a typo for 95%?

The solution you propose, to “perterb” the inputs, is basically a monte carlo simulation…and I agree that this is a great way to get a better estimate of a model’s forecast uncertainty. For those not familiar, this is where all inputs are modeled according to their most reasonable distributions (normal, binomial, etc.), then thousands of trials are run, each time with the input values randomly assigned according to their distributions. The resulting model output distribution, after the thousands of trials, gives a good estimation of your forecast uncertainty, although of course this depends on how well a) your model represents the true processes involved and b) how well you know your inputs.

Dearie

No. They refer to it in the body of the report as well as displaying it in their tables.

To Mr. Rice, I respectfully disagree. Perturbing the inputs yields information about the sensitivity of the model to perturbations of the input variables, but it does not provide information about the uncertainty of the model.

Imagine a model: aX + bY + cZ = Q

Suppose we try out a range of alternative values of X. The model then produces a range of values for Q. We can then determine how sensitive Q is to various values of X in our model.

But we cannot determine how uncertain Q is from that exercise, because we don’t know if the model has validity based just on those tests. Sensitivity yes, validity no.

The model could be a pile of junk. Q could be the number of visitations from space aliens. X could the number of hamburgers sold at Wendy’s. No amount of sensitivity analysis is going to tell us whether the model is any good.