Are Tree Rings Low Pass Temperature Filters?

I don’t know the answer to this question: I am not an expert in this area, and I haven’t the time or resources to track down the data to discover the answer. I even wonder if there exists controlled experimental data. It must be the case that the objections I make below are well known and have been considered by those who regularly use tree rings as proxies for temperature. I’m just hoping that some regular reader might know where to go on this matter.

Ulf Büntgen and a pal or two saw their “2500 Years of European Climate Variability and Human Susceptibility” published recently in Science (thanks to reader Sylvain Allard for bringing this to my attention). The abstract begins:

Climate variations have influenced the agricultural productivity, health risk and conflict level of preindustrial societies. Discrimination between environmental and anthropogenic impacts on past civilizations, however, remains difficult because of the paucity of high-resolution palaeoclimatic evidence. Here we present tree ring-based reconstructions of Central European summer precipitation and temperature variability over the past 2500 years. Recent warming is unprecedented, but modern hydroclimatic variations may have at times been exceeded in magnitude and duration.

The admission is “paucity of high-resolution palaeoclimatic evidence”, meaning no direct measures of temperature and precipitation exists. Thus, the reliance—I emphasize the word—on tree-ring data as a stand-in for what was not measured.

Then comes the key: “Recent warming is unprecedented”. Is it? How can we know? Well, by examining the reconstructions of temperature using tree rings: that is, by building statistical models of temperature as functions of tree-rings.

My question is this. Suppose, ceteris paribus, that temperature changes rapidly year on year. I leave “rapidly” undefined, as I do how the temperature changes (more in summer than in other seasons? equal change through all seasons? etc.; each possibly would presumably influence the way trees reacts to temperature). Can trees keep up with rapid temperature change?

My guess, based, it’s true, on vague biology, would be that the tree ring responds to this temperature change linearly when the year-on-year temperature change is slow, and it responds to the temperature change (say) logarithmically when the year-on-year temperature change is fast. That is, when temperature change is too quick, the tree can’t catch up and doesn’t respond as quickly to extremes. Tree rings would then, in effect, be a low pass filter on temperatures.

If that is true, then any reconstruction of temperature based on tree rings would always show less variability than would actual temperature measurements. The past would necessarily look calmer than the present, so to speak. Reconstructions would have more hockey sticks than at the Joe on a Friday night in January.

Now, if you knew how trees responded to rapid change, then you could of course incorporate this knowledge into your reconstruction models. But this removes these models from the land of simple regressions, and almost certainly, and unless the researcher is very careful, the results will be too certain in times of rapid change (the confidence or credible intervals should widen considerably when the regime switched from linear to logarithmic).

Then, too, we have the difference between estimates of the parameters of these models versus these models’ estimates of the actual observables (temperature). Use of the former—which is all you see in classical statistics—guarantees over-certainty.

Controlled experimental data would answer the question. Grow a strand of trees, paying attention to the ceteris paribus, then change temperature slowly, then rapidly and see what happens. Of course, since tree rings are laid down annually (I stand ready to be corrected here), this experiment will take some time.

You might then try to look to the wild where we have simultaneous (say) thermometer-based temperatures and tree rings, which must have been done. The problem here is observational bias. Chances are overwhelming that that ceteris paribus bit will not have been understood properly. I say this because it rarely is. This difficulty isn’t strong enough to bar these kinds of experiments, but it is sufficiently forceful such that we should always look at our results with some skepticism. Especially if our goal is to forecast temperatures changes of fractional degrees.

As I have said, all this is surely well known; thus this post is more a way for me to organize my thinking than any kind of review. I await enlightenment from you.

Will The Religious Out-Breed Us All?

Thanks to long-time reader and contributer Ari Schwartz for bringing this to our attention.

“It is widely agreed that religion has biological foundations—that belief in the supernatural, obedience to authority or susceptibility to ceremony and ritual depend on genetically based features of the human brain.” Thus does Robert Rowthorn begin his paper on “Religion, fertility and genes: a dual inheritance model” with a falsity. Thus we are not later surprised to learn that Rowthorn has “proved” that the religious—whatever they are, poor things—will out-breed the “normals”, to the extent that the genes of the enlightened folks will be watered down with, well, with holy water.

If what Rowthorn said was false, what is true is that some have said that religion has biological foundations. One reading makes this trivially true: we are biological creatures with brains that allow us to think up religious thoughts. But that’s not what Rowthorn has in mind. He says “religion promotes the evolution of genes that predispose people towards religious belief or behaviour.” Got it? Religion itself makes people religious. Sigh. This is what happens when people read Dawkins with minds far too open. Suddenly, any idea sounds good, no matter how illogical.

How’s it work?

For religion to influence genetic evolution it must convey some kind of selective advantage. Such an effect might come about through social bonding via ritual, formation of group identity through myth, honest signalling through participation in costly ceremonies and adherence to social norms through love or fear of God.

Religion—which must be sentient, like a meme; or something—also makes people fertile. That’s what scares the bejesus—and the Jesus—out of Rowthorn. “The more devout people are, the more children they are likely to have.” He’s particularly fearful of them Amish who have a “total fertility rate of 4.8″. Why, if that sort of rate keeps up, the world will be flooded with beautiful hand-made quilts, not to mention the glut of various sauces and jams that even now squirt like a fire hose out of Lancaster, PA.

Remember the good old days? When national or royal academies of science would only publish articles of value and intrinsic worth? Papers which were insightful and had a reasonable chance to not only be true, but were untainted with mind-rattling gibberish? Maybe my glasses are rose colored, but surely a work like Rowthorn’s would never have passed the bar of the Royal Society even twenty years ago, a time when even the John Birch society would have rejected this man’s wild thesis.

Just for a start, Rowthorn has forgotten the Quakers, the thousands upon thousands of Catholic priests, nuns, and brothers, the growing population of Buddhist monks, Shinto priests, and myriad other holy men and women of various stripe whose main goal in life is not to pass on their religious genes. Even though we have no (or almost no) chromosomal material from any of these exceedingly religious people, yet we are able to replenish their stock year upon year. How can this be?

Nowhere does the economist Rowthorn—he is on the Faculty of Economics, Cambridge—acknowledge the idea that those Amish breeders are less well off than his presumably barren but richer colleagues. There is bountiful evidence that wealth is a bar to pregnancy, and not just personal wealth, but that of a community. The better off a region (or country) is, the fewer the kids the ladies of that region like to have. The love of money trumps the love of babies.

And how come the religious haven’t taken over by now, forcing their beliefs down our throats (the main fear of those discussing this paper on Slashdot)? Defections, says Rowthorn. Yes, even though the religious gene is pernicious, yet some people are able to overcome its influence. The people able to accomplish this miraculous feat—they become what they are not by sheer force of will, even though their wills were under the control of their genes—might be said to have been born again. They abandon their Earthly genes and adopt Enlightened memes which overpower their genes. Or something.

Perhaps it’s overwork or overexposure to economical equations that accounts for people like our Rowthorn. All those formulas have a way of inducing a sense of self that can be unhealthy. Maybe that’s why economists don’t have a lot of kids. I think I’ll model this.

The Similarities Between Psychics And Global Warming Activists

The statistical evidence series will continue after the weekend.

“I see the letter G.” The woman closed her eyes, cocked her head, and looked inwardly. She became grave, tense. “There’s…wait a second…it’s coming through…yes! I can just make out a body of water nearby.” She settled back, opened her eyes, a wide smile overcoming the frown. She waited for the applause that was sure to come.

The woman? A psychic telling a distraught family where the body of their daughter can be found. Or maybe an activist making guesses of where the next global warming calamity will occur. The two aren’t that far apart. Here’s why.

The New York Post reports that a “clairvoyant” was hired by the family of “Melissa Barthelemy, 24, of Buffalo”, who had gone missing. Police suspected foul play. The unnamed psychic was reported to have said she saw Melissa’s “body buried in a shallow grave overlooking a body of water.” Weirdly, the seer also predicted that there would be the letter “‘G’ in a sign nearby.”

Sadly, but also—let’s admit it—somewhat thrillingly, the police found Melissa where the psychic said she would be. And not just Melissa, a horrific mass grave in which “cops unearthed the skeletons of the victims, missing call girls, each wrapped in burlap bags on Long Island’s Gilgo Beach.” Long Island’s Gilgo Beach! There’s the body of water! There’s the G!

The evidence tells us the psychic was right. Therefore, the psychic is psychic; that is, this person (we don’t know whether it’s a man or a woman; I’ll assume it’s a woman) must have the powers she said she does. If you want to say it in a complicated way, the “body of water” and the “G” confirm the theory that paranormal powers exist.

And this is true: the evidence does indeed confirm the paranormal theory. If you’re a skeptic of psi powers, you might not like this conclusion, but that can’t be helped. When a theory predicts an event will happen, and if that event happens, that theory is confirmed to the degree the predicted events match the reality.

Are we done? As John Wayne would say, Not hardly. For that same body of water and “G” also confirm the theory that the psychic is just guessing, and for obvious reasons. Melissa was a Long Island resident, and Long Island is filled with rivers, creeks, lakes, ponds, cisterns, swimming pools; even a well known ocean is nearby. There is virtually no place that is not “near” a body of water. And the “G”? Well, anything with “Long Island” will qualify (can you locate the “G”?). Then there are gas stations, various “ings”, etc., etc. There was almost no way that the psychic could have been wrong. Her supporters will never suffer disappointment.

Unspecific predictions also plague the global warming forecasts of activists. I don’t mean all predictions of climate change, some of which are quite detailed. I have in mind those colorfully vivid sayings of doom given out by green organizations, typically in appeals for donations. These overly earnest folks say that if we don’t “do something”, bad things will occur. Near a body of water, usually.

Those who have “We have to save the planet!” ever on their lips are ready to interpret any untoward event as evidence that their worst fears are realized. Remember the Indonesian tsunami? That was near a body of water, and more than one activist was ready to blame it on mankind; some especially clever agitators were even able to point to global warming. This year’s cold and snow in the States? Global warming, too. Poverty in the third word? Climate change. A lot of racism is caused by the climate chaos, too. More prostitutes, pimps, and pirates? Reliance of fossil fuels.

People will always be creative enough to tie any environmentally bad thing—never, of course, good things—to the theory that mankind is responsible. Just as with psychics, whatever happens will be confirmation that their beliefs are true. They will not, so to speak, see that bodies of water are everywhere. This is why it is so difficult to convince the True Believer that his angst is misplaced.

For a more in-depth look at a psychic supposedly helping detectives solve a murder, see the Tabitha Horn case.

Model Selection and the Difficulty of Falsifying Probability Models: Part III

To clarify: to prove false means to demonstrate a valid argument with a certain conclusion. If a theory/model says an event is merely unlikely—make it as unlikely as you like, as long as it remains possible—then if that event happens, the theory/model is not falsified. To say, “We may as well consider it falsified: it is practically falsified” is like saying, “She is practically a virgin.” False means false; it has no relationship with unlikely.

A theorist or statistician has in hand a priori evidence which says model M1, M2, …, Mk are possible. Some of these, conditional on the theorist’s evidence, may be more likely than others, but each might be true. If these models have probability components, as most models do, and these probability components say that any event is possible, no matter how unlikely, then none of these models may ever be falsified. Of course, those models in the set that say certain events are impossible, and these events subsequently are observed to occur, then this subset of models can be falsified; the remaining models then become more likely to be true.

In Bayesian statistics, there is a natural mechanism to adjudge the so-called posterior probability of model’s truth: it is called “posterior” because the model’s truth is conditional on observation statements, that is, on what happens empirically (this needn’t be empirical evidence, of course; any evidence will do; recall that probability is like logic in that they study the relationships between statements). Each models’ a priori probability is modified to its a posteriori probability via a (conceptually) simple algorithm.

These a posteriori probabilities may be ordered, from high to low and the model with the highest a posteriori probability picked as “the best.” The only reason one would want to do this is if a judgment must be made which theory/model will be subject to further actions. The most common example is a criminal trial. Here, the theories or models are suspects in some crime. At the end, only one theory/model/suspect will face punishment; that is, at most one will. It may be that no theory/model/suspect is sufficiently probable for a decision to be made. But if the suspect is found guilty, it is not that the convicted theory/model/suspect is certainly guilty, for the other theories/models/suspects might also have done the deed, yet the probability that they did so, given the evidence, is adjudged low. This implies what we all know: that the convicted might be innocent (the probability he is so is one minus the probability he is guilty).

It is often the case (not just in criminal trials) that one model (given the evidence and a priori information) is overwhelmingly likely, and that the others are extraordinarily improbable. In these cases, we make few errors by acting upon the belief that the most probable model is true. Our visual system works this way (or so it has been written). For example, your brain assures you that that object you’re reaching for is a cup of coffee, and not, say, cola. Sipping from it provides evidence that this model was true. But as we all know, our vision is sometimes fooled. I once picked up from the carpet a “cookie” which turned out to be a wiggling cockroach (big one, too).

Now, since we began with a specified set of suspects (like my cookie), one path to over-certainty is to not have included the truly guilty in the list. Given the specific evidence of omittance, the probability the other suspects are guilty is exactly zero (these theories/models are falsified). But, in the trial itself, that specific evidence is not included, so that we may, just as we did with the green men, calculate probabilities of guilt of the (not-guilty) suspects. Keep in mind that all probability and logic is conditional on specific, explicit premises. The probability or certainty of a conclusion changes when the premises do.

So what is the probability that we have not included the proper theory/model/suspect? That question cannot be answered: at least, it cannot be answered except in relation to some evidence or premises. This applies to all situations, not just criminal trials. What might this external evidence look like? We’ll find out in Part IV.

Model Selection and the Difficulty of Falsifying Probability Models: Part II

I hope all understand that we are not just discussing statistics and probability models: what is true here is true for all theories/models (mathematics, physics, chemistry, climate, etc.). Read Part I.

Suppose for premises we begin with Peano’s axioms (which themselves are true given the a priori), from which we can deduce the idea of a successor to a number, which allows us to define what the “+” symbol means. Thus, we can eventually hypothesize that “2 + 2 = 4″, which is true given the above premises. But the hypothesis “2 + 2 = 5″ is false; that is, we have falsified that hypothesis given these premises. The word falsified means to prove to be false. There is no ambiguity in the word false: it means certainly not true.

Now suppose our premises leads to a theory/model which says that, for some system, any numerical value is possible, even though some of these values are more likely than another. This is the same as saying no value is impossible. Examples abound. Eventually, we see numerical values which we can compare with our theory. Since none of these values were impossible given the theory, no observation falsifies the theory.

The only way a theory or model can be falsified is if that theory/model says “These observations are impossible—not just unlikely, but impossible” and then we see any of these “impossible” observations. If a model merely said a set of observations were unlikely, and these unlikely observations obtained, then that model has not been falsified.

For example, many use models based on normal distributions, which are probability statements which say that any observation on the real line is possible. Thus, any normal-distribution model can never be falsified by any observation. Climate models generally fall into this bucket: most say that temperatures will rise, but none (that I know of) say that it is impossible that temperatures will fall. Thus, climate models cannot be falsified by any observation. This is not a weakness, but a necessary consequence of the models’ probabilistic apparatus.

Statisticians and workers in other fields often incorrectly say that they have falsified models, but they speak loosely and improperly and abuse the words true and false (examples are easy to provide: I won’t do so here). None of these people would say they have proved, for example, a mathematical theorem false—that is, that they have falsified it—unless they could display a chain of valid deductions. But somehow they often confuse unlikely with false when speaking of empirical theories. In statistics, it is easy enough to see that this happens because of the history of the field, and its frequent use of terms like “accepting” or “rejecting” a hypothesis, i.e. “acting like” a model has been proved true or falsified. However, that kind of language is often used in physics, too, where theories which have not been falsified are supplanted wholly by newer theories.

For a concrete example, take a linear regression model with its usual assumptions (normality, etc.). No regression model can be falsified under these premises. The statistician, using prior knowledge, decides on a list of theories/models, here in the shape of regressors, the right-hand-side predictive variables; these form our premises. Of course, the prior knowledge also specifies with probability 1 the truth of the regression model; i.e. it is assumed true, just as the irascible green men were. That same prior knowledge also decides the form of these models (whether the regressors “interact”, whether they should be squared, etc.). To emphasize, it is the statistician who supplies the premises which limits the potentially infinite number of theories/models to a finite list. In this way, even frequentist statisticians act as Bayesians.

Through various mechanisms, some ad hoc, some theoretical, statisticians will winnow the list of regressors, thus eliminating several theories/models, in effect saying of the rejected variables, “I have falsified these models.” This, after all, is what p-values and hypothesis testing are meant to do: give the illusion (“acting like”) that models have been falsified. This mistake is not confined to frequentism; Bayesian statisticians mimic the same actions using parameter posterior distributions instead of p-values; the effect, of course, is the same.

Now, it may be that the falsely falsified models are unlikely to be true, but again “unlikely” is not “false.” Recall that we can only work with stated premises, that all logic and probability are conditional (on stated premises). It could thus be that we have not supplied the premise or premises necessary to identify the true model, and that all the models under consideration are in fact false (with respect to the un-supplied premise). We thus have two paths to over-certainty: incorrect falsification, and inadequate specification. This is explored in Part III.

Model Selection and the Difficulty of Falsifying Probability Models: Part I

These next posts are in the way of being notes to myself.

Logic is the study of the relation between statements. For example, if “All green men are irascible, and Bob is a green man”, then we know that “Bob is irascible” is certainly true. It isn’t true because we measured all green men and found them irascible, for there are no green men; it’s true because syllogisms like this produce valid conclusions.

We know that “there are no green men” because we know that “all observations of men have produced no green ones”, and we know that based on further evidence, extending in a chain to the a priori. As it is not necessary for what follows, this proof is left for another day.

Probability is no different than logic in that it is the study of the relation between statements. Given the premises assumed above, the probability the conclusion is true is 1. Modify the first word in the first premise (“All”) to “Most”, then the probability of the conclusion is less than 1 and greater than 0 (the range depending on the definition of “Most”).

In either case, logic or its generalization probability, we cannot know the status of a conclusion without reference to a specific set of premises. We cannot know the probability of so simple a statement as “This coin will show a head when tossed” without reference to some set of premises—which might include observational statements. Thus, all probability is, just as all logic is, conditional.

This background is necessary to emphasize that we cannot know whether a given model or theory is true—regardless if that model is wholly probabilistic, deterministic, or in between—without reference to some list of premises. In classical (frequentist) statistics, that premise is (eventually) always “Model M is true”, therefore we know with certainty, given that premise, that “Model M is true”. This premise is usually adopted post hoc, in that many models may be tried, but all are discarded except one.

The “p-value” is the probability of getting a larger (in absolute value) ad hoc statistic than the one actually observed given the premises (1) the observed data, (2) a statement about a subset of the parameter space, and most importantly (3) the model, which is assumed true. If the model is false, the p-value still makes sense because, like our green men, it only assumes the model is true. “Making sense” is not to be confused with being useful as a decision tool. It makes sense in just the same way our green men argument makes sense, but it has no bearing on any real-world decision.

Importantly, despite perpetual confusion, the p-value says nothing about whether (3) the model is true; nor does it saying anything about (2) whether the statement about the subset of the parameter space is true. The theory or model is always assumed to be true: not just likely, but certain.

I leave aside here the argument that a theory leads to a unique model: my claim is that the two words are synonymous. Whether or not this is so, a model is a unique, fixed construct (e.g., every addition or deletion of a regressor in a regression is a new theory/model). The ad hoc statistic or hypothesis test of frequentist statistics forms part of the theory/model (in this way, there are always two theories under frequentist contention, with one being accepted as true, the other false).

In Bayesian statistics, there is a natural apparatus for assessing the truth of a model. There is always the element of post hoc model selection in practice, but I’ll assume purity for this discussion. If we begin with the premise, “Models M1, M2, …, Mk are available”, and joined it with “Just one model is labeled Mi“, then the prior probability “Model Mi is true” given these premises is 1/k. It is important to understand that if the premise were merely “Model Mi is either true or false”, then the probability “Model Mi is true” is greater than 0 and less than 1, and that is all we can say. This makes sense (and it is different from the frequentist assertion/premise that “Model Mi is true”) because again all logic/probability is concerned with the connections between statements, not the statements themselves (this is the major mistake made in frequentism).

That last assertion means that the list of models under contention is always decided externally; that is, by premises which are unrelated to whether the models are true, or even good or useful. There might be some premise which says, “Given our previous knowledge of the subject at hand, these models are likely true”; that premise might go on to assign prior probabilities different than 1/k for each model under consideration. But it is of the utmost importance to understand that it is we who close the universe on acceptable models. In practice, this universe is always finite: that is, even though we can make statements about them, we can never consider an infinity of models.

In Part II, model selection and what falsifiability is.

Job Wanted: Tall Statistician Seeks Match

I am hoping that among my faithful readers there is somebody that knows somebody who might know a guy who’s looking for a statistician.

The economy is such that consulting opportunities are rarer than a conservative at NPR. So in the interest of continuing to finance my daily three, I thought I should turn my ability of wearing suits—and liking it—into cash by finding a real job. Do any companies still require suits?

My sometime unorthodox views, and insistence on remaining in New York City when going would have been wiser, have made it difficult to discover a position in a research university. Despite some earlier opportunities, I stayed put until the two young gentlemen who I’m responsible for could be booted onto the streets without me incurring legal penalties.

I would love to teach at a school where the students actually want to learn and are not there just to “Get a degree.” In my fantasy, these kids matriculate already knowing how to write and are not unfamiliar with multiplication and division of two-digit numbers. And since this is my hallucination, I envision the college shares my desires and puts knowledge (and not “diversity”, etc.) first. Sigh.

There are these places called “think tanks”, and although I’ve never seen one, I picture them as highly walled zoos in which curious specimens of thinking animals are kept on display and are made to perform (on paper) for food. I have no contacts with zookeepers.

I wouldn’t fit in well with (standard) pharmaceutical companies. The work that they do is constrained and restricted by the dictates of various bureaucracies, and necessarily they are not interested in the kind of statistics I preach.

I am a mean R programmer (I curse the computer a lot) and am eager to demonstrate the usefulness of predictive statistics to some adventurous firm.

Re-location is fine (and easy; we only rent). For years, I’ve been trying to find a way back into Texas where I started my illustrious career, via, it’s true, the nepotism of my Uncle Sam. I don’t want to live in California, as gorgeous as that state is (I would like to keep at least some of the money given to me by my potential employer).

I’m free immediately. The only hitch is that I promised Cornell I would teach a two-week course at the end of this June.

My CV may be viewed on-line, or downloaded.

Email or call (beginning Monday, 17 January) 917-392-0691.