Skip to content
July 2, 2008 | 21 Comments

Lizards all male climate change club

Nature magazine reports this headline: Condemned to single-sex life by climate change.

They are talking about a species of lizards called tuatara that live “on about 30 small islands in New Zealand?s north.” The disgusting, scaly creatures are in exile on those islands because they have everywhere else been “wiped out by predators.” No word on who or what these predators are or why the predators cannot follow the tuatara to the islands and thus continue their campaign of herpetological genocide.

Anyway, the lizards are about to go extinct and it’s all your fault. It seems that when the weather is hot, more male tuatara lizards are born than female lizards. And we all know what happens when there are more boy than girl lizards. It becomes impossible to get a date and procreate.

This “doomsday prediction”, we are told by researchers, is assured because of (what else?) global warming.

How do the researchers know this? Why, a computer told them so.

Previous computers did not tell them so, which forced the researchers to reprogram them, this time incorporating in their models “physics of heat transfer with terrain data.” Well, that is impressive. The researchers then “simulated climate change and then monitored its effect on specific points across the island.”

What they found was shocking: Rampant maleness, which naturally carries with it the consequence of enforced bachelorhood.

For those of you who are not as computer savvy as I, let me summarize. Researchers programmed a computer to show that when the temperature rises, fewer female lizards are born. They then told the computer that temperatures were in fact rising. The computer then said “fewer female lizards are born.”

The researchers pored over this result and came to the conclusion that “warmer temperatures caused by global warming imply fewer female lizards will be born.” They wrote this in a paper which was duly summarized at Nature. Science in action!

All might not be lost because, the researchers suggest, the lizards might be “translocated” ( = moved) to cooler climes. I just hope that those mysterious predators aren’t in the new translocations.

July 1, 2008 | 30 Comments

I wish I was making this up

Martin Creed

Another piece of data is in that shows money does not correlate with intelligence.

“Artist” Martin Creed (pictured above) created a “work” called 850, which he will exhibit at the well-known Tate Britain art gallery starting today.

The “work” consists of having joggers, once every thirty seconds, trot through the museum.

Yes, you read that right. Joggers, wearing shorts and looking like they came from the park, will run lightly through a hall or two in the name of “art.”

Guardian writer Adrian Searle claims that the wonderful thing about this “art” is “that it is gloriously pointless.” It’s not surprising the paper should feel that way, since much of its reporting falls into this category. Searle argues that people should not try to decide whether 850 is “art” but “whether the work captures the imagination, whether it gives pleasure and makes people think.”

So, on this theory, I could put a certain piece of Mr Searle’s anatomy in a vice and start to twist, an act which is certainly imaginative and would give me some pleasure. It would also cause Searle to do some serious thinking. But would he call it art?

People should not feel anger or despair over the sort of idiocy like 850, now common in the “art” world. They should instead view it as a chance to raise their income bracket. Since rich people—those people that run galleries and buy and sell “art”—are now utterly incapable of judging quality, and are dead scared of admitting their ignorance, the door is wide open for any “artist” to sell them anything. The only key seems to be that the “work” has to be completely asinine, childish, devoid of any value, and, of course, politically correct.

It also cannot be cheap. The more exhorbitantly priced your excrescense, the better chance it has to sell. For you must understand that the sole purpose of this “art” is to allow its owner to boast that he owns it. Or, in the case of the Tate, to claim that it is unique.


Wired’s theory: the end of theory

Chris Anderson, over at Wired magazine, has written an article called The End of Theory: The Data Deluge Makes the Scientific Method Obsolete.

Anderson, whose thesis is that we no longer need to think because computers filled with petabytes of data will do that for us, doesn’t appear to be arguing serious—he’s merely jerking people’s chains to see if he can get a rise out of them. It worked in my case.

Most of the paper was written, I am supposing, with the assistance of Google’s PR department. For example:

Google’s founding philosophy is that we don’t know why this page is better than that one: If the statistics of incoming links say it is, that’s good enough. No semantic or causal analysis is required.

He also quotes Peter Norvig, Google’s research director, who said, “All models are wrong, and increasingly you can succeed without them.”


The scientific method is built around testable hypotheses….The models are then tested, and experiments confirm or falsify theoretical models of how the world works…But faced with massive data, this approach to science ? hypothesize, model, test ? is becoming obsolete.

Part of what is wrong with this argument is a simple misconception of what the word “model” means. Google’s use of page links as indicators of popularity is a model. Somebody thought of it, tested it, found it made reasonable predictions (as judged by us visitors who repeatedly return to Google because we find its link suggestions useful), and thus became ensconced as the backbone of its rating model. It did not spring into existence simply by collecting a massive amount of data. A human still had to interact with that data and make sense of it.

Norvig’s statement, which is false, is typical of the sort of hyperbole commonly found among computer scientists. Whatever they are currently working on is just what is needed to save the world. For example, probability theory was relabeled “fuzzy logic” when computer scientists discovered that some things are more certain than others, and nonlinear regression were re-cast as mysterious “neural networks,” which aren’t merely “fit” with data, as happens in statistical models, instead they learn (cue the spooky music).

I will admit, though, that their marketing department is the best among the sciences. “Fuzzy logic” is absolutely a cool sounding name which beats the hell out of anything other fields have come up with. But maybe they do too well because computer scientists often fall into the trap of believing their own press. They seem to believe, along with most civilians, that because a prediction is made by a computer it is somehow better than if some guy made it. They are always forgetting that some guy had to first tell the computer what to say.

Telling the computer what to say, my dear readers, is called—drum roll—modeling. In other words, you cannot mix together data to find unknown relationships without creating some sort of scheme or algorithm, which are just fancy names for models.

Very well—there will always be models and some will be useful. But blind reliance on “sophisticated and powerful” algorithms is certain to lead to trouble. This is because these models are based upon classical statistical methods, like correlation (not always linear), where it is easy to show that it becomes certain to find spurious relationships in data as the size of that data grows. It is also true that the number of these false-signals grow at a fast clip. In other words, the more data you have, the easier it becomes to fool yourself.

Modern statistical methods, no matter how clever the algorithm, will not being salvation either. The simple fact is that increasing the size of the data increases the chance of making a mistake. No matter what, then, a human will always have to judge the result, not only in and of itself, but how it fits in with what is known in other areas.

Incidentally, Anderson begins his article with the hackneyed, and false, paraphrase from George Box “All models are wrong, but some are useful.” It is easy to see that this statement is false. If I give you only this evidence: I will throw a die which has six sides, and just one side labeled ‘6’, the probability I see a ‘6’ is 1/6. That probability is a model of the outcome. Further, it is the correct model.

June 30, 2008 | 6 Comments

IMS: Citation Indexes Stink

The Institute of Mathematical Statistics (I am a member) has issued a report on the wide-spread misuse of Citation Statistics.

The full report may be found here.

The non-surprising main findings are:

  • Statistics are not more accurate when they are improperly used; statistics can mislead when they are misused or misunderstood.
  • The objectivity of citations is illusory because the meaning of citations is not well-understood. A citation’s meaning can be very far from “impact”.
  • While having a single number to judge quality is indeed simple, it can lead to a shallow understanding of something as complicated as research. Numbers are not inherently superior to sound judgments.

The last point is not just relevant to citation statistics, but applies equally well to many areas, such as (thanks to Bernie for reminding me of this) trying to quantify “climate sensitivity” with just one number.

More findings from the report:

  • For journals, the impact factor is most often used for ranking. This is a simple average derived from the distribution of citations for a collection of articles in the journal. The average captures only a small amount of information about that distribution, and it is a rather crude statistic. In addition, there are many confounding factors when judging journals by citations, and any comparison of journals requires caution when using impact factors. Using the impact factor alone to judge a journal is like using weight alone to judge a person’s health.
  • For papers, instead of relying on the actual count of citations to compare individual papers, people frequently substitute the impact factor of the journals in which the papers appear. They believe that higher impact factors must mean higher citation counts. But this is often not the case! This is a pervasive misuse of statistics that needs to be challenged whenever and wherever it occurs.
  • For individual scientists, complete citation records can be difficult to compare. As a consequence, there have been attempts to find simple statistics that capture the full complexity of a scientist’s citation record with a single number. The most notable of these is the h?index, which seems to be gaining in popularity. But even a casual inspection of the h?index and its variants shows that these are naive attempts to understand complicated citation records. While they capture a small amount of information about the distribution of a scientist’s citations, they lose crucial information that is essential for the assessment of research.

I can report that many in medicine fixate and are enthralled by a journal’s “impact factor”, which is, as the report says, a horrible statistic—with an awful sounding name. The “h index” is “the largest n for which he/she has published n articles, each with at least n citations.”

Naturally, now that we statisticians have weighed in on the matter, we can expect a complete stoppage in the usage of citation statistics.