Skip to content
December 28, 2007 | No comments

Were the cannonballs on or off the road first?

There’s something of a controversy whether photographer Roger Fenton placed cannon balls in a road and then took pictures of them. He also took a picture of the same road cleared of cannon balls. Apparently, there is a question whether the cannon balls were ON the road when he got there, or possibly they were OFF and he placed them there to get a more dramatic photo. This drama unfolds at Errol Morris’s New York Times blog.

Whether they were first ON or OFF (Morris uses the capitals letters, so I will, too), excited considerable interest, with hundreds of people commenting one way or the other, each commenter offering some evidence to support his position.

Some people used the number (Morris uses the ‘#’ symbol) and position of the balls, others argued sun shadows, some had some words about gravity, and so on. Morris compiled the evidence used by both sides, ON (cannon balls on first) and OFF (cannon balls placed there by Fenton), and he presented this summary picture (go to his blog to see the full-sized image):

Morris cannonball evidence pic

This is an awful graph: the order of evidence types is arbitrary, it would have been better to list them in order of importance; the use of color is overwhelming and difficult to follow; and, worst of all, the two graphs are on an absolute scale. 288 people supported ON, and 153 OFF, so counting the absolute numbers and comparing them, as this picture does, is not fair. Of course the ON side, with almost twice as many people, will have higher counts in most of the bins. What’s needed is a percentage comparison.

Continue reading “Were the cannonballs on or off the road first?”

December 27, 2007 | No comments

Will Smith on reprogramming Hitler

Roger Kimball, in his blog, has an entry on the actor Will Smith’s “Reprogramming Hitler” comments. The subject is benevolence. It is well worth reading.

A quote: “The Australian philosopher David Stove got to the heart of the problem when he pointed out that it is precisely this combination of universal benevolence fired by uncompromising moralism that underwrites the cult of political correctness.” He goes on to quote Stove at length (go to the original site to read).

I thought it be helpful to extend Stove’s quote. To those who would suppose that, “Ought not wrongs to be righted?” is a rhetorical question, Stove writes:

It does not follow, from something’s being morally wrong, that it ought to be removed. It does not follow that it would be morally preferable if that thing did not exist. It does not even follow that we have any moral obligations to try to remove it. X might be wrong, yet every alternative to X be as wrong as X is, or more wrong. It might be that even any attempt to remove X is as wrong as X is, or more so. It might be that every alternative to X, and any attempt to remove X, though not itself wrong, inevitably has effects which are as wrong as X, or worse. The inference fails yet again if (as most philosophers believe) “ought” implies “can.” For in that case there are at least some evils, namely the necessary evils, which no one can have any obligation to remove.

These are purely logical truths. But they are also truths which, at most periods of history, common experience of life has brought home to everyone of even moderate intelligence. That almost every decision is a choice among evils; that the best is the inveterate enemy of the good; that the road to hell is paved with good intentions; such proverbial dicta are among the most certain, as well as the most widely known, lessons of experience. But somehow or other, complete immunity to them is at once conferred upon anyone who attends a modern university.

David Stove, On Enlightenment, Transaction Publishers, New Brunswick, New Jersey, p. 174
December 26, 2007 | 8 Comments

How many false studies in medicine are published every year?

Many, even most, studies that contain a statistical component use frequentist, also called classical, techniques. The gist of those methods is this: data is collected, a probability model for that data is proposed, a function of the observed data—a statistic—is calculated, and then a thing called the p-value is calculated.

If the p-value is less than the magic number of 0.05, the results are said to be “statistically significant” and we are asked to believe that the study’s results are true.

I’ll not talk here in detail about p-values; but briefly, to calculate it, a belief about certain mathematical parameters (or indexes) of the probability models is stated. It is usually that these parameters equal 0. If the parameters truly are equal to 0, then the study is said to have no result. Roughly, the p-value is the probability of seeing another statistic (in infinite repetitions of the experiment) larger than the statistic the researcher got in this study, assuming that the parameters in fact equal 0.

For example, suppose we are testing the difference between a drug and a placebo. If there truly is no difference in effect between the two, i.e. the parameters are actually equal to 0, then 1 out of 20 times we did this experiment, we would expect to see a p-value less than 0.05, and so falsely conclude that there is a statistically significant difference between the drug and placebo. We would be making a mistake, and the published study would be false.

Is 1 out 20 a lot?

Suppose, as is true, that about 10,000 issues of medical journals are published in the world each year. This is about right to within an order of magnitude. The number may seem surprisingly large, but there are an enormous number of specialty journals, in many languages, hundreds coming out monthly or quarterly, so a total of 10,000 over the course of the year is not too far wrong.

Estimate that each journal has about 10 studies it is reporting on. That’s about right, too: some journals reports dozens, others only one or two; the average is around 10.

So that’s 10,000 x 10 = 100,000 studies that come out each year, in medicine alone.

If all of these used the p-value method to decide significance, then about 1 out of 20 studies will be falsely reported as true, thus about 5000 studies will be reported as true but will actually be false. And these will be in the best journals, done by the best people, and taking place at the best universities.

It’s actually worse than this. Most published studies do not have just one result which is report on (and found by p-value methods). Typically, if the main effect the researchers were hoping to find is insignificant, the search for other interesting effects in the data is commenced. Other studies look for more than one effect by design. Plus, for all papers, there are usually many subsidiary questions that are asked of the data. It is no exaggeration, then, to estimate that 10 (or even more) questions are asked of each study.

Let’s imagine that a paper will report a “success” if just one of the 10 questions gives a p-value less than the magic number. Suppose for fun that, every question in every study in every paper is false. We can then calculate the chance that a given paper falsely reports success: it is just over 40%.

This would means that about 40,000 out of the 100,000 studies each year would falsely claim success!

That’s too high a rate for actual papers—after all, many research questions are asked which have a high prior probability of being true—but the 5000 out of 100,000 is also too low because the temptation to go fishing in the data is too high.? It is far too easy to make these kinds of mistakes using classical statistics.

The lesson, however, is clear: read all reports, especially in medicine, with a skeptical eye.