Poor Statistics Undermine The Reliability Of Neuroscience

I'm all aglow
I’m all aglow
Note carefully the picture which accompanies this post. The right-most glow is centered on the upper-middle-fifth amygdalic cingulatum region of your author’s brain. Statistics show that this region is “associated” with feelings of joy; more specifically, the shivers of delight one experiences when saying, “I told you so!”

(The other smaller glow to the left is “associated” with pleasant thoughts of Myers Dark rum, now ridiculously expensive.)

The synaptic juices started flowing and the glow glowed after I read “Many Neuroscience Studies May Be Based on Bad Statistics” in Wired, which opens:

The fields of psychology and cognitive neuroscience have had some rough sledding in recent years. The bumps have come from high-profile fraudsters, concerns about findings that can’t be replicated, and criticism from within the scientific ranks about shoddy statistics. A new study adds to these woes, suggesting that a wide range of neuroscience studies lack the statistical power to back up their findings.

The study is “Power failure: why small sample size undermines the reliability of neuroscience” in Nature Reviews Neuroscience by Katherine Button, John Ioannidis, and several others. Best thing about that paper was a short guide to terms researchers ought to know. My favorite:

Winner’s curse
The winner’s curse refers to the phenomenon whereby the ‘lucky’ scientist who makes a discovery is cursed by finding an inflated estimate of that effect. The winner’s curse occurs when thresholds, such as statistical significance, are used to determine the presence of an effect and is most severe when thresholds are stringent and studies are too small and thus have low power.

Button and team did a meta-meta analysis of fMRI studies and the like and discovered what will be no secret to regular readers: the statistics of these works ain’t too hot. Specifically, many (most?) have very low power. “The consequences of this include overestimates of effect size and low reproducibility of results.”

They looked at 48 meta-analyses, which comprised “730 individual primary studies.” The median power was 18%. If you don’t have a feel for that, the “normal” power for medical studies is 80%+. That’s the level grant granters want, anyway. Button’s finding means half the studies are worse than anemically powered.

The Scientist quotes Hal Pashler, a psychologist at the University of California, San Diego, as saying, “This paper should help by revealing exactly how bad things have gotten.” Can’t go too much by that, because it’s standard journalistic practice to fetch a quote from somebody who didn’t write the paper (and often didn’t read it). But in this case Pashler is right.

Or maybe I’m just happy to agree with him. Here’s why I do.

Point one, I did an extensive (maybe too extensive) critique of Sam Harris’s paper “The neural correlates of religious and nonreligious belief.” One of the worst papers, in a series of bad papers, that I’ve ever read. Shoddy experimental design, editorialism masked as science, data mysteriously disappearing, biases galore, et cetera.

If you’re only going to read two things, read these entries of the review “Can fMRI Predict Who Believes In God?” Part I and Part Last.

Points two and higher: click to one of these two reviews: Yet Another Study ‘Proves’ Liberal, Conservative Brain Differences, Brain Atrophy Responsible For Religious Belief?

I’ve done many more, but these capture the gist. Strange that these “studies” have a sort of theme to them, no?

Wired had the sense to ask why so many bad studies? Reason one: studies are too expensive. But since scientists must publish lest they perish, reason two: “[T]he pressure on scientists to publish often, preferably in high-profile journals, to advance their careers and win funding from the government.”

Since that pressure will not be lifted even after Button’s identification of systematic flaws, it is rational to expect a continuation of systematic flaws. Gives me a kind of job security, though.

IDing poor science doesn’t pay as well as generating it, however,. Actually it pays not at all. That’s why I think warm thoughts about rum: to keep my brain lit up.


Thanks to Mike Flynn for pointing us to this fine news.


  1. Can’t you get somebody to nominate you for a McArthur “Genius” award? You would be set for life in the rum dept. I peripherally knew one of the early winners many years ago. He was a bright and engaging person, but no more so than you seem to be. However, he did publish good science in the major journals before they went down the crapper… Maybe that’s your problem: you’re out of your time.

  2. Gary,

    I think you’re confusing “genius” with “loquacious.” I would, however, accept from that honorable committee a free bottle.

  3. Even ignoring the bad statistics, there seems to be even less actual science in neuroscience than there is in climate science.

  4. Many moons ago, can’t recall the exact number, a software called the ‘Social Sciences Package’ or something similar arrived on the Auckland campus. Others may recall the exact name. A second generation IBM had been installed at the time and this package was part of the deal. It became all the rage. It swept through the soft sciences like a dose of salts – and some of the harder ones too. I have wondered in recent years how much of the present use and abuse of stats stems from that time. I well recall some of the more amusing thesis results. Shucks, the term GIGO had not even entered the lexicon.

    While correlation does not imply causation, that package certainly provided a good helping of serendipty to the publish-or-perish cause.

  5. Dr K.A. Rodgers,

    SPSS, an acronym something like Statistics for Social Scientists. Maybe it was from the Spanish.

    Who remembers the phrase, “I’m going to submit my data to SPSS”? As if SPSS can be a truth generator.

  6. If you keep following the links you end up with the dead salmon effect.


    Much the same point has been made by James Le Fanu in “The Rise and Fall of Modern Medicine” reviewed here.


    He relates the problem to the tremendous success of medicine, and science in general, in the first half of the twentieth century due to the picking of low hanging fruit. This led to the expansion of funding, number of researchers, and research institutes. However, there is no longer enough work (useful research) per scientist, who still must be kept busy and this has led to more and more trivial or incorrect results. This is the publish or perish problem that you mention. I am not suggesting that we have run out of fruitful areas of research, but only that the imbalance of funding and scientific personnel has produced perverse incentives. Le Fanu estimates that the inflection point occurred in the early 1980’s.

  7. William S,

    Thanks for the Salmon reminder!

    Le Fanu’s date sounds about right. Are Scientific Papers Becoming Worse? The increase in universities/research centers can be likened to expansion teams and the watering down of talent.

    The low-hanging fruit comment also interesting. Growth curves take a typical S-shape: slow increase at first, followed by a burst of energy, then a slow tapering off. Where are we on the curve? Podcasts Episode #2: The Limits of Human Knowledge and The Singularity (I haven’t done one of these in over a year.) Everybody assumes we’re at the beginning. But a case can be made we’re near the tapering off.

    Quick evidence: the increased praise for banal. More superlatives expended per paper than ever before, etc.

    Update: if you listen to my voice, you’ll hear why I haven’t done a podcast in a while. No courage of my convictions (it was still in my atheist days). Maybe I should re-start them up—after smoking a few more cigars to put some meaty timbre into the vocal chords.

  8. So if you do a PhD in neuroscience and then go on to star in ‘The Big Bang Theory’ you’re moving in he right direction.

  9. As a graduate student finishing up their PhD in Neuroscience, I applaud your post, and all the press that shoddy statistics in neuroscience are getting. This issue needs to be at the forefront to correct the problem, most specifically in human social, cognitive and affective neuroscience.

    At the same time, though, some of the most exciting work ever done to understand how the brain works is going on right now. Optogenetics allows us to understand sufficiency and specificity in greater detail than ever before. Karl Deisseroth and Ed Boyden (along with a couple others) will win the noble prize for this work inside of ten years.

    So, while we may lambaste plenty of researchers for their shoddy work, we should remember that the collective effort is what matters to all of society.

    PS — if you’re going to tear into other people’s work, make sure you post an accurate picture of the region of the brain you’re referring to, lest you misinform others while trying to educate them

  10. Update: if you listen to my voice, you’ll hear why I haven’t done a podcast in a while. No courage of my convictions (it was still in my atheist days). Maybe I should re-start them up—after smoking a few more cigars to put some meaty timbre into the vocal chords.

    You don’t have to do that Briggs, just get autotune. Or buy a darth vader mask at Toys R Us for $20.

Leave a Comment

Your email address will not be published. Required fields are marked *