Are Scientific Papers Becoming Worse?

The United States Football League began in 1983 with twelve cities, many of which already had an NFL team. The level of play was decidedly inferior to the “senior” league because, of course, the better men were already in the NFL. Still, the organizers thought that more football was wanted, even though the quality would not be on par with what fans had come to expect.

But people failed to love the expansion league, and it never did well enough to be able to pay top talent. It only lasted three years, folding in 1985.

This is a familiar story in sports. With a given population and infrastructure, there’s only so much top talent to go around. You can’t expand indefinitely and expect consistent quality.

The same must be so of professors at universities. There are only so many great brains to go around. It’s true that as the population grows there are more potential recruits into the white-coat leagues. And when training in the youth associations—i.e., math clubs, science fairs, band—is functioning well, there is a better chance that prospects will be recruited.

But again, you can’t expand indefinitely and expect consistent quality. And the professoriate has certainly expanded and is continuing to expand. It’s college for all! regardless whether most can handle the rigors.

It’s natural to wonder how much the swelling of the ranks and the dilution of talent accounts for the Wall Street Journal’s findings in “Mistakes in Scientific Studies Surge.”

Seems retractions by journals have gone from near none ten years ago to well over 300 the past two years. Some of these retractions are from the authors of the papers themselves, after they conscientiously notice their mistakes, but many others are from the editorial boards of the journals after they identify various shenanigans of the authors.

The growth in shoddy work has been so explosive that the blog Retraction Watch has popped up to document the flood. First two headlines: “A quick Physical Review Letters retraction after author realizes analysis was ‘performed incorrectly’” and “Cal Poly Pomona education researcher leaves post after rampant plagiarism is revealed.” What a depressing site!

What’s going on? Expansion, as we saw. The number of journals in every field has exploded. The Far East, Lower Southern Half, Asian Journal of Research Studies: Part D, and so on. Why? Half of what earns a professor tenure is raw paper count. Quality is important, but only at the top schools. At most places, the only determination is weight: the more papers the better. Considering that most professors in the sciences have only one or maybe two good ideas in their entire lives, yet they must publish half a dozen or more papers a year, it is no surprise that much of which makes its way into print is of no or little value. Or even of positive harm, as the WSJ article argues.

The more journals there are, the more papers, and the more papers, the more bad ones. Not just sloppy or ridiculous papers, which regular readers of this site know are rampant, but fraudulent ones, too. Corners cut, numbers fudged, bandwagons jumped on and rode into the dust. This isn’t just in medicine, where there are many multitudes—as in thousands—of papers appearing monthly, but also in research into “climate effects.” It’s not unusual to see, in the same journal even, one paper which “proves”, “Fruit Bats Numbers To Decline When Climate Change Hits” and another which “shows”, “Fruit Bats To Increase Without Number, New Plague, When Climate Change Hits.”

And let’s not forget money—money is what’s going on, and lots of it. Research dollars from governments have flooded the system, making it easier for professors to set up little fiefdoms. The more money a professor brings in, the higher the rewards from the university bureaucracy (corollary: the more money, the larger this bureaucracy grows). Now, the only way to bring in the bucks is by publishing in sexy fields. Better publish quickly and in bulk, too, because you have a dozen guys breathing over your shoulder, itching to increase their “impact” scores ahead of yours.

The temptations here are enormous and, increasingly, they are not resisted. According to the WSJ: “‘The stakes are so high,’ said the Lancet’s editor, Richard Horton. ‘A single paper in Lancet and you get your chair and you get your money. It’s your passport to success.’”

Solution? There isn’t one. Among the race of people are liars, cheats, thieves, slobs, connivers, enthusiasts, zealots. And scientists, you may be surprised to learn, are people. Just because you know how to solve the equations of motion does not mean you are gifted with higher morals than the common man.

The only way to reduce fraud and mistakes are to reduce the number of papers. And since paper counting will never go away, the only way to reduce the number of papers is to reduce the number of professors. And that doesn’t seem likely to happen.

Comments

Are Scientific Papers Becoming Worse? — 35 Comments

  1. you are ignoring the possibility that more retractions can be a result of more attentive eyes noticing mistakes (which is exactly the opposite of your “limited good brains” idea), combined with faster diffusion of information.

  2. @ anonymous. I’ll see your “faster diffusion of information” and raise you a

    “……but many [retractions] are from the editorial boards of the journals after they identify various shenanigans of the authors.”

    Why precisely is that not the “result of more attentive eyes noticing mistakes” which you somehow think the author ignored? A little quick on the trigger, mayhap?

  3. Another reason is explosive growth in number of researchers from former “third world” – India, China, SEA. They generally have lower quality standards and lower “internal filter”. They submit a lot of publications to established journals and by statistics more bad staff passing through.
    And the solution is quite possible – crowdsourced moderation system, like in slashdot, or google +1. Introduce this system to arxiv.org and a lot of low quality stuff will be flagged down at once.

  4. arXiv has a de facto moderation system — responses to papers, linked to the original submission. Someone else sees a paper they like/don’t like/whatever and writes their own paper in response. Happened repeatedly with the Dantzig Selector paper and has happened several times for other papers. Oftentimes the responses are so fast and furious that all of a sudden a stream of papers appears responding to each other, all typeset, most of them carefully argued, all transparently available to anyone who wishes to chronicle the development of an idea or ideas. Needless to say the pace and transparency of such exchanges is vastly superior to that embraced by more traditional journals.

    Contrast this with (say) your average Nature Publishing Group journal. The revisions requested by referees are not available, the timeline of any remark or rebuttal (should NPG deign to publish it, for some reason) is rarely if ever noted, and as an assistant professor friend of mine noted, “I don’t get evaluated on how well I referee”. Moreover, (unlike Science or Nucleic Acids Research, for example) there appears to be no standing requirement for open and durable availability of data and materials sufficient to reproduce results. From a business and “impact” standpoint this is probably optimal. From the standpoint of how it shapes the public perception of science, it is abysmal.

    Furthermore, for major CS and machine learning conferences, the topic of how best to allocate submissions to qualified reviewers is itself an active area of research. As best as I can tell, this has nothing whatsoever to do with the situation in (e.g.) biomedical research, to say nothing of fields like epidemiology or psychology. So for-profit corporations are selling per-page advertisements in journals which are effectively judged by their ‘impact factor’, a commercially derived number in and of itself, on the one hand; and on the other, we have a freewheeling exchange of ideas in full public view, albeit requiring quite a bit of effort (and of course, papers on arXiv are regarded as worthless in the pursuit of tenure — but anyone who is publishing regularly in the core topics on arXiv without attracting a stormfront of ridicule ought to be able to make it in the real world, so this ought not to be such an issue).

    I am not convinced that it is arXiv which is most desperately in need of fixing.

    The degree of corruption and fraud in (say) stem cell or cancer research appears to be a large scalar multiple of that witnessed in less fashionable fields such as old-school model-organism developmental biology, solid-state physics, or inorganic chemistry. I wonder why that is? Might it have something to do with the corrupting effects of money?

    True or not, I find that this explanation submits to Occam’s Razor much more readily than “there are tons of yellow and brown people invading the sciences, who can’t possibly be as smart as whitey”. Setting aside issues of political correctness, the prior likelihood of money ruining science is substantially greater than that of (insert developing country here) ruining scientific endeavours.

    JMHO

  5. I disagree. We may well have a shortage of genius but not a shortage of good intelligent people. I think the mistakes and poor quality in science is about politics and personal agenda intruding into what should be pure science. The journals have shirked their responsibilities to monitor the work they publish. Worse the journals are participating in much of the fraud. We need to find and correct the intentional failings.

  6. Fewer professors implies less fraud. Really?

    Given that the number of Ph.D. students has been steadily on the rise, and is unlikely to decrease as governments compete to train more and more Ph.D. students, fewer professorships will lead to harsher competition. I doubt this will reduce fraud.

    Of course, in a free market, we would expect fewer professorships to translate into fewer Ph.D. students. But it does not work like that. Undergraduates have very little reliable information on the job market. They don’t see all the Ph.D. students who failed to get jobs. They only see those who succeeded. Moreover, the government (in most countries) actively entice students to attend graduate schools by providing funding and scholarships. (These are signals to the students that Ph.D.s are a good thing.) Society in general does a good job at presenting a glorious image of Ph.D. holders. You rarely see on TV underemployed scientists (How many Ph.D.s in Physics do enterprise computing for a living? Or teach yoga? Quite a few. But we never, ever, see them.)

    So, I submit to you that the solution is more full-time, secure and well paid research positions (not necessarily professorships). This would lessen the pressure on publication speed and make fraud less profitable, more risky.

    Disclaimer: I have a tenured position.

  7. It doesn’t make sense to have fewer professors when the overall population is growing. Maybe a fixed fraction is a good idea, and we shouldn’t have a growth in the relative number of professors to the total population. Growth in science fields is just as necessary as all others.

    Maybe the argument you’re making is that you should have fewer papers per professor? i.e. only publish the really worthwhile ideas? But your right, the current system doesn’t allow that…

  8. “there are tons of yellow and brown people invading the sciences, who can’t possibly be as smart as whitey”
    Those yellow and brown people are doing perfectly well in science if they reside in western uni/labs.
    There are two factors here:
    1.negative selection – brightest and best are leaving their native countries.
    2. Mostly lower scientific culture in those countries, influenced by factor 1. and traditional learning models.

  9. These mistakes are much more frequent that most people realize. They are difficult to catch and take a long time to find and resolve. One step in the right direction would be “reproducible research”. Making data, protocol, and analysis code available on publication of a paper would speed up the scientific process of evaluation. Journal editors and funding agencies should move in this direction.

  10. The only way to reduce fraud and mistakes are to reduce the number of papers.

    I don’t see how this could work. I guess if you don’t do anything, you’ll make no mistake.

  11. “retractions by journals have gone from near none ten years ago to well over 300 the past two years.”

    Presuming some science in these figures and assuming it is indicative of a trend (near none forever before ten years ago and increasing beyond 300/2 years in the future), I find it more easily explainable by change in culture (e,g, retractions becoming a requirement rather than a taboo) and technology (e.g. plagiarism detection software) and general trend in scandal mongering. No doubt increase in population and pressure to publish is decreasing quality, but the increase in number of retractions id not it (especially if it has increased so abruptly).

  12. Some exccelent examples of the “bad science” problem can be found in

    McKitrick, Ross R. and B.D. McCullough (2009) Check the Numbers: The Case for Due Diligence in Policy Formation. (Fraser Institute, February 2009)

    McKitrick’s presents a number of examples of bad science enabling even worse government policies. Some of his examples are from “Social Science” which is close to an oxymoron, but the causes of “bad research” are similar.

  13. Oh, really.

    Lets take the old classic: the Theory of Evolution.

    Defined as: Modified Descendants.

    1) Bacteria: acquire whole genes from their environment. Meets the definition; Cited in peer-reviewed journals as clear-cut evidence of ToE.
    2) Viruses: Ditto. Ditto.
    3) Alleles: Via a process that strongly resembles playing with LEGO, a given ‘specie’ (which is defined as stupidly as humanly possible) adjusts to changing environment. The phenomenon called Natural Selection. Ditto. Ditto.
    4) Human (or whatever) children. Ditto. Not-Ditto, due to obvious idiocy of the definition.
    5) NEW genes or alleles, Ditto. Never, ever, Ditto-ed.

    1-3 are used as evidence for 5. Yet not one single new gene in any of them. As clear cut an example of the conflation fallacy as you will ever find.

    Yet NOT ONE person here will stand against this cornerstone of Liberalism, Humanism, Atheism, whatever-ism.

    Now, just what does that say about the majority of people who practice either science or math?

    I would suggest that you rethink what you wrote here, Briggs, but getting a Liberal to reason is impossible.

    =

    Oh, but it does go on.

    Consider Natural Selection. As the environment changes, the ‘specie’ changes. There are resistive factors, of course, a kind of information inertia due to the non-linear, weighted, impact of multiple factors in effect on a given specie.
    Not one, single, NEW gene required.
    Now. Even a gibbering moron should be able to realize that this simple, reasonable, process renders the VERY CONCEPT of ‘transitional fossils’ invalid.

    Well, not quite. First you have to understand that legged wales can be reduced to no-legged whales via the simple actions of NS.
    There. Now the gibbering moron has enough to think with.

    But now there is that very strong stasis-jump pattern in the fossils… which is ever so incompatible with the existence of NS… but don’t go hurting your little minds.

    =

    One could go on, and in similar ways introduce a small semblance of reason into such things as irreducible complexity (or rather, against the so-called arguments against it). But why bother.

    =

    What does it say, Briggs, about you and your cronies, that none of the above moved you so much as a millimeter?

    Science is corrupt, because the Liberals who practice it are unfit to do so: put more simply for your benefit: purposefully-stupid people only do smart things if there are purposefully non-stupid people to hold them to account.

  14. As Serge mentioned, there are different standards for learning and scholarship. I recently read an article about one country where scholarly papers were expected to include large swaths of material copied from experts in the field. In that tradition, you were well-educated if you could quote the sages, without necessarily understanding or being able to build upon what they discovered.

    I’ve started ignoring papers from a particular country because they inevitably involve taking two or three computer/statistics concepts, combining them in strange combination, and documenting that this combination seems to work okay in a couple of test cases and merits further investigation. Not fraudulent, but really a matter of technique dredging that reads like an undergraduate term paper rather than a real research paper.

    I think that aspect is more common in non-western schools, while in western schools the pressure is to show that you’re publishing and therefore using all of that grant/project money well. Want more money for your computer department? Show some trivial technique supports global warming.

  15. I tend to read a fair amount of papers, and do so for the same reason I like to read about history; Lots of boring stuff, but occasionally there is a gem waiting to be uncovered.

    Something I’ve noticed is that work (both published and unpublished) written post-1998 almost always report positive findings. In other words, the average author (post pre and post PhD) seems obsessed with confirming their theory going in.

    Have a peek at some of the dendochronology stuff from the the 1980s; authors willinging admit to unknowns– these days it seems the emphasis is on certainty.

    Of course this could just be a case of ‘back in my day we walked to school uphill both ways’..

  16. If enough papers are published, there is a greater chance that some of them will be really great. That’s not the problem. The problem lies in knowing which ones are profound.

    Bob Newhart (Newhart, 1960):

    You know the idea … if you put an infinite number of monkeys, at an infinite number of typewriters, they would type all the great books. Now, they are going to type a lot of gibberish, too. So they would have to hire guys to check the monkeys to see if they were turning out anything worthwhile. … Look, I’ve got something: “To be or not to be … that is the gezortenblatt …”.

    retrieved from http://projects.chass.utoronto.ca/chwp/CHC2005/Butler/Butler.htm

    No matter how many (or few) papers there are, the problem lies in filtering the wheat from the chaff. It can easily be argued that the establishment is not particularly good at recognizing new and important work.

    There is a new book: http://www.scientificamerican.com/article.cfm?id=how-the-hippies-saved-physics “How the Hippies Saved Physics”. It makes a persuasive argument that a group of, mostly unpublished, ‘hippie’ physicists kept alive and extended the physics that is now the basis of quantum communication. http://en.wikipedia.org/wiki/Quantum_entanglement

    The problem lies with filtering. The establishment can not be trusted to filter correctly. So, I say, open the flood gates and find a way to deal with the resulting information overload.

  17. I’m not sure a complex system involving multiple variables including the interests of funding agencies, researcher self interest, university politics and interest, journal self interest, genuine scholarship, ethical character, etc could be distilled to a single variable- the number of papers.

  18. Briggs said, “The only way to reduce fraud and mistakes are to reduce the number of papers.”

    Which is like saying that the only way to reduce defects in new cars is to produce fewer cars. Fortunately, Toyota and Honda found a better way.

  19. I don’t like the USFL analogy. Look at all of the sports leagues that have expanded and at the same time have generally increased the quality of play at the same time. Sure, there are people who will say that Kobe is not as good as Chamberlin, or that Pujols couldn’t hold a candle to Ruth. But, across the bench, the worst player in most sports today would be a star in a previous era.

    On professorships, the number of Ph.Ds grows much faster than the number of tenure track university jobs. I don’t think that the expansion of professorships is source of the problem.

    On point of bad published research, 80 of published epidemiological “statistically significant” studies fail to replicate in controlled experiments…

    Negative results should have an easier time getting published. Almost none do. I said as much to a scientist friend of mine, and she laughed at the idea.

    The reviewers should have more at stake, including their name attached to the research. And any independent results reproducing experiments.

  20. There is a difference between research and football. In football, if someone pays you to throw the game, you are likely to be found out and drummed out of the game, if not worse. In research, if someone pays you to get the politically correct result, your chances of prizes and election to a National Academy are much enhanced. An example is climate research, which is political correctness on steroids.
    Speaking of steroids, in football that will eventually be found out too and squelched.

  21. When I was undergraduate student in college in the 1950’s, there was a mathematician and popular satirist from Harvard. University, Tom Lehrer. One of his satirical songs was our favorite, Lobachevshy. One line from that song sticks in my mind.
    “I remember the first day I meet the great Lobachevsky. In one word he told me the secret of success in research. Plagiarize. Let no one else work evade your eyes, remember why the good Lord gave you eyes, don’t shade your eyes, plagiarize, only please call it research.”

    Twenty years later I began my career in university teaching when I found that plagiarism was rampant and without any imperative to be eliminated from college ranks. Students bragged about it and many technical students flagrantly used plagiarism. Could any of these great minds referred to above suffer from the same lack of moral imperative to be honest because they got away with plagiarism during their educational careers? Have we made our own bed?

  22. Jon Shivley above is on track, but I can go back almost 40 years to graduate school, wherein some students made up data and published theses using it. No one caught them because the reviewers (committee) rarely read the thesis or dissertation. And often didn’t understand it anyway.

    I published some work, I’m not sayin’ where, in which a nabla was accidentally switched with a V with an arrow over it (vector velocity), and the resulting equation violates the second law of thermodynamics. I didn’t proofread well enough to catch it, but none of the reviewers did either. I’m mediocre, so was everyone else. The decline in abilities was in evidence then?

    In the first couple of decades of the last century Robert Wood and Irving Langmuir spent quite a lot of time debunking junk and pathological science. Quality of science already in decline at this point?

    We could point out how Newton spent a lot of time fudging his data in order to make observations support his mechanical theory. I have found circular logic in his work. Was the decline already in progress in 1689?

    Or is it that even the finest minds occasionally make mistakes? We are simply quicker to find them and more willing to point them out?

  23. Pingback: Matt Briggs wonders whether scientific papers are becoming worse (we wonder anyone doubts it) | JunkScience Sidebar

  24. Most of the “research papers” I see published these days (in mathematical and applied statistics as well as in statistics used in specific areas e.g. medicine, geo-sciences, social sciences (!), climate research) have NO or very little novel statistical content – most looks like it should be a worked example/exercise in a textbook. It is common for “researchers” to try squeeze two or three diluted papers out of one idea (or worse, data set) instead of providing one concise report.

    The source of the problem, I believe, is the way “research” is “incentivised” – money/tenure/job security as a consequence of papers published (“if it was published it must be good”). In any respectable field it may take 10+ years of postgraduate study to properly near the research horizon in that field (not just a flimsy PhD). Expecting (allowing?) people to publish before that happens introduces a deluge of papers where authors simply rediscover older (and usually far more general) ideas and repackage it (knowingly or unknowingly) as cutting edge research to inflate their paper count for the “research” administrators/managers.

    So, what to do?

    Firstly we should ditch the absolute paper count as a measure of a researcher. “Quality” of research may be better measured by number of *positive* citations in *”respectable”* journals. Papers citing a paper which is subsequently shown to contain plagiarism or fallacious content should attract negative citations and the other work of that same author should be similarly downgraded. Google should PageRank+ researchers!

  25. I was going to say that I thought “specie” meant metal money as in coins. And then I realised, “Oh yes, and it evolved into paper money!” Then it occurred to me that there was little or no point in saying any of this but as that has never stopped anyone else: “Submit Comment”.

  26. I kind of like this analogy, although I know it’s not perfect:

    Consider that the sum of human knowledge is a pier, extending into the Sea of Ignorance, built of stones. Every once in a while, a real “strong” person carries a huge stone out to the end of the pier, drops it off, and extends the pier. Then, some people who are perhaps not as strong can carry smaller stones out to fill in the gap between the new stone and the existing pier. Then some other even less strong people can carry smaller stones yet, until the surface of the pier is once again smooth. Under this analogy, we don’t all have to be the strongest person alive to help build the pier, we just need to be able to carry a stone and put it in the right place.

    Where we seem to be today is that (1) we have more builders working on the pier than ever before, (2) people are getting “credit” for carrying out a grain of sand, and (3) when a stone is dropped that doesn’t align with what was already there, it’s not usually called out. Items (1) and (3) are pretty hard to fix (Mr Briggs’ proposal to limit PhD’s notwithstanding – and I would note in passing that PhD’s haven’t done all of the heavy lifting in the past). We might be able to fix item (2) if the journals could quit carrying the “grain of sand” papers – this, in turn, could eventually result in Mr Briggs getting his wish, as some of the weaker academics would no longer be able to publish, and thus (academically) perish. What is required is for journal referees to set a minimum standard “stone size” for accepting articles – papers must contain at least 51% new content or something.

    As I said, it’s not a perfect analogy, but it’s something to discuss. So, discuss?

  27. Pingback: William M. Briggs, Statistician » A Synthetic What? Lady Who Won Lotteries Is Statistician, More

  28. “….reduce the number of professors. And that doesn’t seem likely to happen.”

    Oh but it will. Dissolution of the Monasteries.

  29. “No amount of experimentation can ever prove me right; a single experiment can prove me wrong. ” Albert Einstein

    Presumably backed up with a paper.

    In light of this one professor is worth both a hundred and a thousand.

  30. Pingback: Publicering ingen garanti för kvalitet | The Climate Scam

  31. David, how does Professor McKitrick integrate his own experience in his discussion of bad science?

    Ross McKitrick and Patrick J. Michaels, 2004: Erratum “A test of corrections for extraneous signals in gridded surface temperature data, Clim Res 26: 159–173″, Climate Research 27: 265-268.
    http://www.int-res.com/articles/cr2004/27/c027p265.pdf

  32. Math: 2+2=4
    New Math:2 mod n + 2 mod n = 4 mod n>2
    CAGW Math: 2+2=8 (+/- 6)