could we prevent some scientific retractions?

Fabio Rojas and others have been discussing retractions over on our buttoned-up nemesis, and making the excellent point that the presence of scientific retractions is good for science. However, it can only be good for science insofar as bad or even falsified science takes place to begin with.

The case of Larry Sanna brings to mind the question of whether some such incidents might be preventable. For background, see this article in Nature.  Essentially, Wharton scholar Uri Simonsohn is developing techniques for determining when data are likely to be faked; in this case, the fragment of the real was overly consistent standard errors. Sanna’s work, much of it done while he was on the faculty here at UNC, seems very likely to be falsified to some extent, as he has resigned from the faculty at Michigan (where he has been only for a few months) and asked that several papers be retracted.Meanwhile, I understand Simonsohn plans to publish on the techniques he uses for identifying falsified studies, which raises the question of whether falsifiers will just get more sophisticated in response.

For good reasons, we expect administrations to keep their hands off the day-to-day conduct of intellectual work. I would certainly bristle at the prospect of prior review of my work. But cases like these (and the others discussed on OrgTheory) show that there are significant repercussions to academic fraud: other scholars and students who weren’t involved in the fraud get swept up in the consequences, and future work in cumulative sciences may be sent on the wrong path by false findings earlier on. Any thoughts on how these could be prevented without undue prior restraint on intellectual work?

Author: andrewperrin

University of North Carolina, Chapel Hill

9 thoughts on “could we prevent some scientific retractions?”

  1. There’s falsified data and there’s mistaken data. Falsified data can be detected because real data are messier than fake data. Actually making up data is a terrible sin, and I do think it needs to be detected and punished. But mistaken data is, I suspect, a more pervasive problem.

    Several times in my own apparently never-ending research project I have found a mistake in my own data, sometimes arising from a programming mistake like an error in an interpolation formula, sometimes arising from an error in a data source I obtained from someone else, including official government sources. There are at least three major points in my data journal with contain the entry: “All results prior to this date are based on bad data and need to be verified.” It is my own weakness as a scholar that the works with the bad data had never reached publication and, hence, did not need to be retracted, but as I am more anal than most people, (although admittedly not as anal as the real masters of data-handling) I find it hard to believe that there isn’t a significant fraction of false information that is due more to sloppiness or inevitable human weakness than conscious fraud. But then there is the question of whether error is not perhaps just as bad as fraud, producing as it does the same problem of ruining the pursuit of knowledge. I believe this issue was discussed in Sayer’s Gaudy Night as well, I’m sure, as in more esoteric sources.


    1. And we have no real standard for how much error is acceptable. So, for instance, if you have a research assistant do data entry, it is inevitable that a number here or there will get transposed. At what threshold does this mean that the data should be thrown out?


      1. All data sets have errors. You don’t throw data out. BUT if people were rigorously checking, there would need to be a lot of errata published as people re-run tables and check results after finding errors. That we don’t do this is sobering. Think about how often a critical comment on an article, typically launched for some theoretical reason, includes some correction of data as part of the comment.


    1. I don’t know, what would you prefer? Doppelgaenger? Superego? Annoying older sibling? (I think Evil Twin is taken since OrgTheory calls somebody else that.)

      As for intermarriage: I have no objections!


  2. Data and code have to be made available for reanalysis, but that’s just a start. If NSF won’t fund it, maybe ASA should have a replication budget. You could select the papers at random. Another idea: random lie detector tests of a small % of researchers.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s