flawed science moves good science

I was fortunate to attend a talk by an economist visiting our campus last week and, during lunch, she mentioned the embarrassment that the Reinhart and Rogoff scandal caused the economics profession, including being flogged by Stephen Colbert. I then explained the embarrassment in our fair discipline, the Regnerus affair, of which she had not heard (which, itself, made me very happy). I realize that many might be losing an appetite for this topic, but I think that juxtaposing these two episodes shows some fairly sharp contrasts and lessons for academic work more generally.

Both, I believe, point to fundamental problems in our publication systems. Equally important, however, I submit that sociology’s handling of the Regnerus affair actually conveys a relatively healthy response that, through the subsequent devastating critiques, produced important knowledge. I also submit that the publication of Regnerus’s paper led to this outcome far quickly than what happened in response to the Reinhart and Rogoff scandal.

Social Science Research published Regnerus paper, with all of its substantial flaws, through the normal peer-review process, with substantial flaws of its own. Then what happened? A flurry of intellectual, political, and legal writing on the subject flourished — almost instantly. Regnerus made his data available. Neal took this data and made a publicly accessible and open source reanalysis repository based on that data, which then became a published critique of Regnerus’ original study. Science moved very fast and showed the flaws of the initial results.1 Any author hoping to cite Regnerus’ findings in the future should now also cite the scientific response.

The process also led to substantial introspection about our discipline’s norms. The journal itself devoted considerable space to its own policies and shortcomings (and the description of the publication process provides an invaluable resource for graduate students who want to learn about reviews, writing, and editing). We had more discussions that I can enumerate that ranged, even this week, from retraction to conflicts of interest to the role of outside groups in scientific research.

The Reinhart & Rogoff paper, by comparison, was not subjected to peer review. Instead, they posted the paper as a working paper at NBER, an insider’s club of economists. It appears that the data were not made public. Upon release, a new paper (itself released as a working paper without peer review), found the original included an Excel (!) formula error and questionable analytical decisions (all with precedent in the literature, however). An additional analysis showed that Reinhart & Rogoff probably got the causal order wrong. The time lapse between the publication of the original and the revision in economics, without peer review: almost three years (a year and a half if you count the last revision). And its not as if their paper didn’t matter, world economies were put in precarious situations partially on the basis of their results.

I think that science worked in both cases. I am proud that sociology made quicker work of debunking incorrect findings compared to economics, and think that the institutional structure of a journal helped sociology move more quickly (NB: I am not proud that bad research got published in the first place). In both cases, however, our intellectual curiosity and academic freedom to pursue new research (and shared data) allowed subsequent research to correct the scientific record.

That said, we must be extra careful when our results affect the lives of our fellow humans through Supreme Court cases or central bank policies. We should strive to publish on topics important enough for people to read our work and to do so in a language that they can understand. Arguing no one reads or cares about our work is a poor defense and an indictment on much of our own research. We must always act as if our research matters (because it does) and in both cases I fear that our disciplines became too cavalier publishing results and not double- and triple-checking our work when the stakes for others’ lives were high.

Science always progresses haltingly. Progress comes from false starts, challenges to received wisdom, and questioned assumptions. If we expect peer review to be perfect every time or no bad research to be published we not only set ourselves against an impossible standard, which is a danger, but because we would expect everything to be flawless we would stifle innovative research and risk becoming beholden to received wisdom. That, I believe, is a greater danger.


  1. I note that the Perrin, Cohen, and Caren piece went from submitted to accepted in eight and a half weeks, two of which included Thanksgiving and Christmas/New Year’s. This seems like a comparable timeline to the Regnerus publication, but I do not know the normal publication time at Journal of Gay & Lesbian Mental Health, so it might be that this journal is just extremely prompt. I would value any feedback from the authors, especially since two of whom are contributors here, about their experience publishing their piece.  

6 Comments

  1. Posted April 30, 2013 at 1:44 pm | Permalink

    Science may have progressed as a result of these scandals, but in my reading, the immediate damage was sufficiently bad in both cases to count this one as a loss for both disciplines.

    To the extent that Reinhart and Rogoff’s analysis influenced policymaking in 2010-2012, and that influence was based in part on the 90% threshold story founded on data errors and a questionable weighting scheme that went obscured in the published version, economics did the world a disservice. And this is despite the immediate flaws around the arguably much more important issue of reverse causality pointed out by Krugman et al. in 2010-2011, which somehow failed to invalidate the R+R claims the way that a simple (but largely irrelevant) coding error did.

    To the extent that the Regnerus finding allows conservative justices on the Court to read a rational-basis into the motives of California voters in Hollingsworth v. Perry, sociology will have failed as well. The discipline responded incredibly quickly, by our standards, and yet still not quick enough to prevent the study’s finding from being politically useful.

    I’m not entirely sure where I’m going with this – I’ve been wracking my brain for something comparative about the two cases to say for the past few days, and all I can come up with is that the timescale of academia interfaces poorly with the timescale of policymaking, and in particular, this mismatch allows policymakers to selectively seize an outlier finding for political use before academia can appropriately respond. Retraction becomes potentially more important – disputed findings are probably more useful than retracted one – and perhaps we should create some new category of “challenged” findings that are considered retracted until some kind of post-publication review occurs. I don’t know, but I’m just unsatisfied with any kind of “the system worked!” narrative around either of these cases.

    Like

    • Posted April 30, 2013 at 2:10 pm | Permalink

      Dan, thank you for your thoughtful reply. Policy makers justified policies on the basis of poor scientific results. There seem to me two counterfactuals at play here.

      First, would central bankers, or Paul Ryan for that matter, developed different economic policies in the absence of R-R? Would Scalia decided differently on the case? No and no.

      Second, what system would be better than self correction through further results? I cannot think of one. What research becomes eligible for “challenged” status? Even the idea of “challenged” findings expressly invokes peer review: what is further scientific research than attempts to do some kind of “post-publication review”? It seems to me that any kind of review that that remains outside of the peer review process is open to far more political meddling than further scientific review.

      I cannot think of a process that adjudicates these disputes better, though this might reflect my own lack of creativity than a true state of affairs.

      I am not celebrating what happened. We just expect more from the system than it can provide.

      Like

      • Posted April 30, 2013 at 2:20 pm | Permalink

        In re: your first point – It’s incredibly difficult to figure out the impact of academic debates on policymaking (though there are lots of admirable attempts to do so, and useful ways of thinking about the problem, across different literatures). That said, I think there is at least a chance in these cases that the research in question “mattered,” in part due to the way the findings circulated – and in the Regnerus case, the way they were funded. Someone somewhere thought they needed something saying just X, and along it came. Maybe Scalia wouldn’t have voted any differently, but Scalia isn’t the swing justice. Maybe Ryan wouldn’t have proposed a different plan, but Ryan isn’t the median congresscritter. R+R and R were useful to the radical flanks in mobilizing support from more moderate actors. How useful were they? That’s very hard to say.

        In re: your second point – I’m not really sure either. I haven’t got anything better right now. But I think we ought to take these examples as a moment to explore those alternatives, and not take the current peer review system as some kind of perfectly functioning, perfectly evolved institution.

        It’s worth noting again that R+R was not peer-reviewed, nor would it have ever passed peer-review muster (I think / hope). But it circulated as if peer-reviewed because its authors were prominent Harvard economists and because they were willing to oversell their own results in public fora, while referring back to a paper that seemed quite legitimate (published in the AER!).

        Like

      • Posted April 30, 2013 at 3:00 pm | Permalink

        Excellent points, all.

        “But I think we ought to take these examples as a moment to explore those alternatives, and not take the current peer review system as some kind of perfectly functioning, perfectly evolved institution.”

        Let me turn this around: then we should not expect the peer review process to produce perfectly evolved research given that it is not a perfectly functioning, perfectly evolved institution. Calling for retraction, to me, sets up the expectation that the peer review process should perfectly function and catch all errors.

        I only wish to argue that the process worked as it was designed to do. By all means, we should develop something that improves the current process. Many offer open review as a repair to the current system, but as you and I both point out, the case of “open review” with an NBER paper failed even more than the traditional peer-reviewed system (at least in terms of correcting the record).

        Like

  2. Posted April 30, 2013 at 2:00 pm | Permalink

    Interesting approach, and I do think the comparison is worthwhile, though like Dan I am less positive on the outcomes of each.

    As for the speed of our publication: The three of us had been collecting material and writing the paper piecemeal essentially since the Regnerus paper first emerged. At some point JGLMH sent around a request for papers on aspects of the Regnerus paper, and we decided that would be a good opportunity to have our paper thoroughly and seriously peer reviewed and published quickly. (One of the advantages of JGLMH is that they publish a PDF essentially as soon as it’s accepted, so it’s available quickly.) We did not suggest reviewers. The paper received a “revise and resubmit,” which we did, and was then accepted on the second round.

    Like

    • Posted April 30, 2013 at 2:13 pm | Permalink

      Thank you, Andy! I really appreciate your willingness to share the process and, more importantly, for the work that the three of you did in the paper.

      Like

2 Trackbacks

  1. [...] on all sides, from Greg Mankiw to Betsey Stevenson and Justin Wolfers to Jeff Smith (and even some sociologists). On a purely academic level, I agree with this [...]

    Like

  2. [...] activists are coming!  And, so they should.  A supposedly “debunked” study by Mark Regnerus that does not employ valid measures of lesbian couples worked its [...]

    Like

Follow

Get every new post delivered to your Inbox.

Join 1,163 other followers

%d bloggers like this: