A few years ago, economists Carmen Reinhart and Kenneth Rogoff wrote a paper on national debt, where they found that there wasn’t much relationship to economic growth as long as debt was less than 90% of GDP, but that above this level economic growth was lower. The paper was widely cited as support for economic strategies of `austerity’.
Some economists at the University of Massachusetts attempted to repeat their analysis, and didn’t get the same result. Reinhart and Rogoff sent them the data and spreadsheets they had used, and it turns out that the analysis they had done didn’t quite match the description in the paper. Part of the discrepancy was an error in an Excel formula that accidentally excluded a bunch of countries, but Reinhart and Rogoff also deliberately excluded some countries and times that had high growth and high debt (including Australia and NZ immediately post-WWII), and gave each country the same weight in the analysis regardless of the number of years of data included. (paper — currently slow to load, summary by Mike Konczal)
Some points:
- The ease of making this sort of error in Excel is exactly why a lot of statisticians don’t like Excel (despite its other virtues), so that has received a lot of publicity.
- Reinhart and Rogoff point out that they only claimed to find an association, not a causal relationship, but they certainly knew how the paper was being used, and if they didn’t think provided evidence of a causal relationship they should have said something a lot earlier. (I think Dan Davies on Twitter put it best)
- Chris Clary, who is a PhD student at MIT, points out that the first author (Thomas Herndon) on the paper demonstrating the failure to replicate is also a grad student, and notes that replicating things is job often left to grad students.
- The Reinhart and Rogoff paper wasn’t the primary motivation for, say, the UK Conservative Party to want to cut taxes and government spending. The Conservatives have always wanted to cut taxes and government spending. Cutting taxes and spending is a significant part of their basic platform. The paper, at most, provided a bit of extra intellectual cover.
- The fact that the researchers handed over their spreadsheet pretty much proves they weren’t deliberately deceptive — but it’s a lot easy to convince yourself to spend a lot of time checking all the details of a calculation when you don’t like the answer than when you do.
Roger Peng, at Johns Hopkins, has also written about this incident. It would, in various ways, have been tactless for him to point out some relevant history, so I will.
The Johns Hopkins air pollution research group conducted the largest and most comprehensive study of health effects of particulate air pollution, looking at deaths and hospital admissions in the 90 largest US cities. This was a significant part of the evidence used in setting new, stricter, air pollution standards — an important and politically sensitive topic, though a few order of magnitude less so than austerity economics. One of Roger’s early jobs at Johns Hopkins was to set up a system that made it easy for anyone to download their data and reproduce or vary their analyses. The size of the data and the complexity of some of the analyses meant just emailing a spreadsheet to people was not even close to acceptable.
Their research group became obsessive (in a good way) about reproducibility long before other researchers in epidemiology. One likely reason is a traumatic experience in 2002, when they realised that the default settings for the software they were using had led to incorrect results for a lot of their published air pollution time series analyses. They reported the problem to the EPA and their sponsors, fixed the problem, and reran all the analyses in a couple of weeks; the qualitative conclusions fortunately did not change. You could make all sorts of comparisons with the economists’ error, but that is left as an exercise for the reader.