August 30, 2012

Conclusions of difference require evidence of difference

One of the problems in medical research, exacerbated by the new ability to measure millions of genetic variables at once, is that you can always divide people into sensible subgroups.

If your treatment doesn’t work, or your hated junk food isn’t related to cancer, overall, you can see if the relationship is there in men or in women.  Or in younger or older people.  Or in Portugese-speaking bassoonists. The more you chop up the data, the more likely you are to find some group where there’s a difference.  You can then focus on that group in your results.

To combat this tendency, my Seattle colleague Noel Weiss has been promoting the slogan “conclusions of difference require evidence of difference”.  That is, if you want to report that cupcakes cause cancer in men but not in women, you need evidence that the relationship is different in men and in women.  Finding supportive evidence in men but not finding it in women isn’t enough: that’s not evidence of a difference.  Needing evidence of a difference is especially important when you wouldn’t expect a difference.  We expect most things to have basically similar effects in men and women, and where the effects are different there’s usually an obvious reason.

All this is leading up to a story in the Herald, where a group of genetics researchers claim that a well-studied variant in a gene called monoamine oxidase increases happiness in women, but not in men.  We know this is surprising, because the researcher said so — they were expecting a decrease in happiness, and they don’t seem to have been expecting a male:female difference.  The researchers say that the difference could be because of testosterone — and of course it could be, but they don’t present any evidence at all that it is.

Anyway, as you will be expecting by now, I found the paper (the Herald gets points for giving the journal name), and it is possible to do a simple test for differences in `happiness’ effect between men and women. And there isn’t much evidence for a difference. For people who collect p-values: about 0.09 (Bayesian would get a similar conclusion after a lot more work). So, if we didn’t expect a benefit in  women and no difference in men, the data don’t give us much encouragement for believing that.

Testing for differences isn’t the ideal solution — even better would be to fit a model that allows for a smooth variation between constant effect and separate effect — but testing for differences is a good precursor to putting out a press release about differences and trying for headlines all over the world. We can’t expect newspapers to weed this sort of thing out if scientists are encouraging it via press releases.

 

 

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments