February 13, 2016

Neanderthal DNA: how could they tell?

As I said in August

“How would you even study that?” is an excellent question to ask when you see a surprising statistic in the media. Often the answer is “they didn’t,” but sometimes you get to find out about some really clever research technique.

There are stories around, such as the one in Stuff, about modern disease due to Neanderthal genes (press release).

The first-ever study directly comparing Neanderthal DNA to the human genome confirmed a wide range of health-related associations — from the psychiatric to the podiatric — that link modern humans to our broad-browed relatives.

It’s basically true, although as with most genetic studies the genetic effects are really, really small. There’s a genetic variant that doubles your risk of nicotine dependence, but only 1% of Europeans have it. The researchers estimate that Neanderthal genetic variants explain about 1% of depression and less than half  a percent of cardiovascular disease. But that’s not zero, and it wasn’t so long ago that the idea of interbreeding was thought very unlikely.

Since hardly any Neanderthals have had their genome sequenced, how was this done? There are two parts to it: a big data part and a clever genetics part.

The clever genetics part (paper) uses the fact that Neanderthals and modern humans, since their ancestors had been separated for a long time (350,000 years), had lots of little, irrelevant differences in DNA accumulated as mutations– like a barcode sequence.  Given a long enough snippet of genome, we can match it up either to the modern human barcode or the Neanderthal barcode. Neanderthals are recent enough (50,000 years is maybe 2500 generations) that many of the snippets of Neanderthal genome we inherit are long enough to match up the barcodes reliably.  The researchers looked at genome sequences from the 1000 Genomes Project, and found genetic variants existing today that are part of genome snippets which appear Neanderthal.  These genetic variants are what they looked at.

The Big Data is a collection of medical records at nine major hospitals in the US, together with DNA samples. This nothing like a random sample, and the disease data are from ICD9 diagnostic codes rather than detailed medical record review, but quantity helps.

Using the DNA samples, they can see which people have each of the  Neanderthal-looking genetic variants, and what diseases these people have — and find the very small differences.

This isn’t really medical research. The lead researcher quoted in the news is an evolutionary geneticist, and the real story is genetics: even though the Neanderthals vanished 50,000 years ago, we can still see enough of their genome to learn new things about how they were different from us.

 

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »