search this blog

Monday, March 10, 2014

Unusually strong positive selection over the past 5,000 years, rather than population replacement or even admixture, is responsible for the high frequencies of light skin, hair and eyes among present-day Eastern Europeans, according to a new paper by Wilde et al. at PNAS.

The authors were able to infer pigmentation traits from ancient DNA for 63 Eneolithic and Bronze Age samples, mostly from Kurgan mounds from the Pontic-Caspian steppe of Ukraine and surrounds. The results suggest that the ancient individuals were overall much darker than present-day Ukrainians, who, nevertheless, appear to be their direct descendants based on mitochondrial DNA (mtDNA) sequences. Quoting the paper:

To this end we compared the 60 mtDNA HVR1 sequences obtained from our ancient sample to 246 homologous modern sequences (29–31) from the same geographic region and found low genetic differentiation (FST = 0.00551; P = 0.0663) (32). Coalescent simulations based on the mtDNA data, accommodating uncertainty in the ancient sample age, failed to reject population continuity under a wide range of assumed ancestral population size combinations (Fig. 1).

Conversely, continuity between early central European farmers and modern Europeans has been rejected in a previous study (33). However, the Eneolithic and Bronze Age sequences presented here are ∼500–2,000 y younger than the early Neolithic and belong to lineages identified both in early farmers and late hunter–gatherers from central Europe (33).

...

In sum, a combination of selective pressures associated with living in northern latitudes, the adoption of an agriculturalist diet, and assortative mating may sufficiently explain the observed change from a darker phenotype during the Eneolithic/Early Bronze age to a generally lighter one in modern Eastern Europeans, although other selective factors cannot be discounted. The selection coefficients inferred directly from serially sampled data at these pigmentation loci range from 2 to 10% and are among the strongest signals of recent selection in humans.

Well, either this is indeed a remarkable finding, or something's not quite right. I think it's the latter.

The argument for genetic continuity from the Eneolithic/Bronze Age to the present on the Pontic-Caspian steppe based on mtDNA sequences is actually very weak. The results could simply mean that the ancient samples shared deep maternal ancestry with modern Ukrainians and most other Europeans.

Indeed, we know for a fact that much of the Pontic-Caspian steppe was occupied by Turkic groups of Asian origin from the early Middle Ages until only a couple of hundred years ago. They were eventually cleared out by Tsarist Russia, and mainly replaced by East Slavic settlers from just northwest of the steppe. This process might not be easy to see by comparing low resolution mtDNA data, even between European populations separated by 5,000 years, but it's likely to be obvious when looking at full mtDNA genomes, high-density genome-wide data, and/or Y-chromosome haplogroups.

Surprisingly, the article doesn't mention Keyser et al. 2009, a very important study which showed that a sample of Kurgan nomads from Bronze and Iron Age South Siberia had frequencies of light hair and eyes comparable to those of present-day Northern and Eastern Europeans (see here). Also worth noting is that the most common Y-chromosome haplogroup among these individuals was R1a, which is today the most frequent haplogroup in Eastern Europe, including Ukraine.

What this suggests to me is that the Kurgan cultural horizon was not genetically homogeneous. I suspect that Kurgan groups closer to the Balkans carried significantly higher levels of Near Eastern Neolithic farmer ancestry, and were thus much darker than those in the more temperate northerly regions. However, it seems that at some point, the Neolithic farmer DNA was diluted enough by continuous movements of light pigmented groups from the north and east, possibly made up mostly of males, that there was a major shift in pigmentation traits from Near Eastern-like to North European-like across most of Eastern Europe. This scenario actually fits very nicely with the latest on the genetic origins of Europeans (see here).

We won't know what really happened until we see at least a few complete ancient genomes from Eastern Europe. But for now, I'd have to suspend my disbelief to accept that present-day Eastern Europeans are, by and large, descendants of these exceedingly brunet prehistoric people of the Pontic-Caspian steppe.

Saturday, March 8, 2014

Studies of ancient genomes usually feature unsupervised analyses with the ADMIXTURE software. These are very informative, but only if interpreted in the right context and with caution, because they attempt to fit the ancient samples, often thousands of years old, into ancestral clusters mostly derived from present-day populations. That's like putting the cart before the horse.

So I thought I'd try a different approach, in the hope of achieving more straightforward results, and run ADMIXTURE in supervised mode, with the 24,000 year-old MA-1 or Mal'ta boy genome from South Siberia as one of the reference samples. After a lot of tweaking of the dataset, the experiment seems to have worked, because the cluster created from the ancient genome is basically identical to the MA-1-derived Ancient North Eurasian (ANE) component recently described in the Lazaridis et al. preprint.

Note also that the ANE in my analysis peaks among the Karitiana Indians at around 43%. This is very much in line with a TreeMix graph in Raghavan et al., which shows a Karitiana individual with 41.6% (plus or minus 3.4%) admixture from a clade ancestral to MA-1 (see image here).

Nevertheless, there are clearly some issues with this test. For instance, many South Asians show unexpectedly high levels of Sub-Saharan admixture (in particular, the Austroasiatic samples from India score around 6-7%, which has never been reported before). I'd say this is because they carry genetic variation indigenous to South Asia that doesn't fit well into any of the four ancestral components. The Eastern non-African (ENA) cluster, based on Han Chinese samples, captures most of this diversity, but some of it appears to be siphoned off into the other three clusters. I think the only way to really solve this problem is to include pre-Neolithic genomes from South Asia in the analysis.

By the way, I used 53K SNPs at read depth 2x or more, but varying the quality of SNPs from read depth 1x to 3x doesn't change the results very much.