Tuesday, November 25, 2014

This Mezzavilla et al. paper is currently up for public comment at bioRxiv. My comment is that we really need ancient genomes to be able to answer the sorts of questions that the authors of this paper are trying to answer. Nevertheless, it's an interesting read.

Background: The ancient Silk Road has been a trading route between Europe and Central Asia from the 2nd century BCE to the 15th century CE. While most populations on this route have been characterized, the genetic background of others remains poorly understood, and little is known about past migration patterns. The scientific expedition "Marco Polo" has recently collected genetic and phenotypic data in six regions (Georgia, Armenia, Azerbaijan, Uzbekistan, Kazakhstan, Tajikistan) along the Silk Road to study the genetics of a number of phenotypes.

Results: We characterized the genetic structure of these populations within a worldwide context. We observed a West-East subdivision albeit the existence of a genetic component shared within Central Asia and nearby populations from Europe and Near East. We observed a contribution of up to 50% from Europe and Asia to most of the populations that have been analyzed. The contribution from Asia dates back to ~25 generations and is limited to the Eastern Silk Road. Time and direction of this contribution are consistent with the Mongolian expansion era.

Conclusions: We clarified the genetic structure of six populations from Central Asia and suggested a complex pattern of gene flow among them. We provided a map of migration events in time and space and we quantified exchanges among populations. Altogether these novel findings will support the future studies aimed at understanding the genetics of the phenotypes that have been collected during the Marco Polo campaign, they will provide insights into the history of these populations, and they will be useful to reconstruct the developments and events that have shaped modern Eurasians genomes.

Tuesday, November 18, 2014

Many of us are waiting impatiently for the new manuscript from the Reich Lab on the genetic shifts in Central and Eastern Europe during the late Neolithic/early Bronze Age, which will apparently include genome-wide data from Bell Beaker, Corded Ware and Yamnaya remains (see here). Rumor has it that it'll appear at bioRxiv within a few weeks.

Meantime, it might be useful to check out this review paper by Guido Brandt et al. on the present state of play in European paleogenetics.

It's a thorough summary of almost all ancient DNA results to date from Europe, and includes some very nice maps and other figures that look like updates on the stuff from Brandt et al. 2013 (see here). However, there are a couple of major problems with this paper that drag it down a few notches in my estimation.

Firstly, the authors leave open the possibility that Indo-European languages were introduced into Europe by early Neolithic farmers from Anatolia. Maybe they're trying to be diplomatic and humor those that won't let this failed hypothesis finally die, because otherwise I have no idea why they even considered it?

There are some very good reasons now why this is indeed a failed hypothesis. For one, linguistic evidence shows that all Indo-European languages in Europe include similar loans of non-Indo-European origin associated with farming, like the words for bean, carrot, hemp, oats and pea (for instance, see here).

These words were in all likelihood borrowed by the early Indo-Europeans from someone else as they spread out across Europe well after agriculture had been established throughout much of the continent. So who was this someone else? Probably the non-Indo-European descendants of the non-Indo-European early farmers from Anatolia.

Ancient DNA shows something similar. All ancient European genomes in a farming context sequenced to date from the Neolithic to the Copper Age are clearly distinct from present-day Indo-European speaking Europeans. But they resemble very closely present-day Sardinians, whose ancestors only became Indo-European speakers during the late Iron Age.

The other serious problem with this paper is the suggestion that present-day Northeast Europeans show the highest genome-wide affinity to Pitted Ware hunter-gatherers because the eastern Baltic acted as a refugium during the Last Glacial Maximum (LGM). It's on page 10 of the PDF.

This must be some sort of oversight, because I refuse to believe that the authors aren't aware of the fact that the eastern Baltic was covered in a big fuck off ice sheet during the LGM. Here's a map from Mangerud et al 2004.

A much more plausible explanation why present-day Northeast Europeans show the highest genome-wide affinity to Pitted Ware hunter-gatherers, and indeed all European hunter-gatherers for whom we have data, is that their ancestors were amongst the last people in Europe to take up farming and Christianity.

Thursday, November 13, 2014

First of all, here's a map with some basic info about these two Upper Paleolithic North Eurasian genomes. They're separated by less than 10,000 years and a couple thousand kilometers, so in theory they shouldn't be all that different.

Let's see what TreeMix has to say on the matter. Note that the graphs also include five other ancient genomes: Denisova, Altai Neanderthal, Loschbour, Stuttgart and BR2 (LBA_Hungary).

Admittedly, I'm still learning to use TreeMix. But with that in mind, I'd say the graphs above appear very reasonable, and show outcomes that generally fit with what I've seen elsewhere.

For instance, Denisova harbors something chimp-like that isn't shared with the Altai Neanderthal. This might be a signal of the introgression from an unidentified archaic hominin that has already been reported in scientific literature.

In regards to Kostenki14, the graphs back one of the main conclusions of Seguin-Orlando et al. (ie. the people who first analyzed and published this genome), in that it appears basal to later Europeans. However, the last two graphs suggest that this basal ancestry is not the same thing as the Stuttgart-related Basal Eurasian component described in Lazaridis et al., which, if I understand correctly, is what Seguin-Orlando et al. were saying.

In fact, the basal stuff carried by Kostenki14 seems to be related to the greater part of Ust'-Ishim's genetic makeup. I say the greater part, because Ust'-Ishim also appears to harbor Papuan-like ancestry not shared with Kostenki14.

Is there anything I can do to make these graphs more informative? Perhaps add or take away some samples? Feel free to let me know in the comments below.

By the way, I downloaded the Kostenki14, LBA_Hungary and Ust'-Ishim genomes from Genetic Genealogy Tools. The rest of the samples came from the Reich Lab's Human Origins dataset, available here.

Update 14/11/2014: After looking over the results above and reading the comments below, I made a few changes to the dataset and came up with a couple more graphs that I think are worth sharing. I'm quite certain now that the so called Basal Eurasian ancestry carried by Stuttgart and Kostenki14 can't be lumped into a single component.

Thursday, November 6, 2014

At last, we have an ancient genome from pre-LGM Europe: Kostenki14 (K14) from the famous Kostenki Upper Paleolithic site in southern Russia. The paper, Seguin-Orlando et al. 2014, is locked away behind a paywall, but at least the supplementary materials are open access.

K14 is dated at 38,700-36,200 cal BP and belongs to Y-chromosome haplogroup C-M130, a basal and widespread paternal marker that has already been reported in three other ancient European genomes: La Brana-1 from Mesolithic Spain and NE5 and NE6 from Neolithic Hungary. It also belongs to mitochondrial (mtDNA) haplogroup U2.

The shared drift stats of the form f3(Mbuti;K14,X), where X is the test population, reveal that from among present-day Eurasians, this early European is most similar to Northeast Europeans, such as Lithuanians, Estonians and Belarusians, and some Western Europeans, like Basques and Orcadians (ie. people from the Orkney Isles). This is also what we've seen from other indigenous European hunter-gatherer genomes sequenced to date.

As far as Eurasians are concerned, Papuans and Melanesians are the most distinct from K14, somewhat paradoxically so, considering the ancient genome's Oceanian-like Y-haplogroup. The authors speculate that this might be because they carry ancestry from a very basal lineage that went its own way before the split between West Eurasians and East Asians. But I'm wondering whether this result can't simply be explained by the inflated Denisovan admixture among Oceanians (usually reported at around 5%)?

Indeed, there's no mention anywhere in the paper that K14 has Denisova ancestry. However, much like the recently published Ust'-Ishim genome, it shows significantly larger genomic tracts of Neanderthal origin than present-day Eurasians. The implication of this is obvious, and well covered elsewhere, so I won't go into it here.

Arguably the most controversial outcome of the study is that it shows K14 to be partly of Basal Eurasian origin. This is a highly divergent Eurasian clade first described in Lazaridis et al. (see here), and associated with Neolithic farmers. Seguin-Orlando et al. came to their conclusion via two sets of D-statistics and an ADMIXTURE run, which showed K14 to carry a component specific to the Middle East.

If true, then this finding debunks one of the main premises in Lazaridis et al., which is that Basal Eurasian admixture first arrived in Europe from the Middle East with Neolithic farmers. However, it doesn't debunk this paper's model of the formation of the modern European gene pool. Basically, for that to happen we'd need the Basal Eurasian component to show up in pre-Neolithic samples from Western and Central Europe.

Nevertheless, David Reich (one of the co-authors of Lazaridis et al.) seemed so taken aback by the news that he suggested K14 might be contaminated. Or at least, he was reported to have made this suggestion (scroll down to the last paragraph here)

This is interesting because Reich is currently working on a paper that includes ancient genomes from the Samara Valley, which isn't too far away from the Kostenki site (see here). Judging by his reaction to K14's purported Basal Eurasian admixture, we can probably assume that the pre-Neolithic genomes he's analyzed from Russia don't show any signals of this type of ancestry.

In any case, the model devised by Seguin-Orlando et al., set out in the figure below, is actually very similar to the one in Lazaridis et al., with NEOL basically standing in for EEF (Early European Farmer) and MHG for WHG and SHG (Western European Hunter-Gatherer and Scandinavian Hunter-Gatherer, respectively).

However, the suggestion that the Yenisei Siberians carry MHG rather than ANE doesn't look right to me. Why would Siberians carry European rather than Siberian hunter-gatherer ancestry? I suspect the problem is that MHG is a composite of WHG and ANE (because, as we know, SHG are partly ANE). Thus, if the Yenisei Siberians do carry both ANE and WHG, because they might indeed harbor some ancient European admixture, then perhaps this is simply being classified as MHG? If so, then I suppose it's not technically wrong, but it does look confusing.