search this blog

Thursday, April 27, 2017

Figuratively speaking of course. The relevant paper is behind a paywall at Sciencehere. But the supplementary info PDF is freely available here. The press release from the lab that did the research is here.

Abstract: The genomic changes underlying both early and late stages of horse domestication remain largely unknown. We examined the genomes of 14 early domestic horses from the Bronze and Iron Ages, dating to between ~4.1 and 2.3 thousand years before present. We find early domestication selection patterns supporting the neural crest hypothesis, which provides a unified developmental origin for common domestic traits. Within the past 2.3 thousand years, horses lost genetic diversity and archaic DNA tracts introgressed from a now-extinct lineage. They accumulated deleterious mutations later than expected under the cost-of-domestication hypothesis, probably because of breeding from limited numbers of stallions. We also reveal that Iron Age Scythian steppe nomads implemented breeding strategies involving no detectable inbreeding and selection for coat-color variation and robust forelimbs.
...
The 14 ancient genomes reported here have strong implications for the horse domestication process. First, it has recently been discovered that a now-extinct lineage of wild horses existed in the Arctic until at least ~5.2 ka and significantly contributed to the genetic makeup of present-day domesticates (14,15). The timing of the underlying admixture event(s) is, however, unknown. Using D statistics, we confirmed that this extinct lineage shared more derived polymorphisms with the Sintashta and especially Scythian horses than with present-day domesticates (Fig. 2B). The domestic horse lineage, thus, experienced a net loss of archaic introgressed tracts within the past ~2.3 ky.

Thursday, April 20, 2017

For a while now I've been hearing rumors that the Reich Lab was working on Late Bronze Age and Iron Age samples from Pakistan's Swat Valley for a new paper on the Indo-Europeanization of South Asia. This has now been confirmed officially in a newsletter released by Padova University. See here.
I'm betting they'll be modeled as well over 50% Steppe_EMBA or Yamnaya-related. In other words, similar to the Kalasha people of the Hindu Kush, but even more Yamnaya-like. Exciting times ahead.
The archaeological paper mentioned in the newsletter is available behind a paywall here. I skimmed through it and didn't really understand it. But the authors seem to agree with the general consensus that these samples represent some of the earliest Indo-Aryan speakers in South Asia; likely descendants of recent migrants from the Central Asian steppes.

Abstract: The protohistoric graveyards of north-western Pakistan were first excavated in the 1960s, but their chronology is still debated, along with their relationship to broader regional issues of ethnic and cultural change. Recent excavation of two graveyards in the Swat Valley has provided new dating evidence and a much better understanding both of grave structure and treatment of the dead. Secondary burial was documented at Udegram, along with the use of perishable containers and other objects as grave goods. The complexity of the funerary practices reveal the prolonged interaction between the living and the dead in protohistoric Swat.

Wednesday, April 19, 2017

Part of the introduction to the new Lopez et al. preprint on the genetics of Zoroastrians says this:

The Zoroastrian religion developed from an ancient religion that was once shared by the ancestors of tribes that settled in Iran and northern India. It is thought to have been founded by the prophet priest Zarathushtra (Greek, Zoroaster). Most scholars now believe he lived around 1200 BCE, at a time when the ancient Iranians inhabited the areas of the Inner Asian Steppes prior to the great migrations south to modern Iran, Afghanistan, Northern Iraq and parts of Central Asia.

Disappointingly, in the rest of the preprint we hear nothing about these great migrations from the Eurasian Steppe and if perhaps they brought at least some of the ancestors of modern-day Zoroastrians to what is now Iran.
The preprint's title, The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection, makes it clear that the authors are focusing on the genetic legacy of the Zoroastrians. OK, but why not also expand the focus to a detailed analysis of their genetic origin?
Possibly there's another paper on the way on the genetic origin of the Zoroastrians and other Indo-Iranians? Perhaps, but I'd say the issue here is that the authors have decided to make their main points with haplotypes, rather than unlinked SNPs, probably because, in principle, haplotypes are more powerful than unlinked SNPs.
Thus, they've chosen to limit themselves to using only a few relatively high quality, ancient genomes as reference samples. However, none of these ancient genomes are from the Eurasian Steppe.
As a result, the preprint includes a set of technically powerful haplotype analyses that, unfortunately, say nothing about the potential steppe origin of the Zoroastrians and are generally very difficult to interpret.
To fix this problem they can either sequence a couple of relevant ancient samples from the steppe at a high enough coverage to be useful as reference samples in haplotype tests, and/or expand their use of formal statistics to model Zoroastrians with the already available pseudo-haploid ancients from the steppe (see here).
Actually, since the Iranian Zoroastrians from this study are available online courtesy of Broushaki et al. 2016, I can try some formal statistics models now, using the latest qpAdm and the updated qpAdm methods from Lazaridis & Reich 2017. The results are sorted by statistical fit, best to worst:

Not a huge difference there in terms of the fits. The best model is with Yamnaya_Kalmykia, probably because of its highest ratio of southern ancestry amongst these ancient steppe herder and warrior groups. Interestingly, the next best model is with the early Sarmatians from Pokrovka, Russia, who were, in all likelihood, Iranian-speakers.
I've also tested many other models using ancient Near Eastern reference samples other than Anatolia_ChL (Anatolia Chalcolithic), and can say with some confidence that the Zoroastrians have, one way or another, ~20% ancient steppe-related ancestry.
But how do other Iranian groups compare? It's an interesting and important question, because if modern-day Zoroastrians harbor elevated ancient steppe-related ancestry compared to other Iranians, this would strengthen the case for the steppe origin of Zarathushtra and his early followers. Let's test this using the same Sarmatian model as above (except with Yoruba added for the Bandaris to account for their minor African admixture):

So the Iranian Jews and Chalcolithic farmers from Iran basically show 0% Sarmatian-related ancestry. On the other hand, non-Jewish and non-Zoroastrian Iranians harbor, on average, 21.02±5.34% Sarmatian-related ancestry. That's actually not significantly different from the Zoroastrian result of 25.7±4.7%.
But importantly, modern-day Zoroastrians certainly don't appear to fall short in this regard compared to other ethnic and/or regional Iranian groups, despite being a relatively strong genetic isolate for many generations. What this suggests is that the Sarmatian-related ancestry mostly arrived south of the Caspian sometime between the Chalcolithic and the rise of Islam in Iran, quite possibly with the early followers of Zarathushtra during the Iron Age.
Citation...
Lopez et al., The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection, bioRxiv, Posted April 18, 2017, doi: https://doi.org/10.1101/128272

Tuesday, April 18, 2017

A new prerpint on the genetic legacy of the Zoroastrians has just appeared at bioRxiv. I'm reading it now. Might make some comments later [Update 20/04/2017:Zarathushtra and his steppe posse]. Here's the abstract:

Zoroastrianism is one of the oldest extant religions in the world, originating in Persia (present-day Iran) during the second millennium BCE. Historical records indicate that migrants from Persia brought Zoroastrianism to India, but there is debate over the timing of these migrations. Here we present novel genome-wide autosomal, Y-chromosome and mitochondrial data from Iranian and Indian Zoroastrians and neighbouring modern-day Indian and Iranian populations to conduct the first genome-wide genetic analysis in these groups. Using powerful haplotype-based techniques, we show that Zoroastrians in Iran and India show increased genetic homogeneity relative to other sampled groups in their respective countries, consistent with their current practices of endogamy. Despite this, we show that Indian Zoroastrians (Parsis) intermixed with local groups sometime after their arrival in India, dating this mixture to 690-1390 CE and providing strong evidence that the migrating group was largely comprised of Zoroastrian males. By exploiting the rich information in DNA from ancient human remains, we also highlight admixture in the ancestors of Iranian Zoroastrians dated to 570 BCE-746 CE, older than admixture seen in any other sampled Iranian group, consistent with a long-standing isolation of Zoroastrians from outside groups. Finally, we report genomic regions showing signatures of positive selection in present-day Zoroastrians that might correlate to the prevalence of particular diseases amongst these communities.

Wednesday, April 12, 2017

An abstract book from a recent mathematics meeting in Estonia includes an abstract on the genetic impact of Bronze Age steppe pastoralists on Europe and South Asia. Titled A Pre-Existing Isolation by Distance Gradient in West Eurasia May Partly Account for the Observed “Steppe” Component in Europe, it's mostly authored by scientists from the Estonian Biocentre including Luca Pagani and Mait Metspalu. You can read it here.

Even though it's just an abstract of a paper that might never be published, it's so obviously wrong that I can't let it go. This is the sort of thing I'd expect to see from some of the half deranged visitors in the comments section at this blog, not scientists from the Estonian Biocentre.

First of all, even though the abstract doesn't spell out which data crunching algorithms were used by the authors, it's pretty clear to me that the main part of their analysis was run with ADMIXTURE. That basically makes it a pointless exercise from the outset, simply because ADMIXTURE is not designed for these types of analyses.

Why? Because it's impossible to accurately recapitulate ancient population structure with ADMIXTURE; the results are always significantly skewed in some way, usually by heavy genetic drift in one or more of the test populations. In other words, there's no way to truly revive ancient populations with ADMIXTURE components. And if you can't do that, then how can you estimate their impact more or less accurately? Not possible.

In any case, whether the authors relied on ADMIXTURE or not is immaterial to the fact that all of their main points are clearly wrong. Before I go through these points, and explain why they're wrong, I need to explain exactly what the Steppe component really is and isn't.

The Steppe component is the genetic structure of Early and Middle Bronze Age (EMBA) steppe pastoralist groups Afanasievo, Poltavka and Yamnaya. And it's a very specific thing. It isn't a component inferred from a random run of ADMIXTURE that peaks in Afanasievo, Poltavka and/or Yamnaya, or any other ancient populations.

Keep in mind also that Steppe_EMBA is a very specific mixture of older and contemporaneous populations. Using the formal-statistics-based qpAdm method, which models ancestry directly based on f4-statistics, Steppe_EMBA is probably best modeled as a mixture of Eastern European Hunter-Gatherers (EHG), Caucasus Hunter-Gatherers (CHG), and Anatolia Chalcolithic (Anatolia_ChL), with ancestry proportions of around 0.453, 0.453 and 0.094, respectively. See here.

I believe that in this model Anatolia_ChL represents some type of minor western admixture amongst the close relatives of CHG still living in the Caucasus during the Eneolithic/Early Bronze Age, and/or minor gene flow from the Balkans onto the steppe. But that's a topic for another day, perhaps after the release of the Bell Beaker behemoth?

Below is a visual representation of the model, using a typical Principal Component Analysis (PCA) of Western Eurasian population structure. Note the tight cluster formed by the Steppe_EMBA groups and individuals, which is easily differentiated from all ancient populations outside of the steppe, except, importantly, Corded Ware.

Thus, considering that I know what the Steppe component is and isn't exactly, then I can try to test for admixture from it and its ancestral components as best I can using qpAdm. Below are results for a few pertinent ancient populations (no idea how to model the farmers from Early Neolithic Iran at this stage, but I've already underlined their unique genetic character here and have no reason to believe that they're responsible for any part of the Steppe_EMBA signal in Europe or South Asia). If you're wondering why I chose Hungary_HG as the potential Western Hunter-Gatherer source, it's because it provided the best statistical fits overall. Also note that Ukraine_HG/N is based on samples from the Pontic Steppe.

The models involving Steppe_EMBA and CHG are almost always worse than the best models without them. As far as I can see, there's no strong evidence here of any mixture from a population even similar to Steppe_EMBA in any of these groups, except perhaps Ukraine_HG/N.

However, qpAdm results are dependent on the choice of pright and pleft populations (outgroups and potential mixture sources, respectively). Therefore, with different pright and pleft populations it might be possible to model all of the above groups with significant Steppe_EMBA admixture.

But of course there are other tests that I can run to double check my qpAdm models, such as the West Eurasian PCA. And clearly, the PCA basically supports the qpAdm results, with none of the test groups showing much, if any, deviation towards Steppe_EMBA or CHG from their main mixture clines.

So now let's take a look at the key points made in the abstract and why they're so way off the mark:

However ancient DNA samples from East European and Caucasian Hunter-Gatherers as well as from Early Iranian Neolithic, dating from before the Yamnaya expansion, already show signs of this so called “Steppe” component (Lazaridis et al. 2016).

There's no persuasive evidence for this; see my qpAdm and PCA models above for CHG and various Eastern European Hunter-Gatherer groups. As for the Early Neolithic farmers from Iran, there are no formal models that really make sense for them; we probably don't yet have old enough Near Eastern genomes to serve as potential mixture sources. But the idea that they're somehow interchangeable with Steppe_EMBA is patently idiotic.

Such an observation is compatible with the presence of a pre-existing genetic gradient ranging from Caucasus/Iran all the way to Europe, which likely formed through isolation by distance over thousands of years.

It's not. Isolation by distance has nothing to do with it, because there's no persuasive evidence for the existence of Steppe_EMBA ancestry, or even anything similar, outside of the steppe until the Late Neolithic/Early Bronze Age (LNBA). All of the evidence available to date points to a sudden, massive and perhaps even violent explosion of Steppe_EMBA peoples deep into Europe and also across much of Asia during the LNBA.

Here we show that such a gradient, defined as decrease of "steppe” component with distance from Iran, can be inferred from ancient samples pre-dating the Yamnaya expansion (r^2 = 0.93).

Not possible, because, as I've just pointed out, pre-Bronze Age samples from Iran (Iran_ChL) do not show strong evidence of Steppe_EMBA ancestry aka. the Steppe component.

When analysed in the light of this gradient, later ancient and modern samples from Europe still display an excess of Steppe component, however this excess is less pronounced than previously estimated.

Horseshit. Nothing's changed.

Additionally we found that, of the analysed samples, modern South Asians show the highest excess of “steppe” component, pointing to the documented, recent links between the Caucasus/Iran populations and the South Asian peninsula.

No, you're conflating Steppe_EMBA ancestry with Neolithic ancestry from what is now Iran because you don't know how to differentiate them. But this has already been done many times over on this blog and also in scientific literature.

...

By the way, Iosif Lazaridis made a couple of observations related to the Pagani et al. abstract on Twitter. See here and here.

Friday, April 7, 2017

A very useful new paper on the origin and spread of mitochondrial (mtDNA) haplogroup U7 has just appeared at Scientific Reports.

It re-iterates some key points that I've made about this haplogroup; that it's a South Caspian-specific lineage and conspicuous by its absence from all Yamnaya samples sequenced to date. In fact, along with other South Caspian-specific lineages, such as U1, U3a, HV2 and HV0, it's missing from all Early Bronze Age steppe samples sequenced to date (see here).

This is surely a major problem for those positing that ancient populations from the South Caspian, in other words what is now mostly Iran, made a significant contribution to the formation of Early Bronze Age steppe pastoralist groups, including Yamnaya.

However, I'd say the paper's conclusion that U7 probably spread into Europe before the Early Bronze Age is a bit iffy. Based on the available ancient European mtDNA, it looks to me as if it mostly spread into Europe after the Early Bronze Age. So why are there European-specific U7 lineages, such as U7a19, seemingly with coalescent times dating to the Neolithic in Europe? Well, perhaps because after these lineages moved to Europe, they went extinct in the Near East? From the paper, emphasis is mine:

Abstract: Human mitochondrial DNA haplogroup U is among the initial maternal founders in Southwest Asia and Europe and one that best indicates matrilineal genetic continuity between late Pleistocene hunter-gatherer groups and present-day populations of Europe. While most haplogroup U subclades are older than 30 thousand years, the comparatively recent coalescence time of the extant variation of haplogroup U7 (~16–19 thousand years ago) suggests that its current distribution is the consequence of more recent dispersal events, despite its wide geographical range across Europe, the Near East and South Asia. Here we report 267 new U7 mitogenomes that – analysed alongside 100 published ones – enable us to discern at least two distinct temporal phases of dispersal, both of which most likely emanated from the Near East. The earlier one began prior to the Holocene (~11.5 thousand years ago) towards South Asia, while the later dispersal took place more recently towards Mediterranean Europe during the Neolithic (~8 thousand years ago). These findings imply that the carriers of haplogroup U7 spread to South Asia and Europe before the suggested Bronze Age expansion of Indo-European languages from the Pontic-Caspian Steppe region.

...

Compared to other subclades of hg U, both the phylogenetic structure and the ancestral origin of hg U7 are rather obscure. This haplogroup is characterized by generally low population frequencies and limited sequence diversity, despite a geographic distribution ranging from Europe to India [14,16,25,27,30,31,32,33]. Recently, it has been detected in skeletal remains from Southwest Iran [my note: that was U7a] dated ~six thousand years ago (kya) [34] as well as in remains from the Tarim Basin in Northwest China (3.5–4.0 kya) [35].

...

Another major episode of gene flow affecting the European gene pool appears to have occurred during the Late Neolithic and Early Bronze Age, from a source in the Pontic-Caspian Steppe region north of the Caucasus [3,54,66,72]. It has been suggested that this migration resulted in a further substantial shift in the genetic profile of Europeans and was a major vehicle for the movement of Indo-European languages to Europe [3,72], and likely also to South Asia54. Interestingly, the autosomal genetic component in Europeans considered to derive from the Steppe is almost fixed in two pre-Neolithic ancient genomes from the South Caucasus. This component is distributed eastwards towards South Asia as well54, where it mimics the distribution of U7 (Pearson’s r = 0.65, p = 0.01). Our time estimates for the expansion and differentiation of hg U7 in the Near East, Central Asia, South Asia, and Europe, however, predate these putative late Neolithic-early Bronze Age migrations and thereby rule them out as a major vehicle for the spread of U7 to Europe and South Asia. In this respect, it is also noteworthy that Yamnaya herders of the Steppe so far analysed (n = 43) show no traces of U7 [3,55,72,73] – and U7 is rarely found in this region today (Fig. 2).

...

The expansion time of hg U7 in the Near East, Central Asia and South Asia is more consistent with autosomal multi-locus estimates for the genetic separation of these regions during the Terminal Pleistocene74, suggesting a common demographic process, whose origin was unclear previously. Here, we show that the frequency and distribution of U7b lineages indicate an origin of this clade in the Near East, whilst for U7a these statistics cannot differentiate between South Asia and the Near East (including the Caucasus) as a possible homeland.

Thursday, April 6, 2017

At Scientific ReportsMeiri et al. present and analyze an updated dataset of ancient cattle and pig DNA from the Eastern Mediterranean. At the moment, ancient pig DNA is actually one of the best resources for studying human population movements in the region during the tumultuous Bronze and Iron Ages.
However, this is likely to change later this year or next year, with the publication of high density ancient human genome-wide DNA data for the Minoans, Mycenaeans, Philistines and other main players in the Bronze and Iron Age Eastern Mediterranean.
In any case, interestingly, pig mitochondrial (mtDNA) haplogroup Y2 is found on the Pontic Steppe during the Neolithic-Chalcolithic (7000-3500 BCE). It then appears during the Early Middle Bronze Age (3500-1550 BCE) in Greece and Anatolia. I do wonder if these pigs migrated south with the speakers of Proto-Greek and Proto-Anatolian?

Abstract: The Late Bronze of the Eastern Mediterranean (1550–1150 BCE) was a period of strong commercial relations and great prosperity, which ended in collapse and migration of groups to the Levant. Here we aim at studying the translocation of cattle and pigs during this period. We sequenced the first ancient mitochondrial and Y chromosome DNA of cattle from Greece and Israel and compared the results with morphometric analysis of the metacarpal in cattle. We also increased previous ancient pig DNA datasets from Israel and extracted the first mitochondrial DNA for samples from Greece. We found that pigs underwent a complex translocation history, with links between Anatolia with southeastern Europe in the Bronze Age, and movement from southeastern Europe to the Levant in the Iron I (ca. 1150–950 BCE). Our genetic data did not indicate movement of cattle between the Aegean region and the southern Levant. We detected the earliest evidence for crossbreeding between taurine and zebu cattle in the Iron IIA (ca. 900 BCE). In light of archaeological and historical evidence on Egyptian imperial domination in the region in the Late Bronze Age, we suggest that Egypt attempted to expand dry farming in the region in a period of severe droughts.
...
Haplotype Y2 is considered to have a Near Eastern origin [27, 28]. However, the existence of pig haplotype Y2 in our Greek samples during the Early Helladic II (one radiocarbon determination – 2875–2581 cal BCE) (Fig. 3) together with the findings of Mesolithic wild boar remains in Romania and northeast Italy [33, 35] challenge this conventional wisdom. The absence of haplotype Y2 from Anatolia in the Neolithic (despite a large sample size, n = 38 [28]) on one hand, and its presence in Romania during this period on the other [33] suggest a west-to-east translocation, from Greece to Anatolia no later than the Early Bronze Age.

Monday, April 3, 2017

Abstract: Two recent palaeogenetic studies have identified a movement of Yamnaya peoples from the Eurasian steppe to Central Europe in the third millennium BC. Their findings are reminiscent of Gustaf Kossinna's equation of ethnic identification with archaeological culture. Rather than a single genetic transmission from Yamnaya to the Central European Corded Ware Culture, there is considerable evidence for centuries of connections and interactions across the continent, as far as Iberia. The author concludes that although genetics has much to offer archaeology, there is also much to be learned in the other direction. This article should be read in conjunction with that by Kristiansen et al. (2017), also in this issue.

Abstract: Recent genetic, isotopic and linguistic research has dramatically changed our understanding of how the Corded Ware Culture in Europe was formed. Here the authors explain it in terms of local adaptations and interactions between migrant Yamnaya people from the Pontic-Caspian steppe and indigenous North European Neolithic cultures. The original herding economy of the Yamnaya migrants gradually gave way to new practices of crop cultivation, which led to the adoption of new words for those crops. The result of this hybridisation process was the formation of a new material culture, the Corded Ware Culture, and of a new dialect, Proto-Germanic. Despite a degree of hostility between expanding Corded Ware groups and indigenous Neolithic groups, stable isotope data suggest that exogamy provided a mechanism facilitating their integration. This article should be read in conjunction with that by Heyd (2017, in this issue).