search this blog

Saturday, January 30, 2016

Is anyone else thinking what I'm thinking? The Principal Component Analysis (PCA) below should be self-explanatory. But if you're having problems with the abbreviations and acronyms, consult the list of definitions here.

Tuesday, January 26, 2016

PNAS has just released a new paper on the population history of India. It's not a bad effort, but very speculative and not particularly insightful, mainly because it doesn't include any ancient DNA from South Asia. Let's be honest, nowadays, if you want a really hard hitting paper of this sort, you need some ancient DNA. It's open access. Here's the abstract.

India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.

Saturday, January 23, 2016

My dataset was recently enriched with six ancient individuals from Roman York, courtesy of Martiniano et al. 2016.
They were either gladiators or soldiers. Each one was decapitated. This may have been a coup de grâce or a burial rite. At least one, 3DRIF-26, was not native to Britian.
In fact, isotopic evidence suggests that he spent his childhood in a region with a hot and dry climate such as North Africa or the Levant. Moreover, his top matching population in terms of pairwise Identical-by-State (IBS) allele sharing are present-day Saudis (see here).
However, I thought it might be useful to revisit 3DRIF-26's genetic affinities after taking into account his non-trivial Sub-Saharan admixture. This can be done with qpAdm. The best ten models are listed below.
Please note that in the last model I had to use 3DRIF-26 as a mixture source for present-day Egyptians, because he has less Yoruba-related admixture than the Egyptians.

I'd say these results provide rather convincing evidence that 3DRIF-26's West Eurasian ancestry is derived from the Levant. Moreover, his relatively high level of Sub-Saharan admixture suggests that he came from the southern Levant or perhaps a nearby region, like the Sinai Peninsula.
Interestingly, the best models feature a couple of religious minorities (Samaritans and Lebanese Druze), an island population (Cypriots), and a fairly unique group in terms of genetic structure from Israel's Negev Desert (BedouinB). This suggests that 3DRIF-26 may have belonged to a similar religious or geographic isolate population, or, alternatively, that most of the Levant has experienced significant genetic shifts since he was alive.
The rest of the headless Romans were, in all likelihood, born and raised in or near Britain. However, two of the individuals, 3DRIF-16 and 6DRIF-3, show elevated IBS affinity to Lithuanians and Poles. At the same time, they both belong to Y-chromosome haplogroup R1b-U106 (aka M405), which is a marker generally thought to have arrived in Britain with Anglo-Saxons and Scandinavians. This might be a coincidence, but probably not.
D-stats confirm that they do show elevated Northeastern European affinity relative to the other three Romans. Only one of the Z-scores is statistically significant (>3), but most of the others would probably also reach significance with more SNPs and higher quality sequences.

My guess is that 3DRIF-16 and 6DRIF-3 were Britons of mixed origin, with recent ancestry from Scandinavia and/or East Central Europe. Indeed, they can be modeled with qpAdm as part Swedish and Polish.

Friday, January 22, 2016

The global distribution of J2-M172 sub-haplogroups has been associated with Neolithic demic diffusion. Two branches of J2-M172, J2a-M410 and J2b-M102 make a considerable part of Y chromosome gene pool of the Indian subcontinent. We investigated the Neolithic contribution of demic dispersal from West to Indian paternal lineages, which majorly consists of haplogroups of Late Pleistocene ancestry. To accomplish this, we have analysed 3023 Y-chromosomes from different ethnic populations, of which 355 belonged to J2-M172. Comparison of our data with worldwide data, including Y-STRs of 1157 individuals and haplogroup frequencies of 6966 individuals, suggested a complex scenario that cannot be explained by a single wave of agricultural expansion from Near East to South Asia. Contrary to the widely accepted elite dominance model, we found a substantial presence of J2a-M410 and J2b-M102 haplogroups in both caste and tribal populations of India. Unlike demic spread in Eurasia, our results advocate a unique, complex and ancient arrival of J2a-M410 and J2b-M102 haplogroups into Indian subcontinent.

Tuesday, January 19, 2016

The purported migrations that have formed the peoples of Britain have been the focus of generations of scholarly controversy. However, this has not benefited from direct analyses of ancient genomes. Here we report nine ancient genomes (~1 ×) of individuals from northern Britain: seven from a Roman era York cemetery, bookended by earlier Iron-Age and later Anglo-Saxon burials. Six of the Roman genomes show affinity with modern British Celtic populations, particularly Welsh, but significantly diverge from populations from Yorkshire and other eastern English samples. They also show similarity with the earlier Iron-Age genome, suggesting population continuity, but differ from the later Anglo-Saxon genome. This pattern concords with profound impact of migrations in the Anglo-Saxon period. Strikingly, one Roman skeleton shows a clear signal of exogenous origin, with affinities pointing towards the Middle East, confirming the cosmopolitan character of the Empire, even at its northernmost fringes.

British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain.

Monday, January 11, 2016

Anyone who still thinks that Y-chromosome haplogroup R1a originated in South Asia should burn this map into their brains. It'll come in useful over the next few years as we learn from ancient DNA about the conquest of the Indian subcontinent, and indeed much of Asia, by pastoralists from the western Russian and Ukrainian steppes.

X marks the spot of the burial site of Poltavka sample I0432 from the Mathieson et al. 2015 dataset. This individual belongs to Y-chromosome haplogroup R1a-Z93(Z94+), which today accounts for well over 90% of the R1a lineages in Asia and peaks in frequency at over 60% in the northern parts of South Asia.

Moreover, the dating of his burial site, 2925-2536 calBCE, suggests that he lived not long after the Z93 and Z94 mutations came into existence. That's because Z93 doesn't appear to be much older than 5,000 years based on full Y-chromosome sequence data (see here and here, including the comments).

So I0432 could well turn out to be a crucial piece in the puzzle of the peopling of South Asia.

Interestingly, this individual was flagged as an outlier in the Poltavka sample set by Mathieson et al., hence his other moniker: the Poltavka outlier. However, this wasn't because of any ancestry from South or even Central Asia. In fact, it was because he was too western.

Principal Component Analyses (PCA) featuring a wide range of present-day and ancient samples from Europe and Asia, like the one below, show that Poltavka outlier clusters further west than most Corded Ware individuals from Germany. Right click and open in a new tab to view full size.

In the past, using qpAdm, I modeled Poltavka outlier as 63.7% Yamnaya Samara and 36.3% German Middle Neolithic. This is probably not very far from the truth, but qpAdm offers a supervised mixture test in which the results are heavily reliant on the choice of outgroups, so I thought I'd revisit the issue with TreeMix, which allows an unsupervised analysis.

In a dataset including seven relatively high coverage Copper Age (CA), Early Bronze Age and Middle Neolithic (MN) European genomes, TreeMix picked out Poltavka outlier as the most likely sample to be admixed, showing a mixture edge of 33% from the base of the branch leading to the Iberian MN individual to that of Poltavka outlier.

This outcome is very similar to my qpAdm model, but it suggests an even more western source of admixture in Poltavka outlier. Could this admixture actually be from Iberia? I wouldn't discount this possibility, considering the presence of Bell Beaker communities, possibly of Atlantic or even Iberian origin, as far east as present-day Poland. Indeed, according to Cassidy et al. 2015, German Beakers show high affinity to MN and CA Iberians (see page 51 in the supp info here).

I double checked my TreeMix result with D-stats, and yep, when placed in a clade with Poltavka or Samara Yamnaya, Poltavka outlier shows the strongest signal of admixture from the Iberia MN individual.

At the same time, however, the signal from the Early Neolithic (EN) Iberian fails to reach significance (Z=<3), which suggests that, in fact, TreeMix and D-stats might be seeing the Iberia MN sample as the most attractive mixture source due to her high level of Western European hunter-gatherer (WHG) ancestry, which Poltavka outlier also has plenty of, rather than anything specific to Iberia.

In any case, it's clear enough that Poltavka outlier was the result of mixture between Yamnaya-related western steppe pastoralists and the descendants of Middle Neolithic Europeans with a high ratio of WHG ancestry. Where this admixture actually took place and which archaeological cultures were involved will have to be resolved with further sampling of ancient remains from Central and Eastern Europe.

However, it's already impossible to place the origin of Poltavka outlier anywhere in Asia, which suggests that both Z93 and Z94 are also from well inside the generally accepted borders of Europe.

This obviously has implications for the origins of the Indo-Iranians, because the widespread presence of these mutations in Asia gels very nicely with the idea, and indeed academic consensus, that Indo-Iranian languages expanded rapidly from the Eurasian steppe into Asia during the Bronze Age.

Considering that Poltavka outlier came from a Kurgan burial, and was therefore an individual of some social standing, he might be the direct ancestor of many millions of present-day Asians. If so, this won't be very difficult to prove in the near future as ancient DNA research revs up a few notches.

On a related note, apparently there's a paper on the way with ancient DNA results from Rakhigarhi, a Harappan site in Haryana, northern India (see here). As far as I know, the results will include Y-chromosome haplogroups of three males, but I don't think we'll see any decent genome-wide data at this stage. However, hopefully I'm wrong and the paper will come out with full ancient genomes.

Feel free to post your predictions in the comments. I'm tentatively expecting a couple of instances of J2 and maybe an L or H. Razib made basically the same prediction recently so I'm not being original. What I do know is that we won't see any R1a-Z93. The only way that might happen is if, say, someone coughed or sneezed on the Harappan remains.

Saturday, January 2, 2016

No one's done this yet, probably because at this stage it's still a crazy idea. But sometimes crazy ideas actually work. Here's a map:

The map is based on the spreadsheet below, which shows the total amount of relatively large, probably in most part Identity-by-Descent (IBD), genome-wide tracts shared by the ancient individuals in centimorgans (cM). An extended version of the table, including ~1500 present-day Eurasians, can be viewed here.

I used Beagle 3 and fastIBD for the job. The dataset included just over 300K SNPs that showed a call rate of 100% in all of the ancient samples, so as not to potentially bias their results by imputing missing markers.
To do this by the book, I'd need to run many more ancient individuals, at least a few from each archaeological culture of interest, sequenced at comparably high coverage and genotyped in exactly the same way. This might be possible within a year or two.
Having said that, the results from my quick and dirty test run make perfect sense. Here are a few observations:

- The Corded Ware individual from Germany shows a close relationship to the Yamnaya individual from the North Caspian region, but no relationship to the two Neolithic farmers from Central Europe, NE1 and Stuttgart, supporting the idea that the Corded Ware Culture was introduced into Central Europe by migrants from the Pontic-Caspian Steppe.
- The Srubnaya individual from the North Caspian shares a lot of cM with the Corded Ware individual, and also shows a stronger relationship to other ancient Central Europeans than to the Yamnaya individual buried only kilometers away, suggesting that the Srubnaya Culture was introduced to the Pontic-Caspian Steppe from Central Europe or surrounds.
- The closer relationship between the Yamnaya individual and the Late Bronze Age Hungarian, BR2, than between the latter and the Corded Ware individual, gels with archaeological data showing that Yamnaya groups moved into the Carpathian Basin via the Balkans.
- Weak segment sharing between the Yamnaya individual and Kotias, a Mesolithic Caucasus hunter-gatherer (CHG) from western Georgia, suggests that the Yamnaya population did not receive its CHG admixture from the southwestern Caucasus.
- Elevated segment sharing between BR2 and present-day speakers of Baltic and Slavic languages suggests that BR2, or his close relatives, contributed genealogically in a significant way to the Balto-Slavic expansions that affected most of East Central and Eastern Europe during the Iron Age and early Medieval period.

Friday, January 1, 2016

Summary: Anatolia and the Near East have long been recognized as the epicenter of the Neolithic expansion through archaeological evidence. Recent archaeogenetic studies on Neolithic European human remains have shown that the Neolithic expansion in Europe was driven westward and northward by migration from a supposed Near Eastern origin [ 1–5 ]. However, this expansion and the establishment of numerous culture complexes in the Aegean and Balkans did not occur until 8,500 before present (BP), over 2,000 years after the initial settlements in the Neolithic core area [ 6–9 ]. We present ancient genome-wide sequence data from 6,700-year-old human remains excavated from a Neolithic context in Kumtepe, located in northwestern Anatolia near the well-known (and younger) site Troy [ 10 ]. Kumtepe is one of the settlements that emerged around 7,000 BP, after the initial expansion wave brought Neolithic practices to Europe. We show that this individual displays genetic similarities to the early European Neolithic gene pool and modern-day Sardinians, as well as a genetic affinity to modern-day populations from the Near East and the Caucasus. Furthermore, modern-day Anatolians carry signatures of several admixture events from different populations that have diluted this early Neolithic farmer component, explaining why modern-day Sardinian populations, instead of modern-day Anatolian populations, are genetically more similar to the people that drove the Neolithic expansion into Europe. Anatolia’s central geographic location appears to have served as a connecting point, allowing a complex contact network with other areas of the Near East and Europe throughout, and after, the Neolithic.