Focusing on European population genetics and modern physical anthropology.

search this blog

Friday, March 3, 2017

The genetic history of Northern Europe (or rather the South Baltic)

A second preprint in only a few days on the Neolithic transition in the Baltic region has just appeared at bioRxiv: Mittnik et al. 2017. You can read about the first one here. Keep in mind also that we recently saw a paper on the same topic at Current Biology.
Can't these labs coordinate things a little better and perhaps focus on different parts of Europe? Wouldn't that be the sensible thing to do considering the limited funding for ancient DNA research?
Nevertheless, Mittnik et al. is an important addition to what we've already seen, for me mainly because it shows that largely unadmixed Western Hunter-Gatherers (WHG) lived in the South Baltic region at least as late as ~4,450 calBCE, which is the date assigned to the four Narva samples in the preprint. So now we have a plausible explanation for the inflated WHG-related ancestry in modern-day Balts and Northern Slavs.

Despite its geographically vicinity to EHG [Eastern Hunter-Gatherers], the eastern Baltic individual associated with the Mesolithic Kunda culture shows a very close affinity to WHG in all our analyses, with a small but significant contribution from EHG or SHG [Scandinavian Hunter-Gatherer], as revealed by significant D-statistics of the form D(Kunda, WHG; EHG/SHG, Mbuti) (Z>3; Supplementary Information Table S2).
...
The results for the Kunda individual are mirrored in the four later eastern Baltic Neolithic hunter-gatherers of the Narva culture (Fig. 2) and further supported by the lack of significantly positive results for the D-statistic D(Narva, Kunda; X, Mbuti) (Supplementary Information Table S2) demonstrating population continuity at the transition from Mesolithic to Neolithic, which in the eastern Baltic region is signified by a change in networks of contacts and the use of pottery rather than a stark shift in economy as seen in Central and Southern Europe [15].
...
Furthermore, the individual Spiginas2, which is dated to the very end of the Late Neolithic, has a higher proportion of the hunter-gatherer ancestry, as seen in ADMIXTURE (darker blue component in Fig. 2b), and is estimated to be admixed between 78±4% Central European CWC and 22±4% Narva (Supplementary Information Table S6). A reliance on marine resources persisted especially in the north-eastern Baltic region until the end of the Late Neolithic [29] and in combination with the proposed large population size for Baltic hunter-gatherers a ‘resurgence’ of hunter-gatherer ancestry in the local population through admixture between foraging and farming groups is likely, and has been described for the European Middle Neolithic [2,30].

The only gripe I have with this manuscript are the Principal Component Analyses (PCA). They just look messy and appear to suffer from projection bias, so they're hard to read and probably confusing for a lot of people.
Projection bias is also known as shrinkage. Basically it's when the PCA space shrinks for the projected samples compared to the reference samples. It happens a lot in ancient DNA papers. I find it irritating. But whenever I bring up this issue with authors of these papers, I'm basically told that their PCA look like other PCA from similar papers, so there's no problem. So, essentially, since everybody's doing it wrong, then it's the right way to do it. Awesome logic there.
Citation...
Mittnik et al., The Genetic History of Northern Europe, bioRxiv, Posted March 3, 2017, doi: https://doi.org/10.1101/113241

57 comments:

@Davidski wow you're smooowkin dude.This is like 10 new posts in the past 3 days plus responding to all the comments on the other threads.When the behemoth comes out you'll need a sleeping bag under your desk and smelling salts under your nose.

Very satisfying to see one issue resolved, both of the Narva folk are I2a1. If I remember one of the early R1a's in the last paper seemed to be placed between Kunda and the Early Ceramic and I thought it was a little too close to the ceramic (I can't remember but I thought it was labelled Narva. Seems like that line is clarified a little.Still think ceramics came from East to West primarily through Baikal, despite whatever happened in Mali or whatever.

I think we might be saying the same thing. Narva is almost fully WHG, but the transition from Narva to what they call the EN is pretty drastic. My thought was that the transition to the ceramic was brought by immigrants up the Volga and that's when you start to see lineages like Karelia man and all the mtdna alphabet.

@Samuel,Wow, that's a stark picture of some drum-beating, canoe-paddling hunters. Almost uniformly U. Roy posted the link of the Mesolithic Sardinians. That's some weird stuff too.

The Narva individual Spiginas1 (dated to ca. 4440–4240 cal BCE) belongs to amitochondrial haplogroup of the H branch providing the first direct evidence that thisbranch was present among European foragers without gene-flow from farmers(Extended Data Table 1)

Of course all that demonstrates all what I am saying from ten years, i.e. that the sample of R1b from Samara belonged to a tiny subclade of R-L23 comes very liukely from West (but we'll see next data) and above all that my R-Z2110 is the ancestor of CTS7556>Y5572 >CTS9219....

Of course all that demonstrates all what I am saying from ten years, i.e. that the sample of R1b from Samara belonged to a tiny subclade of R-L23 comes very liukely from West (but we'll see next data) and above all that my R-Z2110 is the ancestor of CTS7556>Y5572 >CTS9219....

The sample I0575 gets not only the mutations in Y:7186135G>C and S20902:18383837C>T, owned from all the R-L23Z2105 subclades, but also the mutation in Y:8446627A>G (SNP Y21707), owned only from the subclade expanded from Samara. Of course this doesn't exclude that other samples of R-L23 have given birth to other subclades (but I have said that they are very likley only R.M73-M478, R-CTS7763 and perhaps some subclades of R-L277 and L584). But about that we are waiting for other data.

"The Scythians of the eastern steppe were seemingly derived from Yamnaya and East Eurasian ancestors And not from temporally closer Sintashta/Andronovo populations that carried EEF ancestry Similar to present-day South Asians who are best modelled with Early/Middle Bronze Age steppe not Andronovo/Sintashta."""

Two CWC samples Gyvakarai1 and Plinkaigalis242 lack the early farmer component also missing in EMBA Steppe samples. Gyvakarai1 is R1a-M417

“The presence of ancestry from the Pontic Steppe among Baltic CWC individuals without the Anatolian farming component must be due to a direct migration of steppe pastoralists that did not pick up this ancestry in Central Europe. This could lend support to a linguistic model that sees a branching of Balto-Slavic from a Proto-IndoEuropean homeland in the west Eurasian steppe”

So M417 is probably Baltoslavic from the steppe.

CWC from Olsund in Sweden has Balto-Slavic R1a-Z645 and is genetically more similar to Baltic CWC than to German CWC.

“This could indicate that the route of CWC expansion into Northern Sweden might have not been northward from Southern Scandinavia but instead westward across the Baltic Sea either by boat or over the frozen sea during winter”

Check out Table S4 of the supplementary info. Baltic BA can be fitted as Baltic LN+MN farmer but not Baltic LN+Narva. EEF-heavy people, by heavy I mean the amount Northern Europeans have, migrated into the East Baltic. So maybe...lots of EEF-derived mHG H came to the Baltic in the Bronze age? That's actually possible considering Neolithic Romanian had 50-60% H.

Also D(Modern eastern Baltic population, Baltic_BA;X, Mbuti) gets the most positive score when Middle Eastern or East Asian populations are X. East Asian isn't a surprise. But Middle Eastern is. Maybe it means modern Balts have a little extra EEF and maybe it also means they have relatively recent SouthEast European ancestry, which would carry a significant amount of relatively recent Middle Eastern ancestry.

"So the Slavic I2a1 has hunter gatherer origin or farmer?"Hunter-gatherer. The highest numbers of I2a1 are to be seen in western and central Polesie from where it spread in all directions together with early Slavic migrations. There was one I2a in Dnieper-Donets culture, in which there are clear connections with late Mesolithic Ertebelle culture.

Yeah, just to clarify in the other thread where I said the PCA were similar it was really just to counteract that these were particularly bad, as they look no worse or better than in the other papers. They could well *all* be overfitting distances as too close between particularly the outlying edge ancient populations (Natufians, CHG / Iran_N, EHG, WHG) and recent people. Just these and the ones in Saag 2017 don't look particularly bad.

I would also say that the relative distances between the ancient samples look reasonable correct - the EuropeMN look about 25:75 between WHG:Anatolian_N, Yamnaya is about 60:40 between EHG:CHG, the LNBA are about 50:50 between Yamnaya and a population slightly more HG rich than Europe_MN. (This isn't incompatible with the outlying edges being too close though - if you move CHG and EHG both around 20% further out, they would still have a similar relative distance to Yamnaya to what they have. So long as this effects all the outlying ancients equally. Only if EHG were too close but CHG was not would you have a problem.).

So although we might not want to take these as read for affinity to modern people, they look good for relative affinities of the ancients. Plus the LNBA Europeans mostly overlay present day people from the same general areas (esp once time has homogenised local cultures to similar levels of steppe / neolithic / other HG as in very late Bronze Age Baltic_BA), which is what we'd expect.

(Btw for anyone who wants to see, here's all those PCA from Mittnick overlaid as best as I could do very quickly, because I wanted to visualise them together and had a few minutes to do it- http://i.imgur.com/dqiUsap.png. Plus ugly overlaid clines - http://i.imgur.com/rGpilaW.png / http://i.imgur.com/CMyAkbU.png).

If so, they should be autosomally identical to Yamnaya (based on the ADMIXTURE - the different Baltic_LN are actually labelled, in white on a white background), with maybe a small offset of less CHG>more HG. No Anatolian.

In the PCA, the 6 Baltic_LN aren't labelled. Some overlap Steppe_MLBA and presumably these are the relatively Anatolia heavy samples with typical HG component - Plinkaigalis241, Kunila2 and some RISE samples I can't make out on the legend.

Then there are two clear outliers. One overlaps the Baltic_BA and this is presumably Spiginas2, a sample about 25% richer in HG ancestry than the other Baltic_LN which is very similar to Baltic_BA in ADMIXTURE. The other is at the very edge of the Yamnaya/Steppe EMBA cluster - so either Plinkaigalis242 / Gyvakarai1, mirroring their "No Anatolia; typical HG" status in the ADMIXTURE.

Evidence? Everyone has understood that Samara was composed of hg. R1a and a little of R1b-L23. That they migrated to Baltic carrying the Balto-Slav languages (no R1b has been found amongst them) and migrated to Andronovo and Sintashta as Indo-Iranian and gave birth to Scythians of Iranian languages. The tiny R1b subclade, only belonging to the R-L23-Z2105 subclade with perhaps some extinct line, was in those migrations (above all carrying hg. R1a) till Mongol/Chinese/Turk people and after also to the Indian subcontinent where a few of those haplotypes may have survived. But from these samples survived in Eastern Europe, Caucasus, Middle East only a few subclades different from the Western European ones which anyway didn't derive from them. I have explained in my previous letters which subclades may have been derived from these haplotypes there and which not. Evidence? You lack:R-M335R-V88 and all subclades (not older in Africa and Middle East than 5000 years)R-L389+ (except the haplotype with YCAII=23-23 found in Armenia, wherea Italy has all the 4 hts known so far)R-Z2109-Y4512 only in Western EuropeR-Z2110 and subclades found in Western Europe and back migrated Eastward as CTS9219R-M269R-L51R-L11R-U106R-P312 and all the Western European subclades...Evidence?

Another cool minor thing about this paper with the outgroup f3 stats is that with the Baltic_BA we see for the first time LNBA populations who get their highest f3s with Eastern European populations outside Lithuania.

"2x I2a1 in Narva So the Slavic I2a1 has hunter gatherer origin or farmer?"

There is no doubt that the ancient paternal "Slavic" marker is of hunter gatherer origin. Currently, however, we have no certainty as to its exact whereabouts in the Neolithic and Bronze ages. None of the discovered ancient DNA (so far) is "Slavic" I2a (=CTS 10228 and subclades). Which is not too surprising since the formation time acc. to Y full is 3.300 BCE followed by a very lengthy 3,000 year "survival period". And none of the specific aDNA markers post M-423 which "lead up" to the Slavic I2a have been found either. Only side branches, quite removed, thereof. Dnipro-Donets, Motala, Narva (here discussed) are just that. We would need an L-621 to see some possibility of ancient movement. So at the moment, the best guess is that the ancestors of the I2a Slavs were hiding just about anywhere they could not easily thrive. In "hunter-gatherer rich" areas of the north...

"In "hunter-gatherer rich" areas of the north..."Agreed. Vadim Verenich published a very extensive article on this topic. His main findings are consistent with the Doggerland theory and the movement of huge numbers of Western hunter-gatherers to the East in the 7th millenium BC (6500-6200BC). Doggerland submersion obviously played a great role in the spread of I2a in the Baltic and Eastern Europe. Though we should not rule out a much earlier presence of I2a in Eastern Europe (Swiderian culture (11000-8000BC)). But the post-6000BC movement is clearly obvious in the forest and forest-steppe zone cultures of Eastern Europe. There was a growing presence of warfare and violent deaths in that period. Its impulses even reached Volga and Ural. https://verenich.wordpress.com/2013/12/27/происхождение-гаплогруппы-i2a-и-путь-миг/

Indeed. I read Verenich's great study a while back. I'm not a linguist, but I found one of his later points intriguing: the notion that Slavic speech was basically formed via a "deflection" of R1a "Baltic" by the lost language of I2a HG's. Do you happen to know the source of this idea? He has a bibliography but does not refer this point specifically.

OT: For the other paper we haven't talked about much here, anyone have any thoughts about the two Mesolithic, pre-Neolithic Sardinian mtdna sequences? http://www.nature.com/articles/srep42869. I3 and J2b1, not U, suggesting the pre-Neolithic may not have been WHG / Villabruna, in which case lower degree of WHG incorporation in Sardinia compared to the rest of Europe may make sense.

Of course, there is some increased WHG compared to Anatolian, so incorporated groups must have either been richer in WHG than Anatolian, or substantial population replacement happened after the Early Neolithic (both seem likely).

They run models supporting total replacement over any admixture, but unsure how strong this modeling is, and how much it squares (or does not) with previous work earlier this year on Sardinian specific mtdna haplogroups, and with effects of culture and natural selection.

Remaining question is also how typical WHG-atypical mtdna was for South Europe below the 43rd parallel before the Neolithic. Is this general or specific only to Sardinia? Continenza (central Italy 11200 BP - 10510 BP) suggests WHG-typical mtdna was typical.

So who wants to bet that one of the western Yamnaya groups will be identical to Gyvakarai1, Plinkaigalis242 and Ardu2, including a shitload of Z645?

How similar is Latvia HG ZVEJ32 to these? I know it's not formal, but nMonte seems to show German Beakers for example have a preference for Latvia HG ZVEJ32, over other WHG, almost everytime. It's like ZVEJ32 is something like what Yamnaya ran into before they made their incursion deeper into Central Europe.

Or, Western Yamnaya really was ZVEJ32,Gyvakarai1, Plinkaigalis242 and Ardu2 and that's why nMonte is picking ZVEJ32 for BB.

How similar is Latvia HG ZVEJ32 to these? I know it's not formal, but nMonte seems to show German Beakers for example have a preference for Latvia HG ZVEJ32, over other WHG, almost everytime. It's like ZVEJ32 is something like what Yamnaya ran into before they made their incursion deeper into Central Europe.

Or, Western Yamnaya really was ZVEJ32,Gyvakarai1, Plinkaigalis242 and Ardu2 and that's why nMonte is picking ZVEJ32 for BB.

Latvia HG ZVEJ32 is probably very similar to the new Narva samples, and might be acting as a proxy for southern Narva ancestry in eastern Bell Beakers.

On the topic of tests that can be done with new samples, doesn't apply to the Scythians, but the Bronze Age Baltic might be worth running through the Eurogenes K13 and K15.

Probably sounds a bit retro and crazy to do on ancient samples I know, with the massive projection problems that have come up... but by the time of the Bronze Age Baltic (which is less than <1000 BC I think), you should start to have captured a lot more of the modern drift, and issues with samples being well outside of the scope of modern variation (and so shrinkage) should decrease.

I've been looking at Eurogenes K13 and K15 lately, and with the British Iron Age Hinxton samples, when they were run in K13 they had a position richer in the North_Atlantic component than any present day populations. Effectively giving them exaggerated positions within the Scandinavian or Scottish/Irish cluster. Esp Hinxton2 and Hinxton3.

(On this tangent, I was pleasantly surprised from not looking at it for ages how this Eurogenes K13 ADMIXTURE captures population distances and structure within present day populations really well. Probably very slightly under specified for East Asia at least, but great, esp it seems to me once distances are accounted for via PCoA adjustment).

It would be interesting to see how the Baltic_BA functions here, whether similarly to the Iron Age British samples with North Atlantic, they have an exaggerated degree of closeness to the Baltic component, above what is found by modern people. (In a sense might be more interesting than the Basal K7, or anything like that, which might tell us what we'd already expect to know and shows weaker splitting and structure between the NE and NW Europe, based on its components that form around older drift...)

Interesting that he excess of ANE in Okunevo is also accompanied by some ASI. I think this is the same thing we see with the Srubnaya_outlier, which also has that excess of ANE and some amount of ASI. So maybe instead of MA1 types being further east in Siberia they were further south around the Pamirs?

In Ryu's post above, Okunevo get's some 4% Chamar. So I checked that and I get the same results. Then I checked Srubnaya_outlier to see if it was noise in Okunevo (low quality) but I get similar results:

The Chamar are more West Eurasian than anything else; their ASI is probably somewhere between 45% and 30%, so Chamar scores for Okunevo are likely reflective of shared ANE and Basal, as Shaikorth has already suggested.

Thanks for testing that. After further investigation, this is probably related to some excess of ANE not well captured by other samples. Using weighted values and adding Karitiana I could get Paniya to 0% in the Srubnaya_outlier. With Itelmen instead, it is reduced to 1.2%. It didn't work for Okunevo, but I'm less confident in the quality of that sample.

Yes, using the D-stats datasheet I see similar behaviour as qpAdm. ASI is all over Europe. I think this is dependent on the outgroups, though, since using Georgian instead of Kotias in the columns makes most of of that ASI change into CHG, which is probably what it is. I think I haven't tested the effects on S-C Asia. I'll see if I have a sheet still around with those changes to give it a try.