Focusing on European population genetics and modern physical anthropology.

search this blog

Saturday, January 23, 2016

The enigmatic headless Romans from York

My dataset was recently enriched with six ancient individuals from Roman York, courtesy of Martiniano et al. 2016.
They were either gladiators or soldiers. Each one was decapitated. This may have been a coup de grâce or a burial rite. At least one, 3DRIF-26, was not native to Britian.
In fact, isotopic evidence suggests that he spent his childhood in a region with a hot and dry climate such as North Africa or the Levant. Moreover, his top matching population in terms of pairwise Identical-by-State (IBS) allele sharing are present-day Saudis (see here).
However, I thought it might be useful to revisit 3DRIF-26's genetic affinities after taking into account his non-trivial Sub-Saharan admixture. This can be done with qpAdm. The best ten models are listed below.
Please note that in the last model I had to use 3DRIF-26 as a mixture source for present-day Egyptians, because he has less Yoruba-related admixture than the Egyptians.

I'd say these results provide rather convincing evidence that 3DRIF-26's West Eurasian ancestry is derived from the Levant. Moreover, his relatively high level of Sub-Saharan admixture suggests that he came from the southern Levant or perhaps a nearby region, like the Sinai Peninsula.
Interestingly, the best models feature a couple of religious minorities (Samaritans and Lebanese Druze), an island population (Cypriots), and a fairly unique group in terms of genetic structure from Israel's Negev Desert (BedouinB). This suggests that 3DRIF-26 may have belonged to a similar religious or geographic isolate population, or, alternatively, that most of the Levant has experienced significant genetic shifts since he was alive.
The rest of the headless Romans were, in all likelihood, born and raised in or near Britain. However, two of the individuals, 3DRIF-16 and 6DRIF-3, show elevated IBS affinity to Lithuanians and Poles. At the same time, they both belong to Y-chromosome haplogroup R1b-U106 (aka M405), which is a marker generally thought to have arrived in Britain with Anglo-Saxons and Scandinavians. This might be a coincidence, but probably not.
D-stats confirm that they do show elevated Northeastern European affinity relative to the other three Romans. Only one of the Z-scores is statistically significant (>3), but most of the others would probably also reach significance with more SNPs and higher quality sequences.

My guess is that 3DRIF-16 and 6DRIF-3 were Britons of mixed origin, with recent ancestry from Scandinavia and/or East Central Europe. Indeed, they can be modeled with qpAdm as part Swedish and Polish.

113 comments:

How come no analysis of the Anglo Saxon? I am curious as to the SHG affinity and in general the affinity of that sample to all of the ancient groups. Would be interesting to see if it clusters closer to BBC or CWC, and the relation to Neolithic and Mesolithic Scandinavians.

U106 isn't a big surprise here. Scandinavians have been moving to Britain since well before the Romans. Remember the Kent grave with several Scandinavians? U106 could be there since then, or even Bronze Age. There is also the chance that these U106 individuals are those Germanics which refused to become foederati, and were imprisoned and shipped around the empire. Have you tried modelling the individual as USA German and Scandinavian?

Dave: it is debated, but scholars long suspected that some of the Belgae that moved to Britain in the 1th century BC had Germanic or partially Germanic ancestry and had come from the East not very long before the Roman conquest of modern day Belgium. My first hypothesis would be that they are behind the more eastern like aDNA and R1b-U106. The wikipedia article on Belgae discusses this question in lenght. https://en.wikipedia.org/wiki/Belgae

Besides IBS with Lithuania/Poland there's no good evidence that any of the Headless Romans had lots of Continental Ancestry. They didn't have high IBS with BeloRussia, what's up with that? Shouldn't BeloRussia share any recent ancestry a Headless Roman has with Poland and Lithuania?

IBS, British/Dutch-only PCA, and FinStructure confirms a close relationship between 6/7 of the Headless Romans with Isles Celts especially Welsh.

British/Dutch only PCA: One extreme was represented by Irish and one by Dutch. Of British, English and Anglo Saxons were closest to the Dutch cluster, and all the headless Romans clustered with Welsh, and didn't strongly deviate towards Dutch at all. This doesn't support them having ancestry from Germania which would have been similar to the Anglo Saxons and modern Dutch.

Thanks for the very interesting analysis, especially regarding individual 3DRIF-26.

I have had a look at the original paper and noticed some interested findings in figure 1 (http://www.nature.com/ncomms/2016/160119/ncomms10326/fig_tab/ncomms10326_F1.html).

In the PCA plot (fig. 1a) 3DRIF-26 clusters with modern Jordanians and Palestinians, rather than Saudis or Bedouins, and especially not Cypriots. Same results can be seen in the ADMIXTURE analysis in fig. 1b. I was wondering, how accurate are these results relative to your qpAdm results?

Generally, I would agree that this individual came from southern and more inland rather than northern and more coastal Levant. In my opinion northern Levant (as well as Cyprus), became much more shifted to a SW Asian component during historic years, mainly through the Arab expansion. 3DRIF-26 on the other hand, showed substantial SW Asian admixture, even before any Arab expansion occurred.

I read this paper "Headless Romans" which is about deciphering the geographic origins of the samples. It confirms there were Continental-Europeans there, but the Headless Romans were mostly from Britain or even York. I find it possible but unlikely any of the Headless Romans we have DNA from had significant Continental European ancestry.

Strontium/Oxygen of 18 suggests most grew up near York. A good amount though were consistent with being from a more Northern altitude and few were consistent with being from a more Southern Altitude. Combined about half have somewhat strange results for someone who grew up in York. 2/18 for sure did not grow up in Britain. One probably grew up Gaul or Germany or further East and one somewhere Southern Europe.

The results do suggest diverse origins of the Headless Romans, but most were probably from York. If not from York, most were from Britain. What makes them diverse is many were probably from differnt parts of Britain than York and some from other parts of Europe. The Arab/Levant guy confirms diversity in the Headless Romans and their violent lives was started/controlled by Roman empire and therefore involved people from other parts of the empire.

Has anyone considered that the U106 guys were relatives and making U106 over represented?

@Slumbery,"Dave: it is debated, but scholars long suspected that some of the Belgae that moved to Britain in the 1th century BC had Germanic or partially Germanic ancestry and had come from the East not very long before the Roman conquest of modern day Belgium. "

There's recent ancestry shared between Dutch/Anglo Saxons that Germania would have shared. None of the Headless Romans show evidence of having this Germanic-ancestry. Having Belgae(If they weren't significantly German) or Gaulish ancestry makes more sense than Germanic.

There are a couple of issues with the PCA in the paper. It includes a lot of samples with high levels of Sub-Saharan admixture, so the ratio of West Eurasian to Sub-Saharan ancestry is a major factor for samples that have significant Sub-Saharan admixture. In other words, it's not very informative for such samples in regards to the fine scale nature of their West Eurasian component.

The ancient samples were projected onto eigenvectors computed with the modern samples, which means that the PCA space for the ancient samples is slightly shrinked, and as a result they appear closer to 0 in both dimensions than they should. This is a common problem in ancient DNA papers.

The ADMIXTURE analysis doesn't include a Sub-Saharan cluster. It just has a North African cluster, which means that any samples with significant Sub-Saharan admixture will show elevated membership in the North African cluster.

Times of Grace,

Ashkenazi_Jew 0.912Yoruba 0.088chisq 16.103 tail prob 0.00288466

Sephardic_Jew 0.923Yoruba 0.077chisq 10.153 tail prob 0.0379256

Krefter,

Belarusians are not identical to Poles and Lithuanians, so that's not a compelling argument.

Tesmos,

Not sure. I think it's HS2 from the Schiffels paper. This individual is mostly Germanic, but shows a mixed character in my analyses.

Thanks. First I thought Sephardic_Jew results would be close to Cyprian results because of similar admixtures. But the Ashkenazi result is more surprising because it's a lot further away in comparison. Is it only because of greater European mixture in Ashkenazi than Sephardic or would that be simplifying things too much?

Ashkenazi Jews are largely of Near Eastern ancestry, but they have a fairly complex genetic history that probably includes admixture from a couple of different parts of Europe and maybe even East Asia (via the Silk Road), plus they've gone through a very strong founder effect. So I don't think they'll prove to be an accurate proxy for any ancient Near Eastern population.

Thanks again for the info. Yes, I thought they would be closer but how can it be proven or assumed to be of mostly Near East ancestry? 23&me has the best Ashkenazi ancestry calculator and even their advisor Itsik Pe'er describes the Ashkenazi makeup between 45-55% Near Eastern-European split https://www.youtube.com/watch?v=QHMYGuXEBZI

Druze have been endogamous for even longer and Ashkenazi Jews are naturally more European than them because of admixture. Everyone accepts that. But to be majority Near Eastern, minor European, and closed off should mean they're genetically closer to other closed off minorities like Druze and Samaritans in the Near East than the Sephardic Jews are -- who are from the same population but historically less endogamous -- but that's not the reality. Khazar [bs IMO] or East Asian means they'd pull more East compared to Sephardic Jews but that's not reality either. This ancient Levant genome must be regarded too because its so different to what everyone expected the old Near East to be like.

Itsik Pe'er uses modern European and Near Eastern populations to estimate European/Near Eastern ancestry proportions in Ashkenazi Jews.

So to assume that his estimates are close to reality, we have to assume that he's using the correct reference samples. Since we don't have any ancient DNA pertaining to Jewish and Ashkenazi ethnogenesis, then that's a very big and risky assumption IMO.

Also, Ashkenazi Jews aren't just an isolate. They're the result of a severe late bottleneck involving, at most, a few hundred people. So not only do they have all of these minor exotic admixtures that Samaritans and Druze lack, but some of their allele frequencies associated with these admixtures might be very skewed.

The upshot is that their overall genetic structure can easily be in large part of ancient Near Eastern origin, but we have no way to really test this yet, because we don't know what their Near Eastern ancestors were like, but even so, they can't be expected to be useful proxies for ancient Near Easterners, because their population history is so unusual.

Sephardi Jews have a much higher effective population size, and perhaps contrary to what most people think, they don't carry much admixture from outside of the East Mediterranean region. So I'm not surprised they provide a better fit for the Roman outlier.

It’s of course a really great treat to find this middle eastern genome in, of all places, Roman England! However, I wouldn’t say there’s anything that unexpected about 3DRIF-26. The only minor surprise is seeing a nice chunk of African ancestry already present *prior* to the Islamic Arabian expansion. Previously I believed that Yemenite Jews are a good proxy for ancient Arabians, but they lack African while 3DRIF-26 doesn’t, so…

Anyways I’ve always believed there to be subtle but significant population structure within the middle east at that time, and 3DRIF-26 supports that idea. That said he does just look within the range of modern Arabians / Bedouins. I’d guess he was a Nabatean but he could’ve been one of many related desert-dwelling groups in the region (isotopic evidence suggests arid environment).

BTW, can someone remind me the exact difference between BedouinA and BedouinB — I’ve always found is strange that they act quite differently in various analyses.

PF, Yemenite Jews don't lack (East) African admixture. But it is ancient, so it can be hidden by the presence of a SW Asian component.

BedouinA are similar to Levantine Muslims, including sizeable CHG ancestry, and minor West African from the Islamic slave trade. BedouinB are very "Arabian", and their minor African admixture is mainly East African, same as in 3DRIF-26 and non-Muslim Middle Eastern minorities.

But on the other hand it's proven to be an equally risky assumption to think modern populations are so very different from ancient populations -- as we just saw with the ancient Levant genome. My guess is north Sinai origin but that's compared to modern populations. I expected ancient genomes from Israel by now because there's much interest in facts over assumptions. Behar, Pe'er, and others have focussed on Jewish genetics for years but no useful ancient genome has appeared yet and I don't know why. It can't be due to lack of funding.

Individually the Sephardim show Ashkenazi, Southern Europe, North Africa, Near East, and less significant ancestry in differing quantities that can be explained by levels of local admixture depending on the Sephardic population. But I still thought it would affect them more.

I don't think Druze would lack exotic admixture. The 2008 Shlush study on Druze supports their oral tradition of heterogeneous ancestral origins that began after the Islamic expansion and supports endogamy linked to local village and cousin marriage. So it should be expected to affect their genetics after 1000 years in some similar ways to Ashkenazi Jews. But the surprise is today's Druze are not greatly different from their neighbors. It too can mean other areas of the modern Near East are less different than we should expect compared to 1000 years ago or more.

Regarding the lack of ancient genomes from the Levant, the hot climate is an issue, which seems to have been overcome recently, but it's also politically sensitive in Israel. That said, I know some Bronze Age remains were sent to Australia just recently.

"The only minor surprise is seeing a nice chunk of African ancestry already present *prior* to the Islamic Arabian expansion. Previously I believed that Yemenite Jews are a good proxy for ancient Arabians, but they lack African while 3DRIF-26 doesn’t, so…"

The best solution to the SSA admixture conundrum is to compare Middle Eastern populations (including 3DRIF-26) side by side in a k=3 or k=4 run, so there'll be minimal 'hiding' of SSA in other components. The paper's own global PCA positioned 3DRIF-26 among the least SSA-shifted Middle Easterners, and that's with projection bias that presumably drew him closer to SSAs than he should be, probably making him the least SSA-shifted Middle Eastern individual there. It's confusing how he models as more SSA-admixed than most modern Middle Easterners.

I'd be very surprised if 3DRIF-26 had less Sub-Saharan admixture than Yemenite Jews. And they do have quite a bit, probably around 8%, but a lot of it is hiding in the various Southwest Asian, Red Sea, East Med and other southern components.

Other Near Eastern and Mediterranean populations have a couple per cent at least. It only disappears completely in the North Caucasus.

That's why it's so hard to estimate Sub-Saharan ancestry with ADMIXTURE; there aren't enough unadmixed samples from Southern Europe and the Near East. It's only really possible to get it right using formal methods and a 0% SSA baseline set by Neolithic Anatolians and Caucasus HGs.

@Dave: First a minor correction. While the York graves included many beheaded people, acc. to the Supp. Mat all six samples were not beheaded. They seem to have been selected for technical reasons - as the Supp.Mats point out, aDNA appears to be best conserved in the inner ear bones, and those are difficult to recover from headless corpses. However, all six had fighting wounds, partly also cuts in the neck, so they obviously died a violent death.

Can you run the two Lithuanian-shifted samples against Hungary_IA? The Marcomannic Wars, and possibly also the Pannonian Upheaval before, resulted in quite some migration towards Silesia and possibly beyond; you had reported about some "out of Pannonia" movement into Eastern Europe before. Hence, I could imagine those two to be Üannonians (pre-Hunnic/Slavic/Magyar etc.), either serving with the Roman army or becoming gladiators after having been caught during the Marcomannic Wars.

Otherwise, let's not forget that the Suebians (including subgroups such as the Vandals and Burgundians) once dwelt in today's Poland - the Oder river was called "Suebian river" by Roman authors. The first Suebian migration westwards already occured during the 2nd/1st ct. BC, possibly connected to the Cimbri and Teutones crossing their lands. Under their leader Ariovist, the Suebians incurred into Gaul, which delivered Ceasar the cause for the Gaulish War. Some of those Suebians might well have made it into Gaul and later into Britain. Suebian soldiers in 3rd ct. Roman service can't also be excluded. And Suebian DNA, to the extent it was differentiated from other Germanics, should rather be expected in Suebia, Alsace, Burgundy, even NW Iberia, than in contemporary Dutch and North German samples.

"Whichever the identity of the enigmatic headless Romans from York, our sample of the genomes of seven of them, when combined with isotopic evidence, indicate six to be of British origin and one to have origins in the Middle East."

And I can't run these samples against Hungary_IA, because there isn't enough markers to get solid results.

At some point, maybe this year even, we should see some genomes from Migration Period era Central and Eastern Europe, which might prove useful in this context.

So what do you think for now, David? That he may actually be a Levantine or does Nabatean seem more likely? I'm guessing no matter the outcome his genome suggests significant change in the region? Does anyone know when he will be uploaded on gedmatch?

I'd be very surprised if 3DRIF-26 had less Sub-Saharan admixture than Yemenite Jews. And they do have quite a bit, probably around 8%, but a lot of it is hiding in the various Southwest Asian, Red Sea, East Med and other southern components.

Other Near Eastern and Mediterranean populations have a couple per cent at least. It only disappears completely in the North Caucasus.

That's why it's so hard to estimate Sub-Saharan ancestry with ADMIXTURE; there aren't enough unadmixed samples from Southern Europe and the Near East. It's only really possible to get it right using formal methods and a 0% SSA baseline set by Neolithic Anatolians and Caucasus HGs.

David, can you explain how you calculate the levels of Sub-Saharan ancestry across West Asia, Southern Europe and the Caucasus, preferably in a separate blog thread? Dienekes had opposing views to you on that topic, and I think this is a topic that deserves more attention.

Basically I use Anatolia Neolithic and/or Kotias as pure West Eurasian references, and Yoruba as the Sub-Saharan reference. Considering that Yoruba probably has some Near Eastern Neolithic farmer ancestry, then ~9% Sub-Saharan admixture for 3DRIF-26 might be too high. But if so, it's only 1-3% too high at most.

There's no point dwelling on this until we see more ancient genomes from the Near East.

Related to the statement Onur just quoted and his question regarding it, how can Sub-Saharan African hiding in Southwest Asian be distinguished from some back-migrant component hiding in Sub-Saharan African? As in some back-migrant component which is earlier than the Sardinian-like component in Kum6, the existence of which is suggested by osteoarchaeological evidence.

Samples made from the various Southwest Asian and Red Sea, and to a lesser extent even East Med and ENF, allele frequencies pull towards Sub-Saharan Africans on PCA plots. The Southwest Asian/Red Sea clusters have around 15% Sub-Saharan admixture on average.

Thank you for bringing up K8. According to K8, most northern West Asians, Southern Europeans and Caucasians have Sub-Saharan ancestry (the Sub-Saharan plus Pygmy components) at most in noise levels (less than 1%).

Samples made from the various Southwest Asian and Red Sea, and to a lesser extent even East Med and ENF, allele frequencies pull towards Sub-Saharan Africans on PCA plots.

It seems that would be the case whether the commonality was from pre-Neolithic back-migratian from Southwest Asia to Sub-Saharan Africa or the other way around. It seemed like you were assuming that it is all due to SSA to Southwest Asia, so I was asking if you had a statistical reason for assuming that. I'm not doubting the more recent SSA to Southwest Asia geneflow, I'm just curious about methodology for sorting it out, but anyway your response to Onur while I was still that "there's no point dwelling on this until we see more ancient genomes from the Near East" pretty much answers the question.

For South Europeans maybe Yoruba is still a better reference, even if it might have a small amount of West Eurasian admixture (based on the assumption that Near Easterners have East African admixture, but NW Africans have West African admixture).

I'd personally be interested in knowing how much Sub-Saharan there is in Spaniards. From admixture it would seem that on average it should be around 1.5%, but the Near Eastern cluster might be confounding things.

David, could you try when possible these stats? (Checking with Icelandic too for reference of theoretically 0% SSA):

Yes, that last one is repeated. I made 2 pairs of stats, one for Spanish and one for Icelandic, but obviously the second one of each pair is the same, so no need to run it twice.

@Onur, you're probably right. But each population can have its own possible problems. I just wanted to see the ballpark of an f4 ratio to check if there was big difference with Admixture. If David at some point wants to make a post about all this he can run many stats changing the reference populations to check how robust the results are (for example, for Europeans I was thinking that maybe Loschbour is better than Anatolia_Neolithic, but I'm not sure either way. In theory both should work).

@Davidski, would you really attribute the potentially minor "Eurasian" ancestry in the Yoruba to Neolithic farmers? Is this assumption based on the Mota genome? I find that highly suspect, however, gene-flow from a "Basal Eurasian" source (with origins in North Africa or an South Arabian isolate) could possibly explain such affinities; that would explain why this apparent signal of "Eurasianess" across Africa shows very little variation across highly disparate populations, i.e. Yoruba, Mbuti, Biaka, Dinka, and San. A "Basal Eurasian" would clears things up, maybe we'll find something in the Fezzan or Yemen soon.

@Labayu, It appears that East African or like ancestry is largely only present among SW Asians, particularly Semitic speakers (apparently disappearing completely once you reach the North Caucasus), and that within this region there's an obvious north-south cline of such ancestry, wouldn't that imply gene-flow from an external source? IMHO, this layer of East African ancestry in SW Asia is indicative of gene-flow from across the Red Sea related to the introduction of Semitic to the region.

For a start, it's strange that all population (modern and ancient) show higher affinity to Mbuti than Anatolia Neolithic does, resulting in all showing some percentage of SSA admixture. Maybe Mbuti has some admixture most similar to Iberia_EN that causes this? I guess that to get really accurate results quite a number of reference populations should be used to cross check and get an good estimate.

East Africans have elevated Eurasian ancestry, specifically of its Caucasoid variety, compared to the rest of Sub-Saharan Africans. So this fact must be taken into account when estimating the levels of Sub-Saharan ancestry in Europe, West Asia and North Africa.

@Onur, no one's disputing that, but SW Asians are shifted towards all Africans relative to other Eurasians; Polako's estimates are based on the Yoruba for example, accordingly, African admixture in SW Asia would only increase if using an ancient East African genome (void of non-African admixture). Dinka or Mota would be fine, but ~25% of the latter's ancestry seems to derive from an ancient African population at the root of AMH.

???? So Mbuti is basically equally close to Yoruba and to Anatolia_Neolithic? Then very wrong assumption on my part that Mbuti would form a clade with Yoruba to the the exclusion of Eurasians.

But I'm afraid that even if we used a Yoruba-like population instead of Mbuti the results would not be accurate due different factors (minor Eurasian admixture in Yoruba, close to noise levels of Sub-Saharan in South Europeans that would make estimates very dependent on reference populations, etc...) Maybe we can do with Admixture for now.

Yeah, West Eurasian in East Africa doesn't affect Dinka as per this:East_Asian French Dinka Chimp 0.009 2.07 - in fact it seems East Asians are more related to Dinka.Stat is from Wong et al. and based on high coverage sequences so it should be as accurate as we can get for now. Mbuti in place of Dinka showed the same numbers, Yoruba was closer to neutral vs French/East Asia.

Sub-Saharan admixture in SW Asia (the Semitic-speaking zone specifically) is evident from ADMIXTURE analyses too. What is in dispute is the existence of more than noise levels of Sub-Saharan admixture in the majority of Northern West Asians, Southern Caucasians and Southern Europeans. ADMIXTURE does not seem to support it and the results of formal analyses can be interpreted in the reverse direction too: backmigration to Africa from West Eurasia.

Mbuti are from near the Ugandan border of the DRC, and most of their languages and closest neighbours are Central Sudanic (usually classified as Nilo-Saharan) rather than Bantu. So we should not really expect them to be close to Yoruba. It would be a different story if using Baka pygmies.

@ Shaikorth, thanks for that. Pretty important revision since that was the main / only novel finding of that paper (and was quite suspect to me at the time). Moved things back to West Eurasian admixture being exactly where it is expected to be.

The revision also actually makes sense, which is always a nice thing for science to do.

I wonder of this means does that mean other stats using Mota as an African outgroup are now sort of invalid?

One thing I don't understand is "(the analysis of admixture with Neanderthals and Denisovans was not affected). "

So er.... what? Yoruba and Mbuti *do* have minor Neanderthal admixture relative Mota, but *no* West Eurasian, or indeed Eurasian at all, admixture relative to Mota. How does that work then? How is that signal actually going to be Neanderthal related (and not some other kind of "basal to Homo Sapiens" signal)?

@Matt, one thing to note is that East Africans also appear less Western Eurasian admixed than previously assumed; take for example the Somali and Tigray-Tigrinya who are now ~30% and ~40% WE, an almost ~10% decrease from earlier estimates. The fact that about ~25% of Mota's ancestry is derived from a divergent African population at the root of AMH makes me think that WE ancestry in East Africa will likely drop some more in the future.

Yes, I really didn't give it much thought about choosing a population, except actually not being specifically related to Yoruba, thinking that as long as they form a clade it would be better not to be identical. But it was apparently a very naive assumption that just for being Africans they would form a clade to the exclusion of Eurasians.

Though I do wonder why Admixture has no problem telling apart Africans from non-Africans if indeed Yoruba is equally related to Mbuti Pygmies as it is to Anatolian Neolithic farmers. Pure random luck that never fails? Probably not. Maybe the way D-stats work, which can be mysterious at times.

Regarding the revision in the Mota paper, it also makes me wonder about those stats with Mbuti. So do modern Europeans have significant Mbuti admixture when compared to Anatolian farmers? The number of markers is high and the Z-score significant. Can we call it noise?

ADMIXTURE can tell non-Africans because they are a clade formed with a bottleneck, it's good at that. TreeMix shows Yoruba forming a clade with Eurasians, so it makes sense that Mbuti would be equidistant from them, except you wouldn't think Mbuti would *completely* lack West African admixture.

But the Mbuti D stats with Eurasians, beats the heck out of me. Something North African maybe? But in Iceland? Mysteries abound. Or maybe it's just some artifact of ancient genomes.

@ Alberto, possibly of interest just after the Mota paper came out Davidski ran me off these "double outgroup" (one of the most outgroup humans, JHN, with one of the non-human outgroup Chimp) D stats relating to closeness to Yoruba, in descending order of D:

Seems to indicate that all these populations are strongly shifted on the axis compared to Ju_hoan_North (more unique genetic diversity not shared with Yoruba, more falling away from the Yoruba side?). And that while West African populations tend to be closest to Yoruba, in general many West Eurasians are closer on this axis to Yoruba than East Africans (the Mota stat might need reconsideration)...

(A similar run of stats also showed Eurasians were closer to Biaka than Ju_Hoan_North were, under D(JHN,Pop,Chimp,Biaka) but by a lesser degree of magnitude than the shift towards Yoruba).

Re: ADMIXTURE, and why it groups Africans, it may be that ADMIXTURE basically identifies "Cluster of similar genetic diversity / allele frequencies".

So it generates a cluster of Eurasians, who all have relatively similar allele frequencies compared to African diversity, and African fall outside this cluster, so are thereby in cluster 2 (the genetically diverse cluster) together.

And this continues up through K, as ADMIXTURE identifies successively more Eurasian clusters (as they're drifted together, less genetically diverse, have similar frequencies, relatively identifiable as clusters) and African clusters only become apparent at high K.

I could be wrong though, I don't have much /any of a theoretical understanding of how ADMIXTURE *really* works.

Yes, that makes sense. Much like Native Americans form a clade to the exclusion of Eurasians, but it doesn't imply that any and all Eurasians form a clade to the exclusion of Native Americans (I guess? Haven't really seen this tested, but I think it should be true).

Those stats with Yoruba are very interesting. I've never thought much about the African genetic diversity, and probably from Admixture and PCA's including Eurasian populations got a wrong impression of them being more related to each other than they really are.

I do wonder, though, if IBS would show similar results to Dstats or if with IBS would pick up more on possible admixture between populations (I guess Africans have admixed more or less according to geography?) and less on the original genetic diversity.

The formation of ADMIXTURE components in Africans is dependant on the number (and homogenousness) of the samples just like with other pops.

You can observe the African cluster formation by comparing for instance Lazaridis et al's run with lots of Africans (Yoruba and Khoisan separata K=5, Hadza appears at K=10 etc) to Behar et al. 2010 (Mbuti and Yoruba separate at K=6, San sample is tiny and does not get a component, looking more like Mbuti than Biakas even at K=10).

1. Semitic expansion: Dated to around 500 BC, affected especially Amharas. Key marker is of course yDNA J1-P58, with 21% frequency in Amharas, 1.6% in South Sudan, but ni occurence in Oromo speakers.

2. pre-Semitic Persian Gulf expansion: That's the expansion dealt with by the Mota study. Markers a yDNA J1(xP58) and T. T has been found in two LBK samples from the Elbe-Saale region, which explains the LBK affinity identified in the Mota study, but it should in principle be a CHG (Zagros?) marker that somehow also got involved with EEF.This expansion is well traceable from crops and domesticates. During the late 3rd mBC, IVC received black eyed peas from West Africa, and Sorghum and the donkey from somewhere further east, probably the Ethiopian highlands. In return, Africa received rice, cattle, sheep and goats. The trail goes right through to West Africa, marked a/o by 5-10% yDNA T among Omoro speakers, 17.6% with the West African Fulbe nomad pastoralists, or 4.8% with Kanuri (N. Nigeria, NS speakers). Wolof, to a lesser extent also Hausa, shares a number of pastoralist and metalurgical terms with Dravidian languages. Some backflow from NW Africa into Iberia is indicated by cattle DNA.A second trail leads through East Africa down to the Cape (23% T with the Tanzanien Akie, 17.6% with some South African tribes, 3.8% with the Massai, NS-speaking pastoralists as the Kanuri). That migration, which also left genetic traces with Khoisan (also JuHuan?), had already been established by another DNA study prior to the Mota analysis.

3. Maritime contact to SEA: Banana planting in coastal Cameroon is archeologicall evidenced by 500 BC. In principle, bananas could have travelled with a/m "rice and pastoralism" package through the savannas; however, this is unlikely for genetic, linguistic and ecological reasons. All indicators point to direct, maritime import from SEA, somewhere between Burma and Sulawesi. R. Bench assumes bananas to have been part of a full "tropical neolithic package" supplied from SEA, comprising also yam and taro, plus possibly coconuts, chicken and pigs. Bench regards this package as instrumental for the first stage of the Bantu expansion into and through Central African rainforests, before they took over pastoralism in East and South Africa. In that case, the contact with SEA must already have ocurred around 2,000 BC, i.e contemporary with the expansion under (2).

So, nitially, we would have had an East-to-West/South "Savanna neolithic" expansion by speakers of non-semitic Afroasiatic and Nilo-Saharan languages (e.g. Dinka), and a "tropical neolithic" expansion out of southern Cameroon/Nigeria by speakers of Niger-Congo languages (e.g. Yoruba). Later on, of course, both models/language families got overlaid and mixed, which should to some extent also apply to the non-African genetic contributions they entailed.

4. Earlier contacts are evidenced bya) the Basenji, an archaic domestic dog of the "singing", i.e. non-barking, pre- herding dog type present with pygmies such as the Mbuti. According to DNA analysis, the Basenji encountered some Near Eastern (Israel) wolves on their way to Africa. Natufian dog burials are evidenced from 12 kBC (Magdalenian evidence, Le Morin/ Dordogne, dates to 13 kBC).b) American Calalabash (bottlegourd), documented as early as 8 kBC on the East Coast (Florida) carrying African aDNA. The Calabash may be mankinds' earliest domesticated plant - the AfroAmerican and Asian clades split some 100,000 years ago. 6 different types of American Calabash aDNA have been identified, differentiated by some 50,000 years of evolutionary history in Africa.

Many of you guys are better than I in sorting out the ways these contacts may be reflected in formal stats; but generally I would go as Siberian as possible when looking for reasonably unadmixed outgroups.

I hate to throw cold water on the party, but is it possible that this one guy was the only true Roman (from the province of Italy) in the bunch?

Or that he was half Italian?

Point 1: Remember, Italians at this time were of a stock that had been untouched by the German invasions of the next century. It was pre Goth, pre Lombard, pre Gepid, pre Frank, pre Norman. They would come up as much more Southern Europe/Middle East shifted because this was before large scale Germanic invasion.

Point 2: Remember, your calculators are faulty. I know lots of modern Italians who come up as "Middle Eastern" on your tests. Everything from Cypriot (in some) to Caucasian (in others) to Algerian (in another) to Red Sea, etc. Before you say "those are ones with Arab admixture," I can tell you that these are blue-eyed Italians with confirmed genealogy from parts of Italy that were never invaded by Saracens. Your calculators are off, and all my Italian friends say the same thing. You simply don't have enough samples from all the peaks and valleys of the Appennines, where folks were unique to start off with and endogamous since then.

Point 3: Is it possible this guy was of Etruscan heritage? All those "exotic Etruscan" adherents always say they were from elsewhere? It's nonsense, but I am making a point about those who would reject Point 1 above and make Point 3 elsewhere. You can't have it both ways!

Yes, that makes sense. Much like Native Americans form a clade to the exclusion of Eurasians, but it doesn't imply that any and all Eurasians form a clade to the exclusion of Native Americans (I guess? Haven't really seen this tested, but I think it should be true).

In ADMIXTURE Native Americans cluster with East Eurasians until the formation of the Native American component, and even then the Native American component is found in smaller levels in Siberians and is itself closer to the East Eurasian-specific component(s) than it is to the other components FST-wise.

Your scenarios on the ancestry of 3DRIF-26 do not make any sense. He genetically clusters neither with modern Italians nor even with modern Northern West Asians. It seems implausible for him to be from a more northern region than the Levant, his isotope analysis also points towards a desert region for his origin.

As for the calculator issue, Davidski is right, Italians usually show up as some type of Italian in his calculators, and if not as Italian, as from a genetically similar population such as Balkan Greeks.

David, can you add the three more Eurogenes Samaritans to the West_Asia PCA?Even though all Israelite Samaritans are closely very related, we've seen that they can diverge from each other in the PCA. Perhaps one or more of these three will also be near 3DRIF-26, and a cluster will emerge?

They do sort of make a cluster. But as you can see, there's something peculiar about the Roman.

He probably has less CHG and EHG than the Samaritans. The difference is only a few per cent, at best, but it suggests that extra CHG and likely EHG entered the Levant during the last ~1500 years, and this process even affected the extreme isolates like the Samaritans.

Ironically, considering that most people were focusing on Sub-Saharan ancestry as the new component in the Near East, this process may have also led to a fall in Sub-Saharan ancestry in much of the Levant. We'll probably soon see whether that's true.

@capra: "I never heard that Africa had Asian rice before the 1st M AD. What evidence is there it came across in the IVC era?"

N.M.Nayar has intensively worked on this question. A shorter paper is linked, a more indepth discussion is found in p. 135ff of his Book "Origine & Phylogeny of Rice" (Google books).http://www.africarice.org/workshop/ARC/1.18%20Nayar%20ed2.pdf

He puts forward quite convincing arguments for a genetic sister relationship of the African O. glaberrima and Asian O. sativa. "Out of a total of 36 952 bp of common sites studied in O. glaberrima and O. sativa (indica and japonica), the authors identified 519 (1.4%) substitutions between indica and japonica, and 764 (2.1%)substitutions between japonica and O. glaberrima. Oryza glaberrima was equally distant from both indica and japonica. (..)The point mutations that differentiated O. glaberrima from O. sativa indica and japonica were not unevenly distributed, as would be expected if segments from either sativa group had been introduced into the African rice. He furthermore argues that the purported wild progenitor of O. glaberrima, O. barthii would in fact be a hybrid of O. glaberrima and other Africam wild Oryza species.

While thus proposing an Asian origin of African rice, he only dates its introduction in the a/m book to about 100 BCE. This already is some evolution from "the early centuries of the Common Era" as in the paper linked above, and his original dating (1973) of 8-11 cAD, at that time coming out in strong opposition to a proposed dating as early as 1,500 BC. I'd call this dating attempt a compromise between remaining true to positions once (1973) taken, and intellectual honesty.

Archeological evidence shows:1. Neolithic settlement, combined with evidence of animal domestication, in W. Africa dating back to at least 1,800 C (Gajiganna/Kursakata, L. Chad Basin, NE Nigeria),2. Rice traces in the above, plus two prehistoric settlements from the Inland Niger Delta(Mali). 3. While it was difficult to distinguish between gathered wild and domesticated varieties, in one case (Dia) the excavators concluded on domesticated O. glaberrima, AMS-dated to ~2,500 BP.4. Use of wild O. barthii is secured from around 1,000 BC (Kursakata), possibly from 1,400 BC (Gajiganna). If Nayar is correct with O. barthii being a hybrid between domesticated rice and wild African varieties, this sets a terminus ante quem for the introduction of domesticated rice into WA. If he is wrong, the newly arrived resident pastoralists must quickly have commenced with rice domestication, otherwise we wouldn't be talking about African rice now - in Asia, domestication (non-shattering trait) took 2-4,000 years.

Further evidence comes from linguistics: The "Nigerian" root for rice (Kanuri "shànkáwá", Hausa "shinkafa", Igbo "osikapa") is obviously of S(E)A origin. Whether it is derived from PAA *cǝŋka:m‘rice outer husk’& *(kǝ)ɓaːʔ ‘rice plant´, or a hybrid of Proto-Tai *cheo "rice" and Dzongkha (Bhutan) kambjâ ‘drypaddy’ I leave to others to sort out (see link for further tracing of S(E)A roots). The more westerly root *malo "rice (Mande, Wolof, Peul, Gurma, Mossi, Ewe etc.) is a bit more difficult to trace, but Proto-Hmong-Mien *mbləu ‘rice plant, paddy’ comes reasonably close.http://www.himalayanlanguages.org/files/driem/pdfs/2011Rice%20and%20the%20Austroasiatic%20and%20Hmong-Mien%20homelands.pdf

Thanks for the PCA. It doesn't look like there's much difference between your global PCA and the paper's. I couldn't calculate exactly, because I don't know which SSA populations the paper included. It's strange, though, because the IA, AS, and Roman Britons (outlier excluded) look to be significantly affected by projection bias, since none cluster with modern Britons in the paper's global PCA. Can projection bias affect different samples differently, even in the same PCA?

There are a bajillion East Asian rice-related words, so finding a vaguely similar African one is meaningless. However, the Austronesian one has enough segments to be a decent proposal. But alas I do not have the time to study Kanuri and figure out whether it is viable. :D

http://www.nature.com/ng/journal/v46/n9/full/ng.3044.html

^ Couldn't find evidence of Asian rice introgression into African rice, and determined that African non-shattering alleles are different from Asian ones.

"Sephardi Jews have a much higher effective population size, and perhaps contrary to what most people think, they don't carry much admixture from outside of the East Mediterranean region."

Davidski, the rate of North Atlantic and West Med-type admixtures among Sephardim makes this improbable. Sephardim definitely cluster comfortably in the E. Med [more clearly than Ashkenazim], but it looks like there's definitely appreciable SW European ancestry, as well as a bit of Maghreb (to be expected, no?).

Why hasn't this issue been discussed at more length? Everyone's obsessed with establishing all the various admixtures of Ashkenazim, while hardly anyone ever talks about Iberian influence on the Sephardi genome.

3DRIF-26 seems to be quite close to six Samaritans on this plot, but slightly shifted toward SSA / Egypt. He's not in the same place as the Lebanese_Christians, Palestinians, Jordanians, or Saudis. Could he have been an Alexandrian Jew, with a small amount of local Egyptian admixture?

I don’t have expertise to assess all those linguistic relationships, but I know that the methodologies involve more than just general similarity. Of the two on FrankN’s list with which I have some familiarity with the languages, I can see the relationship is very solid. The circumflex in the Akkadian word kurângu indicates a vowel contraction. It would have initially been kuriangu. The “u” is a case ending, so remove that and you have kuriang. If you write it in the Aramaic alphabetic transliteration, which the Persian Pahlavi script is based on, it would be kwryng which is very similar to the Middle Persian gwrync. It looks like the only shifts are voiceless velar stop to a voiceless uvular stop and a voiced velar stop to a voiceless palatal stop, which are all only slight changes.

Thanks for your expert opinion! It is good to have input from someone who knows the nuts and bolts of the languages and can tell if a word is a look-alike compound or segmented wrong or whatever. (I recall someone trying to compare goyim with gaijin, lol.)

I was actually dismissing the comparison of *malo to *mbleu, not the various vrihi-like ones. With terms like "rice" long-range transmission is plausible, but the number of possible comparisons is also very large. If we start allowing compounds between Proto-Tai and Dzongha - forget it.

"For South Europeans maybe Yoruba is still a better reference, even if it might have a small amount of West Eurasian admixture (based on the assumption that Near Easterners have East African admixture, but NW Africans have West African admixture)."

What if... OoA had two steps so1) west african tropics (pop A)2) out of tropics (north and south) (pop Bn and Bs)3) out of africa (pop C derived from Bn)with Bs surviving in Khoisan but the Bn population that originally connected Africa and Eurasia essentially disappeared due to Eurasian back migrations and Bantu expansion then using Youruba woul dbe like comparing A->C instead of A->Bn->C so

"A small tribe inhabited the oasis; the ruins of their settlement are scattered between the palms at the north-western shore of the lake. It is said that one of their sources of subsistence were the worm-like crustaceans they fished from the salty lake. They were moved in the 1980s to a new location outside the sand dunes, in the Wadi Bashir, south of the erg, a settlement of concrete apartments built specifically for the resettlement of this tribe."

This shows that 3DRIF-26 clusters very closely with a set of 4 Samaritans, particularly a Samaritan Levite. (It would seem that these 4 are unadmixed with Jews, while the other 2 from Behar's dataset are admixed.) The Levites in particular were isolated from the other Samaritan families, because of Torah restrictions on who they could marry. 3DRIF-26 lies between the Samaritans and Yemenite Jews. The Samaritans originated from the same population as the Judeans in 444-338 BCE, so we would expect Judeans and Jews to "drift away" from the common ancestral group, particularly after tens of thousands of Idumeans converted to Judaism after 104 BCE. (KIng Herod of Judea's grandfather Antipater was an Idumean convert to Judaism.) If 3DRIF-26 was a Samaritan, he would certainly cluster right with the Samaritans. rather than close to them. Notice too that the Lebanese Christians, the Jordanians, the Palestinians, and even a Saudi are all on the "other side" of the Samaritans from 3DRIF-26, therefore he cannot be a Nabatean, Qedarite, Iturean, Lihyanite, or other North Arabian, and he cannot be an Egyptian or Bedouin, because the Egyptians and the Bedouin are on the "other side" of the Yemenite Jews from 3DRIF-26 and the Samaritans.

Yemenite Jews do in fact have 10% Y-DNA J2b1-M205 (J2b-M12 xM241). The J-Y18947 subclade which is the only type of J2b1 to which 3DRIF-26 could belong is found among South Semitic speakers in Hadhramaut and Yemen. However, mtDNA H5* is not found in Yemen (or Egypt), but is found in the Levant and northward. Shen et al. (2004) showing Yemenite Jews with 2/20 (10%) J2b1-M205

The simplest explanation for the autosomal, Y, and mtDNA data from 3DRIF-26 is that he is a Judean, basically almost identical to the Samaritan ancestral population, but with some additional "Yemenite Jewish"-type admixture that occurred after the split with the Samaritans around 338 BCE. This could be from some group like the Idumeans who converted en masse in 104 BCE, or from Egyptian admixture with Jews in Alexandria.

Does anyone have another plausible explanation, that accounts for the all the data, particularly the closeup of the PC plot?

The Yemenite Jews no doubt have some Levantine Jewish ancestry, but the entire Kingdom of Ḥimyar converted to Judaism in the late Fourth Century CE, meaning Yemenite Jews may actually be descent proxies for pre-Islamic Arabia. Being intermediate between Samaritans and Yemenite Jews actually supports 3DRIF-26 being something like Nabatean. There is no reason to assume that a Nabatean would be pulled toward Jordanians and Palestinians since their post-Islamic gene flow likely came from all over the Caliphate in addition to Arabia. The area even had Turkic, Circassian, and Kurdish ruling classes at different times, which might have had a disproportionate effect due to variable reproduction rates between elite and non-elite males.

Some less important issues to consider… Idumeans may not have been genetically distinct from Judeans. The Edomite language is almost indistinguishable from Hebrew, and Idumea was an area formerly part of the Kingdom of Judah which was conquered by the Edomites. The Samaritan community at their formation were almost certainly primarily descended from Israelite ancestry but likely had absorbed some of those deported from various parts of the Assyrian Empire into Samaria.

Sargon II’s proclamation on Nimrud Prism D:

I repopulated Samaria more than before. I brought into it people from countries conquered by my hands.

2 Kings 17:24:

the king of Assyria brought people from Babylon, Cuthah, Avva, Hamath, and Sepharvaim, and placed them in the cities of Samaria instead of the children of Israel; and they took possession of Samaria, and dwelt in the cities.

There are also four contracts from the area that have been recovered in excavations which include a mixture of Mesopotamian and local names.

Thanks for the detailed insight into the ancient Levant. Both of your interpretations make sense in terms of identifying the origin of 3DRIF-26, but this is based on the PCA plot shared by 'Samaritan DNA' (https://drive.google.com/file/d/0BwkCInUg9EPlRGxiVkRxaG5KTVk/view).

There is another plot (based on D-stats) that David shared a couple of days ago, which shows 3DRIF-26 shifting away from all Near Eastern populations and hovering somewhere between Sephardic Jews and Ashkenazi Jews (https://drive.google.com/file/d/0B9o3EYTdM8lQT3hldkJmSnRtWlk/view).

There is another plot (based on D-stats) that David shared a couple of days ago, which shows 3DRIF-26 shifting away from all Near Eastern populations and hovering somewhere between Sephardic Jews and Ashkenazi Jews (https://drive.google.com/file/d/0B9o3EYTdM8lQT3hldkJmSnRtWlk/view).

"Italians always show up as some type of Italian in my tests at GEDmatch, unless they have unusually high and probably recent admixture from outside of Italy."

Citation? How do you know this? I know several Italians with (a) documented ancestry to the dawn of birth records, (b) from places that were not exactly high targets for invasion, in Roman, Medieval, or Modern times, and (c) with classic European phenotypes -- who type off with your calculators. With those three traits, that's 3 strikes for your "recent admixture" from outside of Italy.

"They never come out Caucasian or Algerian, except in the mixed mode oracle results, you dumb troll."

Testy, testy!

"The reason they do that is because most Italian populations have recent, such as Roman-era, admixture from West Asia and/or North Africa."

Um, no. You have no authority for such a statement, historical, prosopographical, or otherwise. Italy, like all countries, is a big place. There are plenty of regions where folks did not afford or own slaves. And that are remote. And gosh it's so strange: you purport to have found the North African admixture but no Germanic, no Celtic? Odd.

You'll have to accept this at some point. And if the mixed mode oracle results bother you, then complain to the people who designed the oracles.

I'm glad you acknowledge that the oracles are deficient. Might I ask why such a serious scientist as yourself (cough) allows yourself to be affiliated with them?

I made my point, well, to make a point. Grow some humility. The snake oil you peddle is not always perfect, which I know shocks someone as condescending as yourself.

I know several Italians with (a) documented ancestry to the dawn of birth records, (b) from places that were not exactly high targets for invasion, in Roman, Medieval, or Modern times, and (c) with classic European phenotypes -- who type off with your calculators.

Their results reflect their ancestry.

Just because they or you don't like or understand these results doesn't mean they're wrong.

I'm glad you acknowledge that the oracles are deficient. Might I ask why such a serious scientist as yourself (cough) allows yourself to be affiliated with them?

There's nothing wrong with any of the oracles. They always reflect reality. But the output has to be interpreted in the right context.

4Mix is more accurate for recent ancestry. But again, you need to be in tune with reality and know how to interpret the output.

From my correspondences with people who use GEDmatch calculators and Oracles I can say that there is no problem with your calculators or their Oracles. The problem is that many people misinterpret or dislike their results. Many people with normal European ancestries want to find some exotic ancestry and when they do not find it in their results they either protest or misinterpret their results.

"Um, no. You have no authority for such a statement, historical, prosopographical, or otherwise. Italy, like all countries, is a big place. There are plenty of regions where folks did not afford or own slaves. And that are remote. And gosh it's so strange: you purport to have found the North African admixture but no Germanic, no Celtic? Odd."

We can easily define a North African or Levantine admixture by comparative analysis when taking Northern Italy/Central Europe as a benchmark. It's pretty different type of ancestry after all."Germanic" or "Celtic" are harder separate cause they are both essentially offshoots of one Central European Bronze Age genetic pool.

The problem with your cherished calculators and the oracles at Gedmatch is that there are 40 of them, and all 40 give different results.

Tell me again how they are never wrong?

I was just corresponding with an attorney, who put it thusly:

"If I was interviewing 40 witnesses, and got 40 different statements, I would know that all were lying."

The way we scientists put it is:

"If I measured temperature with 40 different thermometers, and got 40 different readings, I would know that a minimum of 39, and likely all 40 are wrong."

Your calculators vary greatly in their results from version to version! You put out a new version that is wildly different from the last! Provides entirely contrasting results!

You can sell this stuff to "citizen scientists" on the Interwebs, but soon you, my friend, are the one who will need to be in touch with reality, because with entirely different results on different calculators, it's kind of, um, illogical, to argue they are all right/never wrong, as you do above.

And @George Okromchedlishvili

Tell me again how you can scientifically "baseline" when you don't understand what you're baselining? How you can determine how pop 2 has a different admixture when you use an admixed population as your baseline?

Right is right; science is science; bad logic is bad logic; conflicting theories still conflict. The question is how self-described scientists can justify their theories. Here the answer is, "pretty poorly."

Of course, you can move for burning at the stake. That doesn't improve your logic, only your self-esteem, and only momentarily.

Considering that Halberstadt_LBA (from central northern Germany) comes out closest to Swedes in the oracles of Eurogenes K13 and MDLP K13 Ultimate, I started to realise that the Germanics were on an east-west cline, with even rather western tribes from north of the Harz mountains having quite some eastern affinity. Somehow some of this may have entered 3DRIF-16 and 6DRIF-3.