September 27, 2012

A surprising link between Africans and Denisovans

I took the following populations from the version of the HGDP released by Patterson et al. (2012). I use the _AHOA suffix (Affymetrix Human Origins Array) to distinguish them from other versions of the same populations:

MbutiPygmy_AHOA 11

Italian_AHOA 11

Miao_AHOA 10

Papuan_AHOA 12

Karitiana_AHOA 8

I identified the following SNP subsets:

AFRICA: 67022 SNPs that were polymorphic in MbutiPygmy and monomorphic in the other populations

EURASIA: 94858 SNPs that were polymorphic in at least one non-African population and monomorphic in MbutiPygmy

AFRICA_EURASIA: 367051 SNPs that were polymorphic in both MbutiPygmy and at least one non-African population

ALL: 528931 SNPs that were polymorphic in at least one population

GLOBAL: 168640 SNPs that were polymorphic in all 5 populations

Note that the union of AFRICA, EURASIA, and AFRICA_EURASIA is the ALL set.

Here is a Venn diagram of SNP sharing:

I then calculated all D-statistics of the following form:

D(Pop1, Pop2; Archaic, Chimp)

where Archaic is either Neandertal or Denisova, and Pop1, Pop2 is any possible pair of the modern populations. These D-statistics were calculated for all 5 SNP subsets.

Some brief observations, before we get to the "main course" of this post:

Eurasians appear substantially Neandertal/Denisovan-admixed when SNPs polymorphic in Africans and monomorphic in Eurasians are used. I can think of no other explanation than archaic African admixture for this finding.

Papuans appear Denisovan-admixed across the board.

For the GLOBAL set, population differences in Neandertal admixture are all non-signficant. Given that the GLOBAL set includes SNPs likely to have existed in the ancestral modern humans, this indicates a fairly symmetrical relationship of to Neandertals.

The most unexpected and surprising finding, is doubtlessly, the evidence that Africans have more Denisovan ancestry than all Eurasians (except Papuans) when SNPs polymorphic in non-Africans and monomorphic in Africans are used (EURASIA panel). I highlight some comparisons:

First of all, clearly Papuans have a special relationship with Denisovans compared to all the remaining 4 populations:

That's right: Mbuti Pygmies are actually closer to Denisovans than Eurasians over the subset of SNPs that are polymorphic in Eurasians and monomorphic in the Mbuti.

I do not quite know what to make of this surprising signal. I can think of two explanations:

An early Out-of-Africa movement that affected "Denisovans" and Papuans but not other Eurasians. Living Africans are pulled away from Denisovans because of their archaic African ancestry and towards them because of contributions from their ancestors to the Denisovan population. Hence, they appear less Denisovan-like in African-polymorphic sites (where there is an excess of archaic admixture in Africans) and more Denisovan-like in African-monomorphic sites.

An Into-Africa movement of a population related to Denisovans, a kind of "reverse bottleneck" where a subset of Denisova-like variation entered Africa, hence leaving Eurasians polymorphic and Africans monomorphic.

I would like to stress that these results do not really depend on the choice of the MbutiPygmy population. I have also seen them when I carried out similar experiments using Yoruba and Mandenka.

I often get the feeling that the problem of human origins as it stands is one of too little data for too many variables. But, I am more or less convinced that admixture between very divergent populations of Homo heidelbergensis played a major role in shaping modern humans.

UPDATE (29 Sep): I continue the investigation of this link in a new post.

"Eurasians appear substantially Neandertal/Denisovan-admixed when SNPs polymorphic in Africans and monomorphic in Eurasians are used. I can think of no other explanation than archaic African admixture for this finding".

I'm not denying the likelihood of archaic African admixture, but wouldn't another explanation in this case be that the polymorphism in Africa is the result of just a small subset having left Africa and mixed substantially with the population already out of Africa (H. erectus/heidelbergensis for example)?

"That's right: Mbuti Pygmies are actually closer to Denisovans than Eurasians over the subset of SNPs that are polymorphic in Eurasians and monomorphic in the Mbuti. I do not quite know what to make of this surprising signal".

Your second explanation seems more likely than the first in the case of the Mbuti, Yoruba and Mandenka.

"An Into-Africa movement of a population related to Denisovans, a kind of 'reverse bottleneck' where a subset of Denisova-like variation entered Africa, hence leaving Eurasians polymorphic and Africans monomorphic".

But when did that into Africa movement take place? Your data support the rather interesting fact that the Denisova mt-DNA seems to be at the root of both modern and Neanderthal. Perhaps the modern mt-DNA lines descend from an expansion of the Denisova group into Africa, which left behind in Eurasia what became the Neanderthal mt-DNA lines (as well as the 'original', or 'basal' mt-DNA lines). I suggested that scenario when the Denisova mt-DNA was first discovered, but at the time everyone was trying to construct complicated scenarios involving multiple OoAs. My suggestion here would also explain the higher 'Denisova' levels in Australians and Papuans. They have more of the pre-modern ('Denisova') genes than do other Eurasian populations. Nothing to do with complicated scenarios involving hybridism in SE Asia followed by extinction there, and everywhere else except Oz/NG.

"I am more or less convinced that admixture between very divergent populations of Homo heidelbergensis played a major role in shaping modern humans".

Exactly what I have been suggesting all along. Thanks for seeing the light. Welcome.

There have been a couple of studies that have used statistical methods to show some archaic admixture in Paleoafricans, although not a whole lot (maybe 1%). The same statistical methods predicted Neanderthal admixture and Denisovian admixture as well (although didn't know what to make of it at the time).

If Denisovans are Homo Erectus and species descendants thereof after a Homo Erectus break from the modern human tree (a very plausible assumption), and if the archaic hominins in Africa that admixed at low levels with Paleoafrican populations are closer to Homo Erectus than to Neanderthals who probably arose somewhere outside sub-Saharan Africa (another plausible assumption), then African Paleoafricans should be closer to the Denisovan genome than other modern humans except Papuans/Australian aborigines who have much more Denisovan admixture than Paleoafricans have archaic admixture. Eurasians who lack either form of Homo Erectus admixture and have only Neanderthal admixture, should be more distant from the Denisovan genome.

Homo Erectus evolved in Africa ca. 2 mya and presumably when Homo Erectus migrated as far as Java by 1.8 mya, an African source population remained in Africa and may have persisted in some numbers until competition from modern humans pushed them into extinction in Africa and in Asia, probably sometime in the Upper Paleolithic.

This isn't the only story one can tell with the data but it is a pretty simple narrative that integrates what we know from archaeology with what we know from genetics and makes a fair amount of sense.

Mbuti Pygmies are ctually closer to Denisovans than Eurasians over the subset of SNPs that are polymorphic in Eurasians and monomorphic in the Mbuti.

I'm not sure this isn't a side-effect of using only polymorphic Eurasian allelles that are monomorphic in the Mbuti. There are 3 ways this polymorphic/monomorphic split can come about: an allele was lost due to selection/drift in the monomorphic population, a new allele emerged due to mutation in the polymorphic population, or a new allele was introgressed from a 3rd population. Given that diversity in Africa is high and the Eurasian genome has only a small percentage of known non-African admixture I would expect that mutation alone would account for the bulk of the SNP's in the polymorphic Eurasian sample, however, for argument's sake I'm going to assume an equal 33% chance of the allele being caused by mutation in the Eurasian population (although I'd consider this a minimum).

This gives us 33% of the alleles in the SNP sample being caused by random mutation, and since there are 3 possible amino acid mutations, a 33% chance of each mutation producing a particular allele variant. Multiplying these together we get a (roughly) 10% chance of any polymorphic Eurasian allele in the sample matching the Chimp allele by pure chance, and thus not indicating any genetic connection. Consider the following possibility: Chimps have "G" for a particular SNP. After the Chimp/Homo split Homo mutates to "A", which is preserved in both Denisovan and Sapiens after they split. After the African/Eurasian split, Eurasian mutates back to "G". This would show up in the D-statistic as a Eurasian/Chimp affinity when in reality it is just chance.

When using a genome-wide or random sample of SNPs it wouldn't matter as the 10% of Pop1 alleles that randomly matched the Chimp would be cancelled out by the same thing happening for Pop2. But since you are using a set of SNPs deliberately designed to have more Eurasian mutations than Mbuti ones it will have a noticable effect. As explained above I would expect at least 10% of the SNPs in the set to be "randomly Chimp-like", which is roughly the percentage of increased Eurasian/Chimp affinity that the D-statistics are showing.

There have been a couple of studies that have used statistical methods to show some archaic admixture in Paleoafricans, although not a whole lot (maybe 1%).

Such a low archaic admixture in Paleoafricans is increasingly looking less and less likely with new studies. According to the most recent studies, Paleoafricans probably have the highest amounts of archaic admixture among all existing humans.

There have been a couple of studies that have used statistical methods to show some archaic admixture in Paleoafricans, although not a whole lot (maybe 1%). The same statistical methods predicted Neanderthal admixture and Denisovian admixture as well (although didn't know what to make of it at the time).

Because of the absence of archaic African genomes, inferrence of archaic African admixture is limited to a recent time depth. At longer time depths (say before 100ka) archaic segments are so small that it becomes probabilistically difficult to show that they are archaic.

It's like throwing a bad coin 5 times vs. 100. At 100 times, it's likely that you'll figure out a deviation from 50-50 chance of heads/tails. At 5 times, it's less likely.

The occurrence of substantial African-specific polymorphism can be explained either in terms of a massive bottleneck during OoA or due to admixture with divergent African populations which contributed new polymorphism. A bottleneck does not explain the pattern of the D-statistics, because it would imply that in the random game of chance, Out-of-Africans ended up with the same alleles as Neandertals, which does not seem plausible.

"EURASIA: 94858 SNPs that were polymorphic in at least one non-African population and monomorphic in MbutiPygmy"

Great, but by excluding non-African monomorphic SNPs you also exclude the possibility that most of these SNPs could be ancestral. In that case, you make the ancestral part of Eurasian SNPs disappear and create a marvellous homozygotous correlation with Africans instead.

>I'm not sure this isn't a side-effect of using only polymorphic Eurasian allelles that are monomorphic in the Mbuti.

It does not occur for the equivalent comparisons for Neandertals, so something specific is going on for Denisovans.

But it does occur for Neandertals, as you noted: "Eurasians appear substantially Neandertal/Denisovan-admixed when SNPs polymorphic in Africans and monomorphic in Eurasians are used. I can think of no other explanation than archaic African admixture for this finding." While archaic admixture in Africa is a definite possibility, this finding is exactly what we'd expect to see if using a set of SNP's that is weighted towards African mutations inflates the apparent Chimp affinity for African populations. It's worth noting that the D-statistics for these SNP's with the Mbuti are much higher than for any other results in the table - 33% (Neandertal) and 23% (Denisovan), which would support the idea an inflated result.

Following this logic, the 12-14% result for Eurasian vs Mbuti Neandertal using the Eurasian SNP's would be underestimated.

Look at the data again.In both the AFRICA and EURASIA panel, Africans are less Neandertal than Eurasians.In the AFRICA panel they are less Denisovan, and in the EURASIA panel they are more Denisovan (except when compared to Papuans who beat everyone else in the Denisovan department).

So, something different is going on with respect to the relationship of Africans to Neandertals/Denisovans.

Put another way, archaic admixture at say 200 kya is effectively parts of the definition of the AMH species, and hence, really isn't admixture at all.

You are confusing the time when admixture happened, with the time when the archaic population split. For example, the archaic admixture that was detected by Hammer et al. and Lachance et al. happened a few tens of kya, but involved hominins that split off from modern humans hundreds of kya.

Look at the data again.In both the AFRICA and EURASIA panel, Africans are less Neandertal than Eurasians.In the AFRICA panel they are less Denisovan, and in the EURASIA panel they are more Denisovan

I have looked at the data again and I can see what you are saying... Eurasians are scoring -10% or so for Denisovans when compared to Africans using the polymorphic Eurasian SNPs. We can see for Neandertal admixture Eurasians are scoring 33% and 12-15% for the AFRICA and EURASIA panels, while for Denisovan they are scoring 25% and -8-10% respectively. This -10% for Denisovan stands out remarkably and indicates hitherto unknown Denisovan admixture in African populations.

However, I have shown that there is a minimum 10% expected bias against a population when testing against a set of SNPs that only includes polymorphisms for that population. This bias is because 1 out of every 3 times that the SNP is caused by a genetic mutation in the population, it will revert to the ancestral allele and thus look like an affinity when in fact it isn't. If we allow for a 10% inflation of the scores then we get Neandertal admixture in Eurasians scoring 23% and 22-25% for the AFRICA and EURASIA panels and Denisovan admixture gets 12% and 0-2% respectively. Pretty much in line with expectations.

The -ve Eurasian/Denisovan value for the EURASIA panel stands out, but it can be accounted for as the result of inherent bias in the polymorphic/monomorphic SNP selection. The "something different ... going on with respect to the relationship of Africans to Neandertals/Denisovans" is probably just a side effect of the method you are using.

"Put another way, archaic admixture at say 200 kya is effectively parts of the definition of the AMH species, and hence, really isn't admixture at all".

I agree that admixture has been a factor in our evolution right from before the appearance of Australopithecus, but we're not dealing here with that kind of ancient admixture.

"So, something different is going on with respect to the relationship of Africans to Neandertals/Denisovans".

Easily solved if we accept the possibility that at least the modern mt-DNA owes its deeper origins to somewhere outside Africa. Surely there is absolutely no reason to blindly accept that 'modern' humans evolved from an African population whose ancestors had remained behind in Africa, and they had always been confined to that continent.

"There are 3 ways this polymorphic/monomorphic split can come about: an allele was lost due to selection/drift in the monomorphic population, a new allele emerged due to mutation in the polymorphic population, or a new allele was introgressed from a 3rd population".

Wouldn't the polymorphic European SNPs have an even higher chance of being away from Chimp (any mutation if the monomorphic African is Chimp-like) than towards chimp (only 1/3, in case the African is non-Chimp-like)? Moreover, the polymorphic ones are much more likely to be a single than a double mutation, in the first place.

The majority of monomorphic alleles will be non-Chimp because the D-statistic only selects those alleles that have mutated at least once since the Chimp/Homo divergence - it only counts ABBA and BABA combinations so the alleles must be different between Denisovan and Chimps for the D-statistic to count them. Using 6Mya as the Chimp/Homo divergence and 1Mya for Denisovan/Sapiens, assuming a constant mutation rate we would expect 5My worth or 83% of the non-Chimp mutations occuring in the mutual Denisovan/Sapiens ancestor. So the vast majority of SNPs being counted will have the Denisovan allele as the "normal" one for Pop1 and Pop2, with the Chimp allele only occuring due to mutation or admixture. Since polymorphic alleles are more likely to be derived (at least twice as likely given the very conservative 33% chance each of drift, mutation or admixture causing the polymorphism), then the majority of the monomorphic alleles will be Denisovan.

Moreover, the polymorphic ones are much more likely to be a single than a double mutation, in the first place.

I'm not sure exactly what you mean by "single" and "double" mutation, but if you mean the number of times it's mutated from the Chimp allele, then all the SNPs will be at least "single" (Chimp to Denisovan) and the polymorphic ones more likely to be "double" (Chimp to Denisovan/Sapiens ancestor to Eurasian).

The majority of monomorphic alleles will be non-Chimp because the D-statistic only selects those alleles that have mutated at least once since the Chimp/Homo divergence

Well, I guess I still don't get it. If a site is monomorphic in one population, but polymorphic in another, I have no idea how the D-statistic knows that the monomorphic one is non-Chimp (and not vice-versa). Naively, I would think the opposite.

Look at the Z-GLOBAL column as well. This involves SNPs that are polymorphic in ALL 5 major human groups. MbutiPygmy are significantly more "Denisovan" in that SNP set as well.

You are correct - the Mbuti are showing similar results to the Papuans for the GLOBAL panel (1.5-2% more Denisovan). There may be something significant about these results, but since they are markedly different to the ALL panel results (4% Denisovan for Papuans, nothing significant for Mbuti) I suspect the GLOBAL panel has some inherent bias as well - impossible to tell why without more information of how it's made up, but since "polymorphic" can mean both "90% A, 10 %B" and "10% A, 90% B", I would guess that the allele frequencies are not consistent across the populations.

Well, I guess I still don't get it. If a site is monomorphic in one population, but polymorphic in another, I have no idea how the D-statistic knows that the monomorphic one is non-Chimp (and not vice-versa). Naively, I would think the opposite.

Perhaps if you tell me why you think the monomorphic allele would be Chimp I can explain how I see it differently.... so why do you think the monomorphic one is more likely to be Chimp?

but since they are markedly different to the ALL panel results (4% Denisovan for Papuans, nothing significant for Mbuti) I suspect the GLOBAL panel has some inherent bias as well

The difference between the ALL panel and the others, is that the ALL panel includes SNPs that are polymorhpic in Africans and monomorphic in Eurasians. So, the signal disappears because of the archaic admixture in Africans.

The signal is also present in SNPs ascertained in a single Papuan individual, as shown in the newer post.

Human SNPs are defined by using an outgroup, like chimps or San. Pygmy's and non-Africans could perfectly be monomorph for a certain SNP, in that case they group together.

The SNPs in the Affymetrix Human Origins Array were discovered in two chromosomes of single human individuals. Even if there were such sites included, they would make no difference for D-statistics, because D-statistics only count BABA and ABBA sites, i.e., sites where modern human populations differ from each other.

Also, positions where modern humans have fixed differences from outgroups are not SNPs. SNPs are by definition polymorphisms, and fixed sites within a species are not polymorphisms. Such sites are useless for purposes of studying human population differences, because, obviously, populations don't differ in them.

Yes, I just read the 3-page Affymetrix .pdf - truly a good way to derive SNPs. The only possible bias I can think of is from panel 13. For example, Denisovans may have reversed to Chimp from all other human lineages, there, and a deviation from most humans may indicate Denisovan admixture rather than a San mutation.

Something I've just realised is rather interesting. The Denisova element in Africa tends to be found amoung Mbuti, Yoruba and Mandenka. These groups are found very near the region where A0 and A1 are supposed to have first split. Perhaps the Denisova element lies at the beginning of the modern Y-DNA line. Perhaps the Denisova element was then diluted by admixture with whatever population supplied the modern mt-DNA line. If that is so then hybridism lies right at the base of the origin of 'modern' humans in Africa. Certainly the mtDNA L0/L1''6 split looks to lie somewhat east of the Y-DNA A0/A1 split.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.