May 22, 2014

Interesting try at autosomal DNA nuclear clock-o-logy. Not quite it yet but interesting nevertheless because it approximates much better what seems to be the reality, based on archaeological data, than previous attempts.

The
availability of complete human genome sequences from populations across
the world has given rise to new population genetic inference methods
that explicitly model their ancestral relationship under recombination
and mutation. So far, application of these methods to evolutionary
history more recent than 20-30 thousand years ago and to population
separations has been limited. Here we present a new method that
overcomes these shortcomings. The Multiple Sequentially Markovian
Coalescent (MSMC) analyses the observed pattern of mutations in multiple
individuals, focusing on the first coalescence between any two
individuals. Results from applying MSMC to genome sequences from nine
populations across the world suggest that the genetic separation of
non-African ancestors from African Yoruban ancestors started long before
50,000 years ago, and give information about human population history
as recently as 2,000 years ago, including the bottleneck in the peopling
of the Americas, and separations within Africa, East Asia and Europe.

Based on Figure 4c:

Figure 4: Genetic Separation between population pairs (...) (c) Comparison of the African/Non-African split with simulations of clean splits. We simulated three scenarios, at split times 50kya, 100kya and 150kya. The comparison demonstrates that the history of relative cross coalescence rate between African and Non-African ancestors is incompatible with a clean split model, and suggests it progressively decreased from beyond 150kya to approximately 50kya. (...)

This comparison reveals that no clean split can explain the inferred progressive decline of relative cross coalescence rate. In particular, the early beginning of the drop would be consistent with an initial formation of distinct populations prior to 150kya, while the late end of the decline would be consistent with a final split around 50kya. This suggests a long period of partial divergence with ongoing genetic exchange between Yoruban and Non-African ancestors that began beyond 150kya, with population structure within Africa, and lasted for over 100,000 years, with a median point around 60-80kya at which time there was still substantial genetic exchange, with half the coalescences between populations and half within (see Discussion). We also observe that the rate of genetic divergence is not uniform but can be roughly divided into two phases. First, up until about 100kya, the two populations separated more slowly, while after 100kya genetic exchange dropped faster. We note that the fact that the relative cross coalescence rate has not reached one even around 200kya (Figure 4c) may possibly be due to later admixture from archaic populations such as Neanderthals into the ancestors of CEU after their split from YRI [29].

Follows their population size estimates:

Figure 3: Population Size Inference from whole genome sequences
(a) Population size estimates from four haplotypes (two phased individuals) from each of 9 populations. The dashed line was generated from a reduced data set of only the Native American components of the MXL genomes. Estimates from two haplotypes for CEU and YRI are shown for comparison as dotted lines. (...)

A serious problem I have with this graph is that the gradual bottleneck affecting Eurasian-plus populations does not begin to recover within this simulation before c. 40 Ka. That doesn't seem good enough because by that time the Asian population must have expanded at least moderately, as they had colonized all the continent and even Australasia by that date.

This means that there is a lot of refining still to be done to the methodology, because there should be signal of expansion in Asia much earlier than 40 Ka and not more and more apparent decrease of the population size, what is totally inconsistent with the ongoing colonization of a whole continent.

I could try to double again the rates to get a more consistent Asian expansion age of c. 80 Ka but that should push the Eurasian-plus bottleneck to a much earlier date, 600 Ka ago, what is simply nonsensical. So the only possible conclusion is that the algorithm is far from realistic and still needs a lot of work.

Non-Bantu East Africans belong to the proto-Eurasian cluster:

Our results suggest that Maasai ancestors were well mixing with Non-African ancestors until about 80kya, much later than the YRI [Yoruba]/Non-African separation. This is consistent with a model where Maasai ancestors and Non-African ancestors formed sister groups, which together separated from West African ancestors and stayed well mixing until much closer to the actual out-of-Africa migration.

South Asians exchanged a lot with West Eurasians before Neolithic:

.... the GIH [Gujarati emigrants to Texas] ancestors remained in close contact with CEU [NW European emigrants to Utah] ancestors until about 10kya, but received some historic admixture component from East Asian populations, part of which is old enough to have occurred before the split of MXL.

Figure 4: Genetic Separation between population pairs (...) (d) Schematic representation of population separations. Timings of splits, population separations, gene flow and bottlenecks are schematically shown along a logarithmic axis of time. (...)

Overall their population tree makes good sense, except for the apparently too recent dates for nearly all the events and very especially for the intra-Eurasian split. There are no doubt confounding factors acting here. Probably if MXL (Native American component) were excluded, the West-East split could be moved backwards in time.

They heavily rely on the MXL Native American element to calibrate the clock, what makes sense on the surface. But the fact that Native American origins are themselves a mix of West/South Eurasian and East Asian origins may be tricking them. In the tree, MXL derives from East Asians and it actually should be, we know for a fact, intermediate between East Asia and West/South Eurasia, something that is not reflected at all and that is almost certainly altering the picture.

But, as said above, there are more corners, some quite prominent, to be polished in all the modeling process until a future version of it can be acknowledged as a reliable "clock" (emphasis on reliable, because some people put way too much faith on these rough approximations, what is clearly an error).

On mutation rates:

Our results are scaled to real times using a mutation rate of 1.25×10-8 per nucleotide per generation, as proposed recently [16] and supported by several direct mutation studies [14-16]. Using a value of 2.5×10-8 as was common previously [44, 45] would halve the times. This would bring the midpoint of the out-of-Africa separation to an uncomfortably recent 30-40kya, but more concerningly it would bring the separation of Native American ancestors (MXL) from East-Asian populations to 5-10kya, inconsistent with the paleontological record [25, 26].

In short: using the usual scholastic mutation rates would have been nonsensical. Doubling them was common sense needed to achieve minimal coherence with observed reality (how many times have I said that?) It is obviously not enough but it was something needed in any case.

I blogged the U6 paper at http://dispatchesfromturtleisland.blogspot.com/2014/05/the-story-of-mtdna-haplogroup-u6.html and also commented at Dienekes on it. Their incorporation of archaeology evidence from particular well dated and defined archaeological cultures and paleoclimate date into their genetic analysis, and willingness to acknowledge that those dates may be more solid that mtDNA dating in close cases, is refreshing. For example, their association of U6 as a whole with the intrusive Levantine Aurignacian, and of U6a with the Iberomaurusian culture in the Maghreb, are both well reasoned.

Their willingness to acknowledge that the history of U6 could have been a complex, multi-waved process, for example, with at least one or two UP waves followed by several successive waves in the Neolithic and later, is also appreciated. This data set is also well suited to mapping out those kinds of complex histories.

Their argument for a non-Levantine origin of mtDNA hg U as a whole is also interesting, if a bit over specific to Central Asia given what the data requires. Their narrative of expansion within Africa is also a sensible read of the evidence.

In general, as I will surely mention tomorrow, I find their chronology for U6 as very realistic (within reasonable CIs). As for the arrival of U6 to NW Africa from West Asia, I still have doubts because there's nothing known between Aterian and Oranian (Iberomaurusian). An alternative possibility could be that pre-U6 actually entered NW Africa from Europe expanding then from a North Moroccan center of radiation in the Oranian. The Aurignacoid wave apparently only reached as far West as Cyrenaica (Dabban) and that is a serious problem to claim a West Asian origin for even pre-U6. Food for thought.

But in most aspects is a top quality paper that I really want to write about. Tomorrow almost certainly.

BTW, did you expand from the first article at your blog Andrew?, because I remember it much shorter (although memory can play strange tricks sometimes).

The first one is in my list since monday or so. It's a high quality review of U6. Nothing really too new but refreshing to see all the previous research more or less confirmed, with quite greater detail wealth.

The second one, I rather dislike: the Fst-based conclusion seem a bit amateurish, especially as Denisovan admixture and drift by isolation are not considered for Australasian aborigines, which are the keystone of the argument. Also their chronological estimates are absurdly too recent and the alleged Central Asian route just clashes with archaeological reality. So I'm ignoring it because it lacks merit IMO.

Regarding the second paper, I think it is a very good paper, primarily because it is one of the first papers to do a very good job of combining geometric morphometrics (a very powerful technique) with genetics. Their results are supported by several other recent papers.

They calibrate from 15kya for a distinct Berginian population bottleneck. Given climate and archaeology, a date closer 20kya to 22kya or so is closer to the mark for that event. A 33%-50% longer set of dates would improve the archaeological fits quite a bit.

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (it may take some time before your comment is published, thank that to Trumptards).