August 09, 2014

This is a quite useful paper as it compares different methods of obtaining mutation rate estimates, either using "archaeological calibration" based on known migration events or ancient mtDNA genomes (with known archaeological dates). The authors write:

Our estimate of 143 Kya [112-180 95% HPD] for the TMRCA of all modern human mtDNA is slightly younger but highly consistent with the 157 Kya [120-197 95% HPD] value obtained by Fu et al. (2013b). We stimate the coalescence of the L3 haplogroup (the lineage from which all non-African mtDNA haplogroups descend), often used to date the “out-of-Africa” event, to 72 Kya [54-93 95%HPD], a value also onsistent with Fu et al. (2013b) estimation of 78 Kya [62-95 95%HPD]. This estimation rather places a conservative upper bound of 93 kya for the time of the last major gene exchange between non-African nd sub-Saharan African populations. As pointed out by Fu et al. (2013b), it is important to recognize that this divergence time may merely represent the most recent gene exchanges between the ancestors f non-Africans and the most closely related sub-Saharan Africans and thus may reflect only the most recent population split in a long, drawn-out process of population separation (Scally and Durbin 2012).

It should be fairly easy to pick out the common ancestor of Eurasian mtDNA (the common ancestor of M+N). I am reasonably sure that the two African red dots to the right of event "8" in the figure are African L3's, and this would place them within the Eurasian variation, and in particular as a relative of Eurasian M.

A similar observation could be found in Supplementary Figure 14 of the Lippold et al. (2014) preprint, with African L3 lineages clearly related to Eurasian M (and nested within the Eurasian phylogeny).

In any case, I don't see any evidence at all from this phylogeny that the date of L3 corresponds to an Out-of-Africa event. Unfortunately I couldn't see an estimate for the split of L3 from the rest of the phylogeny; my eyeball estimate from the figure is that it's about 20ky earlier. Hopefully, someone sooner or later will deal with the question of L3 phylogeny, because the "conventional wisdom" that Eurasian M, N are nested within African L3 variation does not appear to be quite right.

Mol Biol Evol (2014)
doi: 10.1093/molbev/msu222

Improved calibration of the human mitochondrial clock using ancient genomes

Adrien Rieux et al.

Reliable estimates of the rate at which DNA accumulates mutations (the substitution rate) are crucial for our understanding of the evolution and past demography of virtually any species. In humans, there are considerable uncertainties around these rates, with substantial variation among recent published estimates. Substitution rates have traditionally been estimated by associating dated events to the root (e.g. the divergence between humans and chimpanzees) or to internal nodes in a phylogenetic tree (e.g. first entry into the Americas). The recent availability of ancient mtDNA sequences allows for a more direct calibration by assigning the age of the sequenced samples to the tips within the human phylogenetic tree. But studies also vary greatly in the methodology employed and in the sequence panels analysed, making it difficult to tease apart the causes for the differences between previous estimates. To clarify this issue, we compiled a comprehensive dataset of 350 ancient and modern human complete mtDNA genomes, among which 146 were generated for the purpose of this study, and estimated substitution rates using calibrations based both on dated nodes and tips. Our results demonstrate that, for the same dataset, estimates based on individual dated tips are far more consistent with each other than those based on nodes and should thus be considered as more reliable.

I think they have too few ancient samples to make a reliable estimate of the mutations rate. And nearly half of their samples are from haplogroup U.

The use of ancient mtDNA does place some useful constraints on age estimates, but it still suffers from the problem that the mutation rate is highly variable, so you need a very large sample size whether you are using modern or ancient samples.

@Dienekes - Figure 1 that you linked to is based on only 320 contemporary samples, selected from just a few studies (e.g., 8 samples are the closely related U5b3a1a samples from Sardinia from the Pala et study). So the figure does not accurately represent the known structure of the mtDNA phylotree based on more tha 20,000 samples. The Lippold draft paper suffers from the same flaw. Their figures do show the structure of their limited set of samples, but it's not very useful for evaluating the relationship of M and N and their sister clades to L3. The van Oven phylotree makes this relationship quite clear with M, N and several African sister clades all sharing the same 3 mutations that define L3.

Interestingly the post glacial expansion is assumed to be typified by H1 and H3 at around 18kya. This would make it pre-neolithically near (or in) south western Europe (IMO).

The American haplogroups are in order of age:C1>D1>A2>B2>X2a

This fits very nicely in with the data for Cuba and the rest of the Caribbean so far, as I said in a post. C1 first, B2 last. X2a (which I think came from northern Europe) is listed as 23k in America! I kind of expected C1 to be a lot older than the others though.

"The van Oven phylotree makes this relationship quite clear with M, N and several African sister clades all sharing the same 3 mutations that define L3."

A reminder for all those who like facts. L3 is currently defined by a combination of bases at sites 769 1018 and 16311. But L3 agrees with Denisova at 769 (all African-specific mtDNA lineages agree with Neandertals), is uniquely derived at 1018 (all African specific mtDNA lineages agree with Denisova and Neandertals) and 16311 is hypervariable in modern humans with many actual L3 sequences classified as downstream from the L3 node (e.g., M1, M2a2, M4''67, etc.) carrying the ancestral site (C) and some actual African-specific sequences (L2, L5c2) carrying a derived site (T).

So, contrary to GailT's self-serving misrepresentation of reality, there's only one true mutation on L3 lineages (1018). 769 is ancestral on L3 and 16311 is hypervariable and thus can't be reliably used in phylogenetic analyses.

The age of 31,320 years for haplogroup C1 for example may not indicate the date of its arrival in America for two reasons. Firstly C1 is by no means confined to America and secondly we don't know how diverse the American C1 was when it first arrived. As an illustration I quote further:

"New_Zealand B4a1a1a3 4.94 ... Madagascar B4a1a1a2 4.97"

It is impossible that B4a1a1a of any sort arrived in either place so early. Or even any other haplogroup. In fact the date suggests both B4a1a1a3 and B4a1a1a2 formed very early during the Austronesian expansion. As a result in both cases we can say that the B4a1a1a haplogroups that arrived in the two opposite regions were highly diversified. In other words it seems very unlikely there was any significant bottleneck involved during the journey east and west. Another interesting aspect is that B4a1a1a2 is confined to Madagascar while B4a1a1a3 is confined to Polynesia (not just New Zealand). That in turn suggests whole related tribes moved together, otherwise we would expect to find a mixture of the two haplogroups in both regions, if they really are as ancient as the original Austronesian expansion.

"Remote Oceania B4a1a1a & B4a1a1a1 15.83"

The same applies in this case. haplogroups of any sort almost certainly did not move beyond the Solomon Islands until the Austronesian arrival, some 4000 years ago. In other words the B4a1a1a that arrived in Remote Oceania was highly diverse. A large population was involved in the expansion.

It is meaningless to say that "769 is ancestral" unless you also specify the base at marker 769. The ancestral state for the RSRS is 769A. L3 and all of its subclades including M and N are 769G. This is immediately obvious to anyone who bothers to look at the phylotree or to look at actual mtDNA sequences.

Hypervariable region markers are, by definition, hyper variable. There are several that are especially variable but that are usually stable enough to be useful for phylogeny. These include markers 152, 16311, 146, 195, 16189, 16129, 16093, 16362 and 150. Whether you include these or not, the combination of A769G and A1018G show unambiguously that M and N are sister clades to L3a, L3b etc. But for completeness, most samples in L3 are C16311T, and the reversion T16311C! is useful for identifying additional more refined subclades.

I find it extraordinary that you can be so obviously wrong and at the same time so rudely insulting in claiming that everyone else is wrong.

"It is meaningless to say that "769 is ancestral" unless you also specify the base at marker 769. The ancestral state for the RSRS is 769A."

769 has G on Denisovan. So this should be considered the ancestral state. L3 agrees with it.

"I find it extraordinary that you can be so obviously wrong and at the same time so rudely insulting in claiming that everyone else is wrong."

The bad news for you is that I'm right. And another piece of bad news is that you only pretend to be competent and objective but in fact you just drive the data toward your desired "out-of-Africa" goal. Whatever you think my attitude is - you deserve it.

769 has G on Denisovan. So this should be considered the ancestral state.

You cannot build a phylotree using a single SNP. You need to look at the full mito-genome. mtDNA has a fast mutation rate and you will find many cases in which the same SNP occurs in unrelated individuals or haplogroups. When you look at the full genome it is obvious that L3 does not descend from Denisovans, and that it does descend from L3'4 which has the ancestral form 769A. Anyone who is willing to spend several hours analyzing mtDNA samples can very easily verify this for themselves, and they will reach the same conclusion as every paper published in the last 30 years. You claim that there is a vast global conspiracy to manipulate the mtDNA phylotree, but anyone can easily verify that this is simply not true.

"You cannot build a phylotree using a single SNP. You need to look at the full mito-genome. mtDNA has a fast mutation rate and you will find many cases in which the same SNP occurs in unrelated individuals or haplogroups. When you look at the full genome it is obvious that L3 does not descend from Denisovans, and that it does descend from L3'4 which has the ancestral form 769A. Anyone who is willing to spend several hours analyzing mtDNA samples can very easily verify this for themselves, and they will reach the same conclusion as every paper published in the last 30 years. You claim that there is a vast global conspiracy to manipulate the mtDNA phylotree, but anyone can easily verify that this is simply not true."

I've spent more time on mTDNA than you. Plus on many other sources of information. Your are an amateur, I'm a professional. Your argument doesn't hold water. "obvious", "last 30 years", etc. are arguments that are outside of science. You need to learn scientific methodology before venturing into data analysis. Just go through all the mutations on the human mtDNA lineages and compare them with the Denisovan and Sima counterparts. You'll see the same thing going on with many many other sites. Without ancient DNA attestation, you can't claim that L3'4 is ancestral to L3. They both occur in living humans and the states on individual sites are often vastly different between Denisova and what PhyloTree postulates as ancestral out of thin air. mtDNA lineages were arranged into a present tree prior to ancient hominin mtDNA becoming available. PhyloTree just favors African sequences as supplying the ancestral state. This is a circular argument that's a very easy target for people like me.

"vast global conspiracy."

My approach to human origins is multidisciplinary, hence the results overturn theories generated from a single-discipline viewpoint. It's all very logical. You can believe in my anti-conspiracy conspiracy all you want but I'm just offering a better rational interpretation to observed reality. Get used to it and go back to the drawing board!

@Tobus

"Your use of "should" here seems irrational... what's wrong with the complimentary logic "769 has A on Neanderthal. So this should be considered the ancestral state."?"'

It's very rational. Denisovans are genetically more divergent than Neandertals. That's what you use to root more recent sequences. The site we are talking about is identical in Sima, Denisovan, chimp and human L3. African L0, etc. and Neandertals represent parallel innovations. That's where the notion that mTDNA is "hypervariable," which GailT misinterprets to her advantage, comes in handy. Derived states can occur through a parallel process but you don't imagine two species reverting to an ancestral state independently. It's just bad methodology.

@GermanIt's very rational. Denisovans are genetically more divergent than Neandertals. That's what you use to root more recent sequences

If Neanderthals and Denisovans disagree then it's pretty much 50/50 which has retained the ancestral and which had the novel mutation. Automatically assuming the Denisovan is always ancestral would be bad methodology.

Derived states can occur through a parallel process but you don't imagine two species reverting to an ancestral state independently. It's just bad methodology.

I agree - the most likely phylogeny is the one that requires the least amount of these convergent mutations... in isolation, G at 769 would appear to be ancestral, however, it is flanked by a number of mutations in the L3 lineage that are clearly derived (A1018G, C10400T in M etc.) meaning that it's more parsimonious to consider the G derived... otherwise you have to assume an even larger number of convergent reversals to the ancestral state at other sites in the haplogroup.

"I suggest that the chimp allele is the relevant one in a situation like this."

To me it seems that human mtDNA is quite divergent from Bonobo and Chimp in this segment. Something must have happenend in human-specific evolution to mix things up entirely, though this process had obviously stopped by the time of Sima de los Huesos. The best ancestral allele for humans might thus be available in Sima de los Huesos. Like Denisovans, it has the L3/M/N-specific and apparently "non-African" 769G.Indeed, the entire classification of L3/M/N as being originally "African" largely depends on the proposal of an A769G mutation, what possibly never happened. Somehow I rather perceive a single African (xL3) clade instead, that derives from a G769A mutation! However, Neanderthal G769A may also suggest back-and-forth mutation on the modern human lineage that started somewhere between Sima de los Huesos and just before the earliest radiation of Neanderthal and modern humans. It has also been proposed to exclude Sima de los Hueses entirely from the shared Neanderthal-modern human lineage and put it instead on the Denisovan lineage, though this has been rejected on the basis of fossils (Arsuaga et al., 2014) and appears genetically inspired by scientific ignorance on back-and-forth mutations. I gather some shared mutations between Sima de los Huesos and Denisovans may actually be ancestral. This stresses the worse problem of the current mtDNA phylogeny: we lack precise knowledge of the modern human ancestor. For all we know it ain't rCRS nor L0, nor chimp. It may have been closest to Sima the los Huesos, and some of its divergence may be due to back-and-forth mutations. For one, this doesn't resolve the origin of L3 as African.

"If Neanderthals and Denisovans disagree then it's pretty much 50/50 which has retained the ancestral and which had the novel mutation."

Sima agrees with Denisovan and human L3. So we have the oldest attested and most divergent lineages supported by a bulk of modern ones. So the going in assumption should be that 769 in Sima/Denisovan/human is ancestral, while 769 on Neandertals and L0'1'3'4'5 is derived. That's good methodology.

"however, it is flanked by a number of mutations in the L3 lineage that are clearly derived (A1018G, C10400T in M etc.) "

There are plenty of sites that are derived in L0'1'2'4'5.

@Rokus

"Somehow I rather perceive a single African (xL3) clade instead, that derives from a G769A mutation! "

yes, that'd be fine with me. We need to have someone like GailT to do actual work on an alternate tree for us instead of just preaching.

Of course. The point I'm making is that if 769 is indeed ancestral at G, than a whole host of other sites where Africans, archaics and chimps doagree would then have to be treated as being back-mutations, and as you yourself said "you don't imagine two species reverting to an ancestral state independently. It's just bad methodology." It appears that it's impossible to resolve a phylogeny without including some duplicate back-mutations, so the best solution is the one with the least of them. Taken in isolation 769 may look like a simple G->A mutation, but in the context of the entire tree it seems it's much more parsimonious to consider it a G->A->G back mutation.

"The point I'm making is that if 769 is indeed ancestral at G, than a whole host of other sites where Africans, archaics and chimps doagree "

Where's that? I don't know of any studies that compares Africans vs. non-Africans vs. Neandertals vs. Denisovans/Sima vs. chimps? At a time span of 13M years since chimps and non-chimps diverged, back mutations are more likely than at more recent timeframes. It's all a matter of building alternate trees, not advocating for one. Out-of-Africa does the latter.

Ensemble. It took me about 30 seconds to look up the neighbpuring sites I provided above that make your version of L3 unlikely.

It's all a matter of building alternate trees

It's a matter of building trees that work, and there are hundreds of people trying everyday to build trees that improve the current model... it's all well and good to cherry pick a handful of sites that look odd and whine about a conspiracy theory, but it's hard to take such armchair criticism seriously when everyone who does the actual legwork across the whole of the data comes up with a similar result. Instead of insisting they build you an alternate tree that fits your personal precpnceived agenda, perhaps you should accept the research and change your agenda to fit the data.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.