Mitochondrial DNA D and C (new paper)

23Dec

I’m a bit overwhelmed by information of all kinds and real life issues, so I fear I will have to pass quickly and maybe shallowly over this paper, which anyhow seems an important reference for the understanding of the colonization of NE Asia (and America).

The most important substance is without doubt the clarification of the phylogeny and the graphs showing by color codes the spread of these two lineages. They also propose archaeologically consistent colonization and recolonization scenarios for NE Asia and even NE India, however their reliance on the most uncertain molecular clock methodology is at least arguable.

Haplogroup C

Haplogroup C is derived from CZ, which in turn is derived from M8, which is a basal subclade of macro-haplogroup M. Overall it is removed from the M node by 9 coding region mutations (plus 3 in the control region).

South/Central Siberia, NE Asia, East Asia, India, Europe

C1 probably arose in Beringia (most subclades are Native American, not shown in the graph).

C4 and its “granddaughter” C4a are claimed to have expanded first prior to the LGM, the other major sublineage, C4b, instead is said to be of more recent spread (Holocene). Take these dates with a tablespoon of salt, because they may well be much older and totally different of these descriptions.

A new basal subclade, C4e, has been described with presence in Altai.

C5 is claimed to have a post-glacial-maximum expansion time-frame.

C7 is said to be the most ancient subclade (but still too recent for my opinion) and has a more southernly distribution (East Asia and NE India), yet the graph clearly shows it is also present in America.

A more complete phylogeny is available in the supplemental material.

Haplogroup D

Haplogroup D is a basal subclade of macro-haplogroup M, only separated from the ancestral node by two mutations (both in the coding region). It is also a most diversified haplogroup.

South/Central Siberia, NE Asia, East Asia, India, Europe

The authors give an older age of expansion for haplogroup D (35-37 Ka) but still quite a bit short for my understanding, which should make it roughly synchronous to the colonization of West Eurasia since c. 50 Ka ago.

D4, D5 and D6 are also given pre-LGM expansion ages.

There is some refinement of the structure of D4b1a2a, which is now divided into two subclades: D4b1a2a1 and D4b1a2a2, D4b1a2a1 is further subdivided into D4b1a2a1a and D4b1a2a1b.

The authors seem puzzled not to have found any molecular clock guesstimate that correspond with the archaeologically defined earliest colonization of North Asia, c. 40 Ka ago (Altai), however this colonization of “Aurignacoid” industries is surely unrelated to these East Asian lineages and most likely related to West Eurasian ones (though unclear which ones).

But more “puzzling” is the following:

Importantly, we have not found in northern Asia any genetic signatures of sufficient antiquity to indicate traces of pre-LGM expansions, that originated from the Upper Paleolithic industries that were present both in the southern Siberia and Siberian Arctic, and that date back to ~30 kya, well before the LGM [1], [34], [36]. Apparently, the Upper Paleolithic population of northern Asia did not leaving a genetic mark on the female lineages of modern Siberians.

This is probably wrong and surely caused by excessive reliance on molecular clock methodologies measuring not from ancestral nodes but from present day sequences, which (I understand) are generally conservative towards ancestral values because of the effects of drift, which naturally favors the older and most common variants, except maybe in random accidents or very very small populations.

Not to mention the usual and awfully wrong practice of calibrating human genealogy using a Pan-Homo divergence well under the most likely age of 8 million years (or more).

So add 20-30% to every age guess you see using the molecular clock methodologies, unless some major improvements are being used or there are other reasons to believe the estimate is correct. This is not any method but a safety measure.

31 responses to “Mitochondrial DNA D and C (new paper)”

The only really "hard" constraint on the dates is the appearance of C and D haplogroups in the Americas, which implies that the basal mutations in the C and D lineages must have arisen from M before the migration of pre-Native Americans from Siberia to Beringia.

In a sense true. The colonization of America c. 20-15 Ka ago is a lowest limit not to C and D as wholes but the lineages shared split between North Asians and Americans. But I understand that mtDNA D, with its two mutations from the M node, is equivalent to mtDNA N or maybe R, so it must be from a much older time-frame as the one proposed here, more like 50 Ka or maybe even more. And surely this reasoning applies to all other clades.

"They also propose archaeologically consistent colonization and recolonization scenarios for NE Asia and even NE India" But not SE Asia? Doesn't that rather rock the boat for a southern coastal migration for the spread of either haplogroup? Presumably the presence of either haplogroup in SE Asia is a result of more recent southward movement. And I'm still very suspicious of the current classification of the Australian/New Guinea/Melanesian M haplogroups. So many of them. But only one M haplogroup (M7) between that region and the mutiple Chinese and Indian M haplogroups. Mitochondrial DNA haplogroups are accepted as being the conservative ones regionally yet it seems that the current classification demands we accept that virtually all intermediate haplogroups have been completely exterminated. Further. Three of the M haplogroups (M27, M28 and M29) are confined to island Melanesia. These islands were presumably settled from New Guinea but only M29 is derived from any New Guinea haplogroup. In this case Q. Where did the others come from? The only conclusion compatible with the current classification is that humans completely died out through island SE Asia after they'd reached New Guinea/Australia. But that argument is complicated by the survival of Y-hap C2 and mtDNAs B and F.

Let's see: in my reconstructions of coastal migration I always place the expansion of D and M8 (the ancestor of C) in East Asia. In fact I may even place them a bit way to the North, considering the present data… but in China in any case. They are offshoots of M, you can make them walk through the inhospitable Tibetan Plateau, Takla Makan desert or ultra-cold Siberian steppe/tundra… or you can make them follow the Brahmaputra/Bengal Bay coast and then the Indochinese rivers or coasts, all them much more hospitable. The overall reasons for a coastal migration between India and East Asia are not these haplogroups in particular (though they fit well too) but specially the Near Oceanian ones. "And I'm still very suspicious of the current classification of the Australian/New Guinea/Melanesian M haplogroups. So many of them. But only one M haplogroup (M7) between that region and the mutiple Chinese and Indian M haplogroups".There are some M clades among Negritos of Malaysia… at least. I'm not even sure where each M subclade belongs because there are so many of them. I understand that there are other M clades in SE Asia and East Asia but are smaller and seldom mentioned. "Further. Three of the M haplogroups (M27, M28 and M29) are confined to island Melanesia. These islands were presumably settled from New Guinea but only M29 is derived from any New Guinea haplogroup. In this case Q. Where did the others come from?"Straight from India, more or less. They are all founder effects, refined by drift. The migration was effectively "rapid" leaving relatively few structure behind.

"you can make them walk through the inhospitable Tibetan Plateau, Takla Makan desert or ultra-cold Siberian steppe/tundra… or you can make them follow the Brahmaputra/Bengal Bay coast and then the Indochinese rivers or coasts" There's a third option, much more likely. They moved through the hill country in the north of Burma/Yunan. Nothing coastal about it. "There are some M clades among Negritos of Malaysia… at least". Apart from M7 there are small amounts of M21 and M22. These are found elsewhere as well, even in Bangla Desh, so they are not actully specific to SE Asia. Most SE Asian M is M7. "Straight from India, more or less. They are all founder effects, refined by drift. The migration was effectively 'rapid' leaving relatively few structure behind". I find that very difficult to accept. Even a 'rapid' movement would not carry the whole population with it. The Polynesians almost certainly moved rapidly out into the Pacific, yet they left plenty of evidence for their trail in spite of the founder effects and drift obviously involved.

"There's a third option, much more likely. They moved through the hill country in the north of Burma/Yunan. Nothing coastal about it".That option is not really different from the coastal model, at least the way I understand it. Actually I have always thought, considering the archaeological evidence and common sense, that coastal migration, riverine migration and migration through other geographies such as hills are all compatible and surely happened in parallel. The term coastal migration is for me not different of tropical migration, often but not necessarily along the coasts. As for the M subclades, I am not sure because I never looked in sufficient depth into the matter (here is where a well documented blog by you could be of help, as you are particularly interested in this region and its lineages). However I see the following generic East Asian M lineages from this (not-updated) tree I created almost two years ago:M7, M8 (incl. CZ), M9 (incl. E), M10, M11, M12'G, M13, M21, M31 (Andaman), M32 (Andaman), M44'52, M46 and D. There may be others. M9 (incl. E) is for example a clear case of important SEA lineage. So it's not just M7. That some are shared with South Asia only emphasizes the coastal/tropical migration process, nothing else. "I find that very difficult to accept. Even a 'rapid' movement would not carry the whole population with it". It does not matter: you can see this process as a subgroup with mtDNA M (possibly mixed in varied amounts with those carrying pre-N) migrating from South Asia and then scattering around, each establishing their own unique founder effects. The migrant groups would all be M (M-root) with whatever "private" sublineages, some of which became dominant by founder effect (plus drift) and we call them now haplogroups with specific names. The fastness of the migration is emphasized because some of this lineages are just one (control region) mutational step (or two) downstream of M, what means a few thousand years on average:- M7 (2 mutations)- M9 (1 mutation) > E- M12'G (1 mutation)- M13'46'61 (1 mutation)- M14 (Australia, 1 mutation)- M29'Q (Melanesia, 1 mutation)- M32'56 (and M32, Andaman, 1 mutation)- M44 (1 mutation)- D (2 mutations)All these lineages were surely expanding in their destinations in maybe 10,000 years after the general M expansion in South Asia. They should be roughly contemporary with the expansion of N as well. So, if the general M expansion happened c. 120,000 BP, this happened c. 110 Ka, and, if the general M expansion happened instead c. 75 Ka ago (Toba) then these would have happened c. 70-65 Ka. Very roughly and unrevised all, but the general idea is right. "The Polynesians almost certainly moved rapidly out into the Pacific, yet they left plenty of evidence for their trail in spite of the founder effects and drift obviously involved".Precisely.

Not "precisely". I misread your sentence. Polynesians did not leave such a clear trail. In fact, at least in Y-DNA we are struggling to find a single thread that unifies all Austronesians, founder effects and drift (and lineage borrowing) were dominant clearly.Also Polynesians are much much more recent and had technology (farming) that made them much more stable. In the Middle Paleolithic, the role of founder effects and drift should have been even much more marked, specially as millennia went through, erasing in some cases the smaller "leftover" lineages that could document the migrations as you imagine should be the case.

"In fact, at least in Y-DNA we are struggling to find a single thread that unifies all Austronesians" But if we confine ourselves to just Polynesians the route stands out clearly. Y-hap C2, originated in southern Wallacea and passing through western New Guinea and east along the northern coast, then out into the Pacific. Some O3 followed along. The problem for your view on the Austronesians is that you are looking for an 'Austronesian haplogroup'. As you've pointed out many times culture can be reasonably independent of particualr haplogroups: "founder effects and drift (and lineage borrowing) were dominant clearly". However it is reasonably easy to discern the pattern of mixing in Wallacea and SE Asia for the haplogroups involved. "Also Polynesians are much much more recent" Granted, and we can see where haplogroups associated with the original migration have been diluted by later movement. However this is not really possible for the presumably earlier migration through the region to New Guinea/Australia. To me there is something wrong there, unless the appropriate haplogroups are derived from something like M7. "All these lineages were surely expanding in their destinations in maybe 10,000 years after the general M expansion in South Asia". But none survived anywhere along any possible route through island SE Asia. "erasing in some cases the smaller 'leftover' lineages that could document the migrations as you imagine should be the case". Generally speaking I would guess that the first lineages to reach a so far uninhabited region would rapidly expand and fill it, making it difficult for later ones to replace them unless they had a far suprior technology or used some different aspect of the environment. Therefore we would expect haplogroups from the first migration throutg the region to remain.

"To me there is something wrong there, unless the appropriate haplogroups are derived from something like M7".No, that thought is wrong unless you presume that the scatter was quite slow and/or you just do not conceive the coastal route at all. Of course, it is still possible that some small and ill-studied haplogroups under M are in need of improved understanding and that some may be eventually found to be joined in somewhat larger haplogroups (the opposite is also true in some cases, at least one lineage that I listed as "double" in 2009 is now two distinct basal clades of M), but there is no particular reason for M7 (which is tested for normally) or M9 (much more important, specially as E, and also tested for generally) to be ancestral to these small clades. I think it's all perfectly understandable in terms of rather rapid migration, founder effect and drift. "But none survived anywhere along any possible route through island SE Asia".Seems not. The populations involved were very small and drift was intense, picking this or that clade here or there. However I am unsure of the mtDNA pool of Wallacea and Filipino Negritos, so possibly there are surprises over there. Particularly there should be "Papuan" and maybe "Australian" lineages in Wallacea, the same that there are Y-DNA ones and preserved linguistic affinities. I really see no reason for a small, quickly moving, lineage to leave any trace, drift alone should erase them easily. In several thousand years (average for each new mtDNA mutation), in the context of a very dynamic expansion, there is a high chance that each new "colony" evolved its own variant of M (or several). So we can easily conclude that soon after the M explosion there were dozens of new populations, many in South Asia but others in East Asia or Near Oceania. The trace of M basal sublineages is just other M basal sublineages, all them are "millennial sisters". And all them explain together M-root, which was once the only lineage besides (pre-)N (at least that has survived) in the Eurasian migrant population.

"Generally speaking I would guess that the first lineages to reach a so far uninhabited region would rapidly expand and fill it"…Agreed but if a population ends up with 70% A and 30% B and another sister population ends up with 30% A and 70% B, the most likely outcome after some drift (assuming isolation, small size and enough time) is that P1 ends up as 100% A and P2 ends up as 100% B. So it's not like someone else comes and erases by superimposition but that the erasure happens within each of the populations, more or less randomly. Normally the low frequency lineages would be erased in X generations, and often one of the more common lineages, maybe two or three, would end up as the only one(s) remaining. If a "tribe" has an effective pop. size of 20 women, for instance, and maybe 7 initial lineages, most will be restricted to one or two women and are likely to be "broken" by chance in few generations (for example, W has only boys, or dies without any descendants). Probably one alone will remain after some generations. However the smaller the population, the greater the effect of randomness; so we cannot make solid predictions, just describe statistically reasonable expectations. In any case you have to think the early expansion as that of M-root, with whatever private sublineages, flourishing eventually as small or larger haplogroups… or not at all.

"If a 'tribe' has an effective pop. size of 20 women, for instance, and maybe 7 initial lineages, most will be restricted to one or two women and are likely to be 'broken' by chance in few generations" But the population at any ancient margin of human expansion would already have been subject to considerable drift. So it is extremely likely it would already consist of just one male and one female haplogroup even if it consists of 20 women (or men). To me this is the most likely explanation for the regional distribution of haplogroups. "I really see no reason for a small, quickly moving, lineage to leave any trace, drift alone should erase them easily". The original population may well have been small and quickly moving, but once that 'small' population reached an uninhabited island, for example, it would have expanded in numbers rapidly on that island while other members kept on moving. Therefore it is difficult to see how a haplogroup could dissappear, unless the whole population of that island became extinct. "So we can easily conclude that soon after the M explosion there were dozens of new populations, many in South Asia but others in East Asia or Near Oceania". Yes, but we should still be able to detect their order of mutation, just as we can with the presumably even more rapidly expanding Lapita/Polynesian people. Y-hap C2 in Tengarra, C2a in the Admiralty Islands, C2a1 in Polynesia. Mitochondrial DNA B4a in Taiwan and the Philippines out to B4a1a1a in Polynesia. "In any case you have to think the early expansion as that of M-root, with whatever private sublineages, flourishing eventually as small or larger haplogroups… or not at all". It looks very much as though the first M-root haplogroups to reach New Guinea/Australia did not survive at all, which is surprising.

"But the population at any ancient margin of human expansion would already have been subject to considerable drift. So it is extremely likely it would already consist of just one male and one female haplogroup even if it consists of 20 women (or men)".That is M and pre-N: two haplogroups. Then the expansion was fast: it probably moved between regions in a matter of one or few millennia and most people carried the M-root lineage without further modifications (or maybe an L3-other or L0 that never really made it – or pre-N, which survived). "The original population may well have been small and quickly moving, but once that 'small' population reached an uninhabited island, for example, it would have expanded in numbers rapidly on that island while other members kept on moving".Sure: island -> lineage #1, others who keep migrating -> lineage #2, etc… All derivatives of M. "Yes, but we should still be able to detect their order of mutation".No, because a mtDNA mutation takes on average (my estimate) several millennia to happen and consolidate. This expansion was faster than the effective mutation rate. "Y-hap C2 in Tengarra, C2a in the Admiralty Islands, C2a1 in Polynesia".First, Y-DNA is brutally much larger than mtDNA and so is the chance of mutation. Y-DNA accumulates mutations much much faster than mtDNA (another thing is whether we know about them). Second, C2 maybe 50 Ka ago, C2a1 recently, hence C2a maybe 30 or 25 Ka ago. My opinion (and you knew it already)."Mitochondrial DNA B4a in Taiwan and the Philippines out to B4a1a1a in Polynesia".Are you telling me that there is not B4a1a1a anywhere but in Polynesia? I am sure that all those mutations have ot evolved since Polynesias sailed out from Philippines. I understand that B4a1a1a is in Philippines or otherwise in an original area of Polynesian scatter, just that geneticists do not bother listing it or even testing for it most of the time. "It looks very much as though the first M-root haplogroups to reach New Guinea/Australia did not survive at all, which is surprising".Of course they did: they are now Q and the other M lineages! Nothing remains static: every single actual lineage accumulates some mutations eventually, specially, as you seem to agree with, when the population is small.But you cannot expect the people of Boungaiville to evolve the same mutations as the people in Japan, hence ones are M13 and the others M8a, for instance. The mutations evolved in situ, at least for the small lineages in remote places – larger lineages must have experienced secondary expansions after coalescence and hence are more difficult to attribute an specific center.

"That is M and pre-N: two haplogroups". Fits my statement exactly. Drift reduced the OoA mtDNA haplogroups to just two. But that wouldn't be the end of the effect. As the population containing the two haplogroups expanded drift would again act on the margins of that expansion reducing the haplogroup diversity. The most likely scenario is that the two haplogroups expanded along different routes. "Of course they did: they are now Q and the other M lineages!" Sorry. I meant along the way. Obviously they survived in New Guinea. "This expansion was faster than the effective mutation rate". Faster than the expansion of the Polynesians with their advance boating technology? "Are you telling me that there is not B4a1a1a anywhere but in Polynesia?" Austronesian at least, although some variants of B4a1a1a are confined to the eastern Pacific as shown in the diagram here: Madagascar has its own variant. http://massey.genomicus.com/publications/Razafindrazaka_2010_EurJHumGenet_v18_p575.pdfGoing back to an earlier comment: "In several thousand years (average for each new mtDNA mutation), in the context of a very dynamic expansion, there is a high chance that each new 'colony' evolved its own variant of M (or several). So we can easily conclude that soon after the M explosion there were dozens of new populations, many in South Asia but others in East Asia or Near Oceania". Interestingly you have vehemently opposed any of my comments proposing just such a phenomenon in relation to N's expansion: a spread of populations along the migration route.

Another link regarding haplogroups M and B in the New Guinea/Melanesian region: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0000248Figure 4 is the relevant one for the distribution of B4a1a1. "As shown in figure 4a and 4b, haplogroup B4a1a1 is rare in Island Southeast Asia" The authors go on to point out difficulties with it being 'the Polynesian motif' but these difficulties are easily explained. "What is clear is that precursors of the 'Motif' originated to the west of Wallacea in the early Holocene; that the full 'Motif' with the transition at 14022 developed in eastern Island Southeast Asia or Near Oceania; that its frequency varies a great deal across Island Southeast Asia, Near Oceania, and sections of Remote Oceania before becoming very common in central Polynesia"

"Austronesian at least, although some variants of B4a1a1a are confined to the eastern Pacific as shown in the diagram here: Madagascar has its own variant". Confined or found only there? There are too many mutations in each of those lines to have happened all since Austronesians expanded. The reasonable explanation is that they existed (finished or not) before expansion and that they thrived after each local founder effect in their destinies, while in Borneo or Philippines they probably also exist (or existed in the past), mutation up or down, but are just not even looked for. Mutations do not need to happen precisely upon migration, actually that is a most rare coincidence, they normally exist prior to it or evolve after migration. But drift may and often does clear up them from the source population. Your understanding is overly simplistic. "Interestingly you have vehemently opposed any of my comments proposing just such a phenomenon in relation to N's expansion: a spread of populations along the migration route".You do not seem to be understanding what I say. This M case is the same as the N case: either mutations evolve after migration (from the core area, normally more or less radial migration) or they evolved before but were drifted out except in destination (founder effect). As you know, I reconstruct the origin area based on where basal derived lineages are found, not imaginary fantasy routes I may conceive on whatever odd pseudo-logic. That's why I think that M exploded in South Asia and N in SE Asia (and not anywhere else).

"Figure 4 is the relevant one for the distribution of B4a1a1".Relevant for B4a, because they make no distinction between B4a1a1 and B4*"The “Motif” has also been confirmed in central and eastern Indonesian populations in low frequencies"…And it is surely even more common in Philippines, where B4a reaches frequencies of more than 11% (Hill 2006, reproduced in Supplementary table 3). B4a1a1 is anyhow a misleading category:"the transition at 16247 had been used to identify the “Motif” in many earlier studies, but it is hypermutable in our series and therefore is not reliable".So B4a1a is the real haplogroup here. I must insist that it is most unlikely that most of those mutations arose in situ, specially the coding region ones. You can see that the Malagassy motiff shares one coding region mutation (1473) but downstream there are only HVS ones (and not many). Most probably the proto-Malagassy already had the transition at 1473 some 3000 or 2000 years ago when they colonized Madagascar. It is possible, very likely IMO, that an exhaustive survey of Indonesians and Filipinos would produce all or most of those lineages at low or middling frequencies. But this research is yet to be made. They might even be found in Taiwan Aborigines.

"Mutations do not need to happen precisely upon migration, actually that is a most rare coincidence, they normally exist prior to it or evolve after migration". Obviously migrations involve more than just one woman, so obviously the haplogroup is varied by the time of any migration although it can still consist of basically one haplogroup. Hence the discrepancy between the diversification dates and the archeological evidence for arrival. "It is possible, very likely IMO, that an exhaustive survey of Indonesians and Filipinos would produce all or most of those lineages at low or middling frequencies". Quite possibly so, but probably from back-migration. "So B4a1a is the real haplogroup here". The Polynesian motif is B4a1a1a. A downstream mutation. "This M case is the same as the N case" Totally agree. That's why their distribution is completely different. "they evolved before but were drifted out except in destination (founder effect)". That is the more likely scenario except that if you actually look at the evidence you see that they were not 'drifted out' along the route they took from just outside Africa.

"… probably from back-migration".Surely not. You cannot even be serious here: what impact could have the whole population of Guam in Philippines? Zero! Also we know that the (efficient) migration happened in one direction: from the Philippines (and other Malay archipelago islands maybe) outwards to the Ocean (and its tiny islands – sure NZ and Madagascar are big but it's extremely far away). Surely all the B4a1a1 phenomenon originated in a dynamic subpopulation in Philippines (most likely), from where all those populations scattered through the islands of the Pacific Ocean ultimately sprung. "The Polynesian motif is B4a1a1a. A downstream mutation"…My bad, I missed an item in the name. What the authors of the PLoS ONE paper say is that B4a1a1a is fake (hypermutability of the "defining" HVS-I site) and that the real one is B4a1a1, defined by the 14022 site. "That is the more likely scenario except that if you actually look at the evidence you see that they were not 'drifted out' along the route they took from just outside Africa".I do not understand this.

"Surely not. You cannot even be serious here" what a ridiculous outburst. So ridiculous I wasn't going to bother replying. My comment was based on what one of the authors had written: "The 'Motif' has also been confirmed in central and eastern Indonesian populations in low frequencies [42], and it could have originated either there or in Near Oceania [see 33]". Do you really believe migrations are always unidirectional? Even many of your conquistadors returned. The classic back-migration, by your own reckoning, is that of several major haplogroups back west through India from SE Asia (mtDNA R and Y-hap MNOPS). Obviously the whole population doesn't keep moving. Individuals remain behind and survive to form populations. At times these remaining populations are able to expand. There is no reason at all why they need follow on behind the original migration. They are just as likely to move back in the direction from which they came.

"My comment was based on what one of the authors had written:"The 'Motif' has also been confirmed in central and eastern Indonesian populations in low frequencies"…"Precisely. And I'm almost certain that the "motif" is found at even greater frequencies in Philippines, specially in some subpopulations, and that careful study of Filipino haplotypes should show that it is ancestral there. Borneo should also be considered. I do not think that Polynesians went back to ISEA in any numbers sufficient to make an impact. The fact that the same motif is found in Madagascar also speaks in favor of an origin that pre-dates the expansion into Oceania: an origin in ISEA, most probably in Philippines and/or Borneo.

"And I'm almost certain that the 'motif' is found at even greater frequencies in Philippines, specially in some subpopulations, and that careful study of Filipino haplotypes should show that it is ancestral there". And here is just such a study: http://mbe.oxfordjournals.org/content/27/1/21.fullQuote: "Analysis of hypervariable segment I sequence variation within individual mtDNA haplogroups indicates a general decrease in the diversity of the most frequent types (B4a1a, E1a1a, and M7c3c) from the Taiwanese aborigines to the Philippines and Sulawesi" Back to you: "I do not think that Polynesians went back to ISEA in any numbers sufficient to make an impact". I'm not talking about 'Polynesians', I'm talking a back movement from before they actually formed way out beyong Fiji. And another comment from the paper: "Haplogroup B4a1a is highly diverse in Taiwan, but the subclade (B4a1a1) characterized by a mutation at np 14,022 is absent there. The identification of haplogroup B4a1a1 in the Philippines may indicate a stage of development of the Polynesian Motif along the north to south pathway proposed in the general Out of Taiwan model for the Austronesian population expansion. This apparently completes a series of genetic links from Taiwan (where the B4a1a motif may have originated), through the Philippines (where the np 14,022 mutation might have evolved) and finally to Indonesia (where the full Polynesian Motif first occurs). However, the observation of a B4a1a1 sample in the Philippine population is not necessarily incompatible with models that argue for an extended development period for the Polynesian Motif in ISEA, if the proposed area of development of the motif is expanded to include the Philippines. Another alternative explanation is that the B4a1a1 lineages might have been brought to the Philippines by a back migration from Indonesia". Note the last sentence. Talking of back-migration; we have the well-accepted evidence that the Polynesians did actually move back in a westerly direction. The languages spoken in the 'Polynesian Outliers' in parts of the Solomons and in Vanuatu are a product of movement from somewhere round Samoa. http://en.wikipedia.org/wiki/Polynesian_outlier

Thanks very much for that link. I think it is at least mildly supportive of the hypothesis of B4a1a1 coalescing in Philippines or otherwise ISEA (Sulawesi? – Borneo is amiss). Other data is also interesting. "Note the last sentence". You can't discard anything I guess. Not until the substructure of B4a1a1 is analyzed in depth also in the context of ISEA, what it seems to me this paper does not address. "Talking of back-migration; we have the well-accepted evidence that the Polynesians did actually move back in a westerly direction. The languages spoken in the 'Polynesian Outliers' in parts of the Solomons and in Vanuatu are a product of movement from somewhere round Samoa".Maybe. But how safe is that linguistic reconstruction? Those islands are Melanesian, rather than Polynesian, so they should have been settled from Near Melanesia, not Samoa. Anyhow, I think it's reasonably safe to discard back-migration from the tiny scattered islands of Oceania causing any impact. Very specially as the same motif is found in Madagascar, which is only related via ISEA.

"But how safe is that linguistic reconstruction?" Completely accepted. No arguments against offered these days, although it was once thought they were remnants of the original Polynesian migration. "Those islands are Melanesian, rather than Polynesian, so they should have been settled from Near Melanesia, not Samoa". But language can be independent of genes to a surprising extent. The Polynesian outliers are mainly in offshore islands though, rather than on the bigger ones. "Very specially as the same motif is found in Madagascar, which is only related via ISEA". I jotted down some further thoughts on the subject earlier today: The 'Polynesian motif' is probably much better termed an 'Austronesian motif', but certainly not the only one. It happens to be the main surviving one carried out into the eastern extremity of the Austronesian distribution, Polynesia. At the western end of the distribution, Madagascar, we find not only the 'Polynesian motif' but M23, a subclade of M23'77. So M23 could be called an Austronesian motif as well. The link pointed out that some E was alo carried east with the Austronesians, however it didn't get as far as B4a1a1a did. Concerning back-migration. The link suggested the possibility that some Melanesian haplogroups have made it to the Philippines, most probably some time before the Austronesians developed. To me this expansion seems linked to mtDNA R's expansion. Haplogroup P diversified just one mutation later than R did, and reached New Guinea, Australia and the Philippines. If the P in the Philippines is not a back movement, but a remnant left there in passing, it raises the question of whether the various other OZ/NG Ms crossed Wallacea with it. These M and P haplogroups may have been the first to people to New Guinea, perhaps 45,000 years ago. The question then arises as to whether haplogroups O, S, N13 and N14 also all arrived in Asustralia at that time. If they did so it would provide some support for your SE Asian N origin theory. To me the fact that no basal N haplogroups survive in New Guinea argues for a separate, and earlier, occupation of Australia by mtDNA N and Y-hap C, perhaps 55,000 years ago. MtDNAs M and P, and Y-haps K and M, crossed Wallacea some time later, perhaps just a short time later though. Varieties of these later mtDNA M and Y-hap F haplogroups made it to both Australia and New Guinea/Melanesia. So N's expansion is earlier than that of R, and possibly not from SE Asia. It could have originated a long way from there. The fact that R is just one mutation removed from basal N tells us little about N's origin. However we can be sure that R's expansion was rapid. As well as the P haplogroup haplogroups as far away geographically as R0 are also just one mutation removed from basal R. Surely you're not claiming that R's expansion over that distance was instantaneous? On the other hand the SE Asian haplogroup R14 has 23 mutations before it diversified. Did it travel some huge distance before it diversified?

Well, you may be right about the Polynesian outlier islands. I did not grasp the concept well at first. They seem to be enclaves in otherwise Melanesian territory. "The link suggested the possibility that some Melanesian haplogroups have made it to the Philippines"…I noticed. Not too surprising, IMO. And hard to discern possible origins and timelines… with the available data (too many clades, to many islands, too low resolution). "If the P in the Philippines is not a back movement, but a remnant left there in passing"…It's an interesting idea to explore. It'd be nice if we'd finally get to know something about Filipino Negrito mtDNA. That could help moving a step beyond speculation. "… no basal N haplogroups survive in New Guinea"…No O?, no S?"… a separate, and earlier, occupation of Australia by mtDNA N and Y-hap C"…Could be two routes as well. I see no particular reason in mtDNA to think that Australia was settled earlier but maybe it was partly by a different route. For example, in my latest reconstruction attempt, it came as follows (each step is one "SNP-time-quivalent"):1. Melanesia: M29'Q (this is right after the M "explosion")2. Australia: M14 and M42 (this is coincident with the N "explosion")3. Australia: S (misplaced in the map), Melanesia: M27 (coincident with the R "explosion")4. "Oceania": O (seems to be the old N12, so probably Australian), Melanesia: PIt seems like a single process embedded in the larger process of Eurasian expansion. Clades "rain" on Australasia soon after their parent clades expand in Asia. However they make differential founder effects depending on where and when they end at. I'd guess that Y-DNA C and MNOPS operated the same way. I would agree that M and S should correspond with mtDNA P specifically (as there is a rather strong correlation between Y-DNA MNOPS and mtDNA R overall) but C should correspond with M and N(xR) equally, the same as in East Asia (where we also find D associated with these mtDNA macro-haplogroups). "… haplogroups as far away geographically as R0 are also just one mutation removed from basal R. Surely you're not claiming that R's expansion over that distance was instantaneous?"But it was fast: in few millennia. Say five millennia (maybe just three or even less). It's enough time (some 100-250 generations, roughly) and it's just an estimate but it means that this population tended to move for whatever reason. It is three times less what the "out of Africa" migration took between Africa and India. "On the other hand the SE Asian haplogroup R14 has 23 mutations before it diversified. Did it travel some huge distance before it diversified?"This is not something I decide based on the length of the stem but on the geography of basal diversity. So guess not. Taken as you say it, with no other information but itself, it could have traveled a lot or stayed put – impossible to say on a single lineage alone.

"It'd be nice if we'd finally get to know something about Filipino Negrito mtDNA. That could help moving a step beyond speculation". More than a year ago Spencer Wells claimed he was about to release the results of such a study he'd completed for National geographic. He has not yet done so, and I'm beginning to wonder why. "No O?, no S?" No. "I see no particular reason in mtDNA to think that Australia was settled earlier" It used to be accepted that the evidence supported that position, but with the idea of single origins the time of settlement of the two regions has been moved together. No evidence that they were settled at the same time though. Of course settlement dates for both regions are very insecure. "1. Melanesia: M29'Q (this is right after the M 'explosion')2. Australia: M14 and M42 (this is coincident with the N 'explosion')3. Australia: S (misplaced in the map), Melanesia: M27 (coincident with the R 'explosion')4. 'Oceania': O (seems to be the old N12, so probably Australian), Melanesia: P" I'd be inclined to make it: 1. Australain Ns, including S, O and the two Ns: N13 and N14. Then: 2. New Guinea Q, possibly a little before 3: 3. Melanesian M27, M28 and M29, the last derived from Q, so it derives from New Guinea. And the Australian Ms: M14, M15 and M42. 4. N-derived P, possibly at the same time as 3 and, like them, spreading through both Australia and New Guinea. Then: 5. The Austronesian haplogroups from (dare I say it) around Wallacea: B4a1a and E. Of course the history of the region may be even more complicated with haplogroups drifting in over a long period of time. Hoever it is quite possible that the Wallacea crossings (apart from the Austronesian) were confined to periods of lowered sea level.

"No evidence that they were settled at the same time though".I think that there is evidence in the mtDNA at least, as described above."I'd be inclined to make it"…But you are "inclined" to that for no particular reason. I have reasons (for a more or less simultaneous colonization) and I have already said why.

Hi Maju,The article is very interesting. I would note that I believe the grey circles in Figure 1 was noted as Asian Indian, rather than American, as changed here. In the article, the authors distinguished between Native Americans vs Asian Indians. In the article, haplogroups C4a1b, C4a2a2a, C4a2b, C4a2a2, and C7a1a were particularly noted to be Asian Indian. This was also referred to in this article. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757894/. I believe the Native American Indian haplogroup subclades C1b, C1c, C1d, and C4c are not shown in Figure 1.

Hi Maju,The article is very interesting. I would note that I believe the grey circles in Figure 1 was noted as Asian Indian, rather than American, as changed here. In the article, the authors distinguished between Native Americans vs Asian Indians. In the article, C4a1b, C4a2a2a, C4a2b, C4a2a2, and C7a1a were particularly noted to be Asian Indian. This was also referred to in this article. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757894. The Native American Indian subclades C1a, C1b, C1d, and C4c appear to not be included in Figure 1.