August 24, 2010

Social selection in Y-chromosome haplogroup C3 clusters

There was another recent paper on Y-chromosome haplogroup C recently. The authors of the current paper also present dates with both the evolutionary and genealogical mutation rate. They write:

The age of accumulated STR variation within hg C3, estimated using the method of Zhivotovsky et al. (2004), is about 14.9 ky or 4.1 ky depending on the mutation rate values selected for calculations (Table 3). The older time estimate is most compatible with the view that hg C3 haplotypes were present in Siberia during the Last Glacial Maximum from where the ancestors of C3b Native Americans migrated to the Beringia (Karafet et al., 2002; Zegura et al., 2004).

However, as I noted in my review of the earlier paper:

A case in point is haplogroup C3b-P39; according to the authors' date, this ought to be related to the early arrival of the ancestors of Amerindians, but haplogroup C in the Americans has a strong relationship with Na-Dene speakers such as Athapaskans, and it seems to me that a late spread of this haplogroup is more consistent with its limited geographical distribution and strong linguistic associations.

If C3 had spread into the Americas "early" (together with the other main haplogroup, Q), then we would expect to see it today all over the Americas (perhaps lost here and there due to drift in small populations), and in all language groups. However, this is not what we see. Hence, I am inclined to believe in the more recent spread of C3 into the Americas, together with Na-Dene speakers.

From the paper:

The median joining network of subcluster C3c appears to be complex, with several common haplotypes present in different populations (Fig. 2). Our analysis revealed that the age of this subcluster is about 5.9 ky or 1.6 ky, whereas the age of subcluster C3d appears to be younger – about 2.0 ky or 0.5 ky, depending on the mutation rate values selected.

As I noted in the Cohen Modal Haplotype paper, age estimation must harmonize with both the observed Y-STR variation and known demography. The key questions are: has there been enough time for haplogroup ? to accumulate so much variation, and to grow to such a population size?

Younger haplogroup ages often hit a stumbling block in trying to explain demography (hence my reservations on the CMH paper). Yet, it is much easier to consider massive demographic growth/social selection in Mongols and associated peoples, as the evidence for that growth and expansion is a part of history, and it also harmonizes with what we know about nomadic peoples.

In any case, the real age could be different than the point estimates, both due to the limitations of Y-STR markers, as well as the potential presence of outliers, in which the recently expanded group within a haplogroup dominates the age estimate.

The authors also deal with the Genghis Khan "star cluster" which is part of the paragroup C3*. Here is what they have to say:

It is suggested that this subcluster appears to have originated in Mongolia about 1 ky ago, taking into account the genealogical mutation rate (Zerjal et al., 2003). Our present and previous (Derenko et al., 2007b) studies have shown that the highest frequency of the “star cluster” in C3∗ is observed in Mongols (35%), whereas in Siberia it varies from 8% in Altaian Kazakhs and 6.5% in Buryats to less than 3% in Tuvinians, Altaians and Shors (Table S2). According to our data, the age of the “star cluster” in C3∗ is 2.8 ± 1.0 or 0.78 ± 0.27 ky, based on the evolutionary and genealogical mutation rates, respectively.

Genghis Khan flourished about 0.8ky. If we accept (as I do) the genealogical rate as closer to the truth, then the "star cluster" is related to Genghis and his male relatives, otherwise it is a completely unrelated phenomenon.

Annals of Human Genetics DOI: 10.1111/j.1469-1809.2010.00601.x

Phylogeography of the Y-chromosome haplogroup C in northern Eurasia

Boris Malyarchuk et al.

To reconstruct the phylogenetic structure of Y-chromosome haplogroup (hg) C in populations of northern Eurasia, we have analyzed the diversity of microsatellite (STR) loci in a total sample of 413 males from 18 ethnic groups of Siberia, Eastern Asia and Eastern Europe. Analysis of SNP markers revealed that all Y-chromosomes studied belong to hg C3 and its subhaplogroups C3c and C3d, although some populations (such as Mongols and Koryaks) demonstrate a relatively high input (more than 30%) of yet unidentified C3* haplotypes. Median joining network analysis of STR haplotypes demonstrates that Y-chromosome gene pools of populations studied are characterized by the presence of DNA clusters originating from a limited number of frequent founder haplotypes. These are subhaplogroup C3d characteristic for Mongolic-speaking populations, “star cluster” in C3* paragroup, and a set of DYS19 duplicated C3c Y-chromosomes. All these DNA clusters show relatively recent coalescent times (less than 3000 years), so it is probable that founder effects, including social selection resulting in high male fertility associated with a limited number of paternal lineages, may explain the observed distribution of hg C3 lineages.

38 comments:

"If C3 had spread into the Americas "early" (together with the other main haplogroup, Q), then we would expect to see it today all over the Americas (perhaps lost here and there due to drift in small populations), and in all language groups. However, this is not what we see. Hence, I am inclined to believe in the more recent spread of C3 into the Americas, together with Na-Dene speakers."

Scholars tend to make a big deal out of the fact that C3 is "associated" with Na-Dene speakers in North America. They tend to overlook the fact that 1) C3 is found in such a geographically wide group of populations as Na-Dene, Eskimos, Muskogean, Siouan speakers in North America as well as in some populations in central America (all at low frequencies); and 2) in Asia C3 is also "associated," and to a much stronger degree as the gene frequencies indicate, with one language family, namely Altaic. There're of course Koreans, Japanese, Ainu, Koryaks and few other non-Altaic populations that carry C3 chromosomes, but they are geographically contiguous with the Altaics and may represent the minor ripple effects from the major Altaic "splash."

Overall the haplogroup does look like a recent "admixture" but I wouldn't automatically assume that it came from Asia to America after the initial colonization of the New World. Instead, it may represent a backflow from America into Asia, with corresponding increase in population size in Asia associated with the Altaic expansion, which is reflected in high frequencies of this hg in Asia. The emergence of C3 in Asia may be a genetic signature of the formation of the "Mongoloid" craniological complex. On its possible Amerindian roots, see Peter Brown (1999). "The First Modern East Asians? another Look at Upper Cave 101, Liujiang, and Minatogawa". Peter Brown noticed that "Mongoloid" skulls in America appear earlier than their counterparts in Asia. The dates for hg C3 also fit the Holocene back migration from America to Asia.

So you are saying that proto-Mongoloids inhabiting East Asia for tens of thousands of years never specialized towards the Mongoloid morphology, but an offshoot of them went into the New World and miraculously differentiated themselves into Mongoloids all within a matter of a few thousands years? And not only that, they had to overcome the limited genetic flow and instead of following the fasted coastal route to Japan they traversed all the way to the central Chinese plains and became as fully Mongoloid agriculturalists. I suggest you revise whatever it is you believe.

"Alliances and intermarriage between the Blackfoot Sioux and Sarcee (Tsuu T'ina Nation), a Dene speaking people, are clearly described within their own history."

Good. Do you care about the fact that the Blackfoot are Algonquin-speakers? Maybe you confused Sihasapa, a band of Teton Dakota, whose name does translate as "Black Feet," with the Blackfoot tribe in Alberta and Montana, but I doubt that you've heard about the Teton band. In any case, Algonquins also carry C3 lineages. Just like the Iroquoians. See "Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America," by Bolnick et al.

There was gene flow between some of these language groups - no doubt - but overall the case is clear that C3 in America has a very wide geographic distribution and is associated with a number of entirely unrelated language families.

"The movement of Na Dene speakers from Canada into the American Southwest is well documented. No mystery there."

American Southwest is not Central America, and there were no migrations of Na-Dene-speakers to Central America.

"Looking at the distribution maps for C3b, it certainly looks like they are congruent with the Dene migrations and Dene speakers."

The frequencies of C3 in Na-Dene are indeed more elevated than elsewhere in North America, but this is nothing in comparison to the frequencies of the same lineage in Altaic-speakers. Plus Na-Dene have typical Amerindian Q lineages and no other Siberian-specific Y-DNA lineages, hence they can't be divorced from the rest of Amerindians into a separate later migration. (In mtDNA, Na-Dene show elevated frequencies of hg A, but nobody's saying that hg A came to America in a separate migration associated with Na-Dene speakers because there're plenty of A's elswhere in North America.)

Both mtDNA and Y-DNA data is consistent with Na-Dene, Eskimo and "Amerind" speakers stemming from the same ancestral population. Hence, C3 in Siberia looks like an admixture from America, and not the other way around.

Kumarwe:

"So you are saying that proto-Mongoloids inhabiting East Asia for tens of thousands of years never specialized towards the Mongoloid morphology..."

The earliest skulls in America are non-Mongoloid/proto-Mongoloid/pre-Mongoloid, etc. The date of arrival of humans in America is unknown. It was definitely "pre-Clovis," which could be 20K or 100K YBP. Mongoloid morphology emerges in America earlier than in Asia (read the Brown paper I referenced). Mongoloid morphology in Asia is of Holocene age (read all the literature available). Linguistic diversity in North America is much greater than linguistic diversity in Siberia/East Asia, so if C3 is associated with the spread of the Altaic family with subsequent outward gene flow, then it's easy to imagine it deriving from any of the dozens of families found in North America. The formation of such a unique morphological complex as "Mongoloids" in a different ecological niche from pre-Mongoloids in Asia is a very natural process. If you want to derive "Mongoloids" from Asian pre-Mongoloids, then your task is harder than mine: why did it suddenly emerge in the past 5-10,000 years? Why did it emerge independently in Asia and in America? The origin of the Mongoloid morphological complex hasn't been resolved precisely because nobody can answer those tough questions. A backflow from America explains everything rather neatly.

See also Kozintsev AG, et al. Collateral relatives of American Indians among the Bronze Age populations of Siberia? // Am J Phys Anthropol. 1999 Feb;108(2):193-204.

"Alliances and intermarriage between the Blackfoot Sioux and Sarcee (Tsuu T'ina Nation), a Dene speaking people, are clearly described within their own history."

Good. Do you care about the fact that the Blackfoot are Algonquin-speakers? Maybe you confused Sihasapa, a band of Teton Dakota, whose name does translate as "Black Feet," with the Blackfoot tribe in Alberta and Montana, but I doubt that you've heard about the Teton band. In any case, Algonquins also carry C3 lineages. Just like the Iroquoians. See "Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America," by Bolnick et al.

There was gene flow between some of these language groups - no doubt - but overall the case is clear that C3 in America has a very wide geographic distribution and is associated with a number of entirely unrelated language families.

"The movement of Na Dene speakers from Canada into the American Southwest is well documented. No mystery there."

American Southwest is not Central America, and there were no migrations of Na-Dene-speakers to Central America.

"Looking at the distribution maps for C3b, it certainly looks like they are congruent with the Dene migrations and Dene speakers."

The frequencies of C3 in Na-Dene are indeed more elevated than elsewhere in North America, but this is nothing in comparison to the frequencies of the same lineage in Altaic-speakers. Plus Na-Dene have typical Amerindian Q lineages and no other Siberian-specific Y-DNA lineages, hence they can't be divorced from the rest of Amerindians into a separate later migration. (In mtDNA, Na-Dene show elevated frequencies of hg A, but nobody's saying that hg A came to America in a separate migration associated with Na-Dene speakers because there're plenty of A's elswhere in North America.)

Both mtDNA and Y-DNA data is consistent with Na-Dene, Eskimo and "Amerind" speakers stemming from the same ancestral population. Hence, C3 in Siberia looks like an admixture from America, and not the other way around.

"So you are saying that proto-Mongoloids inhabiting East Asia for tens of thousands of years never specialized towards the Mongoloid morphology..."

The earliest skulls in America are non-Mongoloid/proto-Mongoloid/pre-Mongoloid, etc. The date of arrival of humans in America is unknown. It was definitely "pre-Clovis," which could be 20K or 100K YBP. Mongoloid morphology emerges in America earlier than in Asia (read the Brown paper I referenced). Mongoloid morphology in Asia is of Holocene age (read all the literature available). Linguistic diversity in North America is much greater than linguistic diversity in Siberia/East Asia, so if C3 is associated with the spread of the Altaic family with subsequent outward gene flow, then it's easy to imagine it deriving from any of the dozens of families found in North America. The formation of such a unique morphological complex as "Mongoloids" in a different ecological niche from pre-Mongoloids in Asia is a very natural process. If you want to derive "Mongoloids" from Asian pre-Mongoloids, then your task is harder than mine: why did it suddenly emerge in the past 5-10,000 years? Why did it emerge independently in Asia and in America? The origin of the Mongoloid morphological complex hasn't been resolved precisely because nobody can answer those tough questions. A backflow from America explains everything rather neatly.

See also Kozintsev AG, et al. Collateral relatives of American Indians among the Bronze Age populations of Siberia? // Am J Phys Anthropol. 1999 Feb;108(2):193-204.

We're all aware of your position by now: You believe that modern humans do not originate in Africa.

Alliances and intermarriage between Algonquian speakers and Dene speakers are well documented. No surprise then that some Algonquian speaking groups have the C3c Y-chromosome.

You put up some examples yesterday of Algonquian speaking C3c carriers as an argument that C3c does not originate with a Dene migration. That doesn't hold up, based on the fact of Algonquian-Dene intermarriage.

I'm not going to go group by group to demonstrate this. If you were open minded enough, you could do this. However, you are not.

By the way, The Blackfoot and Blackfeet are a closely related people, separated only at the time of the drawing of the American-Canadian border.

So perhaps now, we can move along with trying to put a date on the Dene migration.

1 or 2 kya seems too recent. It wouldn't account for the cultural differentiation between, for instance, the Sarcee and the Tlingit.

I'd like to know how Edward Vajda came up with his date for the Dene migration.

Marnie, you and I are having a very wide gap in the knowledge of basic facts, hence this exchange won't be useful to either of us.

Examples:

"The Blackfoot and Blackfeet are a closely related people, separated only at the time of the drawing of the American-Canadian border."

Blackfoot and Blackfeet are indeed two names for the same Algonquian-speaking population. Not Siouan, as you began claiming. However, there's a band within Teton Dakota, a Siouan speaking group, that's called Sihasapa, or "Blackfeet," which has never lived in close proximity to the Sarcee.

"I'm not going to go group by group to demonstrate this. If you were open minded enough, you could do this. However, you are not."

If you believe that Na-Dene Indians contributed C3 to Eskimos, Siouans, Muskogeans, Iroquoians, Algonquians as well as brought it to Central America and even to the Wayuu Indians of South America, you're welcome to do so. There're studies on gene flow among speakers of some North American language families but noone has ever claimed that all the instances of C3 in North America are ultimately traceable to Na-Dene. You can't just declare that C3 in, say, Chippewa/Anishinabe came from a Na-Dene-speaking group just because "Na-Dene and Algonquians" intermarry. You need to find a study that actually proves this to be the case for this tribal group and this haplogroup. The only way to establish it is if it's demonstrable that Chippewa share with some Na-Dene populations only the derived versions of this haplogroup, whereas those Na-Dene groups have both derived and ancestral sequences.

"I'd like to know how Edward Vajda came up with his date for the Dene migration."

Vajda didn't come up with any date for the Na-Dene migration. He just established the linguistic kinship between Na-Dene and Ket on the basis of the verbal paradigm and different lexical items. He may have speculated about it on general grounds, but his own linguistic data is currently incapable of answering this question.

"There was gene flow between some of these language groups - no doubt - but overall the case is clear that C3 in America has a very wide geographic distribution and is associated with a number of entirely unrelated language families".

You make the same mistake Maju keeps making at his blog in assuming an intimate connection between language group and haplogroup. C3 is simply a northwest North American haplogroup which has spread unevenly south as far as Mexico.

"So you are saying that proto-Mongoloids inhabiting East Asia for tens of thousands of years never specialized towards the Mongoloid morphology, but an offshoot of them went into the New World and miraculously differentiated themselves into Mongoloids all within a matter of a few thousands years?"

Like you I am very sceptical about that scenario. The haplogroups actually suggest the original Americans were basically a hybrid between Central and East Asian human groups. The Central Asian element accounts for the early so-called 'Caucasian' features.

"You make the same mistake Maju keeps making at his blog in assuming an intimate connection between language group and haplogroup. C3 is simply a northwest North American haplogroup which has spread unevenly south as far as Mexico."

Terry, I'm not making a general assumption about an "intimate connection between language groups and haplogroup."

However, in the specific cases I have pointed out, there was clearly a tradition of contact, cultural absorption, and intermarriage.

Most of the people of the tribes at the Dene-Algonquian boundary are still living people who could tell you of their inter-relationship themselves.

"You make the same mistake Maju keeps making at his blog in assuming an intimate connection between language group and haplogroup. C3 is simply a northwest North American haplogroup which has spread unevenly south as far as Mexico."

I agree with you, Terry, but your idea shouldn't get in the way of another idea, namely that ideally, under vertical transmission of genes and languages, there will be close correspondence between a genetic "pattern" and a language "family." A genetic pattern could come from Y-DNA, mtDNA, it can encompass one hg or several depending on the demographic process underlying population divergence. Usually geography and language compete as two best predictors of genetic distance between populations. E.g.; "“Both language and geography explained a significant proportion of the genetic variance, but differences exist between and within the language families (table S5 and fig. S33, A to C) (4). For example, among the Niger-Kordofanian speakers, with or without the Pygmies, more of the genetic variation is explained by linguistic variation (r2 = 0.16 versus 0.11, respectively; P < 0.0001 for both) than by geographic variation (r2 = 0.02 for both; P < 0.0001 for both).” (Tishkoff et al. The Genetic Structure and History of Africans and African Americans).

What we don't know (but speculate a lot about) is the age of haplogroups and language families. Linguists tend to believe that language families are relatively young. Although I have a lot of respect for the 200 years of observation over change in languages, the recency of linguistic families is not a reliable assumption. Some of them may be very old. So, the task is really to apply classic comparative method in order to identify the precise correlates of existing language families and stocks in population genetic patterns to then optimize both glottochronology and molecular clock to determine the age of associated demographic events. This process will naturally require sorting out borrowings on language side and gene flow on genetic side. So, it's possible, for instance, that the Austronesian family maps onto Y-DNA C2 with all other lineages (M, O, etc.) coming later as gene flow and mutational events. Or, that the Bantu-Pygmy divergence maps on the Y-DNA B2 and mtDNA L1 lineages. These are just examples - so no need to argue with them here.

"Terry, I'm not making a general assumption about an "intimate connection between language groups and haplogroup."

Marnie, you confuse everything: haplogroups, languages and now commenters.

"Like you I am very sceptical about that scenario. The haplogroups actually suggest the original Americans were basically a hybrid between Central and East Asian human groups. The Central Asian element accounts for the early so-called 'Caucasian' features."

Terry, I know you picked up this idea from a recent paper, but there's a slew of others in which the "Caucasian" element in Amerindians is explained differently. These are all hypervariable opinions, and the stable facts are buried underneath them. What's important, IMO, is that in addition to the fact that "Mongoloid" skulls pop up at earlier dates in the New World, Amerindians show the generalized aspects of a "Mongoloid" phenotype but lack some very specific ones (such as epicanthus) found in Asia. This again supports Peter Brown's hypothesis.

The Sarcee evidently drifted to the Saskatchewan River from the north and, as Jenness (1938) thinks, "possibly towards the end of the seventeenth century." They are first mentioned by Matthew Cocking in 1772-73, but the erection of a trading fort at Cumberland House, followed by others farther up North Saskatchewan River, soon made them well-known to the traders. Early in the nineteenth century the Indians of the section acquired horses and guns, intertribal warfare was increased to such an extent that several tribes united for mutual protection, and the Sarcee allied themselves for this purpose with the Blackfoot. Nevertheless, they continued to suffer from attacks of the Cree and other tribes, and their numbers were still farther reduced by epidemics, particularly the smallpox epidemics of 1836 and 1870 and one of scarlet fever in 1856. In 1877, along with the Blackfoot and Alberta Assiniboine, they signed a treaty ceding their hunting grounds to the Dominion Government, and in 1880 submitted to be placed upon a reservation, where they declined steadily in numbers until 1920.

Connection in which they have become noted:

The Sarcee are noted as the only northern Athapascan band which is known to have become accustomed to life on the Plains, though it is probable that they merely represent a recent case of Plains adaptation such as took place at an earlier period with the Apache and Kiowa Apache successively.

"The Chipewyan (Denésoliné or Dënesųłiné) are a Dene Aboriginal people in Canada, whose ancestors were the Taltheilei."

"Historically, the Denesuline were allied to some degree with the southerly Cree[Algonquian speakers], and warred against Inuit and other Dene peoples to the north of Chipewyan lands."

According to this wiki reference, alliances with Dene speakers did exist. The Cree are one of the largest aboriginal groups in Central and Eastern North America. Certainly, gene flow from Dene groups such as the Chipewyan into the Cree could account for the substratus of C3c that appears in non-Dene speakers in Central and Eastern Canada.

I wouldn't take the information on this wiki site at face value, but it's a good place to start. The Chipewyan wiki site contains extensive references.

Re-examining the map of C distribution from the Zhong et al. 2010 paper, I note that C is absent or at very low incidence east and south of Hudson's Bay, coincident with the eastern and southern limit of the range of the Cree.

"So, it's possible, for instance, that the Austronesian family maps onto Y-DNA C2 with all other lineages (M, O, etc.) coming later as gene flow and mutational events."

Terry, I just re-read the comments you made on Luis's website, and it looks like you'll be willing to consider seriously the idea that there was a "backflow" of populations from Wallacea/Melanesia/PNG up north associated with C*, C2, K, MNOPS and with Austronesian languages. So, Austronesians may have dispersed from Wallacea/Melanesia not only to Polynesia but also west and north into Southeast Asia, all the way to Taiwan. So may want to consider my methodological "example" above as a testable hypothesis.

"you'll be willing to consider seriously the idea that there was a 'backflow' of populations from Wallacea/Melanesia/PNG up north associated with C*, C2, K, MNOPS and with Austronesian languages. So, Austronesians may have dispersed from Wallacea/Melanesia not only to Polynesia but also west and north into Southeast Asia, all the way to Taiwan".

I think it's indisputable that the Austronesian languages originated in Taiwan, or at least their major source was there, although I accept backflow from the Philippines to Taiwan. As Dienekes' posts on R1b show the movement of haplogroups is far more complicated than usually assumed. I agree that haplogroups were picked up along the way during the expansion, C2 in Wallacea for example and K in, or just offshore from, New Guinea. But MNOPS had surely already long dicersified by the Austronesian expansion and I certainly see just a very ancient connection between Wallacean C2, australian C4 and Northeast Asian C3.

"I think it's indisputable that the Austronesian languages originated in Taiwan, or at least their major source was there, although I accept backflow from the Philippines to Taiwan."

From a linguistic point of view, it's reasonably disputable, Terry. Although the majority of Austronesianists will probably side with Blust, who most famously argued that Taiwan harbors 9 primary branches of Austronesian, some (e.g., Ilia Peiros) will argue back saying that in fact all these 9 branches are just the sub-branches of a single branch called "Formosan" and this branch is opposed to the non-Taiwanese branch called "Malayo-Polynesian" (having a myriad of subdivisions). It all depends on the definition of an ancestral state for some of the key Austronesian phonemes. E.g., some Formosan branches show a /t/-/c/ split, whereas all Malayo-Polynesian languages have the /t/ fixed. It's possible that Formosan languages are more diverse in the reflexes of a proto-Austronesian phoneme, hence they form primary branches. But it's also possible that the split was caused by the position of the accent in a word and that was a Formosan innovation. This will make the Malayo-Polynesian branch "purer" than the Formosan branch.

If there're only two primary branches of Austronesian: Malayo-Polynesian and Formosan, then the homeland could be either on or outside of Taiwan. There're no branches of Malayo-Polynesian in Taiwan, so the division between Formosan and Malayo-Polynesian is very abrupt. If Blevins's Ongan-Austronesian connection gets to be well-socialized through the Austronesianist community, then it will create another problem for the out-of-Taiwan idea. We'll end up having Austronesian groups distributed both east (Taiwan and Polynesia) and west (Andaman Islands and Madagaskar) of South China/Sunda/Wallacea. Hence, the homeland will be likely somewhere along the north-south axis from South China to Wallacea.

It's also important to realize that Malayo-Polynesian languages in Melanesia are very diverse grammatically, which could be interpreted as the effects of a Papuan substratum or as the sign of antiquity of these languages. A substratum effect could also be hypothesized for the Formosan languages, as the high level of linguistic diversity within one family concentrated in a small island is suspicious.

So, just like with genetics phylogenies and the out of Africa theory, the situation with Austronesian expansion is unclear if looked through the lens of critical thinking.

Was C2 "picked up" from a non-Austronesian substratum in Melanesia before expanding as C2a to Polynesia? Yes, it's possible. But the whole C lineage is very old and there's little-to-no C in Papua New Guinea or among Tai-Kadai or Austroasiatic groups further up north and west. So if this unknown "substratum" existed and it was unrelated to the Papuan or any other historical groups, it will make the Austronesian attribution of C2 a more parsimonious interpretation.

Although the mtDNA connection between Polynesia and Taiwan is very strong, it's worth pointing out that 9bp deletion wasn't detected in the ancient Teouma Lapita remains. If Lapita was a springboard for the Polynesian expansion, and 9bp deletion is very frequent in Polynesia, it's somewhat odd that it didn't pop in Lapita remains. So we don't really know what was "picked up" by Polynesians: mtDNA B or Y-DNA C.

"But MNOPS had surely already long dicersified by the Austronesian expansion"

But C*/C2 is phylogenetically ancestral to MNOPS, and it's C2 that's so uniquely associated with Austronesians....

Based on the information shown in Table 1 of this paper, which you have referred to, I'd say that your statement in the first post of this thread is at best careless, and at worst, deliberately incorrect:

"C3 is found in such a geographically wide group of populations as Na-Dene, Eskimos, Muskogean, Siouan speakers in North America"

C3b (C-M130) is found in one of four Muskogean groups at a rate of 6.7%. The other three Muskogean groups that were sampled have no C-M130 at all.

I see you've dropped the Iroquois from the list of C3b carriers that you posted in Febrary, as they showed no more than a 3.7% incidence of C3b.

So it is indeed the Dene that have the highest incidence of C3b, then Algonquian and Siouan speakers, who are known to have at times allied with the Dene. C3b is not widely distributed across North America and certainly not further south, contrary to what you have continued to assert in this long and miserable thread.

"Based on the information shown in Table 1 of this paper, which you have referred to, I'd say that your statement in the first post of this thread is at best careless, and at worst, deliberately incorrect."

Marnie, as I pointed out before, you tend to confuse all the possible basic facts - haplogroups, languages, geographic regions, commenters. Instead of stopping right there and revisiting the reasons why you pursue this research, you decided to launch another attack on my statements. And unload another barrage of confused statements of your own.

Marnie: "I see you've dropped the Iroquois from the list of C3b carriers that you posted in Febrary, as they showed no more than a 3.7% incidence of C3b."

German, right above: "In any case, Algonquins also carry C3 lineages. Just like the Iroquoians."

Marnie: "So it is indeed the Dene that have the highest incidence of C3b, then Algonquian and Siouan speakers, who are known to have at times allied with the Dene."

German, right above: "The frequencies of C3 in Na-Dene are indeed more elevated than elsewhere in North America, but this is nothing in comparison to the frequencies of the same lineage in Altaic-speakers." And then: "You can't just declare that C3 in, say, Chippewa/Anishinabe came from a Na-Dene-speaking group just because "Na-Dene and Algonquians" intermarry. You need to find a study that actually proves this to be the case for this tribal group and this haplogroup. The only way to establish it is if it's demonstrable that Chippewa share with some Na-Dene populations only the derived versions of this haplogroup, whereas those Na-Dene groups have both derived and ancestral sequences."

What's important is not only that C3 is found in Eskimos, Na-Dene, Muskogeans, Siouans, Iroquoians and Algonquians in North America, but that in Asia, the Altaic family has much higher frequencies than Na-Dene. This has never been noted in the literature.

"C3b is not widely distributed across North America and certainly not further south..."

"In any case, Algonquins also carry C3 lineages. Just like the Iroquoians"

My response:

No. Table 1 of the Bolnick et al. 2006 paper speaks for itself.

We're not discussing Altaic speakers. I know your dying to bring Altaic speakers into the discussion, to confuse the issue further, but we needn't do that.

Your comment:

"What's important is not only that C3 is found in Eskimos, Na-Dene, Muskogeans, Siouans, Iroquoians and Algonquians in North America, but that in Asia, the Altaic family has much higher frequencies than Na-Dene. This has never been noted in the literature"

My comment:

No. What is important, as I have already stated, is that C3b is found at highest incidence in Dene speakers, and after that, for the most part, among groups who are known to have allied with the Dene.

Your comment:

"You can't just declare that C3 in, say, Chippewa/Anishinabe . . ."

I haven't said anything at all about Chippewa/Anishinabe peoples in this discussion. I have mentioned the Chipewyan, a Dene people, who are an entirely separate people from the Chippewa. You can check the Chipewyan wiki page on that. So the whole paragraph above regarding the need to prove a relationship between the Chippewa and the Dene is nonsense. The Chipewyan ARE a Dene people.

As to the Zhong paper and C in Central America, I'll leave that to someone else.

Marnie: your absolute inability to follow a very simple argument is obvious in every phrase. You can't even notice that when I mention Chippewa I give them as an example ("You can't just declare that C3 in, say, Chippewa..." - "say" stands for "for example"), not as a literal reference to any of your statement.

"I know your dying to bring Altaic speakers into the discussion, to confuse the issue further, but we needn't do that."

I started this discussion with the Altaics. Then you tried to steer it away into North America just to make a whole host of simple mistakes.

FYI: A divergent C3b haplotype was detected in the Wayuu Indians of South America. See "The haplotype for the 2 Wayu (15-13-13-30-25-10-11-13-11-11) exhibited 6 mutational step differences from the C-P39 modal haplotype (15-13-13-28-23-9-11- 12-11-11), reflecting its marked divergence from the predominant Native American C- haplogroup." (Zegura et al. 2004. High Resolution SNPs and Microsatellite Haplotypes Point to a Single, Recent Entry of Native American Y Chromosomes into the Americas.)

This completely falsifies your and others' argument for the special connection between C3b and Na-Dene speakers. It also corrects Dienekes's original assertion "If C3 had spread into the Americas "early" (together with the other main haplogroup, Q), then we would expect to see it today all over the Americas (perhaps lost here and there due to drift in small populations), and in all language groups. However, this is not what we see."

The fact is that we do see it at low frequencies but all over the Americas. Sampling is a usual issue in the Americas, so it's likely that more C3 variants will be found as time goes by. For comparison, another variant of X2, a low frequency mtDNA lineage with a similar distribution to Y-DNA C lineage, was recently found in North America. It was coded as X2g.

"If there're only two primary branches of Austronesian: Malayo-Polynesian and Formosan, then the homeland could be either on or outside of Taiwan".

Thanks for that perspective on the develoipment of the Austronesian languages. I know that the old idea was that several Taiwanese languages were a product of back-movement from the Philippines, although I note that idea is now generally dismissed. This is a fairly recent study which does show some small level of such back-movement:

"But the whole C lineage is very old and there's little-to-no C in Papua New Guinea or among Tai-Kadai or Austroasiatic groups further up north and west. So if this unknown "substratum" existed and it was unrelated to the Papuan or any other historical groups"

C2's homeland appears to be in South Wallacea, the Lesser Sunda Islands and Malukus, not in New Guinea.

"But C*/C2 is phylogenetically ancestral to MNOPS"

Not really. MNOPS descends from K which, in turn descends from F, so C*'s connection to MNOPS is about as distant as is possible.

"in Asia, the Altaic family has much higher frequencies than Na-Dene".

But languages are generally much more mobile that haplogroups. It's quite possible those Asian populations that carry a great deal of C3 have adopted several Altaic languages. After all if we turn to Y-hap Q we see it is almost universal in America but is most common in Central Asia in groups that speak Ket (possibly a language isolate) and the Selkups 9who speak a Finno-Ugric language). Neither of these languages are spoken in America so in one or other case the language spoken by the Y-hap Q population must have changed. My bet is that the language of both the Asian groups has changed, been adopted from later, incoming groups.

"Not really. MNOPS descends from K which, in turn descends from F, so C*'s connection to MNOPS is about as distant as is possible."

C and F are sister clades and the CF node is dated at 68K vs. The K node at 47K (Karafet, New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree). Without sticking to the exact dates, it does appear that Austronesians in the Wallacean-Melanesian nexus seem to uniquely "own" a few basal branches of C (C2 and C*). Between C3,C1, C5 in the north and C4 in Australia there's a gap occupied only by Austronesians. I don't have any other option as to attribute C2 to Austronesians and assume their great antiquity in Wallacea and Melanesia. Malayo-Polynesian languages are very conservative lexically. So they may look like one subgroup out of 10 but here diversity measures may not as reliable as stability measures, which phylogeneticists in linguistics and genetics often overlook. Then, K-M526 (*MNOPS) is found in the Philippines, again among Austronesian speakers, at high frequencies, as well as in Melanesia (Austronesian and not). Taiwan doesn't have K or C. But they do have downstream O lineages.I could imagine an old migration out of "Wallacea" in the direction of the Philippines and Taiwan, with a much later backflow from Taiwan and South China back south. The first one was associated with Austronesian speakers; the second one with some genetic and phenotypic characteristics.

"But languages are generally much more mobile that haplogroups."

You're prejudiced against languages, Terry. Check out this guy. He would also disagree with you. http://www.mpi.nl/people/dunn-michael/publications.

"Neither of these languages are spoken in America so in one or other case the language spoken by the Y-hap Q population must have changed."

Ket was shown to be related to Na-Dene. Na-Dene have C and Q. Ket doesn't have C, which means they probably lost it through drift. Correspondingly, Q raised in frequency. The fact that no Finno-Ugric languages are found in the Americas doesn't mean that they were adopted by earlier populations in Asia. It only means that linguists haven't established a secure genetic connection between Finno-Ugric and New World languages. The most robust proposal to date is Uralo-Siberian (look up Uwe Seefloth), which connects Uralic and Eskimo.

"it does appear that Austronesians in the Wallacean-Melanesian nexus seem to uniquely 'own' a few basal branches of C (C2 and C*)".

Try telling Maju that. And then stand well clear.

"Between C3,C1, C5 in the north and C4 in Australia there's a gap occupied only by Austronesians".

Not 'only' Austronesians. Chinese (Y-hap O) and Indians (Various Y-haps but especially L, H and R). I'm more than happy to accept that the Chinese haplogroups have expanded south in the last few thousand years. The disconnection in India is harder to account for but I have my own theory.

"I don't have any other option as to attribute C2 to Austronesians and assume their great antiquity in Wallacea and Melanesia".

I'm prepared to agree with Maju, in that C2 is probably not an 'original' Austronesian haplogroup, but was picked up early in the expansion. I agree it has a 'great antiquity' in Wallacea.

"Ket doesn't have C, which means they probably lost it through drift".

It's also quite possible they never had it. C3's origin may lie to the south of the mountains.

"The fact that no Finno-Ugric languages are found in the Americas doesn't mean that they were adopted by earlier populations in Asia".

Biut it may also indicate their expansion is more recent than human arrival in the Americas. In which case the Selkup population, containing Y-hap Q, could well have adopted the language.

"The most robust proposal to date is Uralo-Siberian (look up Uwe Seefloth), which connects Uralic and Eskimo".

But most would not place Eskimos in America until 4-5000 years ago. And the Finno-Ugric languages basically have a far north distribution. So the connection is quite possible.

"I'm prepared to agree with Maju, in that C2 is probably not an 'original' Austronesian haplogroup, but was picked up early in the expansion."

Politically, this is by far the easiest solution, the one that doesn't lead to the questioning of the current version of the Austronesian linguistic phylogeny. However, this kind of approach is also not particularly scientific. As the facts stand, Austronesians are uniquely associated with a basal C branch and there are no extant Papuan groups from which this C2 lineage could be derived by gene flow. If this turns out to be true as more samples become available, then the tight alignment between genetics and linguistics may overturn the orthodoxy based on superficial similarities between linguistics and archaeology. Currently there are scholars who see Austronesian history as a very complex process with lots of unknowns, and not a straightforward out of Taiwan "express train." See "Farming and Language in Island Southeast Asia Reframing Austronesian History,"by Mark Donohue and Tim Denham," 2010. In the field of kinship, scholars also question some old notions of Austronesian kinship. See, e.g., ALTERNATIVE PASTS: RECONSTRUCTING PROTO-OCEANIC KINSHIPJames West Turner //Ethnology 2010. And Taiwan aborigines do not look like a good candidate for providing a model for the earliest Austronesian social structure, while some Oceanic groups are.

I meant that Austronesians are the only ones, between Altaics in the north and Australian aborigines in the very south, that carry a distinct C lineage.

"Biut it may also indicate their expansion is more recent than human arrival in the Americas. In which case the Selkup population, containing Y-hap Q, could well have adopted the language."

This needs to be demonstrated by identifying substratum effects in the Selkup language. So far I'm familiar with none of those. Gene flow and drift are more likely to distort original populational connections. Languages are much harder to adopt and it takes much doing to impose.

"there are no extant Papuan groups from which this C2 lineage could be derived by gene flow".

That's what I said. It doesn't come from Papuans. It's from Wallacea, probably southern Wallacea. I agree that C* is distibuted around the South China Sea, but by no means exclusively in Austronesian-speaking people.

"Currently there are scholars who see Austronesian history as a very complex process with lots of unknowns, and not a straightforward out of Taiwan 'express train'".

And I've been arguing exactly that with Maju for weeks. The 'express train' applies to the migration some time after the original Austronesian development, hence the early inclusion of Wallacean Y-haplogroups.

"Languages are much harder to adopt and it takes much doing to impose".

Although the general consensus is that Indo_european languages are not indigenous to western Europe, yet they are certainly widespread there.

"Although the general consensus is that Indo_european languages are not indigenous to western Europe, yet they are certainly widespread there."

I don't have a good idea of how and when IE languages spread in Europe (I'm still monitoring the data as it comes in), but your statement that "Indo_european languages are not indigenous to western Europe, yet they are certainly widespread there" can just as easily be applied to European haplogroups, as recent ancient DNA studies showed.

"The 'express train' applies to the migration some time after the original Austronesian development, hence the early inclusion of Wallacean Y-haplogroups."

I see it exactly in the same way.

"I agree that C* is distributed around the South China Sea, but by no means exclusively in Austronesian-speaking people."

C* are indeed sporadically found there, but they are also found in Wallacea and Melanesia. C2, however, is very specific and the argument that it came from Papuan speakers needs to be supported by some good frequencies and broad dispersal range in PNG. But we don't see it. At some point, I also entertained the possibility that the expansion of the Trans-New Guinean family associated with the spread of agriculture around 10,000 YBP in PNG could have some continuation in the form of Austronesian expansion but I didn't go very far with this idea.

It doesn't contradict the idea that the Austronesians formed the mojority of the Bornean population. And we know of haplogroups that were most likely pre-Austronesian. But these form a minority, especially of Y-chromosomes. Mitochondrial DNA haplogroup M7 is almost certainly pre-Austronesian and forms about 30% of the mtDNA haplogroups.

"It doesn't contradict the idea that the Austronesians formed the mojority of the Bornean population. And we know of haplogroups that were most likely pre-Austronesian. But these form a minority, especially of Y-chromosomes. Mitochondrial DNA haplogroup M7 is almost certainly pre-Austronesian and forms about 30% of the mtDNA haplogroups."

I would be careful with assigning haplogroups to the pre-Austronesian substratum. It's very possible that at some point, in pre-agricultural times, Austronesian foragers co-existed with non-Austronesian foragers, but if we hold on to the observation that C2 is old, southeastern, and Austronesian-specific, then our time frame for the Austronesian language dispersal changes. If we benchmark C2 against Ongan Y-DNA D, which looks Paleolithic and follow Blevins's Ongan-Austronesian connection, then we may hypothesize that some low-frequency mtDNA and Y-DNA that are attributed to the pre-Austronesian substratum are in fact retentions from the time when Austronesian was a small language stock. The Ongan-Austronesian link will further suggest a coastal migration from Wallacea-Melanesia to the Andaman islands, which is a much better solution for the origin of the Andamanese than the coastal migration out of Africa.

We have moved away considerably from C3, but Diekekes has not yet complained so I'll try to make this my last mention of anything else here.

"suggest a coastal migration from Wallacea-Melanesia to the Andaman islands, which is a much better solution for the origin of the Andamanese than the coastal migration out of Africa".

I'm actually very much inclined to agree with that to some extent. Although Y-hap D is very common in the Andamans and, as far as I'm aware, unknown in Wallacea-Melanesia.

"I would be careful with assigning haplogroups to the pre-Austronesian substratum".

I think we can confidently eliminate any of the Y-hap Os as being an ancient componentof the substratum. That narrows it down quite a bit.

"The Ongan-Austronesian link"

Is far more likely to be an Ongan-Austroasiatic link, especially if Bellwood is correct and all the Southern Chinese and SE Asian languages descend from those spoken in the Chinese Neolithic.

"but if we hold on to the observation that C2 is old, southeastern, and Austronesian-specific"

I'm very much disinclined to hold that idea. The evidence seems to me to very much indicate that Austronesian did spread as part of the Early Chinese Neolithic. I very much suspect Austronesian was introduced to, or developed in, Taiwan by members of Y-hap O1, and probably mtDNA B. So C2 is far from being the ancestral source of the language. It picked the language up.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.