August 01, 2012

Dates of major clades of the Y-chromosome phylogeny

Important note (Aug 9): Thanks to argiedude for pointing out a numerical error in these calculations. My message on the GENEALOGY-DNA-L:

You are right. Alleles were coded in 0/2 format in the plink .raw copy I was working with, and I divided with 2 to account for the two lines of descent from the common ancestor, but not with another 2 to account for the 0/2 format. This means that the age estimates posted in the blog [1] ought to be halved. This would make "Root" about 80ky, and CT about 31.5ky. These appear quite different from previous estimates, including the ones I originally got [2]

Looking at the original experiment [2] where I did divide with 2 twice, it seems that there are many more variant sites per pair of chromosomes; this is why the "Root" estimates for the two experiments end up similar.

Regressing the pairwise divergences of [1] on [2], I obtain:
Number of variant sites per pair of chromosomes using [1] = 215+ 0.07 * Number of variant sites per pair of chromosomes using [2]

So, it does appear that there is an excess of variant sites on average in [2] compared to [1].
Unfortunately, there is very little meta-data on the 1000 Genomes ftp site, so I don't know what to trust. If the phase1 data are to be trusted, then age estimates will appear quite smaller than those published by Karafet and Cruciani; but, perhaps, there was a little aggressive pruning of variants between the working data and the phase1 data? I guess we'll have to wait until the 1000 Genomes Project publishes on the topic to see which is the case. Thankfully, it seems that a lot of people are working on the Y-chromosome phylogeny [3], so I'm sure we'll find out more soon.

So, while these calculations are valid in terms of relative ages of haplogroups, we'll have to wait for a formal publication that explains the variant filtering process better until we can get some good absolute estimates.

End note
This represents my final take on the dating of major clades of the Y-chromosome phylogeny using 1000 Genomes SNPs from the phase1 dataset. When new data are added to the project, more haplogroups will turn up and the estimates will probably improve. Some of these may be slightly different than the original ones I posted, since there was a slight bug in the code I was using, and not all pairs between clades were compared.

I have also added the sample standard deviations for the age estimates; these do not take into account uncertainty regarding the mutation rate. Overall, it seems to me that with more samples, deep sequencing of the Y (as opposed to the low coverage of the 1000G Project), sequencing of larger regions of the Y, as well as a better estimate of the mutation rate (which can be achieved when many father-son full genomes become available), we are going to have a pretty crisp estimate of the ages of all the branching points of the tree.

There are, of course, some observations one can make:

The greatest contrast in the human species is between the BT clade which is less than half the age of the most basal clade within this dataset; there are even more basal Y-chromosome lineages in African hunter-gatherers

Most modern humans, including most Africans, belong to the BT haplogroup (~70k years). One can probably correlate the emergence of this haplogroup with a variety of archaeological events: arid periods in the Sahara and Arabian deserts, and the Toba eruption. The geographical origin of BT is currently unknown; the recent discovery of B haplogroup chromosomes in Iran and Afghanistan ought to give us pause about automatically assuming an African origin; the same can also be applied to Y*(xBT) chromosomes that also exist outside Africa in unusual places. Such chromosomes may represent recent gene flow and/or relics of great antiquity: they should be made the object of a study that will compare them to their African relatives.

Marginally younger is the CT haplogroup, that includes the CF (Eurasian) subclade and the DE subclade (Afrasian). All of these have an age of around 60k years. If an ecological calamity is responsible for the coalescence of human Y-chromosomes at around this time, then it is easy to imagine how subsets of modern humans from the original homeland may have acquired their own unique clades. The CF clade may correspond to people who went to the Levant or Central Asia, from which they would launch the Upper Paleolithic, while the DE clade may correspond to people who stayed in the South, heading west (E) or east (D).

The Upper Paleolithic/Lower Stone Age transition was probably made by these people 10-20 thousand years after they split up; I would wager that they already had the necessary mental "hardware" for the task, and the stress of the ecological calamity that befell them spurred them to develop the necessary "software" as well, and so did the fact that they found themselves in unfamiliar surroundings and had to quickly adapt. Hence, the appearance of full behavioral modernity sensu Klein. The major lineages that continue to dominate most humans (IJK and E) trace their origin to this crucial transitional period (50k years).

The West Eurasian clade IJ seems to have emerged earlier (37k years) than the East Eurasian clade NO (33k years) and the Central Asian-Siberian (?) clade P (also 33k years). This may be suggestive of the route taken by modern humans in Eurasia as they moved to the east.

There seems to be a plethora of important lineages that coalesce around the Last Glacial Maximum (I, J, O, R1). Perhaps this too was a stressful period for humans resulting in a loss of diversity and the survival of a few lineages from the early arrivals. In the case of haplogroup I which is truly European in distribution it is tempting to imagine that it represents the emergence of the Gravettians, while its sibling J stayed in the Near East, where IJ* chromosomes have recently been found.

34 comments:

I noted in the previous post, and will just mention in passing here, that the dates make a much more comfortable fit to probably corresponding dates from archaeology and mtDNA if they are all calibrated to be 50% longer.

I also note that the C1-C3 estimate made previously is not found here, and wonder if I should interepret that as a coding or software error in the original version of that estimate. That number had seemed very low the first time around.

"I'd say that if we included some Indian and Australian C, this would increase even more".

The new date you have for the CF split (56 kya. as against 47 kya.) makes more sense. That makes it quite possible that C carried mt-DNA N into Australia, allowing mt-DNA P to diversify there from 55 kya. The new date also now corresponds more closely with that of mt-DNA N at 59 kya. (Behar et al):

http://www.cell.com/AJHG/retrieve/pii/S0002929712001462

"My guess is that if we ever date these lineages it will be revealed that the D folk came first, followed by the C folk, followed by the NO folk".

That is what Eurologist suggests too. I would be reasonably sure that NO moved north from the hill country bordering NE India/S China though. That movement followed MNOPS's arrival in SE Asia.

"Could you expand on this?" [This may be suggestive of the route taken by modern humans in Eurasia as they moved to the east.]

I have long been suggesting both C and D 9with mt-DNA N) moved east through Central Asia at a time of moderated climate. Is that what you think as well?

Why don't you use the following samples as well, to derive the tMRCAs for intermediate clades?

9 G2a3-L30*s and also possibly 2 G1a-P20 GIHs available1 IJK* M89+ L16+ P128- P123- GIH which may be an "Indian F*"8 T1-P77s1 L-M22 availableA large number of Q1a3a1-M3s, but also two Q-M45+ M242+ M207- KHV Vietnamese

Why don't you use the following samples as well, to derive the tMRCAs for intermediate clades?

They're not part of the 1000 Genomes Project. This dating approach depends on having the same regions of the Y chromosome sequenced in the studied chromosomes. But, there are more chrY releases from the 1KG coming, so I'll probably do more age estimates in the future.

The age of DE makes it look much like a descendant of CT that stayed behind in Africa. Asian DE may descend from the same population that spread CT into Eurasia, in a DE segment of the population that managed to linger on. CF looks like the marker of a more bottlenecked Eurasian population, perhaps with some cultural/technological innovations.

The age of DE makes it look much like a descendant of CT that stayed behind in Africa. Asian DE may descend from the same population that spread CT into Eurasia, in a DE segment of the population that managed to linger on. CF looks like the marker of a more bottlenecked Eurasian population, perhaps with some cultural/technological innovations.

CF is Eurasian and DE is Afrasian. So, it seems that CT was originally Asian, and spawned CF and DE.

D has a relic distribution in the Asian periphery, and E is in Africa. So, it might make sense if DE was part of a southern population that got split up as the Arabia became more desert like, causing some remaining DE folk to migrate west and some east. These groups of escapees coalesced into E in Africa and D somewhere to the east.

The other set of escapees (CF) moved north out of Arabia, so they never got the chance to go to Africa. One group of escapees headed north+east (C), and another stayed behind (F). It is probably this F group that started the UP revolution in Asia, and the E group that started the LSA one in Africa.

D has a relic distribution in the Asian periphery, and E is in Africa. So, it might make sense if DE was part of a southern population that got split up as the Arabia became more desert like, causing some remaining DE folk to migrate west and some east. These groups of escapees coalesced into E in Africa and D somewhere to the east.Indeed. If CT is Asian, then this explanation makes the most sense.

One thing that makes it uncertain is that DE is apparently almost just as old as CT. This means it's actually quite possible that DE migrated out of Africa at the same time as CT. In other words, DE either migrating into Africa, or out of Africa at a later date, are no longer the only possibilities to consider. There needs to be a shift in the focus of the debate.

It is probably this F group that started the UP revolution in Asia, and the E group that started the LSA one in Africa.

If E carriers initiated the African LSA, it is truly remarkable that they failed to leave a genetic mark among the modern hunter-gatherers of Central-Southern Africa. There are no signs of any ancient E lineages in the Khoisan and Pygmy. Incidentally, mtDNA L3 is also lacking in these populations.

If E carriers initiated the African LSA, it is truly remarkable that they failed to leave a genetic mark among the modern hunter-gatherers of Central-Southern Africa. There are no signs of any ancient E lineages in the Khoisan and Pygmy.

More specifically, I should add that when I refer to the people that "initiated" the LSA, that includes the early LSA people of Border Cave. I don't rule out that LSA may trace its origins to more northerly DE populations. However, if that's truly the case, then it seems to have spread through cultural diffusion. Because there is no sign of their DNA in Central-Southern Africa.

If E carriers initiated the African LSA, it is truly remarkable that they failed to leave a genetic mark among the modern hunter-gatherers of Central-Southern Africa. There are no signs of any ancient E lineages in the Khoisan and Pygmy. Incidentally, mtDNA L3 is also lacking in these populations.

E1b1 is about 43,000 years old. I don't have data on the dates of the chromosomes found in Khoe-San. There are other E types as well.

In Europe, we've seen mtDNA U levels drop from ~100% to ~10% in the span of a few thousand years. So, I don't anticipate great continuity anywhere on the planet, and Reich's recent comments hint to that effect.

Also, it's quite possible that some part of the spread could have been mostly cultural. I'm not a big believer in a genetic origin of the 50ka revolution; even if there were such advantages, they need not be neurological in nature (could be related to life history or other factors). So, I wouldn't be surprised at all if Y*(xBT) people living in South Africa could adopt new techniques from other BT Africans with little immigration. However, I am inclined to accept some level of migration, primarily due to the Hofmeyr skull.

Hi there, I am interested in my family genealogy and in a DNA test.But it's hard to figure what to choose.23andMe is very expensive, 450 bucks or so, is it that good? What should I look for in a good test?What is the best DNA test in your opinion?Or, if you prefer, what are the bad ones I should stay away?

E1b1 is about 43,000 years old. I don't have data on the dates of the chromosomes found in Khoe-San. There are other E types as well.

Dates are not really needed when we know of Bantu admixture in Khoisan (represented by E1b1a and E2). No unique E lineages have been found among them. E-M293 has been associated with relatively recent East African expansions, as it's much more diverse in East Africa, and the recent ancestors of E-M293 are believed to have originated there.

In Europe, we've seen mtDNA U levels drop from ~100% to ~10% in the span of a few thousand years. So, I don't anticipate great continuity anywhere on the planet, and Reich's recent comments hint to that effect.

That's comparing apples to oranges. In Europe, Neolithic farmers replaced the hunter-gatherer natives, leaving only a portion of their DNA in the farmers as a result of assimilation.

In Africa, the situation is different. There are several hunter-gatherer relics that survived; the Bantu did not completely wipe them out. Also, the technology used by Khoisan hunter-gatherers bears striking similarities to the early LSA from 40-50 kya at Border Cave.

That's not enough to argue there should be genetic continuity. What further supports some degree of isolation for the Khoisan and Pygmy is the fact that they do not carry mtDNA L3, and if not for recent admixture, they would not carry Y-DNA E. So, even if hunter-gatherers are the result of some recent expansion, they did not arrive from a Y-DNA DE or mtDNA L3 source. It does not seem possible that the majority of their ancestry within the past tens of thousands of years can be traced outside the Central-Southern Africa region (including parts of East Africa, and excluding the rainforest).

So, I wouldn't be surprised at all if Y*(xBT) people living in South Africa could adopt new techniques from other BT Africans with little immigration. However, I am inclined to accept some level of migration, primarily due to the Hofmeyr skull.

Does this mean that you now believe BT is also of Eurasian origin?

In comparison to DE, it is actually quite possible that B is related to the spread of LSA technology. Y-DNA B is found widely among sub-Saharan hunter-gatherers. Interestingly, the most widely distributed subclade (B2b) is dated to >55 kya. B2b peaks in frequency and diversity in the Hadza hunter-gatherers in the central Rift Valley, which was part of the early LSA.

Dates are not really needed when we know of Bantu admixture in Khoisan (represented by E1b1a and E2). No unique E lineages have been found among them.

They don't have to be unique. Being part of the same lineage doesn't mean that there cannot be substantial time depth. You saw it recently in R-M343, they all belonged to the "same lineage" and yet one of them was much more anciently diverged than the rest.

In Africa, the situation is different. There are several hunter-gatherer relics that survived; the Bantu did not completely wipe them out. Also, the technology used by Khoisan hunter-gatherers bears striking similarities to the early LSA from 40-50 kya at Border Cave.

There is no reason to think that Khoisan are direct descendants of Border Cave people; they could be (in part), but that's far from proven.

When Hofmeyr was tested against UP Europeans and recent Khoesan, it aligned with the former.

French, Greek, Lebanese, and Egyptian fishermen in the Mediterranean used traditionally pretty much the same material culture, but that did not mean they were genetically related. Material culture persists even if there is massive genetic change, provided it is useful in the local environment.

Does this mean that you now believe BT is also of Eurasian origin?

I don't believe anything with respect to either BT or to Y-chromosome Adam himself. There are B and Y*(xBT) in both Africa and Eurasia. Their relationships should be studied.

It is wrong to assume that the persistence of B and Y*(xBT) in neat little packages in African hunter-gatherers has anything to say about Africa's priority with respect to either BT or the root of the phylogeny as a whole.

If the assimilation/genocide of hunter-gatherers continues for a little longer, Africa will end up just like Eurasia, where the B and Y*(xBT) chromosomes occasionally turn up in large samples.

At present, I'm willing to entertain different ideas: someone ought to compare e.g., the B in Iran with the B found in different African populations: if it's a recent immigrant, then they'll be separated by a few thousand years at most; but they may not be.

When this was done in the case of L*(xM,N) lineages in Europe, some of them turned out to be very old:

If two populations start exactly the same (50% X and 50% Y) they may follow quite different trajectories and turn out:

A: (50% X and 50% Y)B: (1% X and 99% Y)

All of the sudden, it now appears that X is "recent gene flow" from A to B. So, the paucity of B and Y*(xBT) in Eurasia does not necessarly indicate that these are all recent migrants; it could very well indicate that they were pushed to the side earlier, so they no longer exist in relic populations (as they do in Africa), but only as occasional outliers, having long before been assimilated.

The greatest contrast in the human species is between the BT clade which is less than half the age of the most basal clade within this dataset; there are even more basal Y-chromosome lineages in African hunter-gatherers

Is there really nothing between the most basal clade and the BT clade?

I have in mind the recent study by Cruciani et al (2011) which stated that the haplogroup A was paraphyletic. So logically there should be "intermediate clades" between the most basal clade and the BT clade, right?

That's good. When the evidence supports an African spread into Eurasia, it's better not to "believe anything". Stick with the far-fetched hypotheses when they concern Eurasian influence in Africa.

The evidence does not support anything. Only by making an additional assumption (that Y*(xLT) and B in Eurasia is always the result of recent African origin) does the evidence support an African origin of the phylogeny.

The Iran paper found 7/938 haplogroup B chromosomes in Iran, a country of 75,000,000. That's potentially a quarter million B chromosomes in just one Eurasian country. I wouldn't dismiss these as easily as you seem to.

It is well known that Iran has been a receipient of sub-Saharan lineages. I don't know how it exactly translates into the later nomenclature but Hammer's haplogroup 5(a sublineage of haplogroup E) is found at a moderate frequency in Iran while it is virtually absent in all of Europe and Asia.

So I would wager that B's in Iran came along with "haplogroup 5" through Horn of Africa.

I have long argued that CT is Asian in origin and E represents a back migration to Africa. However BT for the same is pushing a bit too far.

"One thing that makes it uncertain is that DE is apparently almost just as old as CT. This means it's actually quite possible that DE migrated out of Africa at the same time as CT".

Actually CT includes DE. CT split into DE and CF. Next both haplogroups split into two, DE almost immediately. Of the four resulting haplogroups just E is African. That alone is enough to expect that it was CT that emerged from Africa.

"I don't rule out that LSA may trace its origins to more northerly DE populations. However, if that's truly the case, then it seems to have spread through cultural diffusion. Because there is no sign of their DNA in Central-Southern Africa".

Many would claim that technology migrates faster than genes. So it is quite possible that although the technology would be passed from father to son or uncle to nephew the technology could spread ahead of the Y-DNA. Besides which, if haplogroup E was originally a minority haplogroup in the populations that became the Khoi-San it could have been eventually drifted out.

"the B in Iran with the B found in different African populations: if it's a recent immigrant, then they'll be separated by a few thousand years at most; but they may not be".

That testing is absolutely essential before we can draw any conclusions concerning any extra-Africa origins for those basal Y-DNAs. Until then we are probably safer in assuming they are more recent arrivals outside Africa.

There is a huge patch of isolated and mysterious R1b1c in Northern Cameroon. I always thought that the R1b in northern Cameroon was recent colonial from Western Europe, French or something. But as R1b1c it tells a completely different story.

R1b1a includes the main Western European R1b.

R1b1b is rumoured to be found in Anatolia (I could not confirm this), and the observation of this subclade in Anatolia is part of the evidence for a West Asian location for R1b1. The idea is that R1b1b stayed behind when Ra1b1a or subclades of it travelled north of the med to the Atlantic.

Its hard to fit R1b1c in the northern Cameroon into this scenario (expansion north of the med from central asia).

The simplest idea when three subclades have different locations is to look for a geographically likely source location for the parent clade (R1b1), able to spread to all three locations. The obvious one here is North Africa.

That would translate as:R=Central AsiaR1a=branch that travelled west north of the med.R1b=branch that travelled west south of the med.R1b1=North Africa

R1b1a=the branch that headed west from africa and crossed north at Gibralter.R1b1b=the branch that headed back east to Anatolia.R1b1c=the branch that headed south from north africa.

However another possible explanation is that a relict R1b1 haplogroup later made its way to the Northern Cameroon and expanded.

IMO 23and me is much better value. But if you get serious about tracing your paternal relatives (process that takes decades of waiting) you will need to test STRs at Family Tree DNA and join a Surname project also.

"Why don't you use the following samples as well, to derive the tMRCAs for intermediate clades?"

"They're not part of the 1000 Genomes Project. This dating approach depends on having the same regions of the Y chromosome sequenced in the studied chromosomes."

Aren't these in fact low-coverage Y full sequences - about the same degree of coverage as the 1K Genomes samples - so they are basically equivalent to the 1K Genomes low-coverage Ys? What's the difference?

The Iran paper found 7/938 haplogroup B chromosomes in Iran, a country of 75,000,000. That's potentially a quarter million B chromosomes in just one Eurasian country. I wouldn't dismiss these as easily as you seem to.

To be more specific, B was found in 7/192 (3.6%) of samples from Hormozgan, one of 31 provinces in Iran, and one of 16 sampled provinces in the recent paper. This is a region with recent, known connections with Africa, and it even hosts an Afro-Iranian community (although Grugni et al. only included 12 Afro-Iranian samples). It's telling that in 2/3 of the native Iranian groups where E1b1a1 was found, B was found as well.

If you find it so difficult to make assumptions based on recognizing the most probable option, then maybe you should refrain from ever ascribing a haplogroup to any particular region, ethnolinguistic group, or innovators of technology, unless the evidence is completely unambiguous? But I know you don't actually have a problem with making assumptions (E - LSA, F - UP, J2a - Indo-European), only in certain "special cases". I'm done here.

It also turns up in various places in Europe and West Asia in commercial testing unaccompanied by E, and certainly not in the dominant E/weak A ratios expected of modern African populations.

And, there appears to be evidence that A was more frequent in the past than it is today in parts of Africa

http://etd2.uofk.edu/content/html/pdf/en/en.4312.pdf

If the "A" dates to that earlier period, then it would explain why it is found unaccompanied by E.

So, while recent African introgression can be used to explain some results, there is also evidence of much older links. These should be investigated.

Also, proposing links between lineages and archaeological phenomena is not "making assumptions", it's brainstorming. For example, the link of E with the LSA is not set in stone. E very clearly moved north-south in recent times (the San are a partial relic of this), and it also apparently moved to the Sudan since Neolithic times. West African E types are derivative from East Africa. So, it might be quite possible that E spread much later than the LSA.

The fact is, that African populations have >100ka population "splits" with Eurasians, even when using the fast mutation rate calibrated with chimps. Coalescence ages of L3, BT, and DE, on the other hand are much younger. There is also evidence of late persistence of archaic humans in Africa and late admixture with them. My Back-to-Africa theory attempts to reconcile different strands of evidence into a new synthesis. Perhaps I am wrong but there's no reason to cling to a naive recent Out-of-Africa paradigm when it is becoming increasingly unbelievable.

Yes. The increased frequency frequency of A in the ancient past explains the occasional A in West Eurasia, despite the absence of E1b1a. To this day, A is predominant in South Sudanese Nilotes, where E1b1a is absent (Hassan 2008). aDNA has shown that the predominance of A extended to Northern Sudan in Neolithic times, and it may have extended to parts of North Africa as well. From the outliers I've seen who tested with commercial testing companies, their A matches the subclade found in Sudan (A3b2).

The fact is, that African populations have >100ka population "splits" with Eurasians, even when using the fast mutation rate calibrated with chimps. Coalescence ages of L3, BT, and DE, on the other hand are much younger. There is also evidence of late persistence of archaic humans in Africa and late admixture with them. My Back-to-Africa theory attempts to reconcile different strands of evidence into a new synthesis. Perhaps I am wrong but there's no reason to cling to a naive recent Out-of-Africa paradigm when it is becoming increasingly unbelievable.

Several factors explain the >100 kya divergence time, despite of the great success of (likely African) lineages dating to 60-70 kya.

1) The majority of West African mtDNA does not belong to mtDNA L3, but more divergent lineages (L2, L1). Nilotic/Chadic groups, partially of recent East African extraction, are an exception to this rule; they are mostly L3

Based on palaeoarchaeological evidence, the region, where anatomically modern humans have likely originated, is com- prised of a vast territory from Central Europe in the west to the Russian Plain in the east to Levant in the south. Each of these regions is renowned for discoveries of the oldest skeletal re- mains of modern humans dating back to 42,000 - 44,000 ybp. To date, none of these sub-regions has clear and unequivocal advances in this regard.

You said "Its hard to fit R1b1c in the northern Cameroon into this scenario (expansion north of the med from central asia)."

It's really not that hard to fit. There are attested pre-historic and historic interactions between Africa and the Near East (the Central Asian association for R1b is a red herring) which are fully compatible with everything else we know about R1b.

For one thing, the first major split under R-M343 is between R-L389 and R-V88. Within the Near East, the first has a more northernly distribution while the second is more southernly: entirely consistent with the existence of R-V88 in the parts of Africa where we observe it.

Dienekes, what do we see that seems to synchronize with the Younger Dryas Impact at 12.9 Kya?

Ted,

I can't answer for Dienekes, but I would think that the method employed basically folds some error distribution over the actual dates that is wider than the Younger Dryas. On the flip side, if one knows the exact form of this function, it could (theoretically) be possible do deconvolve the results to get back to a much finer temporal resolution.

R1b-V88 corresponds very strongly to the Afro-Asiatic linguistic subfamily called "Chadic" a pastoralist Sahel people whose largest ethnic group are the Hasua. This Y-DNA haplogroup has diffused very little via exogamous admixture with geographically adjacent groups until very recently (Muslim Sahel pastoralists of multiple linguistic populations began to form a sense of shared identity with each other, probably starting sometime in the 20th century, which has probably facilitated greater admixture in the last few generations). This distinctiveness, despite geographical proximity and similar food production modes, is suggstive of the possibility that R1b-V88 may have been the last significant pre-Colonial arrivals to the population genetic mix in Africa.

The R1b-V88 case, which is predominantly Chadic Afro-Asiatic, also adds to the confusion about the linquistic identity of the carriers of R1b during its expansion period. R1b (in haplogroups common in other European populations as well) is a leading Y-DNA haplogroup among the quintesstially linguistically non-Indo-European and non-Afro-Asiatic Basque. But, uniparental genetic markers, autosomal genetics, and specific genetic trait markers (blood type, lactose intolerance) as well as the history that is available, all disfavor a scenario in which the Basque experienced language shift from population that was previously Indo-European or Afro-Asiatic linguistically, or a scenario in which a larger linguistically "Greater Basque language family" population shifted to speak Indo-European languages with no identifiable genetic contribution from Indo-European langauge speakers.

When we have Indo-European language speakers with distinctively non-Basque genetic elements, Basques and Chadic linguistic populations that all bear some kind of R1b Y-DNA, it is hard to make any strong statements about the linguistic origins of the source population for all three.

For reasons of phylogeny, I think we can safely say that these three populations all have a common Y-DNA ancestor somewhere in the general vicinity of the Cacususes or Pontic-Caspian steppe area.

I also think it is most plausible that R1b-V88 founding populations developed an Afro-Asiatic language somewhere en route from this homeland to the Sahel, perhaps via imperfect or intentionally differentiated dialects of Coptic or Berber or Cushitic or even Semitic or Omotic (internal organization of the Afro-Asiatic linguistic family is hotly disputed). There are plausible routes by which almost any of the main Afro-Asiatic linguistic families could have influenced a migrating population that would come to speak Chadic language son that route.

On balance, the notion of a Greater Basque population at the source area that experiences language shift accompanied by some demic contributions from an Indo-European superstrate after arriving in Western Europe from their homeland seems most plausible scenario to me. But, given the thinness of the historical evidence, any plausible migration and language shift scenario could explain the data and more complicates scenarios are possible too.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.