January 20, 2010

Neolithic origin of European Y chromosomes

My comments on this important paper will follow here after I read it.

UPDATE

My observations:

The non-use by the paper of the "evolutionary mutation rate" results in low age estimates and hence a "Neolithic" time frame. The authors are to be commended for going against the flow of the last 7 years of population genetics literature.

However, the inference that a "Neolithic" TMRCA = "Neolithic" time of arrival into Europe is flawed, as I have argued elsewhere. In short: a particular TMRCA is consistent with either the arrival of the lineage long before and long after the TMRCA in a particular geographical area.

Equally flawed is the inference that R1b1b2 is clinal (Figure 2A). Microsatellite variance is not significantly higher in Turkey than in Europe -- even if one makes the questionable questionable assumption that modern Anatolian Turks are patrilineal descendants of Neolithic Anatolians. The significance of the regression line disappears if 1 or 2 data points are excluded, and the plot has a quite visible "gap" between Turkey and Italy corresponding to the entirety of eastern Europe and the Balkans, i.e. the routes that any putative Neolithic lineages would have entered Europe

In conclusion: the paper makes an important but inconclusive contribution to the question of R1b1b2 origins. We will have to wait for more ancient DNA before this question can be settled.

The relative contributions to modern European populations of Paleolithic hunter-gatherers and Neolithic farmers from the Near East have been intensely debated. Haplogroup R1b1b2 (R-M269) is the commonest European Y-chromosomal lineage, increasing in frequency from east to west, and carried by 110 million European men. Previous studies suggested a Paleolithic origin, but here we show that the geographical distribution of its microsatellite diversity is best explained by spread from a single source in the Near East via Anatolia during the Neolithic. Taken with evidence on the origins of other haplogroups, this indicates that most European Y chromosomes originate in the Neolithic expansion. This reinterpretation makes Europe a prime example of how technological and cultural change is linked with the expansion of a Y-chromosomal lineage, and the contrast of this pattern with that shown by maternally inherited mitochondrial DNA suggests a unique role for males in the transition.

75 comments:

Vizachero is rejoicing in posting the paper of Barbujani (hereon Farfugliani) on “Genealogy-dna”. Farfugliani, the preferred pupil of Cavalli Sforza. Ydna R1b1b2 would come from Middle East , the old theory of farming expansion, demic diffusion and, why not, the Renfrew’s oddity of the Indo-European from Asia Minor.1) Asia Minor isn’t Middle East. There is a fault between Indo-European Asia Minor and Semite Middle East: linguistically, cultural, chromosomal.2) From there would come Y but not mt. Do you imagine an horde of Asian Don Juans to break a lot of European hearts? And under the eyes of Mesolithic hunter gatherers with their murderous bows?3) Anyway a few joy for Vizachero: this would have happened not 4000ya as he is thinking but at least the double before: 8000ya. Something doesn’t square.4) The science of Farfugliani is all pro domo sua: Italy, which has at least a 30% of its R1b1b2 as R1b1b2a (DYS393=12), has in his data pretty only the more recent subclades: and Italy has R1b1*, R1b1a, R1b1b2/L23-, R1b1b2/L23*/150- (the unique over the world) etc.5) We are yet waiting for the Rozen’s SNPs, con buona pace di Farfugliani. Farfugliani sleeps in peace.

From where it might have been spread by the Cardium Culture or the Megalithic Culture:

It would seem that they might have originated in Southern Armenia, in the Syunik area, as there is a huge ancient megalithic structure there, called Karahunj (aka Zorats Karer, Angelakot) - http://www.nationmaster.com/encyclopedia/Karahunj - which may date to 8,000 yo, it seems to have also served as an observatory - much like Newgrange in Ireland or Stonehenge in England.

It would also seem that the population of this Southern Armenia area is 40-45% R1b:http://www.ucl.ac.uk/tcga/tcgapdf/Weale-HG-01-Armenia.pdf

I think they're refering to numbers within the population rather than microsatellite variance. The actual comment, 'increasing in frequency from east to west' seems to imply that interpretation.

"This reinterpretation makes Europe a prime example of how technological and cultural change is linked with the expansion of a Y-chromosomal lineage"

By no means the only example though. In fact I see technological change behind the expansion of virtually all Y-haps. Their distribution certainly makes more sense when interpreted in this way. But occasionally they take their own women with them.

The dating issue raised yet again by Mr.DP, God bless him, is just a red herring.

The exact age of the haplogroup in question is not important. What is important is: 1) frequency and 2) genetic diversity. I have read that Mr.DP does not agree with the higher genetic diversity of the haplogroup in Anatolia, but that report is just confirming, yet again to the point of boredom, that the haplogroup gets more genetically diverse the further east one moves from its centre of high frequency in western Europe. Those researchers may all be wrong and have agendas, but they have access to all the data by which they came to that conclusion. A similar conclusion can be made about other haplogroups, mine in particular, the despised, Middle Eastern J1. The groups who have the highest frequency, to the point of thinking those people must have made inbreeding and consanguinity a dogma in their groups, show a lesser amount of genetic diversity. Genetic diversity shows where haplogroups originate, or where the original bearers immigrated in a large enough group to maintain high genetic diversity.

I have to say "I told you so". I have maintained that the selection of certain over publicised haplogroups in Europe as being a sign of Paleolithic or Mesolithic Europeans was wrong. I have said often that just about every haplogroup found in Europe have Middle Eastern or Central Asian origin points. In fact I have said there are no Y chromosome haplogroups native to Europe.

Cheddar Man, a man killed and placed in a cave in Britain more than 9,000 ybp and before agriculture was introduced to Britain. Therefore that mito haplogroup has to be considered a native European one i.e existing prior to the Neolithic farming crowd moving to Britain. However, very few authenticated and proven SNPs have been derived from the bones of Paleolithic or Mesolithic Europeans. The CRS HSV1 result in a Paleolithic European from Italy proves nothing. Lots of haplogroups including the African Ls can have the CRS in the HSV1. More ancient dna is needed to prove the haplogroups i.e SNPs in their mito dna before making rash judgements.

I'm still reading it but there are various problems I can already see:

1. MC age estimates are just a guess game with no guarantees whatsoever.

2. They do not study R1b1b2a1, which is the truly European-specific haplogroup but a larger clade that of course has West Asian origin (when? MC estimates can't but speculate).

3. The phylogenetic structure of R1b1b2 and in particular of R1b1b2a1 is not considered. All haplogroups and phylogenetic layers are treated as one simple amorphous object.

In this sense, the phylogenetic graph is almost funny and evidences the lack of precision achieved by mere haplotype (STR) analysis, what looks an almost amorphous gigantic starlike structure has in fact little correlation with the highly structured R1b1b2 true phylogenetic tree.

However, I believe I can still spot how Anatolian phylogenetic diversity was achieved: by back-migration. One branch does look as specifically Anatolian (mostly red, probably representing R1b1b2(xR1b1b2a1)) but all the other Turkish haplotypes (indeed very diverse) are clearly derived from European ones.

4. No archaeologically consistent spread pattern is proposed other than the vague term "Neolithic", as if all Neolithic cultures were one and the same thing.

As a side note the PLoS Biology site has been under maintenance for the last many hours, so I'm accessing the paper via PMC, something that other readers may find useful while the main site goes up again.

I'm finding intriguing why so many European samples (Bosnia, France 5 and 6, Poland, Russia, Portugal, Spain 1 and 3, this last meaning Basques, and Slovenia) have been excluded of all analysis and from the supplemental material (frustrating).

As expected, most of the Turkish haplotypes are DYS393=12, what means they are not part of R1b1b2a1, which is what really matters. This, together with all DYS393>12 (R1b1b2a1)

The other possible source of "Neolithic" haplogroups in Europe could be the Balcans, however, while the Balcans seem to share some R1b1b2* haplotypes with Turkey (sign of Neolithic arrival probably but only to that area), they do not appear as central to the R1b1b2a1 haplogroup, that is what really matters here.

For starters, check fig.3, that shows the haplotype structure of the overall R1b1b2 in Europe and Turkey. You will notice that it has a very marked star-like structure and that one (and only one) of the many branches has most of the red-coded circles, meaning Turkish samples. This is the root and is also identical to R1b1b2(xR1b1b2a1).

Then there is a big circle in the middle with very little (but some) red. That is the modal haplotype of R1b1b2 and also the modal and root of R1b1b2a1. It does not look Turkish, Balcanic nor Italian. Of the rest only the Iberian area (but excluding Portugal, East Andalusia and, notably, Basques) is marked, comprising some 15% of the "cake". The rest is "others", meaning French (incl. Basques), German, English, Irish and Danish. Together they make up about 80% of the node (though I'd like to know in greater detail and for that reason I'll have to check manually through the supplementary material - I wouldn't mind having France and Germany marked in this graph, but nope).

This means that R1b1b2a1 (i.e. most of R1b1b2 globally and nearly all R1b1b2 in Europe) has a European origin in the Magdalenian region.

Erratum: Iberian share of the central (modal) cake is more like 20% and others therefore make more like 70%, not 80%. But this doesn't change anything, as the Turkey-Balcans-Italy area still only make up like 10%.

And another very useful exercise to confirm or reject the validity of the claim that highest R1b1b2 or R1b1b2a1 diversity is highest in Turkey or where. Just go to figure 3 again and count the circles including each color (each circle means a single haplotype). I did and these are my results (subject to minor error because of manual count):

I just was able to look at the paper (after a 4-hour publisher delay) and basically agree with Maju.

In particular, the agricultural map is outdated, and the haplogroup analysis is extremely simple-minded. The crux is when and where R1XXX and R1bXXX migrated out of South Asia, and when its children started to dominated Europe. Seems like the authors don't really know what they are up against.

Maju, I appreciate your analysis, but remember what I have always said:1) probably R1b1* came to Italy from Western Europe (the Cantabrian Refugium?) before the Younger Dryas, but the last paper of Cruciani has found 3 R1b1* in Italy and none in Western Europe, then we can hypothesize that R1b1* was present in Italy before and not elsewhere;2) after the Younger Dryas the subclades till R/L23+/L150+ migrated from Italy to East and to West;3)Spain has R1b1b2a1b, but lacks all upstream clades, and has above all as subclades R-M153 and R-M167. No R-L21 and a few R-U152, due probably to back migration.Then the origin of subclades of R1b1b2 has a double face: one from West and one from East-Central Europe and the origin in Italy I think is always the best explication. Also the R1b1/V88+ in Africa presupposes an origin from North (Italy by sea) and not from East, and probably neither from West, lacking Spain the clades upstream it.

Gioello: I have recently (on light of the new R1b1a data) considered the possibility that Italy could be at the origin of R1b1 as a whole, as by haplogroup plus "asterisk" haplotype count it might seem to be slightly ahead in basal diversity. The matter is not too clear cut but there is some evidence that could support your claim.

However this issue is one or two steps further down the phylogenetic tree: in what regards to R1b1b2, Italy appears as rather secondary - and Anatolia instead as the actual urheimat. In turn, R1b1b2a1 seems quite detached from either Turkey or Italy and be an essentially West/North European matter.

Hence it'd rather look like the pattern could be: Italy (R1b/R1b1) -> West Asia (R1b1b2) -> Central/West Europe (R1b1b2a1). However West Asia is also a decently good candidate for the R1b/R1b1b urheimat and the slightly lower basal diversity after Italy might be due to undersampling.

The matter is complex but looks anything but Neolithic (except maybe for some minor branches).

And another very useful exercise to confirm or reject the validity of the claim that highest R1b1b2 or R1b1b2a1 diversity is highest in Turkey or where. Just go to figure 3 again and count the circles including each color (each circle means a single haplotype). I did and these are my results (subject to minor error because of manual count):

No need to count from the figure: all the data is in the supplemental file.

But the important points, which contradict your argument, are two.

One is that the haplotypes (and haplogroups) of NW Europe are a subset of SE Europe/SW Asia and not the other way around. Of course the frequency of the derived alleles has surfed to higher frequency as the wave progressed, but this is well attested and explained by both simulation and observed data elsewhere.

The second, related, point is that haplotype diversity is higher in the east (especially southeast) than in the NW. You could add data from other studies which tested the same markers (e.g. Cyprus, Syria, and Lebanon from El-Sibai et al. (2009) and the southern Balkans from Bosch et al. (2006) to the map and the conclusion would be the same.

To these I also added data from Armenia and Hungary from FTDNA, but you can see the basic shape of the diversity cline is intact.

Maju, of course I can’t think that R was born in Italy. Probably some R (R1, R1b) arrived to Italy from somewhere, in the first expansion that happened from a center all around. But now, from the latest calculations of Klyosov re. the haplogroups’ nodes (see “Genealogy-dna”) and the observation of Margaliot (perhaps the first time I find something to agree with him) re. the possibility that the origin isn’t in hg. A but in CT/AB (then the origin wouldn’t be in Africa but in Middle East), perhaps our theories should take in consideration other scenarios than those taken till today.For mtDNA Italy not only has that (probably) mtDNA H from Paglicci (28,000YBP) but I have personally cases of an ancient presence in Italy of other very ancient haplogroups: the R0a of a relative of mine in Tuscany (also found from a paper on Etruscans), the U2d of an Italian-American asked me from Palanichamy to test for an FGS (unfortunately I wasn’t yet able so far to find someone for a sample)… Nobody, neither this last paper, doubts about the importance of the Cantabrian Refugium re. European mtDNA (H, but also U, and my K come from U8 found above all in the Basque country).Of course I agree completely with you that the times are older than those thought by Klyosov, Nordtvedt, Vizachero etc.

No need to count from the figure: all the data is in the supplemental file.

Sure but it's much more difficult to count. I've tried that and gave up.

One is that the haplotypes (and haplogroups) of NW Europe are a subset of SE Europe/SW Asia and not the other way around.

I agree with that... in what regards to R1b1b2 as a whole. My whole point is that I don't know why I need to care about R1b1b2 and completely ignore that there is a huge starlike haplogroup which is not R1b1b2 nor R1b1b2a but R1b1b2a1, whose greatest basal diversity is in Western or Central Europe.

If we are to consider R1b1b2, why not R1b1b or R1b1? Or R1 or R or P or...?

My whole point is that R1b1b2a is a distinct subclade that should be dealt with on its own regardless of higher level phylogenies. And also lower level phylogenies should be adressed if possible because they are also informative.

Of course the frequency of the derived alleles has surfed to higher frequency as the wave progressed...

There is no single wave: these are two distinct process: R1b1b2 pre-R1b1b2a1 coalesced in Anatolia (or somewhere nearby) but R1b1b2a1 is a unique (and huge) explosion on its own right. Explosion that did not have its center in Turkey or the Balcans and that can't be explained in vague anti-archaeological terms of "Neolithic expansion" (which did never exist as any single process anyhow but as several distinct ones, not reflected at all in the hypothesis).

The second, related, point is that haplotype diversity is higher in the east (especially southeast) than in the NW.

Not for R1b1b2a1. Only for upstream stages, which are not at the core of the demic explosion reflected in this STR phylogeny (and any phylogeny I could ever find).

Why do you stop dribbling the bull and deal with the facts? Why you keep skipping R1b1b2a1, as if it would not exist and not be the very star of this phenomenon on its own merits, and trying to distort the matter by only dealing with R1b1b2 as a whole? They are two related but clearly distinct matters.

Nobody, neither this last paper, doubts about the importance of the Cantabrian Refugium...

I am not interested in defending R1b1b2a1 or any other lineage as original from the Franco-Cantabrian region. In spite of being Basque, if our origins would be in Congo two centuries ago, I'd accept that fact with total normality. In fact I have been considering that Central Europe might have been more central... maybe. Or Italy or Turkey or India... or whatever, each one at the stage that corresponds if the data signifies it.

I do not have an agenda (unlike others, it seems). My only agenda is finding out the truth or getting as close to it as I can.

Just to be clear. And in fact I'm 25% Italian by ancestry, though I have nearly no cultural connection with The Boot in practical terms. As Basque and as European and as Human, I am interested in clarifying our origins but I have no particular preferences (and if I would have, I'd put them aside).

And don't forget, Maju, what I have said in my first posting, that Italian R1b1b2s (or R1b1b2a-s) are underrepresented: the Italian samples are from North East (Ladins, but probably the mixed Friulans and not the more conservative Rhaetians) and from North West. It lacks completely the rest of Italy, that has a 30% of its R1b1b2s as R1b1b2a (DYS393=12). But also with these haplotypes we have the slowest mutating marker among these (DYS388) in Italy with 15 value (two times), which demonstrates the most ancientness also of these haplotypes all over those took in consideration.

Glad of your quarter of nobility. I, like you, am searching only for truth and if this truth puts my country in the center it doesn't dislike me, but I'd do anything against truth. As Spinoza said: "Veritas se aperit" (or something similar). For me it is already enough that this paper says that 3 were the refugia: Spain, Italy, Balkans. Only a few years ago Italy didn't appear.

Sure, Gioello. This paper's data is not the last word. Not only Italy is undersampled, the Balcans are too (they sampled Bosnia and then did not include the data in the analysis, when Bosnia could in principle support or not their Neolithic hypothesis, as was part of the core area of CP). But from what I've seen in other materials Italy still doesn't look central for R1b1b2a1.

Whatever the case, someone in the academic community should get serious about R1b and in particular R1b1b2a1 and make a decent research taking in account both SNP phylogeny as basic and STR data as complement.

My whole point is that I don't know why I need to care about R1b1b2 and completely ignore that there is a huge starlike haplogroup which is not R1b1b2 nor R1b1b2a but R1b1b2a1, whose greatest basal diversity is in Western or Central Europe.

No need to ignore anything. Better, I think, would be to look at all the evidence before you with an unbiased eye.

The genetic evidence tells us that R1b1b2 spread into Europe from SW Asia. Forget for a moment the absolute "when". This was, by any measure, a very swift expansion. The TMRCA for R-M269 is no more than a 1,000 years greater than the TMRCA for R-L11 and is maybe half that.

I understand your desire to focus on the most "European" clades of R1b1b2, but the timing and spread of this haplogroup prevent the treatment of one clade in isolation from the others. That is what makes the difference here versus R-M73 etc.

It is true that we see very little R-P312 or R-U106 in SW Asia, but that is exactly the point! That's what we'd expect under the scenario advanced by this paper. There is a wave, starting in the Near East and spreading towards NW Europe. At the beginning, there is very little R-L11 and no R-P312 or R-U106 at all. As the wave rolls across Europe, R-P312 and R-U106 are "born" (in the sense that the P312 and U106 mutations arose on the wave front) and their frequency is amplified as the wave progresses. In a very short span of time (dozens of generations, max) the frequency of P312 and U106 went from zero to, well, something much more than zero.

So when I talk about R1b1b2 I am not "ignoring" L11 or P312 or U106. Heck, I was one of the first to report some of those markers publicly. Rather, I am holding R1b1b2, R1b1b2a, R1b1b2a1, R1b1b2a1a, etc. all in my head at the same time.

There is no single wave: these are two distinct process: R1b1b2 pre-R1b1b2a1 coalesced in Anatolia (or somewhere nearby) but R1b1b2a1 is a unique (and huge) explosion on its own right.

Well, R1b1b2(xL51) never really made much dent in Europe at all. So in that sense, I guess I agree that with regards to Europe that "R1b1b2a1 is a unique (and huge) explosion on its own right". But that won't change the story: R1b1b2 expanded rapidly into Europe from SW Asia, probably during or after the neolithic transition.

Maju, I'm curious how Bosnia, a country of a mere 4 million people in a largely mountainous region could support anything about the Neolithic. The Bosnian population seems to resemble mostly the Ukranian one, and the people speak a S. Slavic language and are part of one of the youngest clades of the I2 haplogroup found in Russia today. I'm stumped how this has anything to do with 5-7 thousand years ago.

The genetic evidence tells us that R1b1b2 spread into Europe from SW Asia.

Agreed.

Forget for a moment the absolute "when".

Agreed. It should be the last thing we'd look at in fact.

This was, by any measure, a very swift expansion. The TMRCA for R-M269 is no more than a 1,000 years greater than the TMRCA for R-L11 and is maybe half that.

So immediately after deciding to ignore the "when" you go for another hypothetical measure of time: the "how fast".

I don't really believe in the molecular clock (not at least as anything that is factual, just conjectural), so this argument has only very weak validity for me.

So I'll also ignore the "how fast" until we stumble into a clear signature of really fast expansion: a starlike structure. This starlike structure is not rooted in R1b1b2 nor R1b1b2a but in R1b2b2a1.

This point is crucial to the debate.

I understand your desire to focus on the most "European" clades of R1b1b2...

Don't get me wrong. I'd like to know all about everything genetic and prehistoric... everywhere.

My reason to change the focus is that the very structure of the haplogroup does not have a signature of dramatic expansion starlike structure, much less in Europe, at the R1b1b2 or R1b1b2a stage... but only at the R1b1b2a1 stage. Stage that clearly does not show any Anatolian nor Balcanic core, at least for what be seen in this paper.

As there is no other Neolithic origin in Europe than the Balcans (or presumably Anatolia at a previous stage), this simple fact invalidates any "Neolithic" hypothesis for R1b1b2a1 and for R1b, R1b1, R1b1b, R1b1b2 and R1b1b2a too.

It is true that we see very little R-P312 or R-U106 in SW Asia, but that is exactly the point! That's what we'd expect under the scenario advanced by this paper.

No. This paper totally ignores these two haplogroups and in general any distinction of the structure downstream of R1b1b2-M269.

And that's its very failure: what makes it rather useless and confusing. Confused from the very starting point in fact: the authors seem to be more interested in "demonstrating" something than in exploring the complexity of the haplogroup. They totally miss the point.

There is a wave, starting in the Near East and spreading towards NW Europe.

No. There was a single founder effect. Nowhere in that structure a "wave" is apparent: there's no gradation of any sort between the last Rb1b2a node (the largest cake with a major Turkish participation - anyhow comparable to the NW European or Iberian ones: they are 1/3 each) and the first R1b1b2a1 node (the large central cake).

[You'll visualize better at the annotated diagram I posted at Leherensuge]

This founder effect was eventually followed by a rapid demic expansion, a starlike explosion of sorts that happened in Europe, somewhere in the Other-Iberian area (i.e. the Magdalenian cultural area). I can't discern exactly where from this paper's data or other materials either. Sadly enough this issue has not been explored sufficiently but I tend to think that either the Franco-Cantabrian region or Central Europe (Germany and surroundings).

Neither of these areas (Iberia, Franco-Cantabrian region nor Rhine-Danube region) played any central role in Neolithic spread, much less one that would have affected the others. There is just no way, absolutely no way, that R1b1b2a1 can fit with Neolithic spread patterns.

There must be therefore another explanation. And all possible ones are pre-Neolithic: Aurignacian, Gravettian, Magdalenian or Tardenoisian/Geometric Epipaleolithic are the candidates. I have different reasons to favor and oppose each one of them, so I'm uncertain but what is clear is that it's not Neolithic.

This is consistent with the starlike expansion from a modal haplotype, oddly enough shared by so many phylogenetic nodes (i.e. absolutely central across R1b1b2a1 and downstream).

It would be just another reason to put the horses before the cart and analyze STR structure within and not without SNP one, which is both safer and probably more clarifying.

However many areas in Europe have never been tested for this structural organization. Basques for example have some of this and that low level subclade but most of the lineage is R1b1b2a1* as far as research can tell (mostly it only says "R1b" or "R1b1" but we infer from haplotypes it must be R1b1b1a or R1b1b1a1 or something not yet described downstream of these nodes. Same for all SW Europe (France, Iberia, etc.) that remains outrageously understudied for the structure of R1b (as for I too - too much I* in all studies so far for me to be happy).

So when I talk about R1b1b2 I am not "ignoring" L11 or P312 or U106.

You do. Maybe not intentionally but you are ignoring the most clear structure we have: the one based on SNPs.

I guess I agree that with regards to Europe that "R1b1b2a1 is a unique (and huge) explosion on its own right.

Good. :)

But that won't change the story...

It changes all. The center of the demic explosion is only at that node, not upstream. And it happened somewhere between the Oder and Gibraltar, not at the Aegean.

Your obsession with MC conjectures only confuses you and whoever you manage to persuade. Look at the hard data first and foremost (structure, geography, archaeology). Only when you understand that and all its "timeless" implications you should deal with more speculative matters as age estimates.

So immediately after deciding to ignore the "when" you go for another hypothetical measure of time: the "how fast".

I don't really believe in the molecular clock (not at least as anything that is factual, just conjectural), so this argument has only very weak validity for me.

Surely you can grasp the difference between absolute and relative. And your inability to understand the concept of a molecular clock may be charming to some, but not to me.

So I'll also ignore the "how fast" until we stumble into a clear signature of really fast expansion: a starlike structure. This starlike structure is not rooted in R1b1b2 nor R1b1b2a but in R1b2b2a1.

My reason to change the focus is that the very structure of the haplogroup does not have a signature of dramatic expansion starlike structure, much less in Europe, at the R1b1b2 or R1b1b2a stage... but only at the R1b1b2a1 stage. Stage that clearly does not show any Anatolian nor Balcanic core, at least for what be seen in this paper.

It is all the same story. A man born in Anatolia moves to the Balkans and has a son. The Balkan man moves to Hungary and has a son. The Hungarian man has twelve sons, who move to France and England and Germany and Italy and Spain and .... Each of those sons have twelve sons, who stay in their homeland and prosper. The story of the family does not begin and end with the twelve far-flung men. You can see that, surely.

As there is no other Neolithic origin in Europe than the Balcans (or presumably Anatolia at a previous stage), this simple fact invalidates any "Neolithic" hypothesis for R1b1b2a1 and for R1b, R1b1, R1b1b, R1b1b2 and R1b1b2a too.

No. This paper totally ignores these two haplogroups and in general any distinction of the structure downstream of R1b1b2-M269.

This paper is not the history of the world. And the paper's hypothesis results in predictions re: U106 and P312 which we (being men, not sheep) can check against data from other sources. Low and behold, the hypothesis is consistent with the extra data too. That's science.

No. There was a single founder effect. Nowhere in that structure a "wave" is apparent:

I think you just don't want there to be waves, just like you don't want there to be molecular clocks.

Maju, I'm curious how Bosnia, a country of a mere 4 million people in a largely mountainous region could support anything about the Neolithic.

Simple: Cardium pottery originated in a region very similar in shape to the Roman province of Dalmatia or the pre-Roman country of Illyria. This core area included most of modern Bosnia-Herzegovina, all Dalmatia, all Montenegro and coastal Albania (highland Albania was rather in the Sesklo-related Neolithic cultural group of the rest of the Balcans).

Among all sampled areas only BiH belonged to this core area of the Cardium Pottery culture, that spread Neolithic through Mediterranean Europe. But somehow it was excluded from the analysis, along with other regions (southern Basque Country and others arguably less central).

This area seems most important in the spread of I2a and probably also in the Mediterranean spread of E-V13 and other "Neolithic" lineages like J2b, however I have no reason to think it played any important role in the spread of R1b.

... the people speak a S. Slavic language...

Languages come and go. They say "gore" for up/upwards, just as Basques say "gora" (etymologically 100% Basque: goi-ra: 'to high'). There are rivers over there that have the same names as rivers in Iberia (Ibar, Hevrus, compare to Iberus/Ebro and the related Basque words "ibai", river, and "ibar", river bank). If Basque language has a Neolithic origin (maybe totally unrelated to any genetic) it's a trail worth exploring. The opposite (a west to east flow) can also be true. And of course I may be totally wrong in my linguistic suspicions... but maybe not.

Languages are like clothes, genes are like body parts. You can change your clothes, sometimes you must change them, but normally not your liver or your heart.

"This area seems most important in the spread of I2a and probably also in the Mediterranean spread of E-V13 and other "Neolithic" lineages like J2b, however I have no reason to think it played any important role in the spread of R1b."

Is there any reason to suspect that Neolithic spread into Europe by a single source? I have seen other academics suggest at least two fronts. Perhaps this might answer the riddle.

The relationship between Cardium pottery and the coastal regions of Southern Europe has been noted before in the past. It would be nice to see deeper analysis in these regions.

With respect to I2a2 in the Balkans, some might have a problem with you suggesting it is 5-7 thousand years old. It appears to come from W. Russia and is one of the youngest clades of I2. Nearly the only one found in S.E Europe. Its presence in Turkey doesn't seem to extend beyond Istanbul from earlier studies that are readily available.

It's plain as day that R1b1b2 was already in SE Europe and Anatolia from an early point, irrespective of any back migration which although is historically attested, seems to have only left marginal trace in the grand scheme of things.

I see your suggestion that R1b1b2/R1b1b2a being a pocket in SE Europe and the R1b1b2a1 being the Western branch. However, I don't believe modern science would support a Magdalenian 10-18 thousand year separation point of these two groups. If we're going to use that basis, let's be consistent here and use it for all other groups. The bottom line is...it doesn't add up.

... your inability to understand the concept of a molecular clock may be charming to some, but not to me.

I can imagine it bothers you but it's not my problem.

I understand well the molecular clock conjecture and that's the reason why I don't trust it. Very specially I don't trust it as anything factual. It is not C14 and I doubt it will ever be anything of the like.

It is a fashion and as such will be put in its place in due time.

A man born in Anatolia moves to the Balkans and has a son. The Balkan man moves to Hungary and has a son.

What we see here is, metaphorically, a man born in Anatolia moves to West Europe (the Oder-to-Gibraltar area for our discussion) directly and has a son. This son has many many sons, and these have more many sons, etc. The explosion only happens once in West Europe, not in Anatolia nor the Balcans. And is very much uniform for all West Europe.

And there is no Neolithic culture with that extension. Nothing at all.

Your disdain or ignorance of factual (archaeological) prehistory is certainly not amusing to me. There are facts (archaeology, SNP phylogeny) and there are speculations (molecular clock, etc.) You can speculate but you need to have a grasp of the hard facts, and take them in account first and foremost, in order for your speculation to make any sense.

You need a reality check, seriously.

And the paper's hypothesis results in predictions re: U106 and P312 which we (being men, not sheep) can check against data from other sources.

I must have missed something. Can you explain yourself?

But revising the paper I noticed that they even dare to claim that E-M78, most of which has a North African spread, is also a Neolithic product and not Capsian, when in North Africa Capsian culture clearly evolves into Neolithic with minimal outside influences (some Cardium in a few coastal spots and little more) and when this haplogroup has a clear NE African origin and not West Asian.

It's crazy! Makes me feel like re-reading Erasmus' "The Praise of Folly" just to laugh at all this nonsense.

I think you just don't want there to be waves...

There is no evidence for your wave. The expansion begins after not in that conjectural wave.

You just believe too many things that are not factual.

You (and others, as this paper sadly evidences) need a serious reality check. Facts first, models and conjectures only after the hard facts.

Is there any reason to suspect that Neolithic spread into Europe by a single source? I have seen other academics suggest at least two fronts.

At least two fronts. Absolutely.

They might have an unclear common origin at Thessaly, where some very early Cardium Pottery is found along Sesklo cultural items in some locations but that would be their last common origin. They are clearly split at the mainland Balcans, with most being Sesklo-derived (or related, not everybody thinks "Sesklo first" strictly speaking) and the "Illyrian" country being Cardium instead. Since that point there are clearly two fronts.

Another issue are the Neolithic cultures of the Atlantic, mostly unrelated to either wavefront.

The relationship between Cardium pottery and the coastal regions of Southern Europe has been noted before in the past. It would be nice to see deeper analysis in these regions

Indeed. CP anyhow, unlike the Balcano-Danubian wavefront, does not appear to be just a mere colonizing force but it has some localities that are clearly colonized and many others that show clear continuity with local Epipaleolithic traditions and very specially toolkits. This is almost 100% true of SE France and also of very large chunks or Mediterranean Iberia and peninsular Italy. Secondary colonizations are also possible, for example Corsica and Sardinia were clearly colonized from Central Italy.

With respect to I2a2 in the Balkans, some might have a problem with you suggesting it is 5-7 thousand years old. It appears to come from W. Russia and is one of the youngest clades of I2.

I'm not suggesting that. I'm suggesting that possibly was at the origin of the expansion of I2a westward through the Mediterranean. For what I know it could have been there since the Big Bang, so to say.

Are you saying that I2a is younger in the Balcans than Neolithic? Why? I can perfectly accept that I as a whole has an Eastern European origin (I have flirted with that idea all the time but never found enough support) but I seriously doubt it's post-Neolithic in the Western Balcans. Notice that there are derived clades in the Western Mediterranean, particularly in Sardinia, which seem very old and that could hardly have arrived there after Neolithic or Chalcolithic at the latest.

It's plain as day that R1b1b2 was already in SE Europe and Anatolia from an early point, irrespective of any back migration which although is historically attested, seems to have only left marginal trace in the grand scheme of things.

Yes. I mean back-migration of the R1b1b2a1 found in Anatolia, which is quite clearly derived from European clades, not ancestral.

R1b1b2(xR1b1b2a1) [i.e. most Turkish R1b but very little European R1b] is clearly Anatolian. I have no doubt about that. I have serious doubts about the timeframes that some want to promote and that make no sense whatsoever.

However, I don't believe modern science would support a Magdalenian 10-18 thousand year separation point of these two groups.

Science has not advanced enough as to make clear statements in this matter of haplogroup ages (and I doubt it will ever be able to say anything with any certainty - barring accurate aDNA testing maybe). And in fact most papers published to date have claimed that, logically, or have remained wisely silent.

What we see here is, metaphorically, a man born in Anatolia moves to West Europe (the Oder-to-Gibraltar area for our discussion) directly and has a son. This son has many many sons, and these have more many sons, etc. The explosion only happens once in West Europe, not in Anatolia nor the Balcans. And is very much uniform for all West Europe.

You have a tendency to ramble on about irrelevant details, but when you say something concrete (as this) it is wrong.

The variance data are quite clear in revealing that the diversity cline does not run west to east (as you claim) but rather east to west.

R1b1b2 did not spread from western Europe into central Europe, but rather the opposite.

Once researchers are able to disentangle Greek/Phrygian/Roman/Galatian colonization of Anatolia, Central Asian Turkish movements into Anatolia, not to mention resettlements of Balkan Muslims into Anatolia from any pre-Phrygian "Neolithic" component, I will take arguments about the supposed "diversity" of R1b1b2 in Anatolia seriously.

My guess is that the great diversity of R1b1b2 in Anatolia (but not statistically higher than in Europe!) is largely due to the fact that the peninsula was settled by at least 5 different IE stocks and 1 Altaic one.

The Italian sample are actually from Ladin region of Alto Adige/Sud Tyrol, but probably from the mixed region which had a German colonization. I have no time now, but this afternoon I'll compare these data with those of Pichler et alii.Anyway previous studies gave to the most conservative Ladin zone an age of more than 11,000ya, like only a similar ancientness in the most conservative regions of the Caucasus.I agree completely of course with Dienekes. If these two R1b1b2 (European and Turkish) are related will be demonstrated I think from the Rozen's SNPs.

The variance data are quite clear in revealing that the diversity cline does not run west to east (as you claim) but rather east to west.

Because they are measuring the wrong set. Someone please do the same study (or hopefully a better one) ONLY with R1b1b2a1 and set things straight.

R1b1b2 did not spread from western Europe into central Europe, but rather the opposite.

If you mean R1b1b2a1, I may agree. However the research so far has not been sufficiently conclusive and we have to rely on mere STR data.

It's even possible that R1b1b2a1a1 and R1b1b2a1a2 represent two different, yet related spreads. All this structure has gone unresearched, at least in this paper, which so outrageously ignores the phylogenetic structure of the haplogroup.

But in any case there is absolutely no Neolithic pattern that could explain such Central to SW Europe migration. I can think of Epipaleolithic migrations, I can think of the dark origins of Magdalenian (coalescing in Aquitaine but seemingly derived from NW European Aurignacian remnants) and I can think of the two oldest cultural waves of Paleolithic Europe: Aurignacian and Gravettian (but these also affected Eastern Europe and Italy, unlike Magdalenian and epi-Magdalenian ethno-cultural flows).

But I can't think of anything Neolithic that fits that pattern: neither from Central Europe to SW Europe nor vice versa (unless it's Megalithic cultural complex but that would mean a core in Portugal and, secondarily, Brittany).

Dienekes: I don't necessarily agree with your historicist/recentist interpretation but certainly I agree that there is some clear back-migration into Anatolia from Europe.

All the rest is local (it's upstream in the SNP phylogeny) and means local processes that, sure, are hard to disentangle.

The cline is weak and worthless statistically given that the data points are not exact measurements but estimates with hefty confidence intervals around them that overlap in most cases.

Also.

But the main problem is what are they measuring. They arbitrarily decided to ignore the SNP structure (too much work?, inconvenient for their pre-determined goal?) and considered the haplogroup as an amorphous clade. This is a major fault of this research and that's why I posted their, otherwise interesting, STR phylogeny not as such but with a much needed note on what is R1b1b2a1 (where the real action is) and what is the Anatolian upstream local branch.

...

Gioello:

Anyway previous studies gave to the most conservative Ladin zone an age of more than 11,000ya...

That's the difference between the pedigree school and the till recently much more commonly accepted evolutionary one.

Another arbitrary decision the authors of this study make is to pick that rate in spite that there is quite clear evidence that it fails before 5000 BP.

Not that I think that the MCH can say too much nor much less have the last word in these matters in any case, but just for the record.

The skeletal remains of these people stretched across mainland Greece, the islands, and extended as far as Illyria. the Balkans/' Belgium, Switzerland. Denmark, and Anatolia/7 Evidence of their survival into the Bronze Age civilization of Crete is evident:Early Minoan I bones from a rock shelter at Hagios Nikolaos(24 women) (are) described as being of pygmy dimensions [Bushman]. ...

Maju, the problem is that posed by Dienekes, the difficulty to discern what is ancient and what is recent. Among the Ladins we have very recent haplotypes of German origins and very ancient from the first Rhaetians.If I take this haplotype:DYS19, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392, dYS393, DYS43914 15 13 16 24 11 13 12 21(14)Do you believe to the path proposed by Klyosov?14, 12→13→14→15, 13, 16, 10→11, 13, 12, 19 (12)→20(13)→21(14)Or do you believe to me?:14→13→14→15→14,12→13→14→15,13→12→13→14→13, and so on.

"I'm not suggesting that. I'm suggesting that possibly was at the origin of the expansion of I2a westward through the Mediterranean. For what I know it could have been there since the Big Bang, so to say. "

- I was referring specifically to I2a2-Balkan/Russia, not I2-M26 found in Sardinia and Iberia/France..etc

"Another issue are the Neolithic cultures of the Atlantic, mostly unrelated to either wavefront."

How can you be sure? They are still "Neolithic". The argument of the article states M269+ which was spread like a wave front with agriculture as the catalyst. The data supports this, as do the TMRCA estimates.

I was referring specifically to I2a2-Balkan/Russia, not I2-M26 found in Sardinia and Iberia/France..etc.

I will need good documentation to believe you in this, Aaron, for example check this older post here at Dienekes, where it's obvious that half of Pyrenean and generally Iberian I is I2a2-M423: not the Sardinian clade but the West Balcanic one. In other studies only I in general is tested for and in the classical paper of Semino, a good deal of Southwesterner I is other clades:- I2b1-M223: among French and Portuguese- I1 and I* among French and Spanish- I2a*-P37 among Portuguese

The only not reported lineage in the area, of the ones tested by Semino, is I1b-227, which is concentrated in Eastern Europe and the Balcans seemingly.

The structure of haplogroup I in SW Europe probably needs some clarification and research, now that the haplogroup is better understood, but is not in any case one of a single clade but a complex scenario.

"Another issue are the Neolithic cultures of the Atlantic, mostly unrelated to either wavefront."

How can you be sure? They are still "Neolithic".

Precisely: Neolithic is an almost meaningless tag. Africans are "Neolithic" (mostly) but they are not West Asians. Indians are "Neolithic" but they are not West Asians either.

Of course Neolithic had some minor demic impact in those areas (J2b in India for example) but you don't see mass population replacements.

In Europe we have to at least take in account the archaeology and this one tells us that, while maybe influenced at some level, Danish Neolithic for example (a well studied case) is not Danubian Neolithic by any means but displays continuity from the Paleolithic substratum. Each regional case in the Atlantic is unique and should be carefully analyzed but when even Cardium Pottery is clearly in most cases a techno-cultural continuity with pots, goats and lentils, the case for demic replacement in the Atlantic is even less clear.

The argument of the article states M269+ which was spread like a wave front with agriculture as the catalyst.

That's what they claim. But they are ignoring totally:

1. The dual quality of the wave (continental and coastal), each belonging to a totally different culture.

2. The Balcanic coalescence of the European Neolithic.

3. The unique and highly diverse nature of the Atlantic Neolithics.

There is nothing in fig.3 that resumes well the data, though should have some haplogroup annotations for greater clarity, that suggests that dual wavefront pattern nor the central Balcanic role.

The data does not resemble at all anything that could be in agreement with what we know of Neolithic Europe.

The data supports this, as do the TMRCA estimates.

The data does not support that. The data says that there was some single large demic explosion somewhere in Western/Central Europe at some time in the past: there is that big cake at the center and is not something happening in Turkey nor the Balcans.

There is nothing in the data that supports the conclusions. And the MC exercises are nothing but more or less elegant speculations that can prove nothing on their own, much less when they insist in treating R1b1b2 as a un unstructured amorphous haplogroup and not one that has a very defined structure, critically between the West Asian R1b1b2* clade and the European R1b1b2a1 one.

"It's even possible that R1b1b2a1a1 and R1b1b2a1a2 represent two different, yet related spreads".

I realise you're firmly convinced that local ecology has played no role whatsoever in our evolution. And that 'modern' humans have been able to swiftly adapt to any environment that opens up to them. But the distribution of the two haplogroup branches suggests that U106 evolved somewhere on the North European Plain, probably in Northern France or in Germany (or further east). P312 evolved somewhere in the hilly country to the south and eventually spread through Alpine Germany, Switzerland, the Pyrenees, and out along the Atlantic coast from Spain to Ireland. But I don't think this Spain/Ireland connection owes anything to boating. To me it seems to date to before the rise in sea level at the end of the last Ice Age.

"But in any case there is absolutely no Neolithic pattern that could explain such Central to SW Europe migration".

As you wrote previously, 'the Neolithic cultures of the Atlantic, mostly unrelated to either wavefront'. The first farmers to Ireland and Scotland were not actually either Cardial or Danubian. They probably date to before the English Channel first formed. Y-hap R could well be Neolithic in Europe, but we're looking for an earlier Neolithic than the Cardial of Danubian.

Evidently 95% of men on the west coast of Ireland with native surnames belong to Y-hap R1b1b2a1a2f (L21). To me this indicates that Y-hap R is the oldest surviving Irish Y-chromosome haplogroup. In earlier times it was presumably more widespread, but has been replaced further east by later migrations, including that of R1b1b2a1a1 across the English Channel (by boat).

I'll concede that in Ireland Y-hap R may have drifted out earlier Y-haps, perhaps even some of the remaining west coast 5%. But to me it seems more likely that these other Y-haps, such as I, are later arrivals.

Although technology and genes are not necessarily closely connected I suspect that Y-hap I is associated with Danubian. This puts J as a candidate for the Cardial Pottery expansion. And then Y-hap E and R-V88 joined in the fun in the Mediterranean. But I still think it was Y-hap T who had introdroduced the requisite boating technology to the Mediterranean.

Maju: With respect to I2a2, just a quick look at the FTDNA I2a project and you will see a very large gap between West European I2-M423, and east European I2-M423. The same gap cannot be seen between R1b1b2a and R1b1b2a1. If you have no faith in the science, at least you should substantiate why this would be so. The spread of agriculture, not necessarily its inception explains this nicely.

If all that existed in Europe before agriculture were hunter gatherers, then all groups should have an equal chance of survival. What are the odds that hunter-gatherers would become dominant (by pretty large margins I might add) in every single pocket they inhabited in Europe, and the introducers of agriculture would not leave a single trace...?

... that 'modern' humans have been able to swiftly adapt to any environment that opens up to them.

Not necesarily. I understand that cold countries are clearly hostile to our fur-less species, so well adapted to the tropics.

But the distribution of the two haplogroup branches suggests that U106 evolved somewhere on the North European Plain, probably in Northern France or in Germany (or further east).

Maybe. So?

Anyhow I'm still in wait for a good comprehensive study of the structue and distribution of R1b1b2a1. As I have said before, SW Europe and in particular France is under-researched. We need more comprehensive research and less speculations.

But I don't think this Spain/Ireland connection owes anything to boating.

Surely it's because they used helicopters... meh!

To me it seems to date to before the rise in sea level at the end of the last Ice Age.

Not likely because Ireland was totally under an ice sheet back then. Anyhow boats were known and journeys were short, so this should be no problem to anybody (except you of course).

The first farmers to Ireland and Scotland were not actually either Cardial or Danubian. They probably date to before the English Channel first formed.

This is confusing. Even if the first settlers of Great Britain (Ireland is still another sea away) reached by land (what about the huge river that was there?) they did not know of agriculture yet surely. Their descendants may have been the first farmers... or not... or partly yes and partly not.

Anyhow Ireland is not the center of the universe.

Y-hap R could well be Neolithic in Europe, but we're looking for an earlier Neolithic than the Cardial of Danubian.

Magdalenian Neolithic? Uh-oh!

I can take (and have even argued in favor of that possibility at times) that Magdalenian peoples might have domesticated horses, they certainly had dogs... but nothing, absolutely nothing, points to farming or animal husbandry other than that.

If you know of any evidence, tell me because that would be really interesting.

Otherwise the archaeological state of the art says that Balcano-Danubian and Cardial were, together with Dniepr-Don in East Europe, the first agricultural phenomenons in the continent.

Although technology and genes are not necessarily closely connected I suspect that Y-hap I is associated with Danubian.

Really? And how come it's most common in non-Danubian areas like Scandinavia or the West Balcans and most diverse in non-Danubian areas like East Europe (it seems)?

I as a whole (and I2 as well) is too old by all accounts to be Neolithic, regardless that Neolithic peoples, Indoeuropeans or whatever other flow also moved the super-lineage around.

This puts J as a candidate for the Cardial Pottery expansion.

IMO, E-V13, I2a and maybe other clades like T and G, as well some sublineages of R1b1b2*/R1b1b2a* also participated in the phenomenon. But they do not seem to have caused strong founder effects with specific sublineages but in very specific places (E-V13 in Greece and Albania, I2a1 in Sardinia and maybe others harder to discern).

Even a region that was clearly colonized only in Neolithic times such as Galicia and North Portugal (an excellent control case), shows a variegated array of haplogroups, not any single founder effect.

Maju: With respect to I2a2, just a quick look at the FTDNA I2a project...

Link please. You should not assume that everybody is acquainted with your favorite commercial testing company.

Whatever the case, those databases are not too good for just "a quick look" though they do help indeed to people willing to work hard in deciphering the haplotype structure and making sense out of it.

However, they also have a very strong sample bias in favor of NW Europe, notably English-speaking countries, what makes them less useful than more balanced research materials normally.

... you will see a very large gap between West European I2-M423, and east European I2-M423.

I'll check that but per Semino'04, I2a(xI2a1), which should be the same, this lineage is found in Italy (though not Germany nor Switzerland), suggestive that the CP origin at the West Balcans that I suggested above makes some good sense.

The same gap cannot be seen between R1b1b2a and R1b1b2a1.

Not that same gap obviously because R1b in general is very low in East Europe. In fact the two haplogroups compare very badly and must have spread largely through distinct processes probably.

But there is a massive gap between R1b1b2a1 and R1b1b2(xR1b1b2a1). The first is almost only found in West Europe (incl. Germany and Denmark), where it is massively dominant, while the latter is almost only found in West Asia and the Balcans, being clearly dominant in Anatolia as well. The distinction is so clear that ignoring it is abhorrent and can only produce confusion.

If you have no faith in the science...

I don't have faith in anything. I trust Science and I strongly favor the scientific method (in which doubt and criticism play a central role). I certainly cannot have "faith" or rather trust some so-called scientists.

Just having a university title doesn't make you automatically closer to wisdom. They are not priests that have to be believed blindly, just humans like you and me with some expertise that they can make productive or fail to.

If all that existed in Europe before agriculture were hunter gatherers, then all groups should have an equal chance of survival. What are the odds that hunter-gatherers would become dominant (by pretty large margins I might add) in every single pocket they inhabited in Europe, and the introducers of agriculture would not leave a single trace...?.

Neolithic peoples left clear marks. About 10% of Spanish autosomal DNA seems of East Mediterranean origin, and you can also see similar traces in Y-DNA and mtDNA.

But from 10% to 100% there is a big difference.

Per archaeology, the chances that pre-Neolithic natives transformed into agriculturalists with Cardium Pottery is very high. Most sites show clear continuity, with only some pockets looking as true colonies. The case in Central Europe is different and more difficult to understand but you can't happily extrapolate Danubian specifics to all the rest of Europe, much less when the Danubian area and the R1b1b2a1 area only overlap somewhat (continental Germanics, North French...)

This discussion reminds me of the association of the spread of agriculture from Anatolia with Indo-European. It suited the proponents fine as long as the dating was ~5,000 to 6,000 years ago, or so. At before 8,000 years ago - not so much. Especially, if then you come awfully close to the second wave of post LGM (after the interstadial)...

"My MC says 5,000 years ago, or so - what do we have, there ...?">> Great, I heard NW Europe first received agriculture, then.<<"Case closed!"

Never mind that agriculture entered Europe more than 8,000 years ago, and then spread via two completely different routes and cultures.

As to I2, I would like to see some references of what people are swinging around, here. To me, it looks like I2 is closely correlated to R1a, in Europe. Except, it appears at the fringe of the other, and at the boundary to other haplogroup's dominance. A characteristic of something that once was more established and spread much earlier... and thus point to before agriculturalists.

As to R1b[something] diversification in Anatolia: there are many documented migrations and settlements originating from Central Europe and the Balkans. Anatolia has been at the cross roads for 50 millennia - albeit always with a relatively small population, due to limited resources and poor climate. Not necessarily a spot I would trust to evaluate today to gain wisdom.

I've finished estimating ht15 and ht35 diversity, it's taken me 2 days to do this, and I think the results are amazing. The diversity clines of ht15 and ht35 are almost polar opposites. I think these results seriously call into question the conclusion of the Balaresque study, which is what prompted me to look into this. There's no surfing-on-an-expanding-wave phenomenon occuring with ht15. It has its lowest variance in the supposed origination point: Anatolia.

Instead, as Maju pointed out brilliantly, the study's tree diagram of their R1b1b2 haplotypes seems the result of 2 separate events, not a single wave diffusion. And he was absolutely right.

The diversity of ht35 doesn't form a gradually decreasing cline. It seems to be uniformly similar from Iran to west Iberia, or at least up to Italy, because there are issues with the validity of the North African and Iberian data (small sample size in one case and confusion with ht15 samples, in the other). Its cline seems to be more north-south than diagonally from southwest to northeast. East European countries have the same ht35 diversity as West Europe, with the special consideration of the west Iberian results.

Some technical details to keep in mind. Ht15 can be differentiated from ht35 by barely 2 markers: 393 and 461. Few studies test 461, so most of the samples I used were chosen on the basis of 393 alone. But about 3% of ht15 and 10% of ht35 have the "wrong" value, becoming confused with the other group. This usually doesn't matter, exceot in countries where there is an overwhelming ratio difference between the frequencies of both groups, such as in Iberia, France, Britain, Netherlands, Anatolia, and the Levant. In these extreme cases, I've included, where possible, 2 pair of results. The top pair uses samples predicted as narrowly as possible, by using both 393 and 461. The bottom pair is the standard prediction using just 393. The top pair should be more accurate, but they tend to lack in sample size, so then again, maybe not. Notice in the case of Iberia, that the less restrictive result changes drastically from the more restrictive result, and results in identical values to Iberia's ht15 diversity estimate, suggesting most of the samples are in fact ht15 samples that are being confused for ht35 because they had a mutation in 393 to the modal value of ht35 on that marker. Curiously, this didn't happen in France, where I was only able to use the less accurate method (393 alone), and yet the result is notably low and different from France's ht15 diversity. I'd seriously take North Africa's high ht35 result (0,30) with a military-issue teaspoon of salt, it's just 5 samples. On the other hand, it's notoriously high ht15 diversity (0,28) is pretty solid.

To recap, Baralesque and all geneticists are stuck in a time warp, they're back in 2003, thinking R1b is just R1b. What a waste, after going through all the effort of collecting and processing the samples, to not have had the sense to test for a few extra key mutations that define some major subdivisions of R1b1b2 and are well known for more than 5 years. The conclusions they reached would then have been very different.

"Anyhow I'm still in wait for a good comprehensive study of the structue and distribution of R1b1b2a1".

This is a good start. Some kind person has compiled a diagram of R1b1 subclades and even notated where each is found:

http://en.wikipedia.org/wiki/Haplogroup_R1b_(Y-DNA)

"Surely it's because they used helicopters... meh!"

Surely you're aware that the English Channel didn't exist until around 10000 years ago, after humans had reached Britain. So, if, 'Not likely because Ireland was totally under an ice sheet back then' how come they could live there? Not Ireland, admittedly, but much the same problem:

http://en.wikipedia.org/wiki/Star_Carr

Quote, 'It belongs to the early Mesolithic and was occupied from around 8770 BC until about 8460 BC'. Close enough to 10,000 years.

"Even if the first settlers of Great Britain (Ireland is still another sea away) reached by land (what about the huge river that was there?) they did not know of agriculture yet surely".

It seems you are correct in this. The first settlers as the ice retreated were not farmers.

"And how come it's most common in non-Danubian areas like Scandinavia or the West Balcans and most diverse in non-Danubian areas like East Europe (it seems)?"

It obviously existed somewhere before the Danubian, and the Y-chromosome is quite capable of spreading beyond any expansion of farming. It's quite possible it's Gravettian of course.

Argiedude: thanks again for an excellent work! Logically it must have taken you some hard work to tab all those haplotypes so very special thanks, really. :)

I insist you should have your own blog (and Eurologist too). If only you published a map like this every 6 months or each year, it'd be still very much worth it.

With your implicit permission I'll borrow and post this one at Leherensuge, copying the observations you make here. With due attribution, of course. If you have any problem with this, just say so and I'll delete.

The diversity of ht35 doesn't form a gradually decreasing cline. It seems to be uniformly similar from Iran to west Iberia, or at least up to Italy, because there are issues with the validity of the North African and Iberian data (small sample size in one case and confusion with ht15 samples, in the other). Its cline seems to be more north-south than diagonally from southwest to northeast.

The Galician case of high ht35 diversity should be considered, IMO, as product of Neolithic colonization probably. Because this particular region of Iberia shows no signs of H. sapiens presence until the Neolithic (except for some small border areas in the Epipaleolithic already).

Guess this Neolithic origin could also be argued for Italy but The Boot was already inhabited in the UP, so the case is much less clear.

Only careful dissection of that diversity can give us some better clues, I imagine.

Some technical details to keep in mind. Ht15 can be differentiated from ht35 by barely 2 markers: 393 and 461.

I've been recently chewing on this and on the fact that the modal haplotype seems to be shared by R1b1b2a1 and derived R1b1b2a1a (if not further downstream haplogroups). My provisional thought is that STR mutation rates may be misleading because they would be heavily delayed in practice by mere drift if the population was small: the chance that any mutation at any single locus (as STRs are) survives and becomes more or less fixated in such scenario is really tiny; instead this would be less true for mutations along the whole length of the Y chromosome (as SNPs are) because the sheer size of the ADN chain alone would counter the highly adverse odds for each locus. Of course, studying the whole Y chromosome or even a representative fraction of it is not really viable.

In these extreme cases, I've included, where possible, 2 pair of results. The top pair uses samples predicted as narrowly as possible, by using both 393 and 461.

A very interesting annotation, thanks.

Notice in the case of Iberia, that the less restrictive result changes drastically from the more restrictive result, and results in identical values to Iberia's ht15 diversity estimate, suggesting most of the samples are in fact ht15 samples that are being confused for ht35 because they had a mutation in 393 to the modal value of ht35 on that marker.

Should this increase the ht15 diversity significatively?

I'd seriously take North Africa's high ht35 result (0,30) with a military-issue teaspoon of salt, it's just 5 samples. On the other hand, it's notoriously high ht15 diversity (0,28) is pretty solid.

Very interesting too. Do you think this may reflect the Gravetto-Solutrean influence on Oranian/Iberomaurusian, as I think it is the case of mtDNA H (and other mtDNA lineages probably)?

To recap, Baralesque and all geneticists are stuck in a time warp, they're back in 2003, thinking R1b is just R1b. What a waste, after going through all the effort of collecting and processing the samples, to not have had the sense to test for a few extra key mutations that define some major subdivisions of R1b1b2 and are well known for more than 5 years. The conclusions they reached would then have been very different.

Totally agree. Curiously, my comment at the article itself was censored because I used the words "waste" and "misleading" in my short concluding sentence. Enfin...

Argiedude: thanks again for an excellent work! Logically it must have taken you some hard work to tab all those haplotypes so very special thanks, really. :)

I second that. What an effort, and timely, too.

I insist you should have your own blog (and Eurologist too).

Thanks for the encouragement. I am just getting into all of this - perhaps if my retirement plans from my real job materialize (in about six years, or so...). ;)

I do at times feel that many young researchers in this field are too narrowly educated and don't know about other disciplines, nor the big picture. I am sure Gioiello will agree with me here - even though more than half the time I have no clue what he's talking about, even after trying to re-translate Italian phrases and figures of speech to English... ;)

I was thinking that Ebizur too should start his own blog. Though maybe a team blog is easier? Just an idea but I really feel some pity that your excellent efforts at understanding and explaining genetics are so scattered and lack of a more solid publishing support. Blogger is really easy to manage and you can always post only when you feel like/have time/have something to say.

As for the researchers, in this case, they are clearly too concerned about "demonstrating" some pet theory instead of actually researching the matter. They are also too blinded by the fashion of the molecular clock as the ultimate measure of all things genetic, when it has almost no proving weight on its own in fact.

Now, a question for Aargiedude: would the "false h35 positives" add up significantly to the ht15 diversity in Iberia?

Maju, I wrote many times, in private and on some forums, about the “false ht35 positives” in Hiberia. It is thanks to these studies that I elaborated my theories. Studying the data of my Brazilian friend (and of Ricardo Costa de Oliveira), who writes on “sc_gen”, Alberto Durao Coelho, who has DYS393=12 and the closest to him, an American Souza and a Brazilian tested by SMGF who has DYS461=11, I understood that they had had back mutations, as Alberto, tested by FTDNA, is R1b1b2a1b. For this and other I have always thought that Hiberia lacks the most ancient haplogroups, differently from Italy.

Eurologist, certainly I agree with you. Which is what you haven’t understood? I have an e-mail address (gioiello.tognoni@gmail.com) and everybody can write to me. I always answer.

[i]Now, a question for Aargiedude: would the "false h35 positives" add up significantly to the ht15 diversity in Iberia?[/i]

Thanks for the compliments, Maju. I presume you meant to write ht35 at the end of the sentence, correct? And yes, I think so. In the Adams study of Spain, which tested 461, we can see that there are 39 R1b1b2 samples with 393=12 (ht35's modal). But only 15 of them have 461=11 (ht35's modal). Ergo, perhaps just 40% of these R1b1b2 that carry ht35's modal value are truly ht35; the majority are ht15 who happened to back-mutate in 393 to ht35's modal. Notice the same thing happens in England/Wales. Its result, using both 393 and 461, is 0,21, but when using only 393 to predict ht35, the result climbs to 0,23, and what a coincidence, that's England's ht15 diversity, too. So I'm guessing England's R1b1b2 samples with 393=12 are mostly ht15 samples which back-mutated to 393=12, as in Iberia. The problem disappears quickly, though. Even in North Italy, where ht15 is about 45% and ht35 is 5%, this still results in just 20% of its R1b1b2 samples with 393=12 being ht15 samples that back-mutated. Incidentally, this could mean that North Italy's ht35 diversity, already just barely beneath Anatolia's, might in fact be slightly greater than Anatolia's diversity, because my estimate must have included a big minority of North Italian ht15 samples, and North Italy's ht15 have very low diversity, bringing down the final result.

I've been reading your blog posts, and just wanted to say thanks for your comments on U5b1b, something I'm pretty interested in. So now the picture becomes of a presence of U5b1b stretching from Finland through Europe all the way to Anatolia. Instead of the former perception of 2 disconnected locations, in Finland and North Africa/Senegal. Very interesting, thanks.

Also, from your blog posts, about R1b1*. The huge Contu study of Sardinia apparently found 2 lineages of R1b1. One of them is the already known M18+ lineage, but a second one appeared in her study. It's SNP was:

R1(xR1a1a,R1b1a,R1b1b2) >>>> M173(xM17,M18,M269)

It could technically be, using current nomenclature, R1b1*, R1b1b1 (M73), or even R1a*, R1a1*, R1b*. But its haplotype suggests it's probably R1b1*. It makes up 1% of Sardinian y-dna (same as R1b1a1-M18), and its haplotype makes it different from any known cluster of R1b1* (it has 385a/b=14/14). We thus have 2 lineages of R1b1* floating around in Sardinia. It's looking to me like Sardinia is an island refuge of lineages that have since disappeared in the mainland. And that would point to R1b1* being a longterm inhabitant of at least Italy, instead of a recent historic movement, as per the Baralesque study. Gioiello has written to some of these authors pointing out this interesting case, and hopefully perhaps Cruciani will think it's worthwhile to make an addendum to his study and test some of these Contu R1b1 for his newly discovered SNPs.

Maju wrote in his blog: "What does this say? That even between perfectly comparable regions such as Turkey, Italy and Iberia, the highest diversity for R1b1b2a1 is in the West. If you look again at figure 3 you'll notice that most Turkish haplotypes of this clade are derived from European ones, what implies back-migration after the formation and spread of R1b1b2a1, which must have happened in Western or Central Europe."

PS: I'm not going to write a blog. I don't know how you manage, but I can hardly keep up with the explosion of information constantly coming out. I still have a backlog of pdf files that I have to go through.

I presume you meant to write ht35 at the end of the sentence, correct?

I don't think so. I mean if the "false ht35 positives" would belong to ht15 (i.e. R1b1b2a1) in fact and if, in that case, adding them to the total number of ht15 haplotypes would significatively rise the ht15 diversity in Spain in particular (I see that Portugal seems to have the opposite phenomenon).

In Spain, once you remove the false positives, ht35 diversity falls from .24 to .19, a very significant drop. But these discarded haplotypes would then belong to ht15 per your previous explanation and could hence rise the diversity for ht15. By how much?

These Spanish false positives are a very large number of individuals (n=68), which should rise the diversity of the ht15 sample (n>100), as they should include some haplotypes not considered previously.

In the English case, with ht15 n>1000, I imagine they will not change the diversity figure that much. But in the Spanish case they seem to be a very large fraction of the local R1b1b2a1.

I've been reading your blog posts, and just wanted to say thanks for your comments on U5b1b, something I'm pretty interested in. So now the picture becomes of a presence of U5b1b stretching from Finland through Europe all the way to Anatolia. Instead of the former perception of 2 disconnected locations, in Finland and North Africa/Senegal. Very interesting, thanks.

That was in some other thread, right? Damn, I can't recall where we discussed that... though I remember the discussion, sure.

It could technically be, using current nomenclature, R1b1*, R1b1b1 (M73), or even R1a*, R1a1*, R1b*. But its haplotype suggests it's probably R1b1*. It makes up 1% of Sardinian y-dna (same as R1b1a1-M18), and its haplotype makes it different from any known cluster of R1b1* (it has 385a/b=14/14). We thus have 2 lineages of R1b1* floating around in Sardinia. It's looking to me like Sardinia is an island refuge of lineages that have since disappeared in the mainland.

That's possible, though I feel inclined to think that it has rather accumulated founder effects from mainland Italy (and maybe other localities: SE France, West Balcans). Whatever the case, a place to watch.

And that would point to R1b1* being a longterm inhabitant of at least Italy...

I am more and more inclined to seriously consider that possibility. However it's not totally clear, specially because Italy is better sampled than West Asia and both regions "compete" for this "honor". What I do think is that it's not impossible that R1b1 arrived to Europe with Gravettian maybe (assuming that Gravettian is exogenous to Europe, what I don't think has been proven minimally). This would explain pretty well the current distribution, with Italy keeping an "older type" of diversity, rather not seen in north or west of the Alps. However it would also send to hell the molecular clock in all the variants I have ever read about.

But otherwise makes some good sense.

It would have been great if Balaresque and co. would have not wasted their time in molecular clock exercises and would instead have studied in depth the SNP and STR structure of the haplogroup, something that has not yet bee done with this kind of extensive sample.

But having the haplotypes without the haplogroups makes analysis almost impossible beyond what you did of splitting apart ht15 and ht35.

But I can see circa 10 major branches in that starlike structure beyond the root one leading to "Turkish" ht35. It would be really good to know something more about them: haplogroup affiliation where known and geographic affiliation in the "other" group too.

Superb, you nailed it, just using the study's own graphs.

Thanks. It was way too easy to realize once I had done some previous homework on the R1b1b2 structure.

But that's what really corrodes me: that the Balaresque team decided to ignore that obvious structure, that they must have known.... only to push forward their pet theory of "Neolithic expansion". I feel that's not honest science. Certainly not serious one.

I'm not going to write a blog. I don't know how you manage, but I can hardly keep up with the explosion of information constantly coming out.

Apart of being somewhat "crazy" I can't keep up with all either. I have to be selective... sometimes just capricious. It's my blog and I write about what I want, when I want. I try to avoid it putting a pressure on me, though admittedly I'm sort of an information (and discussion) addict too.

What if I open a second blog and make you and the other mentioned people team members? No pressure, just publish whenever you have something. Merely a reference blog. I hate to work in teams but if that helps some really nice materials like this map to have a platform, a stable reference site, I'm open to give it a push.

"This would explain pretty well the current distribution, with Italy keeping an 'older type' of diversity, rather not seen in north or west of the Alps".

I think Gravettian fits, but the 'older type of diversity' in Italy may be the result of stronger post-Gravettian selection north and west of the Alps. This would have the effect of lowering diversity there until the climate warmed and the surviving haplogroups could expand.

I mean if the "false ht35 positives" would belong to ht15 (i.e. R1b1b2a1) in fact and if, in that case, adding them to the total number of ht15 haplotypes would significatively rise the ht15 diversity in Spain in particular (I see that Portugal seems to have the opposite phenomenon).

Ah, now I see. I entered this into the spreadsheet, and it doesn't seem to change things much. In the case of Spain, adding the R1b1b2 samples with 393=12 and 461=12 (in other words, probable ht15 samples), changes ht15's diversity from 0,239 downwards to 0,237.

In the case of Portugal, the old figure was 0,243, adding the ht15 samples with 393=12 and 461=12 changes this result to 0,239, downwards again.

Keep in mind these ht15 with 393=12 only make up 3% of the total ht15 samples.

But these discarded haplotypes would then belong to ht15 per your previous explanation and could hence rise the diversity for ht15.

But wouldn't we expect their diversity to be the same as the rest of ht15? If you're referring to the fact that all of them already have 1 mutational difference from the ht15 modal, 393=12, then I agree, but I've purposefully excluded 393 from the STRs I used to make the variance estimates precisely because of the bias involved in my using this marker to pre-select the samples. This may not produce perfect absolute estimates of diversity, but since everyone is equally affected by the selection bias, it should produce accurate relative diversity estimates, which is what's really important.

These Spanish false positives are a very large number of individuals (n=68), which should rise the diversity of the ht15 sample (n>100)

I should have been specific when labelling the number of samples. The Spanish samples are actually more than 500. But the Spanish false positives aren't 68, they're 23. I'm referring to the Spanish R1b1b2 that have both 393=12 and 461=12. [In my previous post I mistakenly said there were 15 R1b1b2 with 393=12 and 461=11, but there are 14, and together with the other 23 they add up to 37 R1b1b2 with 393=12... hmm, I had also wrongly stated they added up to 39, but no, they're 37]

.........

All great observations in your second post. Especially the first paragraph about the sampling bias: notice how the land between Italy and Anatolia seems a void of ancient R1b lineages. It was untested in Cruciani's V-88 study, and no study has looked at their basal level SNPs the way Cinnioglu did for Turkey.

But wouldn't we expect their diversity to be the same as the rest of ht15?

Makes sense if you count not the raw diversity but the diversity/n. I've seen both ways of counting, sometimes side by side in the same paper.

Anyhow, I was curious about the matter.

But the Spanish false positives aren't 68, they're 23.

Then the map appears to have an error because it states n=80 for one measure and n=12 for the other (80-14=66 - not 68 as I said).

... notice how the land between Italy and Anatolia seems a void of ancient R1b lineages.

Well, it also looks rather void of samples anyhow. Romania has rather high diversity for R1b1b2a1 (though n is very small). Greece in turn looks rather high in R1b1b2*, only slightly under Turkey and Italy.

Still it's pretty curious that the number of really measurable false positives is almost double (23) than true ht35 halotypes (14). This is something to take in account when looking at the Iberian R1b because looking only at the DYS365 may be very misleading in this particular region.

I took the 15 SNP-tested ht35 samples of the Italy DNA Project and calculated their variance using the same method as I did when making the maps of ht15/ht35 diversity. The result was 0,30. The results I obtained with predicted ht35 samples, for Italy, were 0,29, 0,30, and 0,31, for different regions of Italy. The results for Turkey's ht35 were 0,31, and for the Levant 0,30.

This can all be seen in the map I posted in a previous post in this thread.

Alot of your problems and low frequencies are derived from your apparent neglect that taking current sample of genes from turkey, are NOT the original inhibitors of that land, and have just recently settled their. So if you reassess your data, and it comes out saying that Armenian, and everything points to Mr. Ararat. it should be of no big surprise, even to someone who percieves himself to be an intellectual as you, will realize that original inhabitants of Anaotolia, or Armenian highlands, who were the hurians to the left, and hayasa's to the right, it is very well recorded about the to tribes and they were probably rival powers at first but very close family members. If i could pull that much without analyzing a large genetic pool, i wonder what else you researchers missed. Dienekes was very correct in the beginning as he was thinking without influence, but Maju, obviously is thinking in limited terms. The oldest written british text states, that the original settlers were Armenians, and most likely mixed with the nearby Basques, another suggestion would be the 9000 year old cave founding is a antediluvian human. Could be anything

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.