April 01, 2011

Genetic structure in North-Central Europe with the Galore approach (revisited)

This is an update of a previous post, but with a much larger number of 416 sampled individuals from 26 populations.

The sources of the data are:

FIN and GBR from the 1000 Genomes Project

Populations with _D ending from the Dodecad Project

Populations with _H ending from the HGDP

Populations with _B ending from Behar et al. (2010)

20 clusters were inferred with 14 MDS dimensions retained. Below is the number of individuals assigned from each population to each cluster:

Some details on the cluster:

#1-3 are dominated by all 100 Finns plus 2 Swedes

#4 is clearly Balto-Slavic

#5 is clearly Russian

#6 is Norwegian-Swedish

#8 is British Isles

#9 is also British Isles but also encompasses all 3 Danes and a Dutch

#10 is French dominated

#11 is Central European (German-Hungarian)

#13-14 are British-Orcadian

Some points of interest:

A single Estonian groups with Balto-Slavs

A single Austrian groups with a Hungarian, French, and British

By definition, a cluster can be inferred if there are 2 or more individuals belonging to it! Hence, singleton project participants are likely to be grouped with some other broader group irrespective of their closeness to it. Hence, we should not conclude that e.g., the Estonian is indistinguishable from Balto-Slavs, but rather that a possible genetic distinctiveness of the Estonian population must await more Estonian population samples.

It is also interesting that the hitherto distinctive Finnish and British Isles populations have split into several clusters. This is the power of numbers, and I anticipate this to occur for other population groups with large sample sizes.

On the flip side, the inclusion of a wide array of Balto-Slavic populations has tended to make them all fall into a single cluster. Belonging to a single cluster does not mean that there is no population differentiation, but rather that this does not take the form of separate "blobs" of individuals that an algorithm working on unlabeled individuals can uncover.

This also brings an explanation of the mega-British Isles/American White cluster discovered in the most recent analysis for Project participants: the inclusion of multiple admixed individuals has probably served to fill-in-the-gaps within the general population of that origin, whereas the current analysis which included only individuals of a single origin as well as what is presumably a good geographical sampling of the GBR population has allowed population structure to be better visible.

UPDATE (Apr 4): It has come to my attention that the single "Hungarian" and the single "Austrian" joined the project under false pretenses and are the relatives of other Project members. In retrospect it is not surprising that they failed to join the German and Hungarian clusters respectively.

22 comments:

The Spilt of Irish and British population into Clusters 8 and 9 is interesting. 76% of the Irish are in Cluster 8, whereas with British_D the spilt is: 35% in Cluster 8 and 50% in Cluster 9. Given that Cluster 9 includes Dutch and Danes we could be seeing spilt between "Insular Celtic" population and "Germanic" incomers?

We need to get some specific welsh and scottish people to join Dodecad it be interesting to try and break out the variation across the island of Britian. I know in past studies that Scots have often overlap with both Irish and English.

On the GBR group which obviously excludes the Welsh the levels of Cluster 8 drop from the 35% in British_D to about 21%

We would expect at least two subgroups in Finns - true Finns and Swede-Finns. With three groups, one would suspect a strongly Saami admixed Finn, a not strong Saami admixed Finn and Swede-Finn mix, or alternately, a Finn, a Slavic admixed, and a Swede-Finn category.

As I have pointed out a number of times, in pretty much all autosomal studies Germans and Hungarians rank very close.

Germans are still represented a bit sparsely, here - so there are missing matches with Swedes, Danes, or Dutch that would probably occur in a larger population sample. It is still interesting to see that eight of the ten Germans group with (18 out of 21) Hungarians. What a strong "Danubian" connection!

Regarding the GBR set is there any data regarding the breakdown of how many of the 90 were in Scotland or in England?

About 1/10th of total population of Britain have a least one Irish grandparent (6million), if you cover all immigration since the famine the number goes up to about 1/4th of the population with some Irish ancestry/admixture.

The cluster overlap with the Danes will be connected with the group of Vikings that came that way, and Danish rule of Northern England.

And yes, the lack of significant German overlap is striking. But this agrees with haplogroup data that indicates that the impact with groups like the Saxons was minimal.

IMO similarities between the Germans and British Isles populations is related to a much older underlying structure combining a Southern origin component (related to the Sardinians) and a northern Structure (related to Lithuanians).

I dont like this cluster only data as I think it can be misinterpreted. Can we have to graphic also? It really adds to the reality of the situation.

Thinking about it further the Danish/Dutch cluster looks to just illustrate the similarity between these people and the British. If it were Danish rule/Viking input it would not be so dominant a cluster. Neither made much inroad further south in England.

Looking at the sample map from the pdf that Debbie provided it looks like they have quite a poor sampling rate for most of Scotland. Most scots samples are in the East and south as well as Orcadian. Not alot of samples from the west in the traditional Gàidhlig speaking areas.

In comparison they got quite a good sample rate from Wales, Cornwall and Northern Ireland.

Given that Cluster 9 includes Dutch and Danes we could be seeing spilt between "Insular Celtic" population and "Germanic" incomers?

In addition to Irish and British samples, cluster 9 includes Danish, Dutch and even French samples but no German or Scandinavian ones. I think that this makes it unlikely for cluster 9 to be Germanic. I think that cluster 9 is a Celtic one that was germanized in Denmark and the Netherlands and latinized in France. From what I understand Y-chromosome R1b - which is more common in Celtic populations than in Germanic ones - is more common in Denmark and the Netherlands than it is in Germany and Scandinavia which may be evidence of a significant Celtic survival in the Dutch and Danish populations.

@RafaelThanks For the info but was the Guy Italian-swiss (a small minority)? The Italian Swiss have already been shown to be quite different from the German and French Swiss in that one 2008 study. In fact they did cluster with northern Italian.

The one German Swiss I was thinking of, swissgirl, clustered with northern Europeans and I am almost sure most Swiss would also.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.