June 24, 2010

Population structure in Ireland and Britain (O'Dushlaine et al. 2010)

From the paper:

Eigensoft PCA analysis across all seven of our European and European-ancestry populations broadly identified four sub-groups consisting of (i) Bulgarian, (ii) Portuguese, (iii) Swedish and (iv) Irish/British/Utah populations (see Figure 1a, Supplementary Figure S1). The first two principal components (PCs) separate out northern from southern, and western from eastern European ancestry, respectively. The Europe-wide PCA analysis positions the Scottish population (Aberdeen) intermediate between the Irish and English populations. We further explored this observation by restricting our PC analysis to residents of Ireland, Scotland (Aberdeen) and south/southeast England (Figure 1b, Supplementary Figure S1). This analysis confirms the observation that the Scottish population is intermediate between the Irish and English cohorts on the first principal component(this time dividing west from east). Although more subtle, the Scottish cohort is also shifted slightly from the other two on PC2.

The distinction between Britons and Swedes was also noted in an earlier study. It's nice to see Bulgarians and Portuguese sampled, as they have been rather neglected in genomic studies, but, unfortunately none of their neighbors or any other intermediate populations were included, which is understandable as the study focused on British Isles populations. Bulgarians and Portuguese served as "anchor points" to re-create the well-known correlation of the first two PCs of European genetic variation with longitude/latitude.

The intermediate position of Scottish populations relative to the Irish and English is not surprising, given the Gaelic connection between Scotland and Ireland.

The paper also has haplotype diversity data that can be compared with those recently published by Auton et al.

The authors observe:

In summary, our results illustrate a subtle genetic structure across Britain and Ireland in the context of the comparatively homogenous nature of the European genetic pool. We have observed slightly elevated levels of LD and genome-wide homozygosity in Ireland and Sweden compared with neighbouring British and European populations, although these levels do not approach those of traditional population isolates. Similarly, we have illustrated a decrease in HD in Britain and Ireland, more so in Scotland and Ireland than in England.

Finally, the authors present results of frappe analysis (Figure S2):

At K=2 we see a distinction between northern and southern Europeans.

At K=3 a distinction between British Isles and Sweden appears. The absence of the Western European component in Bulgarians is noteworthy and expected.

At K=4 the Bulgarian component is identified.

At K=5 a Portuguese component is identified.

British Isles populations are dominated by the "Northwestern" green component with variable "Scandinavian" white (which is higher in England as expected) and both "Iberian" and "Balkan" minority elements.

European Journal of Human Genetics doi: 10.1038/ejhg.2010.87

Population structure and genome-wide patterns of variation in Ireland and Britain

Colm T O'Dushlaine et al.

Abstract

Located off the northwestern coast of the European mainland, Britain and Ireland were among the last regions of Europe to be colonized by modern humans after the last glacial maximum. Further, the geographical location of Britain, and in particular of Ireland, is such that the impact of historical migration has been minimal. Genetic diversity studies applying the Y chromosome and mitochondrial systems have indicated reduced diversity and an increased population structure across Britain and Ireland relative to the European mainland. Such characteristics would have implications for genetic mapping studies of complex disease. We set out to further our understanding of the genetic architecture of the region from the perspective of (i) population structure, (ii) linkage disequilibrium (LD), (iii) homozygosity and (iv) haplotype diversity (HD). Analysis was conducted on 3654 individuals from Ireland, Britain (with regional sampling in Scotland), Bulgaria, Portugal, Sweden and the Utah HapMap collection. Our results indicate a subtle but clear genetic structure across Britain and Ireland, although levels of structure were reduced in comparison with average cross-European structure. We observed slightly elevated levels of LD and homozygosity in the Irish population compared with neighbouring European populations. We also report on a cline of HD across Europe with greatest levels in southern populations and lowest levels in Ireland and Scotland. These results are consistent with our understanding of the population history of Europe and promote Ireland and Scotland as relatively homogenous resources for genetic mapping of rare variants.

Dienekes, is it possible to obtain a better pic of the PC graph? It seems, and this is very iffy because of the pic, that the Aberdeen samples are actually farther "west" than the English samples, that is, less "Swedish". We'd expect the opposite if Aberdeen, way up in the northeast corner of Scotland, geographically as close as you can get to Scandinavia from continental Scotland, were hugely affected by the Viking invasions from which supposedly Scotland derives its 6% R1a >>>> 20% Scandinavian ancestry (because Norway has 30% R1a, 20% Norwegian ancestry produces 6% R1a). It doesn't seem like it in the graph.

On a lesser note, judging from the PC1 axis and the FST results from McEvoy, 2009, who found a distance between CEU and Sweden of 0,00092, we can make a gross extrapolation that CEU-Portugal might be around 0,0030 and CEU-Bulgaria around 0,0025. This is a very gross extrapolation, because FST looks at all the SNPs tested, but a PC graph shows a purposefully selected 1% or 2% of these SNPs, but notice how it fits rather well with Heath's 2008 results: CEU-Spain 0,0026 and CEU-Romania 0,0028.

I answered my own question in the jpg image above. And effectively, Aberdeen is squarely in the mid-range between England and Ireland in the PC2 (Eigenvector 2) dimension, which seems to measure Scandinavian/Swedish ancestry. I do NOT see the Scandinavian influence in Scotland. Aberdeen is situated perfectly in the middle between England and Ireland.

What most called my attention is the detection of a clearly distinct Scandinavian or Swedish component (and quite early in the cluster analysis, in spite of Swedes not being oversampled). Generally in pan-European PC analysis Swedes tend to cluster, albeit loosely, with NW Europeans, so this specific cluster is a discovery. I wonder if it is (essentially) the same component as the "Finnish" one in Bauchet'07 or a different one.

"The intermediate position of Scottish populations relative to the Irish and English is not surprising, given the Gaelic connection between Scotland and Ireland".

I have to partly question this. In the cluster analysis is clear that what divides Hiberno-British populations is not a particular link between Scots and Irish but the greater or lesser presence of the "Swedish" component among them. It's well known that English are more influenced historically by flows from Scandinavia and nearby areas, followed by Scots, while Irish were less affected by such migrations.

"British Isles populations are dominated by the "Northwestern" green component with variable "Scandinavian" white (which is higher in England as expected) and both "Iberian" and "Balkan" minority elements".

Right, but in the Irish case the "Balkan" component appears almost as important as the "Scandinavian" one.

I'd dare say that the "Balkan" component in all other populations reflects the main Neolithic gene flow, which is significant but minor. The "Iberian" component might also represent a secondary Neolithic-Chalcolithic flow in the context of Megalithism (quite minor in any case).

ArgieDude said:Aberdeen samples are actually farther "west" than the English samples, that is, less "Swedish". We'd expect the opposite if Aberdeen, way up in the northeast corner of Scotland, geographically as close as you can get to Scandinavia from continental Scotland, were hugely affected by the Viking invasions from which supposedly Scotland derives its 6% R1a

This is not correct. Viking Scotland is the Western Isles, Orkney and Shetland and a small part of Northern Scotland (aka Sutherland). Most Scottish Vikings were ultimatley from South Western Norwegian (aka Rogaland)

Aberdeen would be in the heart of Pictland, and if anything have some influence from the Angles of South East Scotland.

it is not surprising that the British isles cluster with the Northern and Central Europeans and not the Southwestern Europeans.

The thing that I find interesting is that British isles form a very good continuum with the CEU while the Swedes differentiate from the CEU much more rapidly. From the looks of things you can't separate the British from the CEU. Its as if there were no straight.

Also the whole Rb1 thing probably didn't leave a great genetic impact.

"In the cluster analysis is clear that what divides Hiberno-British populations is not a particular link between Scots and Irish but the greater or lesser presence of the "Swedish" component among them."

"Up to this period the Norsemen from Scandinavia, or the Vikings, i.e. men of the voes or bays, as they were termed, had confined their ravages to the Baltic; but, in the year 787AD they for the first time appeared on the east coast of England. Some years afterwards they found their way to the Caledonian shores, and in 795 made their first attack on Iona[Western Islands], which frequently afterwards, along with the rest of the Hebrides, suffered grievously from their ravages. In 839AD the Vikings entered the Pictish territories[Lowlands]. A murderous conflict ensued between them and the Picts under Uen their king, in which both he and his only brother Bran, as well as many of the Pictish chiefs, fell. This event, no doubt, hastened the downfall of the Pictish monarchy; and as the Picts were unable to resist the arms of Kenneth, the Scottish king, Kenneth carried into execution, in the year 843, a project he had long entertained: uniting the Scots and Picts, and placing both crowns on his head. That anything like a total extermination of the Picts took place is now generally discredited, although doubtless there was great slaughter both of princes and people. Skene asserts indeed that it was only the Southern Picts who became subject to Kenneth, the Northern Picts [including those of Aberdeen] remaining for long afterwards independent of, but sometimes in alliance with, the Scots. This is substancially the opinion of Mr E.W. Robertson, who says, "the modern shires of Perth, Fife, Stirling, and Dumbarton, with the greater part of the county of Argyle, may be said to have formed the actual Scottish kingdom to with Kenneth succeeded". The Picts were recognised as a distict people even in the tenth century, but before the twelfth they lost their characteristic nominal distinction by being amalgamated with the Scots, their conquerors."

A Population of 100 males, from wich 50 are R1b and 50 are R1a, can turn to 100% R1b or 100% R1a in 5 Generations while beeing totaly isolated.

The effect was computersimulated. the result is: It happens ALLWAYS, that in the end, one Haplogroup reached 100%. It was only a matter of time. And so smaller the population was and so more isolated it was, so fewer generations did it take for 1 Haplogroup to reach 100%.

Thats why one observes the total domination of one Haplogroup on Islands usualy.

I would think that hidden in this Swedish/CEU conundrum is the lack of samples from the coast between the Netherlands and Denmark. That is a group that would be very close to both people migrating to the Islands and to Sweden. In other words, variable "Swedish" contributions are masked by variable Frisian-to-Danish contributions that make up the majority in England (rather than true Swedish/Norwegian impact).

"I would think that hidden in this Swedish/CEU conundrum is the lack of samples from the coast between the Netherlands and Denmark. That is a group that would be very close to both people migrating to the Islands and to Sweden. In other words, variable "Swedish" contributions are masked by variable Frisian-to-Danish contributions that make up the majority in England (rather than true Swedish/Norwegian impact)."

I think so too.Only a few decades ago, people believed "all" the Germanic tribes of Germany had originated in Scandinavia. This was based on the various tribal sagas.

The Saxons for example have such a saga. this Saga claims "Nordland" (The area around Trondheim in Norway) as the original homeland of the Saxons. Claiming they came by ships to Denmark. And from Denmark to Northseacoast of Germany. From wich saxon mercenaries finaly head for England. With Saxon kindoms in Southern England aswell as northern Germany.

Even if I dont really believe in this story, it shows how Scandinavia DNA can reach Britain in an non-direct way.

I think it is because most people in Utah are of British, German and/or Irish ancestry. It is a pity that the researchers did not include Norwegians and Danes in the research since these were the populations that provided the Viking invaders that attacked Britain and Ireland.

"From the looks of things you can't separate the British from the CEU".

That's not surprising at all considering that Utah white self-reported ancestry is essentially English with some of Danish and other British.

CEU, as you may know, stands for Caucasoids of European ancestry from Utah, a very unlikely sample to represent Europeans as a whole, specially considering the Mormon-related peculiar founding history of this state. Nowadays a second sample from Tuscany is also being used but not so frequently.

...

"there is an effect called "Genetical Drift"".

But it does not apply (significantly) to large post-Neolithic populations. It's essentially something that used to happen in the Paleolithic and among marginal small isolated populations thereafter.

"It was only a matter of time".

But this time tends to infinite in large populations. It's like flipping coins. If you flip them in groups of 10, there's a good likelihood that at some point they will be all tails or heads, but if you flip them in groups of a million the chance is practically zero.

Not sure what you mean but what I pointed to was to what divides English from Irish (and to lesser extent Scots). And in this case it's clear that it's almost only the importance of the "Swedish" component.

"Yes, but in any case, the Scandinavian contribution to Scotland is smaller than to the English"...

Which is exactly my point.

...

"Why exactly is Utah part of the mix?"

For a totally arbitrary reason, when HapMap was first conceived, the populations chosen to represent the world were: Japanese from Tokyo, Chinese from Beijing, Yoruba from Nigeria and Whites from Utah.

Utah whites are self-reportedly mostly of English ancestry (much unlike everything around them which is of self reported mostly German ancestry).

argiedude:Aberdeen samples are actually farther "west" than the English samples, that is, less "Swedish". We'd expect the opposite if Aberdeen, way up in the northeast corner of Scotland, geographically as close as you can get to Scandinavia from continental Scotland, were hugely affected by the Viking invasions from which supposedly Scotland derives its 6% R1a

pcconroy:This is not correct. Viking Scotland is the Western Isles, Orkney and Shetland and a small part of Northern Scotland (aka Sutherland). Most Scottish Vikings were ultimatley from South Western Norwegian (aka Rogaland)

Aberdeen would be in the heart of Pictland, and if anything have some influence from the Angles of South East Scotland.

Whatever, Scotland has a relatively high rate of R1a (5,5%) compared to England (3,5%) and Ireland (2,0%), and this phenomenon most likely occurs throughout continental Scotland. The PC graph shows English, Scots, and Irish, and the supposedly higher Scandinavian ancestry of Scots (equal to their extra 2% or 3% R1a divided by 30% R1a in Norway = 8% more Scandinavian ancestry in Scots than in English or Irish) doesn't show up in the graph.

Yes, it is that easy, this is Scotland, not a tiny enclave of Ashkenazi Jews in a neighborhood in the Rhineland or a couple hundred Sardinians in an isolated village in the mountains. Christ. At a minimum, their population was a quarter million during the Viking age.

......................

I would think that hidden in this Swedish/CEU conundrum is the lack of samples from the coast between the Netherlands and Denmark.

What conundrum could there be? What's hidden? They picked certain spots to test and these are the results. I presume if they picked an intermediate spot it would be located in an intermediate position in the graph. So what? In fact, we actually already know this thanks to previous excellent autosomal tests by Heath (2008) and McEvoy (2009). Yes, those regions are intermediate between England and Sweden.

......................

Can someone who has access to the study inform us if it includes FST genetic distance results?

If you look at the data label the "English" data is just for the south south east of England. This area is known to be very different from the rest of England, being the only area where Norman conquest haplotypes gained any kind of foothold. The Normans being displaced Vikings who mixed in with the local french population on their way to Britain. Thus they could be expected to be similar to the CEU in general. Which is exactly what the results show. In fact any European influx can be expected to have its greatest effect in south east England, the main point of entry.

The Vikings of Scandinavia should be expected to impact Ireland and Northern Britain more than southern England. Mostly by settlement.

The Normans and Saxons affected south east England mostly.

Parts of England above the Danelaw line can be expected to have different genetics to those south of it.

It is a shame more regions of Britain were not displayed. I imagine the results would be different to genetically deviant south east England.

This study says only one thing about England. That the point of contact with the rest of Europe is similar to the rest of Europe. Huge surprise (not).

Incidentally there is a far better version of this PCA analysis in the 23 and Me Global Similarity advanced view. This is only accessible to 23 and Me customers.

It does not show Sweden or Norway, but it does show the Irish, English, German, French and most interestingly, Orcadians. The Orcadians lie on a 45 degree line leading up and left to what must be the Scandinavian cluster, shown in this paper, similar to the black dots trailing up to the Scandinavian cluster.

About 50% of the Irish overlap with the English box. There is a smaller area of German/English overlap. The French box is broader and encompasses a lot of the German box and some of the English box. The Orcandian line bites into the the English box.

The Scots from the Scottish Lowlands are all Saxons. Get to know some Scottish history. There are a number of ethnic groups in Scotland. The Scotti from Ireland, the Picti up north, the Saxons in the SE Lowlands, the Welsh in the SW and finally the Norse, mostly the islands to the north.

In the K=5, what exactly do the colors represent? White=Finn admixture, Blue=Balkan SE Mediterranean, Yellow=SW Mediterranean, Green=NW European, and the Red=Neolithic farmers? Whatever the Red is, it is consistent in all the Europeans.

"But this time tends to infinite in large populations. It's like flipping coins. If you flip them in groups of 10, there's a good likelihood that at some point they will be all tails or heads, but if you flip them in groups of a million the chance is practically zero. "

No chance is ever "practically zero".

We exist because a couple of chances you call "practically zero" did happen in a row after all.

I have concerns too over the locations that the data was collected. The Viking settlement of Dublin was on the North side of the river Liffey, today the North side is the poorer side and populated by people from all over the country, especially the West. The South side is the wealthier side and was (is ?) a bastion of the Anglo-Irish Ascendancy – Merrion Square/Sandymount etc. From a previous study (Capelli?), the town of Rush, just to the North of Dublin has some Viking influence. So depending on which part of Dublin, you could get results that are reflective of the population substructure in the city.

Thanks Annie Mouse. That's helpful. I'd also like to see the Belgians and Dutch in the mix.

Just for the record, I think you've got the area of Saxon and Norman influence about right. There's also an area of England on the east coast that is believed to have either Saxon or Viking genetic influence.

Yes, Scottish Presbyterians were settled in Ireland during the Ulster Plantation.

On the other hand, Dublin city, Dublin county and surrounding areas of Kildare, Wicklow had been part of the British realm since the time of the Norman invasion of 1167. The area was know as "The Pale" - hence the expression "Beyond the Pale" for someone/something uncouth or barbaric.

So you are totally incorrect to suggest that many British didn't move to Dublin, they did! These British settlers were mostly English and many were from the port city of Bristol.

Technically it means that there's X chance that the individual is more alike to any other within that group, right? Europeans are much alike, so guess it's ok and rather should indicate just common European other ancestry.

...

@Annie: Anglosaxon influence was clearly more intense in England than elsewhere in the islands. And genetically (and geographically) Vikings and Anglosaxons were about the same. Also I'd think that the genetic impact of the Anglosaxons was much much greater than that of the Vikings, and that impact, as measured by Y-DNA was much greater in England (with and East-West gradient) than Scotland, Wales and surely Ireland too.

But anyhow there might be other elements of affinity, hard to tell without more populations of reference.

@Joe: "Shouldn't the researchers have picked an area that had fewer British immigrants such as the west of Ireland?"

Totally agreed.

...

@Fanty: "No chance is ever "practically zero"".

Then you probably play lottery... and lose.

There are chances that are not significantly different from zero.

"We exist because a couple of chances you call "practically zero" did happen in a row after all".

Fair enough (if you mean "we" as individuals), but that does not apply to drift statistics because pure flukes cannot be inferred statistically.

...

@PConroy: "Just remember that Joseph Smith, the founder of Mormonism was R-M222 = North West Irish, so I expect a lot of Irish descent among the Mormons"

You expect too much from a Y-DNA lineage. Smith is an English surname and I presume that when Utahns report mostly English ancestry, that is relevant and likely to be true in any case.

That cannot be correct either because as the hunter-gatherer substrate is older and from a broader area, it should show up as different regional components. It's just noise of the method saying that there's a cluster that includes all Europeans equally, because Europeans are quite homogeneous and surely the Frappé algorith allows for such noise to happen.

Unless people from the source regions of Angles, Saxons and Jutes in the European mainland and all ethnic and sub-ethnic groups of Britain, and preferably also of Ireland, are tested together, it is futile to talk about English and also historically Anglic speaking Scottish genetic origins.

Btw, I used the term "Anglic" in my above post as just a language classification, not as an ethnic term; so historically Anglic speaking Scots may or may not be more Saxon and/or Jute descended than Angle.

Btw, I used the term "Anglic" in my above post as just a language classification, not as an ethnic term; so historically Anglic speaking Scots may or may not be more Saxon and/or Jute descended than Angle.

A last addendum: The Anglic variety native to Scotland isn't called Anglic, English or something similar, but has a vairety of native names like Braid Scots, Lallans (meaning Lowlands), Buchan Claik, Doric and Teri.

The point is that SSE England is not genetically representative of England, not even most of it. So the paper says almost nothing about England. We already know that Anglosaxon and Norman influence was greatest in SSE England.

So the authors chose the most admixed population in England and chose the make that representative of England? That makes no sense.

Maju said:@PConroy: "Just remember that Joseph Smith, the founder of Mormonism was R-M222 = North West Irish, so I expect a lot of Irish descent among the Mormons"

You expect too much from a Y-DNA lineage. Smith is an English surname and I presume that when Utahns report mostly English ancestry, that is relevant and likely to be true in any case.

Smith is as much an English name as it is an Irish name and a German name - in the US. In Irish Gaelic, Gabhann=Smith, and so the lastname McGabhain (aka McGowan, like singer Shane McGowan) is usually anglacized as Smith or an older version Smythe.

I have lived in Sweden for one year of my life. I have lived with a Swedish woman for 10 years. I have danced to Dancing Queen at more than one Swedish wedding. I have travelled widely in the Nordic countries and the British Isles. My eyes tell me that the populations of Scotland and the populations of Bergen Norway have some degree of overlap. Most obvious is the presence of red hair and the type of skin pigmentation found in both populations. Swedes tell me that Finns are ugly but after spending some time in Helsinki I must say that many Finns look a lot like Swedes(and they are both the opposite of ugly). They obviously share a large amount of genetic material. Go to any of these northern cities and there are large population samples right before your eyes, and what one sees tends to confirm what modern genetic studies are suggesting.

The "regional sampling" in Scotland (Aberdeen) is hardly representative of the Scottish population. Aberdeen is an oil city with a large population of "incomers". I wouldn't be at all surprised if most of those sampled there were English.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.