Abstract

Population structure can provide novel insight into the human past, and recognizing and correcting for such stratification is a practical concern in gene mapping by many association methodologies. We investigate these patterns, primarily through principal component (PC) analysis of whole genome SNP polymorphism, in 2099 individuals from populations of Northern European origin (Ireland, United Kingdom, Netherlands, Denmark, Sweden, Finland, Australia, and HapMap European-American). The major trends (PC1 and PC2) demonstrate an ability to detect geographic substructure, even over a small area like the British Isles, and this information can then be applied to finely dissect the ancestry of the European-Australian and European-American samples. They simultaneously point to the importance of considering population stratification in what might be considered a small homogeneous region. There is evidence from F(ST)-based analysis of genic and nongenic SNPs that differential positive selection has operated across these populations despite their short divergence time and relatively similar geographic and environmental range. The pressure appears to have been focused on genes involved in immunity, perhaps reflecting response to infectious disease epidemic. Such an event may explain a striking selective sweep centered on the rs2508049-G allele, close to the HLA-G gene on chromosome 6. Evidence of the sweep extends over a 8-Mb/3.5-cM region. Overall, the results illustrate the power of dense genotype and sample data to explore regional population variation, the events that have crafted it, and their implications in both explaining disease prevalence and mapping these genes by association.