The HapMap project has provided a unique tool for the analysis of human genetic variation, providing reference information for allele frequency and genotype distributions as well as linkage disequilibrium patterns of Single Nucleotide Polymorphisms (SNPs) across the entire genome. The latest release of HapMap phase 3 data provides genotypes for millions of SNPs in 11 populations from around the world, with Europe being represented by the CEU (originating from Northwestern Europe) and the TSI populations (Tuscan Italians from Southern Europe). Although initial studies support the fact that the CEU can be used as reference for the selection of tagging SNPs in other European populations, a critical step in the design of genetic association studies, this hypothesis has not been extensively studied across Europe and in particular in Southern Europe. We set out to explore the extent to which the HapMap populations can be used as reference for a previously unstudied population of South-Eastern Europe, the Greek population. To do so we studied genomic variation in 1,813 SNPs, genotyped by our group in 56 individuals of Greek origin, and compared them to the CEU and TSI genotypes (1,813 SNPs from the CEU HapMap dataset and 1,205 from the TSI dataset). The studied SNPs are spread over 13 autosomal chromosomes and 26 regions, ranging in size from 120Kb to more than 4Mb. Genotype, allele frequency, and pairwise LD measures were compared across all three populations. PCA was used in order to identify those markers that are responsible for the observed inter-sample variance. Tagging SNPs were selected in the CEU and TSI samples and their transferability to the Greek population was tested, using both the r2 metric as well as the efficiency of genotype imputation of the non-selected SNPs. Our results demonstrate that, although the CEU population can to some extent be used as reference for the Greek population, it is preferable to use as reference a European population of closer genetic ancestry, like the TSI. These results are applicable in medical genetics, in order to inform the design of genetic association studies, as well as in studies of evolutionary relationships of Southern European populations.

One of the great problems of Eurasian anthropology is whether the Uralic populations are simply variable admixtures of Caucasoids and Mongoloids or they contain a tertium quid in the form of a Proto-Uralic element. The latter need not be distinct from the other two, as it can also be an old or stabilized blend of the two major Eurasian races that later admixed with more recent groups on either side. The abstract does not seem promising in this respect, i.e., in identifying a common core of ancestry among Uralic speakers in addition to their variable east-west admixture, but it would be nice to see if anything like that exists in the paper.

For about last two decades the examination of uniparentally inherited genetic marker systems revealing the variation embedded in mtDNA and Y chromosome has been the main tool in the studies of human genetic origins. Within few recent years the analysis of the genome-wide SNP data of individuals from different populations has started to give promising new insights in the field of human population genetics. The uniparentally inherited markers have shown slightly different demographic scenarios for the maternal and paternal lineages of North Eurasian, particularly of European Uralic-speaking populations. The geographical location of a population has evidently been the most important component that dictates the proportion of western and eastern mtDNA types in the gene pool of Uralic-speakers. Thus, the palette of maternal lineages of the Uralic-speakers resembles that of their geographically close European or Western Siberian Indo-European and/or Altaic-speaking neighbours, respectively. At the same time, the most frequent North Eurasian Y chromosome type N1c, that is also a common link between almost all Uralic-speakers, is with few exceptions rare, if present at all, among Indo-European-speakers of Western and Southern Europe. Here we combine genome-wide high density SNP data (650 000 SNPs, Illumina) with uniparentally inherited mtDNA and Y-chromosome variation of 16 Uralic-speaking populations to assess their place on the genetic landscape of North Eurasia. By the use of principal component and structure-like analysis on the autosomal data we show that the proportions of western and eastern ancestry components among the Uralic-speakers are determined mostly by geographical factors. The westernmost populations from Europe, both Uralic- and Indo-European speakers, are similar in their pattern of ancestry components and show low levels (less than 10%) of the eastern component. Conversely, the eastern ancestry component is dominant (60-70%) in the gene pool of the Siberian Uralic-speakers. In general, the genome-wide analyses corroborate the results of mtDNA analysis and do not reflect the common genetic characteristics between western and eastern Uralic-speakers at the level seen in case of N1c. Interestingly, among Saami from North Europe, who are often considered as „outliers“ in genetic studies, the dominant western component is accompanied by 30% of eastern component making them more similar to Volga-Uralic populations than to their closest neighbours.

This seems to validate my thoughts on relics and their importance in age estimation.

Genetic evidence based on mitochondrial DNA (mtDNA) has recently revealed the existence of additional founding lineages that have contributed to the first peopling of America’s double-continent in addition to the more popular five Native American haplogroups (A2, B2, C1, D1 and X2a), and has demonstrated as well the need for additional sampling and analysis to be performed for some of the already known but poorly characterized lineages. One paradigmatic example is represented by the pan-American haplogroup C1. Two of its sub-branches (C1b and C1c) harbor ages and geographical distributions that are indicative of an early arrival from Beringia about 15-17,000 years ago, concomitantly with the other currently accepted Paleo-Indian founders. However, the estimated age of C1d - the third Native American subset of C1 - is only 8-10,000 years, which is suggestive of a much later entry and spread in the Americas. In this study, we shed light on the origin of this enigmatic Native American branch of C1 by completely sequencing a large number of C1d mitochondrial genomes from a wide range of geographically diverse, mixed and indigenous American populations. The revised phylogeny shows that the age previously reported for C1d was heavily underestimated and indicate that C1d is ancient enough to be among the founding Paleo-Indian mtDNA lineages. Moreover, our results reveal that there were two C1d founder genomes for Paleo-Indians that most likely arose early (~16kya), either in the dynamic Beringian gene pool, or at a very initial stage of the Paleo-Indian southward migration. This brings the recognized maternal founding lineages of Native Americans to the unexpected number of 15, and indicates that the overall number of Beringian or Asian founder mitochondrial genomes will probably continue to increase as more Native American haplogroups reach the same level of phylogenetic resolution as we obtained here for C1d. Additionally, we have confirmed a nearly identical geographic distribution pattern for haplogroup C1d when comparing samples collected in the general mixed population with those from native tribal groups, as it was also reported previously for haplogroups X2a and D4h3. This substantiates the validity of searching large public mtDNA databases (such as the one available through the Sorenson Molecular Genealogy Foundation, www.SMGF.org) for novel founder candidates able to reveal unknown details concerning the ancient human history of the Americas.

Another interesting abstract. I've written before about the association of Y-chromosome haplogroups with the spread of Semitic speakers and the agreement with language phylogenetics.

The origin of the modern Mesopotamia marsh people, which are locally called “Ma’dan” or “Marsh’s Arabs”, is a question of great interest. Based on their life-style (living in reed houses, grazing of water buffalo and other aspects) and local archaeological sites, many historians and archaeologists believe they may have Sumerian ancestry. Although little is known about the origin of Sumerians themselves, two main hypotheses have been advanced in this regard. According to the first, Sumerians were a group of populations which migrated from the “South East” following a seashore route through the Arabian Gulf, and settled down in the southern marshes of Iraq. According to the second, the advancement of the Sumerian civilization is the result of migration from the mountainous area of Anatolia to the southern marshes of Iraq where they settled, adsorbing previous populations. In order to shed some light on the genetic origin of the Mesopotamia marsh population, we investigated the male gene pool of 145 DNA samples of modern Mesopotamia people, still living in marshes in the south of Iraq. The analyses of Single Nucleotide Polymorphisms (SNPs) and Short Tandem Repeats (STRs) of the paternally transmitted Male Specific region of the Y chromosome (MSY) revealed that more than 80% of marsh Y chromosomes belong to (Hg) J1-M267, the autochthonous haplogroup of Middle Eastern/Semitic speakers with possible recent expansion and/or founder effect reflected by the reduced STRs variability. In particular, 90% of them were assigned to the J1e-M267-PAGE08 sub-haplogroup, which is the predominant Y chromosome lineage among Middle Eastern Arab populations (Yemen, Qatar, UAE, and Levant). Thus, these findings testify, at least from the paternal side, a strong Semitic Arabian component in the contemporary Mesopotamia marshes population, whereas no clear Anatolian and/or South Asian genetic evidence has been detected.

Salar is a small Western-Turkish-speaking population living mostly in Qinghai province of China. The most similar languages to Salar are all far in Turkmenistan. Historical records suggested that they may be descendants of the Turkic nomadic tribes in Central Asia. In this study, 141 Salar Y chromosomes were analyzed for 39 SNP and 14 STR markers to investigate the potential imprints of their western ancestors. The most frequent haplogroup (hg) in this population sample is Hg R, comprising 40% of all Y chromosomes. Most of these Hg R samples belong to R1a1 (M17), which distributes in a wide geographic region including South Asia, East Europe, Central Asia, and South Siberia. Other four Western Eurasian haplogroups (G-2%, H-5%, I-3%, J-3%) were also found in Salar Y chromosome gene pool. These paternal lineages of Salar are absent in their East Asian neighbors but frequent in Central Asia. Y-STR-based analyses also grouped Salar to Central Asians. On the other side, Salar also has low frequencies of the East Asian specific Hg D and Hg O, suggesting possible gene flow from their neighboring populations. This Y chromosome study demonstrated that Salar well keeps the Western Eurasian paternal lineages of their Central Asian ancestors although they may have migrated to Central China for about 800 years.

I wish that more "people pairs" would be studied this way, as it would give us some good insight of how migration affects gene pools (allele frequency changes, founder effects, possible social selection etc.)

The Basques are an ancient people, considered by many anthropologists to represent the oldest extant European population. Because of this, they have been the subject of numerous sociological and biological investigations. The Basque Diaspora, a relatively recent demographic expansion of the Basque population, has until now been overlooked in genetic studies. Samples were taken from 53 individuals with Basque ancestry in Boise, Idaho, and the mitochondrial DNA (mtDNA) sequence variation of the first and second hypervariable regions were determined. Thirty-six mtDNA haplotypes were detected in the sample. Comparing the genetic diversity in the Idaho sample with other Basque populations, signatures of founder effects were observed, consistent with both the recent and ancient history of Basque mitochondrial lineages. There has been a marked alteration of haplogroup frequency and diversity, and there is a slight reduction in other measures of diversity in the NW Basque population compared to the native Basque population. We have found a relatively high percentage of the Cambridge Reference Sequence (rCRS) haplotype for hypervariable regions I and II, which is absent in previous studies of Basque mtDNA, and rare in other Spanish populations. The amount of nucleotide diversity is consistent with a sample that is predominantly haplogroup H, which is especially common in the Basque regions of Europe, due to ancient migrations and expansions out of glacial refugia. This is the first report of mtDNA diversity in an immigrant Basque population, and we find that the diversity in NW Basques can be explained by the recent history of migration, as well as the phylogeography and diversity of the major European haplogroups.

The first major interaction between Native Americans and Europeans is documented historically and occurred less than 550 years ago. This recent time frame provides an excellent opportunity to investigate the effects of admixture between two populations that were previously separated for hundreds of generations. To characterize European admixture in Native American populations, we sampled and analyzed a group of isolated Totonac agriculturists from tropical Mexico near Veracruz and a group of native Bolivians predominantly from the mountainous region near La Paz, Boliva. Mitochondrial sequencing of HVS1 showed that all samples had pre-Columbian mtDNA haplogroups (A, B, C, and D). Using a panel of 48 STRs or 12 Y-chromosome SNPs, Totonac Y-chromosomes lineages were all assigned to the pre-Columbian haplogroup Q1a3a, and Bolivian Y-chromosome lineages were assigned to haplogroups Q1a3a, R1, and J2. Haplogroups R1 and J2 are common in European populations. Principal components analysis (PCA) using >800K autosomal SNPs typed in 24 Totonacs and 23 Bolivians showed that all Totonacs and 14 Bolivians clustered distinctly from Eurasian individuals. Nine Bolivians, however, were positioned between the New World and European PCA clusters. Admixture analysis showed that these nine samples had 21 - 33% European admixture using a European reference population. All three observed Y-chromosome haplogroups, including the well-studied pre-Columbian haplogroup Q1a3a, occurred in the admixed individuals. Two of the nine admixed individuals had pre-Columbian mtDNA and Y-chromosome haplogroups but 21-23% European ancestry. This result demonstrates that Y-chromosome and mtDNA haplogroups are only partial indicators of an individual’s complete ancestry.

Readers of the blog know that I don't agree with the scenario presented in the followin abstract. The serial founder effect idea is used by geneticists to explain the overall reduced genetic diversity of our species (that we appear to be young, in evolutionary terms). Personally, I don't see how a smart, expanding species that all of the sudden had access to the resources of the landmass of Eurasia went through these extreme bottlenecks.I think that the alternative of a larger human population, genetic diversity reduced across the species by ongoing climate- and culture-mediated selection, and admixture within Africa itself -where a particular expanding H. sapiens group must've co-existed with pre-existed hominids, anatomically modern or not- has merit.J. Long et al. Evidence for archaic admixture in contemporary non-African human populations

Analyses of large-scale genetic data sets show evidence for a series of founder effects that occurred as modern humans left Africa and settled the rest of the world. Nonetheless, research on modern humans has not ruled out the possibility that other processes, such as local gene flow, or mixing between archaic and modern humans, have also contributed to modern human diversity. Recent analyses of the Neanderthal genome make archaic admixture a salient issue because they show evidence for mixing between Neanderthals and out-of-Africa migrants. The present study examines evidence for archaic admixture in genotypes for 619 microsatellite loci collected from over 2,000 individuals from 100 human populations. We obtained these data from the Marshfield Clinic collection. The populations analyzed represent all inhabited continents of the world. In our analysis, we formulate the serial founder effects (SFE) model as a special case of a phylogenetic model promoted by Cavalli-Sforza and his associates. In this light, the SFE process makes four predictions: 1) A tree of descent according to the pattern of fissions. 2) The root of the tree lies in Africa. 3) The length of each branch is proportional to ratio of evolutionary time to effective population size. 4) The gene identity between all pairs of populations that share the same most recent common ancestor is equal in expectation. Using hypothesis tests based on generalized hierarchical statistical models, we find good agreement between the SFE predictions and diversity within and between African populations, and we find good agreement between the SFE predictions and diversity between non-African populations. However, there is more diversity within the non-African populations than the SRE model can account for. This makes for greater genetic distance between Africans and non-Africans than otherwise expected. How and where did the non-Africans obtain this diversity? A simple explanation for the finding is that the earliest migrants out-of-Africa mixed with an archaic population such as Neanderthals prior to their expansion throughout Europe and Asia. Coalescent based computer simulations of the SFE model with mixing support our interpretation. The time and place that we detect mixing coincides perfectly with that detected in a recent examination of Neanderthal genome sequences. Our study shows that genomic diversity in modern humans still reflects ancient events and processes.

Using ancestry informative markers (AIMs) allows reducing the number of makers needed for population stratification adjustments in association studies. As few as 100 AIMs are sufficient to adjust for the largest European axis of differentiation (i.e. EuroAIMs). However, their use for ancestry inference and adjustment in association studies in atypical European populations such as the Canary Islanders, a recently African-admixed population from Spain, needs to be addressed. We aimed to explore whether EuroAIMs were suitable both for the inference of Spanish and Northwest African admixture proportions and for ancestry adjustments in association studies including samples from Canary Islanders. We analyzed samples from Canary Islanders, mainland Spanish (IBE) and Northwest Africans (NWA) for 93 EuroAIMs and compared the data with CEU and YRI from HapMap, Basques and Mozabite from HGDP, as well as from previously analyzed European samples. The major genetic difference was observed between NWA and all European populations, preserving the northwest-to-southeast differentiation of European populations in the second axis. Analyses revealed that Canary Islanders were intermediate between IBE and NWA, and that direct sub-Saharan African influences were negligible. Assessment of individual admixtures without prior population information clearly identified two subpopulations corresponding to NWA and IBE, while Canary Islanders were admixed with an average of 17.4% Northwest African contribution varying largely among individuals (range 0-95.7%). As few as 23 EuroAIMs correctly estimated population membership to IBE and NWA, while 69 EuroAIMs were required to accurately estimate individual admixture proportions in Canary Islanders. Ancestry estimates based on a subset of 69 EuroAIMs also controlled significant allele frequency differences between IBE and Canary Islanders. These data suggest that a handful of EuroAIMs would be useful to control false-positives in association studies performed in Spanish populations. Supported by FUNCIS 23/07 and grants from the Spanish Ministry of Science and Innovation PI081383 and EMER07/001 to CF.

As I have I mentioned before, the Maasai (and many other east Africans in various degrees) are intermediate between Negroids and Caucasoids, and hence admixture estimates considering Yoruba Nigerians would tend to underestimate the African element. It's important to remember that extant Africans are not uniform, ranging from Caucasoids to Negroids, Pygmies, and Khoi-San, with multiple identifiable clusters within the major Negroid group itself, and all sorts of between-group gene flow in a regional basis. It is always useful (as is the case e.g., with African Americans) to both use historical knowledge about population sources, and also to validate historical narratives with the genetic evidence.R. L. Raaum et al. Autosomal African admixture in Yemeni populations.

Approximately 30% of mtDNA lineages in South Arabian samples are African L haplotypes, whose origin has usually been attributed to migration and assimilation of African females into the Arabian population over approximately the last 2,500 years. Few In contrast, few Y chromosome lineages of clear recent sub-Saharan African origin have been found in Southern Arabian populations. This bias in maternal and paternal lineages is in accord with historical accounts of the female bias in the Middle Eastern slave trade. In order to evaluate autosomal African ancestry, we collected high-resolution SNP genotype data from a geographically representative set of 62 Yemenis selected from a collection of 552 samples acquired in the Spring of 2007. The ancestry of chromosomal segments in the Yemeni population was estimated using a haplotype-based local ancestry estimation method, HAPMIX. The HAPMIX method is based on a two way admixture model that requires two phased reference populations; we used the HapMap Yoruba in Ibadan, Nigeria (YRI), Luhya in Webuye, Kenya (LWK), Maasai in Kinyawa, Kenya (MKK), and CEPH US residents with ancestry from northern and western Europe (CEU) samples. The three African reference populations include two Bantu-speaking groups (YRI and LWK) and one Nilotic-speaking group (MKK). We estimated local ancestry in the Yemeni sample with all three European-African reference population combinations (CEU-YRI, CEU-LWK, CEU-MKK). The correlations among African ancestry calculated using all three reference population combinations are high (r > 0.98 in all pairwise correlations). Furthermore, there is no significant difference between the average proportion of African ancestry in Yemenis calculated using either of the two Bantu-speaking reference populations: CEU-YRI (mean 0.062, sd 0.044) and CEU-LWK (mean 0.076, sd 0.049) (p=0.13, two-tailed Welch two sample t-test). However, the average African ancestry calculated using the Maasai reference population (CEU-MKK, mean 0.148, sd 0.060) is significantly greater from that calculated using either the Yoruba or Luhya reference populations (p less than 0.0001 in both comparison, two-tailed Welch two sample t-test). These data suggest that the source population for the African ancestry of the Yemeni population is more similar to the contemporary Maasai population than either the Luhya or Yoruba.

The study of prehistoric human population history is often fraught with controversy owing to incongruent evidence among various markers of present-day genetic and cultural diversity. While archaeological evidence can be used to calibrate the conclusions drawn from present-day diversity, the fickle nature of the fossil record leaves some migration histories unresolved. Our work analyzes the potential of music - in particular, vocal music - to serve as novel migration marker, bolstering established migration work and shedding light on regions of the world whose settlement history is contested. One such migration is the recent expansion of Austronesian-speaking peoples across the Pacific within the last 6000 years. The dominant hypothesis posits a recent origin in Taiwan, with a rapid movement southwards and eastwards to populate Polynesia during the following 3500 years. While this model is strongly supported by both archaeological evidence and the present-day distribution of linguistic diversity, our goal was to analyze whether music could serve as a novel line of evidence in the study of Pacific prehistory. A critical concern regarding any migration marker is its time depth. In order to examine this for music, we analyzed correlations between musical diversity and mitochondrial-DNA diversity in 9 Taiwanese aboriginal tribes for which both types of data were available. A sample of 226 choral songs was analyzed using 39 binary characters representing significant structural features of music (e.g., rhythm, interval size, melodic contour, etc.). The musical samples were restricted to ritual musics, which constitute the most conservative (i.e., slowly changing) component of a culture’s repertoire. Mantel tests showed a significant correlation between musical distance and genetic distance among these 9 tribes, suggesting that music may have a time depth comparable to widely-used genetic markers like mitochondrial DNA. This work demonstrates that music has the potential to enrich the conclusions drawn from other markers, and establishes methods for employing it as a tool in the study of prehistoric human movements throughout the world. At the same time, we want to capitalize on music’s own unique dynamics of change over time and place, particularly its capacity for admixture. In other words, music might not only be able to support the narratives told by other migration markers but shed new light on the histories of population movement and cultural contact.

The bolded part in the following abstract makes sense, as it indicates (i) the distinctiveness of Ashkenazi Jews compared to CEU Europeans, and (ii) the fairly recent widespread formation of admixed individuals (in the last couple of generations) which generated individuals that are 1/4 1/2 and 3/4 AJ genomically.

Studies of complex genetic disorders may benefit from focusing on population isolates, such as Ashkenazi Jews (AJ). However, in order to truly exploit the advantages of reduced genetic diversity the self-declared AJ ancestry of study participants should be independently confirmed with available genetic data. We investigate whether the AJ cohorts display genetic heterogeneity, such as e.g. different rate of admixing in cases and controls, which could potentially confound disease association studies. We applied principal component analysis (PCA) to AJ cohorts ascertained in Israel and the US East Coast with the goal of characterizing population structure. As described previously, when compared to the HapMap samples with CEU, YRI and CHB/JPT ancestry, virtually all AJ samples cluster with the CEU. Similar analysis done on CEU and Jewish HapMap samples from Ashkenazi, Sephardic and Middle Eastern Jewish communities revealed that 97.8% of AJ samples cluster along the AJ-CEU axis, with modes at AJ and CEU cluster centers and at approximately quartile distances between them. We postulate that these groups correspond to 100-0, 75-25, 50-50, 25-75, and 0-100% AJ-CEU admixtures. Notably, only 91.7% of self-reported AJ individuals fall into the reference JHapMap panel AJ cluster, with 1.6, 3.3, 0.5 and 0.7% in the admixed modes ordered by decreasing fraction of AJ ancestry. We also observe admixing with the non-AJ Jewish communities: 0.7% of samples fall within the non-AJ clusters and 1.4% at a subgroup approximately halfway between the AJ and non-AJ cluster centers. In our dataset we found that when compared to the sample as a whole or only to controls, individuals with Crohn’s disease (CD) show significantly more admixing: 78.1, 3.1, 8.5, 2.0 and 0.9% in the 100, 75, 50, 25 and 0% AJ subgroups respectively. Also, CD samples show more admixing with non-AJ groups (2.8 and 1.0% in the 50-50 and 0-100 AJ-non-AJ subgroups). Isolates typically exhibit a greater amount of cryptic relatedness compared to outbred populations, which motivates an orthogonal method for verifying AJ ancestry based on identity-by-descent (IBD). The high background level of IBD within the Ashkenazi Jewish community can be used to estimate degree of AJ ancestry by averaging the IBD between a sample under study and the AJ individuals in the JHapMap panel. Our preliminary results show that this method recapitulates the high-level results from the PCA analysis and provides better resolution.

15 comments:

I found the Marsh Arab extract interesting. I don't understand the idea of either a Sumerian origin or a South Asian one for the Arabs. Why? Because they don't ride camels, have Water Buffaloes, and live in reed dwellings? To my eyes they have all the indicators of a recent movement of desert Bedouin to an alternate lifestyle in the marshes with its easier life, and the movement of these unusual Bedouins into the marshes has produced an unusual people, founder effects but still unmistakeably Arabs.

I cannot see any association with those hypotheses of the Semitic languages, ca 6,000 years old, the "Arabs" or any other "Semitic" people, ca 2,000 years old with J1 whether M267 or "Page08" which is older than either the language group or the ethnic group.

I say produce proof rather than suppositions regarding language, haplogroup and ethnic group associations rather than just looking at the presence of those languages, haplogroups and ethnic groups today. One day, all will be extinct like the Etruscans, the Iberian language, and mtDNA K1otzie, and there will be nothing to make suppositions about.

Regarding Sumerians, it isn't clear where they came from; they might just have been a native ethnic group of southern Mesopotamia for thousands of years before the first appearance of the Sumerian language in the historical record.

I think to understand the quantity of pre-Arabian genes in the Marsh Arabs they should compare them with the non-Muslim populations of that region like Mandeans.

"One of the great problems of Eurasian anthropology is whether the Uralic populations are simply variable admixtures of Caucasoids and Mongoloids or they contain a tertium quid in the form of a Proto-Uralic element."

It's remarkable that by way of answering the question of the Uralic homeland linguists produced several widely divergent theories. These theories cover pretty much the whole geographic range in which Uralic languages are found from modern Poland in the West to the Volga-Kama region (where I conducted both archaeological and ethnological fieldwork back in the early 1990s) and finally to western Siberia. In the case of other well-studied language families such as Austronesian and Indo-European there's a general agreement that Austronesian languages mostly expanded from north to south/east and Indo-European languages from south and east to north and west. In the case of Uralic, it can go either way.

It's also noteworthy that the traditional binary-split, hierarchical structure of the Uralic family has been questioned. See http://www.helsinki.fi/~tasalmin/kuzn.html. It's possible that all Uralic branches (Finno-Saamic, Volga Finnic, Ob-Ugrian and Samoyedic) derive directly and independently from proto-Uralic. From the kinship studies perspective, proto-Uralic kinship system seems to be best preserved (almost intact) in the Saami (see Dziebel, "The Genius of Kinship"). This is the tertium quid. It supports a "western origin" for Uralic and is consistent with some genetic evidence (e.g., U5b1b in the Saami, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1181943/) of the origin of the Saami in the course of the post-LGM repopulation of Northern Europe. But this comes in conflict with craniology (per Moiseyev quoted by Dienekes) that detects the proto-Uralic morphological type in Ob-Ugrians.

I don't believe in the "hybrid" origin of Uralic-speakers on purely methodological grounds. The way comparative method works is that you gotta be able to reconstruct a proto-type (genetically, linguistically, craniologically, etc.) and then explain other interferences as later gene flow, borrowing, etc., or earlier substratum. But Uralic is one of those challenging cases (aren't they all?) when it's hard to say what genetic layer maps onto what linguistic layer, etc.

The Marsh Arab abstract caught my eye as well. It seems odd to talk about possible "Semitic Arab" origins for the Sumerians, when we know from historical records (the oldest anywhere) that Sumerians were antecedent to Semitic peoples and that the Sumerians ultimately adopted the language of the incoming Semitic peoples.

It should hardly be a surprise that Southern Iraqi Marsh Arabs have a similar J1 haplotype to Middle Eastern Arabs generally. But, modern DNA samples of Marsh Arabs doesn't shed a lot of light on the genetic origins of Sumerians, despite their common geographic location, in a place that has seen so many sweeps of history (at leas half a dozen plausible ones, and probably more) in the past 5500 years, a whole host of which could be the source of predominant Y-DNA types. Maybe these guys are the direct descendants of the Sumerians, but I don't see how this evidence establishes anything.

Interesting post. Where do you see the origin - let's say, of the Saami and Finnish-speaking people (not the people who make up most of their current genetic make-up, of course)?

From what I have gathered recently, there seems to be some mild consensus that the area between the Baltic language homeland (between Eastern Ukraine and Southern Russia)and the southern Urals may be the best candidate. While most think of this as rather recently, I am envisioning a connection to a few millennia after The Great Melt (~8.000 - 10,000 years ago) that left the area between and North of The Two Great Lakes ;) rather inaccessible - because much of it was either flooded or marshland.

That is, Uralic people where isolated (concerning the NW direction) and couldn't easily move NW shortly after LGM, and when they could, they found much of the region already inhabited. Conversely, they could more easily and earlier move East - which may explain some of the seeming contradictions.

I like the way you've approached it. We definitely need to add past ecology into the mix. I'll add more abstract data into the mix before we arrive at a solution. Although I suspect your "centrist" position on the Uralic homeland (between Ukraine and Southern Urals) is largely accurate.

There's an oldish paper by Richard Rogers "Language, Human Subspeciation and Ice Age Barriers in Northern Siberia" (1986) in which Eurasian language families, especially Uralic and Altaic, and the Mongoloid-Caucasoid phenotypic divergence are mapped against Pleistocene, post-LGM ecological zones. I don't know how to embed it here, but I can e-mail it you. One of the observations that the paper makes is that the melting of the glacier removed the natural barriers separating several refuge populations (let's say in the east, in the west and somewhere in the center) and created this powerful Mongoloid-Caucasoid mingling that's attested in the cranial morphology and the genes of Uralic peoples. The Saami share their U5 branch with none other but the Berbers suggesting that their makeup was affected by the post-LGM movement out of such a far western refugium as Franco-Cantabrian.

Now, the tertium quid theory suggests that the central component of this pre-Great Melt Eurasia (not Caucasoid and not Mongoloid but "phenotypic Uralic" per Moiseyev) has survived in some "linguistic Uralic" populations such as Saami and Ob-Ugrians (Khanty and Mansi). This is what makes them Uralic. Finns, Mordvinians, Udmurts and Mari experienced gene flow from the Caucasoid west, while Samoyeds experienced gene flow from or absorbed some Siberian Mongoloid groups of Altaic or Paleoasiatic provenance. Even later some Uralic groups west of the Urals got Altaicized through a population movement out of South Siberia (e.g., the Chuvash in http://dienekes.blogspot.com/2010/03/abstracts-from-aapa-2010.html). In the case of the Chuvash, we don't see any genes moving westward, only language, but the presence of such a South Siberian Y-DNA haplogroup as N1c across the Uralic groups west of the Urals suggests that this westward movement was demic, not just cultural.

One more player to consider are the Yukaghirs in the far northeast. Their linguistic affiliation with Uralic is currently beyond doubt but their geographic location adds complexity, rather than gives a straightforward answer to the old questions. If Uralic-Yukaghir constitutes the primary split in the Uralic linguistic phylogeny, then, geographically, the Yukaghirs reinforce the northern pole of the Uralic distribution comprised of such bearers of the Uralic tertium quid as the Saami and the Ob-Ugrians, while at the same time adding weight to the eastern pole. Then what do we do with the such a strong far western, Franco-Cantabrian association of Saami mtDNA? Linguists consider the Uralic-Yukaghir split as organic, namely not substratum induced. This makes it difficult to explain the Yukaghir factor away as a classic Uralic language, ultimately of European origin, imposed on a Mongoloid population in the Far North-East.

I can't say I've ever read it but I remember reading long ago that a major ice sheet covered the whole of the low-lying region east of the Urals (the West Siberian Plain) until something like 6-7000 years ago: the lower Ob and Yenesi valleys. If true this would place the Samoyed arrival in the region to more recently than that.

"a major ice sheet covered the whole of the low-lying region east of the Urals (the West Siberian Plain) until something like 6-7000 years ago: the lower Ob and Yenesi valleys."

Only the northern part of the West Siberian Plain was covered in ice. One can't exclude the possibility that there were some refugia such as the Taymyr one, which could host the ancestors of Uralic-speakers. In any case, I'm still going back and forth on the southern, western, eastern and northern axes of the Uralic-Yukaghir range as potential homelands. It's possible that the high frequencies of mtDNA U5 in Saami and U4 in Ob-Ugrians reflect the amount of post-glacial gene flow from west to east, while mtDNA C, Z and D5, found at low frequencies in the Saami, may represent that tertium quid, with northeastern origins, almost obliterated by later admixture.

"It's possible that the high frequencies of mtDNA U5 in Saami and U4 in Ob-Ugrians reflect the amount of post-glacial gene flow from west to east, while mtDNA C, Z and D5, found at low frequencies in the Saami, may represent that tertium quid, with northeastern origins, almost obliterated by later admixture".

mtDNA Z in the Saami and Finns is claimed to be as recent as 2,700 years old, and it derives from the Volgo-Ural region, with further links to Northeast Siberia and Japan http://www.nature.com/ejhg/journal/v15/n1/abs/5201712a.html

Interestingly, when the two M267 subclades, J1-M267* and J1e are considered, differential frequency trends emerge. The less represented J1-M267* primarily diffuses towards North East Mesopotamia and shows its maximum frequency in the northern area (Assyrian). In contrast, the most frequent J1e accounts for almost all the J1 distribution in South West Mesopotamia, reaching its highest value in the Marshes. By considering the STR variance associated to the two different subsets of J1 chromosomes, the highest variance for both J1-M267* and J1e is registered in the northern Mesopotamia area. . . . The lower variance value (0,118) registered in the Marshes Arabs is in agreement with a recent expansion event which, itself, clearly emerges from the network analysis. The presence of Y chromosomes belonging to the M267* paragroup suggests a long persistence of this haplogroup in the Mesopotamia Marsh area.

Thank you very much handschar:

Historically the Mesopotamia Marsh Arabs Area long long time ago was the resident for the Sumerian/ Akkadian civilizations.

Handchar :

Do you think the long persistence of J1* in the Mesopotamia Marsh Arabs Area and amongst Assyrians in North East Mesopotamia, is due to their association with the Akkadians ?

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.