October 05, 2011

Y-chromosomes of Marsh Arabs

What do the Marsh Arabs have to do with ancient Sumer? Nothing that can be determined on the basis of this data. There are plenty of ancient Sumerian skulls, so how about we study them directly?

As far as I can see, the only link between Marsh Arabs and Sumerians presented in this paper comes from dating Y-STR variation of their major J1-Page08 group using the evolutionary mutation rate, with a divergence time of 4.5 +/- 2.6 ky. Even if that mutation rate was correct (it is not) and the assumptions on which the confidence interval are based were exhaustive (they are not), we still have +/- 2.6 ky leeway to deal with, which spans not only the Sumerians but plenty more besides.

Not to mention that the evolutionary mutation rate is wrongly applied to every case under the sun, and that Y-STR based age estimation in general has been conclusively shown to be a rather futileexercise.

Nonetheless, the paper does have value in demonstrating the paucity of J2 and R1 in the Marsh Arabs compared to the more cosmopolitan general Iraqi population:

Different from the Iraqi control sample, the Marsh Arab gene pool displays a very scarce input from the northern Middle East (Hgs J2-M172 and derivatives, G-M201 and E-M123), virtually lacks western Eurasian (Hgs R1-M17, R1-M412 and R1-L23) and sub-Saharan African (Hg E-M2) contributions.

Rather than "Sumerian", it seems that the Marsh Arabs have rather preserved a more pristine Semitic patrilineal gene pool compared to the cosmopolitan Iraqi samples that have absorbed pre-Arab and pre-Semitic population elements.

BMC Evolutionary Biology 2011, 11:288doi:10.1186/1471-2148-11-288

In search of the genetic footprints of Sumerians: a survey of Y-chromosome and mtDNA variation in the Marsh Arabs of Iraq.

Nadia Al-Zahery et al.

Abstract (provisional)

BackgroundFor millennia, the southern part of the Mesopotamia has been a wetland region generated by the Tigris and Euphrates rivers before flowing into the Gulf. This area has been occupied by human communities since ancient times and the present-day inhabitants, the Marsh Arabs, are considered the population with the strongest link to ancient Sumerians. Popular tradition, however, considers the Marsh Arabs as a foreign group, of unknown origin, which arrived in the marshlands when the rearing of water buffalo was introduced to the region.

ResultsTo shed some light on the paternal and maternal origin of this population, Y chromosome and mitochondrial DNA (mtDNA) variation was surveyed in 143 Marsh Arabs and in a large sample of Iraqi controls. Analyses of the haplogroups and sub-haplogroups observed in the Marsh Arabs revealed a prevalent autochthonous Middle Eastern component for both male and female gene pools, with weak South-West Asian and African contributions, more evident in mtDNA. A higher male than female homogeneity is characteristic of the Marsh Arab gene pool, likely due to a strong male genetic drift determined by socio-cultural factors (patrilocality, polygamy, unequal male and female migration rates).

ConclusionsEvidence of genetic stratification ascribable to the Sumerian development was provided by the Y-chromosome data where the J1-Page08 branch reveals a local expansion, almost contemporary with the Sumerian City State period that characterized Southern Mesopotamia. On the other hand, a more ancient background shared with to Northern Mesopotamia is revealed by the less represented Y-chromosome lineage J1-M267*. Overall our results indicate that the introduction of water buffalo breeding and rice farming, most likely from the Indian sub-continent, only marginally affected the gene pool of autochthonous people of the region. Furthermore, a prevalent Middle Eastern ancestry of the modern population of the marshes of southern Iraq implies that if the Marsh Arabs are descendants of the ancient Sumerians, also the Sumerians were most likely autochthonous and not of Indian or South Asian ancestry.

It's interesting that everyone still accepts Haplogroup G as a "Near Eastern" or "Northern Middle Eastern" or "Caucasian" haplogroup.

I think it is now clear that Haplogroup G was present all over Europe during ages past.

Significantly, it has been found in far-flung Western European ancient samples, from Treilles, to the Alps (Oetzi), etc.

Significantly, it is now found in large quantities in all isolated relic populations: The Alps? Check. The Pyrenees? Check. The Southern Appenines? Check. Islands like Sardinia? Check. The Caucusus? Check.

As Dienekes has pointed out, we can no longer assume that the modern most extensive range (Ossetia in the Caucusus for G) is the origin of the Haplogroup.

I think G represents a formerly widespread Hg in Southern Europe, now pushed into the fringes. Interesting that it stayed out of Scandinavia.

Am I wrong to assume that the author has an answer and is searching for proof? Didn't the Germans try this not too long ago?

The Arabs do have a very truly interesting gene pool. Mideast/Semitic research should be directed toward discovery rather than proof of legend. Would think that now is best time to apply the greatest effort and not wait for a more "cosmopolitan" future.

They are irrelevant since the Gulf Oasis occurred much earlier than anything having to do with Marsh Arabs

@mooreisbetter

We will only know for sure until we have pre-Neolithic data from Europe and any data from West Asia.

The way things look now, the occurrence of G in European Neolithic contexts together with its modern distribution and phylogeography is suggestive that it arrived in Europe with early Neolithic colonists from West Asia.

Almost *all* haplogroups, with the exception of maybe Hg I, arrived in Europe through West Asia.

Some took a more northerly route, and some took a more southerly route, but that statement describes Hgs from J to R.

What I am saying is that I would not be surprised if we later discovered that F mutated to G in Europe. In other words, that F populated Europe, and that G's initial population and territorial expansion was from individuals already in Europe.

Only a widespread "layer" of far-flung aboriginal-types can explain the current distribution from Sardinia to the Caucasus.

If you want to look for modern refugia (not from cold, but from invasions); if you want to look for modern "relic" populations, you just look for G.

It is high up in the Alps, high up in the Pyrenees, high up in the Appennines, high up in the Caucasus, and on islands like Sardinia and the Balearic. Basques, Sardinians, Caucasus Mountain Caucasians, certain Italians: those are all well-documented sites of pre-Indo-European, "aboriginal" populations.

Indeed, G almost mirrors the distribution of speakers of non-IE languages that survived into the modern era in Europe.

From the distribution, I see an old relic G population in Europe that is very widespread but thin (G2a*), perhaps pre-dating LGM, and one that spread from the Balkans and middle Danube during introduction of agriculture (G2a3b1*, G2a3b1a1* and G2a3b*). It may very well have resided there at the time due to the easy Black Sea connection.

"On the other hand, a more ancient background shared with to Northern Mesopotamia is revealed by the less represented Y-chromosome lineage J1-M267*"

That's interesting, because just today I was reading a book on the Middle East (chief consultant Dr. Stephen Bourke). Quote:

"Excavations at 'Tell Awayli, an Ubaid site near Larsa in Southern Iraq, has revealed a predecessor culture with affinities to the earlier northern Samarra culture. A late Samarran site, Choga Mami, has evidence of canal irrigation as early as 6000 BCE."

The author of that section suggests this may indicate an origin in the north for the Sumerian culture.

I find the higher prevalence of J1*(xPage08) among Marsh Arabs to be most interesting. Sumerian is a split-ergative and agglutinative language like some of those from NW Iran and the Caucasus where J1* is also prevalent. The article is consistent with a migration from the north of J1* bringing Sumerian to the region. The expansion time of J1* among the Marsh Arabs is also older than that of J1-Page08 and more consistent with Sumerian being at least 5000 to 6000 years old.

I really see no evidence that G-M201 arose on European soil. Such a hypothesis would have a hard time explaining the present-day distribution of this haplogroup.

Your hypothesis would require:- 2 movements: pre-G into Europe and G into Asia- substantial reduction of G in Europe but not in West Asia

It is less parsimonious than the alternative hypothesis, that G originated in Asia and spread into Europe. Actually, this is also supported by phylogeography, as European G belongs mostly to a single branch of the G phylogeny, with the greatest variety found in the Caucasus, eastern Anatolia, Syria.

The article is consistent with a migration from the north of J1* bringing Sumerian to the region.

The article is wrong, because it puts too much faith on age estimation of Y-chromosome lineages. Apparently, some scientists who write in peer-reviewed journals still haven't gotten the message that "amateurs" have been broadcasting for years, that the entire "evolutionary mutation rate" approach belongs in the dustbin.

Incidentally, corrected for the 3x fudge factor of the evolutionary rate, the Marsh Arabs' J-P58 is dated to ~1.5ky. So, they're basically an Arabian tribe, and their Y-STR diversity corresponds to the well-known expansion of the Arabian tribes. Due to their distinctive lifestyle and isolation, they did not pick up much from pre-Arabian Mesopotamians that all other Iraqis have.

Incidentally, corrected for the 3x fudge factor of the evolutionary rate, the Marsh Arabs' J-P58 is dated to ~1.5ky. So, they're basically an Arabian tribe, and their Y-STR diversity corresponds to the well-known expansion of the Arabian tribes. Due to their distinctive lifestyle and isolation, they did not pick up much from pre-Arabian Mesopotamians that all other Iraqis have.

But aren't all those mutation rates completely unreliable according to the latest studies? So your "correction" should be unreliable as well.

My correction isn't a correction at all, it's the use of the genealogical rate. It is rather the evolutionary rate that is a correction, and a misguided one at that.

As for the unreliability of Y-STRs, of course I haven't changed my mind on that, but the recent Busby et al. paper dealt with the problem of non-linear accumulation of Y-STR variation, which is likely NOT a big problem within a 1.5ky timespan. While nothing can be excluded for certain, the odds of Marsh Arab J-P58 being of Sumerian age are not good.

I said that J1*, not J1-Page08 reflects the Sumerian migration to the marshlands. J1* has substantially higher variance than J1-Page08. The accuracy of the evolutionary vs. the pedigree rate strongly depends on demographic processes such as rapid population growth sustained over millennia and certain male lineages over generations not disappearing to extinction. Such may the case for J1-Page08 on the Arabian Peninsula over the past two thousand years, but clearly not for all Y lineages. As everyone knows, only complete sequence enumeration of Y SNPs and/or ancient DNA studies will help elucidate the accuracy of these YSTR rates of variance accumulation. If Otzi is truly L91, then it will be difficult to use the pedigree rate to get L91 expansion times back to 3000 BCE.

It indeed seems that J-P58 amongst Marsh Arabs is anything but Sumerian (I would rather head for J1* w/DYS388=12 to 14 for the Sumerian, Hurrian and Urartean populations).

If it is not of Arabic tribal background (which I doubt greatly), we would then have to associate it with other pastoralist peoples of the region such as Amorites or Arameans (but that doesn't seem to be the case anyway, and an Arabian origin seems quite more likely than an earlier one).

The Sumerian stuff is completely out of context, it looks like Az-zahery is trying to back an assumption with genetic data!

The accuracy of the evolutionary vs. the pedigree rate strongly depends on demographic processes such as rapid population growth sustained over millennia and certain male lineages over generations not disappearing to extinction.

That is incorrect. Rapid population growth sustained over millennia is NOT necessary for the applicability of the pedigree rate. Modest population growth over millennia (as evidenced since the beginning of the Neolithic), and/or rapid population growth at an early stage, followed by no population growth both suffice. Both cases are treated in the Zhivotovsky et al. (2006) paper that has sought an explanation for the "evolutionary rate".

What we _do_ know is that the conventionally used evolutionary rate applies to a very specific case (constant population size) that is false on dual empirical grounds (human populations have grown by orders of magnitude since the onset of the Neolithic, and present-day haplogroup sizes _cannot_ have been produced by random drift alone in a constant-sized population).

In short: the evolutionary mutation rate is wrongly applied almost in every case. It might be applicable to some small hunter-gatherer tribes that haven't experienced population growth or reached high numbers, but it is _not_ applicable to the majority of cases where it has been applied.

If Otzi is truly L91, then it will be difficult to use the pedigree rate to get L91 expansion times back to 3000 BCE.

Otzi is irrelevant to the issue, because calculations of the TMRCA of modern-day haplotypes do not include all the side branches of the tree that went extinct.

The L91+ ancestor of all living L91+ men was NOT the only L91+ man at the time when he lived, nor was he necessarily the first L91+ man. So, the discovery of an L91+ man before the time assessed by the pedigree rate is irrelevant as to its validity.

It _is_ possible that pedigree-based age estimates have systematic underestimates, but this is because of mutation saturation as described in the recent Busby et al. paper and has NOTHING to do with demography as proposed by Zhivotovsky et al.

Don’t know that much about Marsh Arabs, but if I am not mistaken Assyrians view them as pre-Islamic. I think the fact that Romans employed “Marsh Arabs” from the Tigris in Roman Britain as Rafters may represent an ancient pre-Islamic Semitic layer and lifestyle.http://en.wikipedia.org/wiki/Arbeia

I am not saying you are necessarily incorrect, but I have never heard this. The Assyrians consider the Iraqi Mandaeans of S Iraq --a population with whom they share (or shared) a very similar tongue-- as pre-Islamic. Autosomally, at least based on the two participants in the Dodecad project, they are not much different from E Assyrians. Compare my v3 K=12 values(DOD134) with those of the two Iraqi Mandaean participants (DOD460 and DOD786).

If Otzi is truly L91, then it will be difficult to use the pedigree rate to get L91 expansion times back to 3000 BCE.

Otzi is irrelevant to the issue, because calculations of the TMRCA of modern-day haplotypes do not include all the side branches of the tree that went extinct.

The L91+ ancestor of all living L91+ men was NOT the only L91+ man at the time when he lived, nor was he necessarily the first L91+ man. So, the discovery of an L91+ man before the time assessed by the pedigree rate is irrelevant as to its validity.

I beg to differ. The virtue of the evolutionary rate is that it tracks the increase in YSTR variance along a specific lineage and projects back to the time when the variance would be zero. Otzi's L91 date of 3000 BCE is a terminus ante quem for the appearance of L91. The first L91 male would be at least as old as Otzi.More recent examples of the problem with the pedigree rate in a non-hunter gatherer population would be the Hutterite paper which demonstrated that among Hutterites in the last 300-400 years the evolutionary rate better fits the accumulation of YSTR variance, or even, as we reflect, on some of our own lineages: my father had two sons--I am the only one who had a son and my son has only one son. If my father has a unique Y SNP, then the YSTR variance on that SNP has remained 0 until the generation of his great-grandson. This is exactly, as you know, the difference between the evolutionary and the pedigree rates estimating the initial event of 0 YSTR variance on a Y SNP.

I don't see how J1 or J2 are anything but non-semitic. Of course this tends to hurt feelings, but E1b1b1 are the strongest candidates for the original language bearers. THat said, I don't think its possible to prove any language family had only 1 broad hg assigment to it. I know Dienekes' view is that of Semites coming out of southern Arabia around the time of the Phoenicians as the Greeks noted heavy immigration/settlement from that region. This doesn't mean they were semitic speakers, and unless the Greeks were unaware, Proto-NW Semitic seems to have been spoken 500-1000 years earlier among the Amorites of Syria who were recorded as warrior elites among the Sumerians, rather than people who replaced the population. Again, this hurts feelings because there is racist sentiment, even among *certain* groups that feel E1b1b1 can't be anything but black African. It's a possible scenario that Arabian/Yemen E1b1b1/J1 were already immigrating to the north in prehistory and settling in Syria, but again, we have no proof.

There were certainly significant numbers of Arabs in southern Mesopotamia, in southern Greater Syria and in Sinai already during pre-Islamic times (probably at least since the early Iron Age, which is also the time when the word "Arab" is first attested); Arabs may have even constituted the majority of the population in at least parts of those regions already during pre-Islamic times; historical records tell us these. So Marsh Arabs may be mostly descended from pre-Islamic Christian Arabs of southern Mesopotamia who were Islamized during Islamic times.

I beg to differ. The virtue of the evolutionary rate is that it tracks the increase in YSTR variance along a specific lineage and projects back to the time when the variance would be zero. Otzi's L91 date of 3000 BCE is a terminus ante quem for the appearance of L91. The first L91 male would be at least as old as Otzi.

You are wrong. L91 defines a lineage, and there are sublineages within the L91 lineage. Modern L91+ men are descended from a man (MRCA) who lived _after_ (by definition) the first L91 man (L91-FOUNDER).

YSTR variance grew from 0 to whatever value is currently observed from MRCA to present-day L91+ men, but in the tree linking modern L91+ men with their MRCA.

All the rest of the tree is NOT recoverable by looking at modern L91+ men. In particular, lineages that branched off at any point from the path leading from L91-FOUNDER to MRCA are completely ignored by variance calculations on living men.

There is no reason to think that the L91-FOUNDER and the MRCA of living men lived at the same time, or even in the same time period.

The so-called evolutionary rate attempts to date L91-FOUNDER, but the solution it arrives at (the 3x fudge factor) is applicable ONLY to the case of a constant population size. In that case, the expected haplogroup size after G generations is 0.5G. In other words, e.g., for a 400-generation old haplogroup it is 200 men.

The so-called evolutionary rate has been used for everything under the sun, when it is clearly not applicable

http://mbe.oxfordjournals.org/content/23/12/2268.full

See Fig. 2 which shows how variance accumulates under different demographic scenaria.The 3x fudge factor actually represents the most extreme case (slowest accumulation of variance), and one which is completely misapplied for real human populations.

I have listed several cases in which the pedigree rate trumps the evolutionary rate:

The evolutionary rate fiasco is actually a good example of the failure of peer-reviewed science, as it has been uncritically used for almost 10 years now "because it has been published and other people have used it".

It is true, however, that both Greek and Strabo note that the Phoenicians originated in Arabia.

Arabia and the southernmost parts of the Levant are the most likely Proto-Semitic homelands: they are the most proximal to Africa where most Afroasiatic speakers live, they haven't been inhabited as far as we can tell by any non-Semitic groups in historical times, and so on.

The short explanation is that the expected number of generations given a variance measurement is _higher_ for lower variance and _lower_ for higher variance. At the limit of variance=0, the posterior age estimate is not 0, so indeed, for very recent clades Y-STR variance _underestimates_ age.

Applying "calibrations" based on recent genealogies to prehistoric events is wrong, both on Bayesian grounds, and because of the linearity problems described by Busby et al.

@AWood,I think you need to check Chiaroni and Roy King's papers on the matter (especially his 2009 article about human migrations during the neolithic), it seems that E1b1b1c1 positive men were Afroasiatic speakers and that this group invented pastoralistic lifestyle during the PPNB along with J1c3 (when this one was migrating from the northern parts of the Levant).This scheme tends to fit quite neatly with Juris Zarins' Circum Arabian Nomadic Pastoral Complex which is, according to him, responsible for the emergence of Proto-Semitic speakers.

So we have at least two clades (J-P58 and E-M34) for whom the odds are that they will turn out to have played a significant part in the spread of Semitic languages (there are other clades which we can associate to this linguistic family, aside from the two mentionned, clades such as R-V88 or T1a).Don't forget also that pre-hispanic Guanches had J-M267 positive men, and this can't be explained by Semitic migrations... So J1* might've played a role in the spread of Afroasiatic and pastoral lifestyle in Africa along with R1b1c and E-M35 clades (this would explain the Caucasian words found in African branches of Afroasiatic too).

@Dienekes,

Proto-Semitic has a word for Naphta, Bitumen, Ice and Oak which are found only in the northern parts Levant (so the Proto-Semitic homeland ought to be situated near Syria somehow).

or does its continued publicity and utilization indicate an intentional premeditated stratagem in steering a controlled "inertia" in this industry?

There are papers that use the evolutionary rate, and there are papers that don't. And, there are papers that use both without making an attempt to take a stand.

Methinks it is convenient to "pick a rate from the box" that matches one's theory and not think too hard about the mathematics. But it is disheartening to see that a whole field has allowed a 3-fold discrepancy in age estimates to persist unresolved in the literature for so long.

Don't forget also that pre-hispanic Guanches had J-M267 positive men, and this can't be explained by Semitic migrations.

Why not? The Canary Islands were visited by the Phoenicians-Carthaginians

Then we would have to explain the lack of J2a and R1b1c along with I2a1 which all seem to have followed Phoenician migrations.Don't get me wrong, you could be right after all... But the absence of these markers makes this scheme look rather odd.

As for the picture of Amman, I think the Proto-Semitic urheimat would be situated in the mountaineous regions which border Israel, Syria and Jordan (and that isn't far from Amman). Probably further south compared to Kitchen's proposal.

But anyway, denying that J-P58 had anything to do with Semitic languages is far-fetched.

Also, I don't know if you noticed, but in the article it is said that "the M365 was observed in two J1-Page08 Y chromosomes"... That's also odd, considering that if Page08 is equivalent to P58, then we shouldn't find any M365 downstream as it is negative to P58 (J1b in current nomenclature).

Perhaps they also were first to make the transatlantic voyage. The Assyrians relied on the seafaring expertise of the Phoenicians when they took to the water. ;)

I am not seriously suggesting there is an actual link to Mesopotamia. It will be an interesting read for some, however: http://indiancountrytodaymedianetwork.com/2011/03/where-did-chief-joseph-get-his-mesopotamian-tablet/

"Chief Joseph’s pendant was not the only Mesopotamian tablet found in North America. [A]nother cuneiform tablet that was found in 1963 in Georgia...was written in the Sumerian language..."

Then we would have to explain the lack of J2a and R1b1c along with I2a1 which all seem to have followed Phoenician migrations.

I know of no evidence (*) that J2a was ever a major lineage in either Proto-Semites or ancient Phoenicians. Semites have high J1/J2 ratios compared to non-Semites, and groups of Semites that had lower opportunity to intermix with others (such as Arabians and Ge'ez) have higher still. J2a reflects non-Semitic admixture from northern Near Eastern groups (primarily Indo-Europeans and Caucasian(-like) speakers).

(*) Wishful thinking by some researchers that the modern Lebanese are descendants of Phoenicians and extrapolation from them to ancient Phoenicians does not count as evidence for me.

It even occasionally snows in Saudi Arabia like in the Asir/Hejaz Mountains to the West Coast of the Red Sea.Also the Oak is the national Tree of Jordan, I wouldn’t be surprised if Oaks existed even more to the South in more milder times. As to Bitumen, the historical center of trade is Mesopotamia.

Wishful thinking by some researchers that the modern Lebanese are descendants of Phoenicians and extrapolation from them to ancient Phoenicians does not count as evidence for me.

If we do not have enough ancient DNA and/or craniofacial data of them, I think we should refrain from making inferences or speculations about the genetic relationship of extinct (by identity) populations to each other and to extant populations. So if we currently do not have enough ancient DNA and/or craniofacial data of Phoenicians (I don't know the current extent of such data), we can at most calculate how much Muslim Lebanese are descended from pre-Islamic Christian Lebanese and how much from original Muslim Arab migrants to Lebanon by comparing the genetic data of modern Muslim Lebanese with those of modern inland (thus away from Crusader and other outside genetic influence) Christian Lebanese and those of the putative original Muslim Arab source population using a carefully-chosen modern Arabian group of samples as a proxy for it, but we cannot calculate how much Christian Lebanese, Muslim Lebanese or Lebanese as a whole are descended from Phoenicians.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.