This article is within the scope of WikiProject Genetics, a collaborative effort to improve the coverage of Genetics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.

Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South AsiaThe American Journal of Human Genetics, Volume 89, Issue 6, 731-744, 9 December 2011: Summing up, our results confirm both ancestry and temporal complexity shaping the still on-going process of genetic structuring of South Asian populations. This intricacy cannot be readily explained by the putative recent influx of Indo-Aryans alone but suggests multiple gene flows to the South Asian gene pool, both from the west and east, over a much longer time span. We highlight a few genes as candidates of positive selection in South Asia that could have implications in lipid metabolism and etiology of type 2 diabetes. Further studies on data sets without ascertainment and allele frequency biases such as sequence data will be needed to validate the signals for selection.

The findings of this paper do not rule out the occurrence of Indo-Aryan migration into the subcontinent, but instead suggest that Indian ancestral components are the result of a more complex demographic history than was previously thought, that can only be explained by multiple gene flows over the course of thousands of years, in addition to Indo-Aryan expansion into the subcontinent.

Therefore, I propose that the final passage be written thus:

However, a 2011 study published in the American Journal of Human Genetics[1] indicates that Indian ancestral components are the result of a more complex demographic history than was previously thought. It found that South Asia harbours two major ancestral components, one of which is spread at comparable frequency and genetic diversity in populations of South and West Asia, Middle East, Near East and the Caucasus; the other component is more restricted to South Asia. Both the ancestry components that dominate genetic variation in South Asia demonstrate much greater genetic diversity than those that predominate in West Eurasia. These findings do not rule out the possibility of Indo-Aryan migration, but suggest that the genetic affinities of both Indian ancestral components are the result of multiple gene flows over the course of thousands of years, with Indo-Aryan expansion into the subcontinent but one of many complex demographic episodes. The study authors write:

Summing up, our results confirm both ancestry and temporal complexity shaping the still on-going process of genetic structuring of South Asian populations. This intricacy cannot be readily explained by the putative recent influx of Indo-Aryans alone but suggests multiple gene flows to the South Asian gene pool, both from the west and east, over a much longer time span. We highlight a few genes as candidates of positive selection in South Asia that could have implications in lipid metabolism and etiology of type 2 diabetes. Further studies on data sets without ascertainment and allele frequency biases such as sequence data will be needed to validate the signals for selection.[2]— Preceding unsigned comment added by Āryāvastra (talk • contribs) 15:24, 23 September 2012 (UTC)

This is my second revision, pared down and brought in line with the latest research:

A 2011 study published in the American Journal of Human Genetics[3] indicates that Indian ancestral components are the result of a more complex demographic history than was previously thought. According to the researchers, South Asia harbours two major ancestral components, one of which is spread at comparable frequency and genetic diversity in populations of South and West Asia, the Middle East, the Near East and the Caucasus; the other component is more restricted to South Asia. However, rather than ruling out the possibility of Indo-Aryan migration, these findings suggest that the genetic affinities of both Indian ancestral components are the result of multiple gene flows over the course of thousands of years, with Indo-Aryan expansion into the subcontinent but one of many complex demographic episodes. The study authors write:

"Summing up, our results confirm both ancestry and temporal complexity shaping the still on-going process of genetic structuring of South Asian populations. This intricacy cannot be readily explained by the putative recent influx of Indo-Aryans alone but suggests multiple gene flows to the South Asian gene pool, both from the west and east, over a much longer time span."[4]

Copy of text for future reference, in case it is again vandalized by Hindutvadis:

According to the The Indian Genome Variation Consortium (2005),[5] the population of the subcontinent can be divided into four morphological types: Caucasoids in the north, Mongoloids in the northeast, Australoids in the western, central and southern regions of the country and Negritos largely restricted to the Andaman Islands; however, these groups tend to overlap because of admixture. The majority of genetic differences among Indians appears to be distributed along caste lines, rather than along ethnic lines, although genetic differences do exist between predominantly Indo-European-speaking northern and predominantly Dravidian-speaking southern Indian populations, as was also observed by Reich in a recent 2009 study.[6]

In 2008, The Indian Genome Variation Consortium produced another study, this time emphasizing the significant genetic differentiation which exists between Dravidian-speaking, Indo-European-speaking, Tibeto-Burman-speaking and Austro-Asiatic-speaking populations. The researchers write: "Thus, although there are no clear geographical grouping of populations, ethnicity (tribal/nontribal) and language seem to be the major determinants of genetic affinities between the populations of India. This is concordant with an earlier ﬁnding based on allele frequencies at blood group, serum protein and enzyme loci (Piazza et al. 1980)." The authors further observe that "it is contented that the Dravidian speakers, now geographically conﬁned to southern India, were more widespread throughout India prior to the arrival of the Indo–European speakers (Thapar 1966). They, possibly after a period of social and genetic admixture with the Indo–Europeans, retreated to southern India, a hypothesis that has been supported by mitochondrial DNA analyses (Basu et al. 2003). Our results showing genetic heterogeneity among the Dravidian speakers further supports the above hypothesis. The Indo–European speakers also exhibit a similar or higher degree of genetic heterogeneity possibly because of different extents of admixture with the indigenous populations over different time periods after their entry into India. It is surprising that in spite of such a high levels of admixtures, the contemporary ethnic groups of India still exhibit high levels of genetic differentiation and substructuring."[7]

In a major study (2009) using over 500,000 biallelic autosomal markers, Reich hypothesized that the modern Indian population was the result of admixture between two genetically divergent ancestral populations dating from the post-Holocene era. These two "reconstructed" ancient populations he termed "Ancestral South Indians" (ASI) and "Ancestral North Indians" (ANI). According to Reich: "ANI ancestry is significantly higher in Indo-European than Dravidian speakers, suggesting that the ancestral ASI may have spoken a Dravidian language before mixing with the ANI."[6] Furthermore, Reich observes: "It is tempting to assume that the population ancestral to ANI and CEU spoke 'Proto-Indo-European', which has been reconstructed as ancestral to both Sanskrit and European languages, although we cannot be certain without a date for ANI–ASI mixture."

"The historical record documents an influx of Vedic Indo-European-speaking immigrants into northwest India starting at least 3500 years ago. These immigrants spread southward and eastward into an existing agrarian society dominated by Dravidian speakers. With time, a more highly-structured patriarchal caste system developed. India is now broadly characterized by Indo-European (e.g. Hindi, Urdu, and Punjabi) speaking populations found in the central and northern regions and by Dravidian (e.g. Tamil, Telugu, and Kannada) speaking populations in the southern and southeastern regions. ... Although other interpretations may be possible, our data are consistent with a model in which nomadic populations from northwest and central Eurasia intercalated over millennia into an already complex, genetically diverse set of subcontinental populations. As these populations grew, mixed, and expanded, a system of social stratification likely developed in situ, spreading to the Indo-Gangetic plain, and then southward over the Deccan plateau. A strong patrilineal social structure, accompanied by a developing practice of caste endogamy, may have contributed to an asymmetric apportioning of Y-chromosome, autosomal, and to a lesser extent, mtDNA lineages."[8]

The geneticist PP Majumder (2010) has recently argued that the findings of Reich et al. (2009) concerning Indo-Aryan expansion into the Indian subcontinent are in remarkable concordance with previous research using mtDNA and Y-DNA:

"Central Asian populations are supposed to have been major contributors to the Indian gene pool, particularly to the northern Indian gene pool, and the migrants had supposedly moved into India through what is now Afghanistan and Pakistan. Using mitochondrial DNA variation data collated from various studies, we have shown that populations of Central Asia and Pakistan show the lowest coefficient of genetic differentiation with the north Indian populations, a higher differentiation with the south Indian populations, and the highest with the northeast Indian populations. Northern Indian populations are genetically closer to Central Asians than populations of other geographical regions of India... . Consistent with the above findings, a recent study using over 500,000 biallelic autosomal markers has found a north to south gradient of genetic proximity of Indian populations to western Eurasians. This feature is likely related to the proportions of ancestry derived from the western Eurasian gene pool, which, as this study has shown, is greater in populations inhabiting northern India than those inhabiting southern India. In general, the Central Asian populations are genetically closer to the higher-ranking caste populations than to the middle- or lower-ranking caste populations... . Among the higher-ranking caste populations, those of northern India are, however, genetically much closer than those of southern India. Phylogenetic analysis of Y-chromosomal data collated from various sources yielded a similar picture. Higher-ranking caste populations have been the torch-bearers of the Hindu caste system that was formalized by the Indo-European immigrants. It is likely, therefore, that there was a greater proportion of admixture between higher-ranking caste populations and Indo-Europeans. The fact that high-ranked caste populations inhabiting southern India do not exhibit as much affinity with central Asian populations as those of northern India may be explained by the recent finding that the south Indian, Dravidian speaking, populations may have admixed with north Indian populations bearing ancestral signatures of the western Eurasian gene pool more recently."[9]

The author summarizes his findings by stating that:

"Within India, consistent with social history, extant populations inhabiting northern regions show closer affinities with Indo-European speaking populations of central Asia that those inhabiting southern regions. Extant southern Indian populations may have been derived from early colonizers arriving from Africa along the southern exit route. The higher-ranked caste populations, who were the torch-bearers of Hindu rituals, show closer affinities with central Asian, Indo-European speaking, populations. ..."

Further building on Reich et al.'s characterization of the South Asian population as historically based on admixture of ANI (Ancestral North Indian) and ASI (Ancestral South Indian) populations, a 2011 session paper by Moorjani et al. states that a "major ANI-ASI mixture occurred in the ancestors of both northern and southern Indians 1,200-3,500 years ago, overlapping the time when Indo-European languages first began to be spoken in the subcontinent."[10] This evidence supports the idea of a significant introgression identifiable with the arrival of speakers of Indo-Aryan languages in the proto-historical period.

A 2011 study published in the American Journal of Human Genetics[11] indicates that Indian ancestral components are the result of a more complex demographic history than was previously thought. According to the researchers, South Asia harbours two major ancestral components, one of which is spread at comparable frequency and genetic diversity in populations of South and West Asia, the Middle East, the Near East and the Caucasus; the other component is more restricted to South Asia. However, rather than ruling out the possibility of Indo-Aryan migration, these findings suggest that the genetic affinities of both Indian ancestral components are the result of multiple gene flows over the course of thousands of years, with Indo-Aryan expansion into the subcontinent but one of many complex demographic episodes. The study authors write:

"Summing up, our results confirm both ancestry and temporal complexity shaping the still on-going process of genetic structuring of South Asian populations. This intricacy cannot be readily explained by the putative recent influx of Indo-Aryans alone but suggests multiple gene flows to the South Asian gene pool, both from the west and east, over a much longer time span."[12]

My edits sought to undo cherry–picking of studies (Autosomal DNA variation) and researchers (Recih and Basu) in the section. Post my edits, which you reverted as mass vandalism,[1] the section incorporated a variety of studies (mtDNA variation, Y Chromosome variation) and researchers giving due weight to each. All the content added is sourced and the two links removed in the lead were dead. You'll have to give better reasons than mass vandalism to revert my edits. Correct Knowledge«৳alk» 05:11, 16 December 2012 (UTC)

You're not supposed to push edits through without consulting others. Anyway, the info you posted is practically worthless because the section is concerned with population substructure, which can only be accurately assessed using autosomal DNA. YDNA/mtDNA is already dealt with elsewhere in the article, but it's obvious you don't know how to read. --Āryāvastra (talk) 03:38, 17 December 2012 (UTC)

The section was titled Reconstructing Indian Population History till, without consulting anyone, you changed its title.[2] The section is obviously different from other section which give details on mtDNA, Y and Autosomal studies in the sense that it summarizes the conclusions from various studies on the origin on Indian sub–populations. So, your argument YDNA/mtDNA is already dealt with doesn't hold much water. The section on Autosomal studies needs to be written like the other sections. Currently, the section only contains an abstract copied from just one study (Basu et al.) which uses mtDNA and Y along with Autosomal markers to arrive at its conclusions. So, technically even that one copied paragraph is unjustified in the section because the study is not purely based on Autosomal markers. Your other argument that population substructure... can only be accurately assessed using autosomal DNA is simply wrong. Autosomal DNA has always been unpopular among migration studies on India because unlike mtDNA and Y-Chromosome it does not show a simple direct lineage of inheritance.

Coming back to my original point, which you have skirted, the section as it stands cherry–picks researchers and studies and takes it to to absurd limits. The quote from Watkins et al. (2008): The historical record documents... mtDNA lineages., which looks like the conclusion of his study is actually just the background to his paper. The section of the paper which is just written to introduce and define the scope of the research is well.. cherry–picked over the far more important/reliable sections on process/results/conclusion. The article should not pick and choose winners or POVs. Incidentally, in the previous edit wars over this article, editors from two different sides were accustomed to picking and choosing their studies and researchers and exaggerating their claims. The section as it stood after my edits, as opposed to as it stands now, incorporated many more scholars, was better referenced and presented all sides of debate, even if it wasn't perfect. Please also note, your disruption of my talk page,[3] changing the title of this section,[4] and one your comments here weren't exactly civil.

Are you illiterate "Correct Knowledge"? YDNA/mtDNA are already dealt with in the previous sections. If you have something to add, put it there. Otherwise, the section on autosomal DNA and IE migration are concerned with comprehensive population structure which can only be assessed with autosomal DNA. Adding YDNA/mtDNA here would be out of context and redundant.Āryāvastra (talk) 16:11, 17 December 2012 (UTC)

You obviously didn't read anything I wrote above. Let me try again. Autosomal DNA and The Genetics of Indo-Aryan Migration (now that you've retitled this section) are two different sections. And the section on The Genetics of Indo-Aryan Migrationalready uses a mix of the three studies. The problems with this section are it cherry–picks researchers, uses studies older than 1990, quotes background of research papers as the conclusion of the study, does not reflect all POVs, overemphasizes some studies etc. etc. Not that this matters much, but Autosomal is not as reliable as Y–chromosome or mtDNA studies for tracing ancestral lineage.[5][6] The reasons for this are obvious. Y-chromosome and mtDNA are uniparental, Autosomal is not. Correct Knowledge«৳alk» 18:54, 17 December 2012 (UTC)

Do I have to repeat myself over and over again? The final passages serve as comprehensive reconstructions of population history, which can only be assessed using autosomal DNA. This is because autosomal DNA is biparental and therefore reflective of both male and female ancestries (total ancestry). Besides, YDNA/mtDNA have already been discussed elsewhere. Now go read a book, dunce, and save yourself from further humiliation. You're becoming tiresome. Āryāvastra (talk) 22:50, 17 December 2012 (UTC)

" which can only be assessed using autosomal DNA"... really? The 2011 study of American Journal of Human Genetics that you added to the Population reconstruction/Indo–Aryan migration section[7] refers to the previous mtDNA, Y–chromosome studies and tries to compare its results with them. It's impossible that you didn't already know this. You are obviously just being disruptive here. Leaving aside the reliability of Autosomal DNA, there is an argument to be made from WP:NPOV (neutrality) to include different types of studies and researchers. Even in the section as it stood after my edits Autosomal had the maximum length. So, I am not really arguing to have it removed. Please also read the links given in my previous post. The fact that that Autosomal DNA is biparental makes the contribution of ancestors separated by 2 generations or more probabilistic. There is a reason why mtDNA and Y-chromosome are called lineage markers. Correct Knowledge«৳alk» 18:57, 18 December 2012 (UTC)

All your concerns regarding the unreliability of mtDNA/Y-chromosome have been adequately addressed. I've made my points on why diverse studies and researchers should be included in the Population reconstruction/Indo–Aryan migration section. If you have no other objections we can safely go back to the version of the article before you started reverting. Correct Knowledge«৳alk» 11:00, 20 December 2012 (UTC)

Please try to understand each other.--Andrew Lancaster (talk) 11:26, 20 December 2012 (UTC) A second remark: it appears some of the reverting going on is "knee jerk" to the extent that even things like formatting fixes are being reverted? That seems to show that proper care is not being taken to edit in a collegial way.--Andrew Lancaster (talk) 14:27, 21 December 2012 (UTC)

The last two paragraphs of the article mention the Indo-Aryan Migration theory. However, considering this is now considered a defunct theory of Western Nationalism and makes no mention of a study on caste or genetics it is best to remove it from the article. — Preceding unsigned comment added by 81.100.22.136 (talk) 02:24, 23 April 2013 (UTC)

Latest Study by Harvard Medical School : India’s Fragmented Society Was Once a Melting Pot[edit]

The preamble of this study says following : India has been underrepresented in genome-wide surveys of human variation. We analyse 25 diverse groups in India to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the ‘Ancestral North Indians’ (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, whereas the other, the ‘Ancestral South Indians’ (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39–71% in most Indian groups, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the indigenous Andaman Islanders are unique in being ASI-related groups without ANI ancestry. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years owing to endogamy. We therefore predict that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically.

In recent years, genetic studies of modern Indians have provided a host of new insights into the ancient history of this sprawling nation, which harbors nearly one-sixth of the world’s population. A key finding, reported in 2009 by a team led by geneticist David Reich of Harvard Medical School in Boston, was that most Indians today are descendants of two major population groups: Ancestral North Indians (ANI), who probably migrated into the subcontinent 8000 or more years ago from the Middle East, Central Asia, and Europe; and Ancestral South Indians (ASI), who were native to the region and had been there much longer. The study also showed that these two groups began to mix at some point in the past, although just when was not clear.

Reich and his colleagues teamed up with researchers from the Council of Scientific and Industrial Research Centre for Cellular and Molecular Biology in Hyderabad, India, to take a much closer look at the genetics of modern Indians. Using both newly generated and previously published genetic data from 571 people representing 73 ethnic and language groups, 71 from India and two from Pakistan (which prior to Indian independence from British rule in 1947 was considered part of India), the team analyzed the genetic differences among the subjects using several powerful statistical methods. The analysis included nearly 500,000 genetic markers on the subjects’ DNA.

“There was a major demographic transformation in India from a region where mixture was pervasive to one in which it is very rare because of a shift to endogamy,” says lead author Priya Moorjani, a geneticist at Harvard Medical School.

The traces of this alternating pattern can be clearly seen in the genomes of modern Indians today, the study finds. For example, the percentage of ANI ancestry ranges from a high of 71% in the Pathan ethnic group of northern India to a low of 17% in the Paniya group of southwest India, meaning that the degree of ancient admixture is still measurable and significant in even the most isolated and endogamous ethnic groups.

“The most remarkable aspect of the ANI-ASI mixture is how pervasive it was, in the sense that it has left its mark on nearly every group in India,” Moorjani and her co-workers write.

What accounts for this pattern? The team points out that the period of intermarriage overlaps with a time of huge social upheavals in India, including the collapse of the ancient Indus civilization—which thrived on the Indian subcontinent between about 2600 B.C.E. and 1900 B.C.E.—as well as large-scale population movements and the rise of the Vedic religion, the predecessor of modern Hinduism. But after 1900 years ago, India’s caste system became a major cultural force, the team concludes, based on its new genetic findings and confirmed by evidence from ancient religious texts. The system rigidly defined four social classes, with the Brahmans at the top and the Sudras at the bottom. Intermarriage was not allowed between them. The Rig-Veda, India’s oldest surviving text and a founding document of ancient Hinduism, does not mention the caste system in its earliest sections, probably written some 3000 years ago; only much later are references to it found.

“The bulk of the Rig-Veda describes a society in which there is substantial movement among groups,” Moorjani points out. The four-caste system is only mentioned in an appendix written much later, she says, consistent with the genetic evidence.203.38.204.210 (talk) 04:37, 15 December 2013 (UTC)

By measuring the lengths of the chromosome segments of ANI and ASI ancestry in Indian genomes, the authors were thus able to obtain precise estimates of the age of population mixture, which they infer varied about 1,900 to 4,200 years, depending on the population analysed.

“Only a few thousand years ago was the Indian population structure vastly different from today,” says co-senior author David Reich, professor of genetics at Harvard Medical School.

“The caste system has been around for a long time but not forever. “Prior to about 4,000 years ago there was no mixture. After that, widespread mixture affected almost every group in India, even the most isolated tribal groups. And finally, endogamy set in and froze everything in place,’’ he said.

“The fact that every population in India evolved from randomly mixed populations suggests that social classifications like the caste system are not likely to have existed in the same way before the mixture,” said co-senior author Lalji Singh, currently vice-chancellor of Banaras Hindu University in Varanasi and former director of CCMB. “Thus, the present-day structure of the caste system came into being only relatively recently in Indian history.”

While the findings show that no groups in India are free of such mixture, the researchers did identify a geographic element. “Groups in the north tend to have more recent dates and southern groups have older dates, This is likely because the northern groups have multiple mixtures,” co-first author Priya Moorjani said.

But once established, the caste system became genetically effective, the researchers observed. Mixture across groups became very rare.

“An important consequence of these results is that the high incidence of genetic and population-specific diseases that is characteristic of present-day India is likely to have increased only in the last few thousand years when groups in India started following strict endogamous marriage,” Thangaraj added.203.38.204.210 (talk) 04:42, 15 December 2013 (UTC)