Abstract

Indians undergoing socioeconomic and lifestyle transitions will be maximally affected by epidemic of type 2 diabetes (T2D). We conducted a two-stage genome-wide association study of T2D in 12,535 Indians, a less explored but high-risk group. We identified a new type 2 diabetes–associated locus at 2q21, with the lead signal being rs6723108 (odds ratio 1.31; P = 3.32 × 10−9). Imputation analysis refined the signal to rs998451 (odds ratio 1.56; P = 6.3 × 10−12) within TMEM163 that encodes a probable vesicular transporter in nerve terminals. TMEM163 variants also showed association with decreased fasting plasma insulin and homeostatic model assessment of insulin resistance, indicating a plausible effect through impaired insulin secretion. The 2q21 region also harbors RAB3GAP1 and ACMSD; those are involved in neurologic disorders. Forty-nine of 56 previously reported signals showed consistency in direction with similar effect sizes in Indians and previous studies, and 25 of them were also associated (P < 0.05). Known loci and the newly identified 2q21 locus altogether explained 7.65% variance in the risk of T2D in Indians. Our study suggests that common susceptibility variants for T2D are largely the same across populations, but also reveals a population-specific locus and provides further insights into genetic architecture and etiology of T2D.

Type 2 diabetes (T2D) has developed into a major health problem, responsible for early morbidities and mortality that affects over a billion people worldwide (1). Developing countries such as India will be maximally affected by the T2D epidemic, both in terms of morbidity/mortality and socioeconomic loss in the coming decade (1,2). India, with one-sixth of the world’s population, typical risk phenotypes, and rapid socioeconomic transitions, provides an important resource for understanding the pathogenesis of T2D (1,3).

Genome-wide association studies (GWAS) and subsequent meta-analyses have identified >56 susceptibility loci for T2D that collectively explain ∼10% of the disease risk (4–8). Although GWAS have greatly improved our understanding of the genetic basis of T2D, most of these studies have been performed in Europeans (9), and the studies involving South Asians are very limited (7,10). Interpopulation differences in allele frequencies and effect sizes have yielded the discovery of new loci in different populations (11). The relatively unexplored and high-risk Indian population provides an opportunity for genetic dissection of T2D and other metabolic disorders (3).

In this study, we performed a two-staged GWAS of T2D involving a total of 12,535 Indians. The study involved a genome scan of 2,465 subjects and replication of top signals in two ethnic groups of India comprising 7,221 Indo-Europeans and 2,849 Dravidian subjects. These two ethnic groups of India are genetically diverse due to different ancestral backgrounds (12); hence, data for these two ethnic groups have been provided separately with the primary focus on Indo-Europeans. Our study identified a new T2D–associated locus on 2q21, with a lead signal at rs6723108 near TMEM163. Imputation analysis and genotyping revealed a stronger signal on 2q21 at rs998451 within TMEM163. Further analysis suggested a probable mechanism of influence of the TMEM163 variants on T2D susceptibility through insulin secretion.

RESEARCH DESIGN AND METHODS

The present GWAS was performed in a two-stage study design (Fig. 1), involving a total of 12,535 Indians comprising 6,738 T2D patients and 5,797 nondiabetic control subjects. We performed a genome scan for 2,465 subjects including 1,256 T2D patients and 1,209 nondiabetic control subjects of Indo-European ethnicity in stage 1. Stage 2 comprised of replication of 708 single nucleotide polymorphisms (SNPs; stage 1, P < 1.0 × 10−3) not reported for association with T2D by previous studies and 25 SNPs of previously reported T2D-associated loci in 7,221 Indo-Europeans (3,937 T2D patients and 3,284 control subjects) and 2,849 Dravidians (1,545 T2D patients and 1,304 control subjects).

All subjects of the current study are part of the Indian Diabetes Consortium (INDICO) (13), a sample repository of two major ethnic groups of India, namely Indo-European and Dravidian (14). Informed written consent was obtained from all of the participants of the study. The study was approved by the ethics committees of the participating institutes (13) and carried out in accordance with the principles of the Helsinki Declaration.

Study participants

Stage 1.

Consecutive T2D patients attending the outpatient Department of Endocrinology clinic of All India Institute of Medical Sciences, New Delhi, India, prior to September 2008, were enrolled as case subjects. T2D was diagnosed according to the World Health Organization criteria (i.e., fasting plasma glucose level >126 mg/dL [7.0 mmol/L] or 2-h postglucose load of >200 mg/dL [11.1 mmol/L]) (15). Control group included subjects who met the following criteria: 1) ≥40 years of age; 2) HbA1c level ≤6.0%; 3) fasting plasma glucose level <110 mg/dL (6.1 mmol/L); 4) no family history of diabetes in first-degree relatives; and 5) urban dwellers. All of the control subjects were collected from Diabetes Awareness Camps organized in various localities of Delhi and nearby regions as described previously (13).

Stage 2.

Stage 2 comprised subjects from two ethnic groups: Indo-European and Dravidian. The ethnicity of the subject was defined primarily on the basis of geographical location of the recruitment along with their birth place and place of origin of both the parents. Indo-European subjects were recruited from northern part of the India that included mainly Delhi and neighboring states, whereas Dravidian subjects were collected from Chennai, located in southern part of India.

Indo-European case subjects included consecutive T2D patients recruited from the outpatient departments of collaborating hospitals: All India Institute of Medical Sciences (New Delhi), Guru Teg Bahadur Hospital (Delhi), and Sawai Man Singh Hospital (Jaipur). T2D patients recruited from the All India Institute of Medical Sciences (New Delhi) for stage 2 included only those subjects who attended the clinic after September 2008. Subjects with self-reported diabetes and/or taking medication for diabetes and newly diagnosed T2D patients were also enrolled from Diabetes Awareness Camps. T2D was diagnosed according to the World Health Organization criteria. Control subjects were recruited based on the criteria as in stage 1, except that the fasting glucose level <100 mg/dL (5.6 mmol/L) was considered as the cutoff for subjects with missing information about HbA1c levels. HbA1c levels were not available for 1,198 subjects in stage 2 of the study.

Subjects in the Dravidian group were recruited from an ongoing epidemiology study, Chennai Urban Rural Epidemiological Study, on a representative population (>20 years of age) of Chennai (16). Subjects with self-reported diabetes taking drug treatment for diabetes or confirmed by oral glucose tolerance test to have a 2-h postload plasma glucose level >200 mg/dL (11.1 mmol/L) were classified as T2D patients. Control subjects included randomly selected subjects with a 2-h plasma glucose level <140 mg/dL (7.8 mmol/L) and fasting plasma glucose <100 mg/dL (5.6 mmol/L). All recruited subjects underwent anthropometric and clinical measurements. Height, weight, and waist and hip circumferences were measured using standard methods as described previously (13,16). Levels of glucose, HbA1c, insulin, C-peptide, total cholesterol, triglycerides, HDL cholesterol, LDL cholesterol, and blood pressure were measured as described previously (13,16). Homeostatic model assessment of insulin resistance (HOMA-IR) was calculated as provided by Matthews et al. (17). Anthropometric and clinical characteristics of the subjects are provided in Supplementary Table 1.

Stage 1 genotyping and analysis.

A genome scan was performed using Illumina Human610-Quad BeadChips (Illumina Inc., San Diego, CA). Genotype calls were assigned using the GenCall algorithm as implemented in Illumina GenomeStudio (version 2010.3; Illumina Inc.). Stringent Quality Control (QC) criteria for filtering SNPs and samples for analyses were applied separately in T2D case and control subjects (Supplementary Fig. 1). Samples with low call rate (<95%), sex discrepancies, and cryptic relatedness (pi_hat >0.1875) were removed. Cryptic relatedness was determined by identity-by-descent analysis using PLINK v1.07 (http://pngu.mgh.harvard.edu/~purcell/plink) (18). SNPs with minor allele frequency (MAF) <0.01, either in case or control subjects, were excluded (Supplementary Fig. 1). Further, from the SNPs with a MAF 0.01–0.05, those with a call rate <99% and Hardy-Weinberg equilibrium (HWE) of P < 5.7 × 10−4 were excluded, whereas from SNPs with a MAF >0.05, those with call rate <95% and HWE of P < 1 × 10−7 were removed.

Principal component analysis was performed in stage 1 samples (without inclusion of HapMap samples) to assess population structure and identify outliers with 116,631 independent pruned SNPs using SMARTPCA as implemented in EIGENSOFT version 3.0 (Supplementary Fig. 2). Pruning of SNPs was carried out with successfully genotyped autosomal SNPs using the –indep-pairwise option of PLINK v1.07, with a window size of 50 SNPs, step size of 5, and r2 <0.2. We also removed SNPs within regions of known high linkage disequilibrium (LD) (19). The first two principal components (PC1 and PC2) had maximum contribution with eigenvalues of 8.41 and 4.33, respectively. A total of 89 samples were identified as outliers that were removed using the outlier removal option of SMARTPCA (σ threshold of 6 with 5 iterations along first 10 principal components). Principal component analysis was also performed using our samples with reference samples from HapMap Phase III populations CEU (Utah residents with Northern and Western European ancestry from the Centre d'Etude du Polymorphisme Humain collection), GIH (Gujarati Indians in Houston, TX), YRI (Yoruban in Ibadan, Nigeria), JPT (Japanese in Tokyo, Japan), and CHB (Han Chinese in Beijing, China) using first two eigenvectors.

Association of QC-passed SNPs with T2D was tested by logistic regression analysis under a log-additive model adjusting for age, sex, and potential population stratification using PLINK v1.07. The first two principal components (PC1 and PC2) that accounted for the maximum variance were used as covariates to adjust for residual population stratification. Deviation between observed P values for association and the expected P values under the null hypothesis was investigated by constructing quantile–quantile plot. The −log10 of P values observed for the association of SNPs under the additive model adjusted for age, sex, and first two principal components was plotted against the theoretical −log10 P values expected under the null hypothesis using STATA version 10.1 (Stata Corporation, College Station, TX). The genomic inflation factor (λ) was calculated from the average χ2 statistics using PLINK v1.07.

Stage 2 genotyping and analysis.

Genotyping of the selected SNPs in stage 2 was performed using the Illumina Golden Gate assay following the manufacturer’s instructions (http://support.illumina.com/documents/MyIllumina/2bbe5885-34be-4e2e-b655-4848b7d3e3d7/GoldenGate_Genotyping_Assay_Guide_15004065_B.pdf; Illumina Inc.). SNPs with a genotype confidence score <0.25, call frequency <0.90, GenTran score <0.60, cluster separation score <0.40, minor allele frequency <0.01, and HWE of P < 1.0 × 10−5 were excluded. All of the QC criteria for SNPs were applied separately in case and control subjects, and only those SNPs that qualified QC both in case and control subjects were taken forward for analysis. Samples with a <0.90 call rate were also removed. A total of 256 samples (∼3%) were genotyped in duplicates, which showed >99% concordance in genotype calls. Association of SNP T2D was assessed separately in Indo-Europeans and Dravidians under the additive model using logistic regression analysis adjusting for age and sex.

Results of stages 1 and 2 were combined by inverse variance method under fixed effect model using METAL (http://www.sph.umich.edu/csg/abecasis/Metal/) (20). Association of the top four loci with quantitative clinical traits including measures of adiposity, dysglycemia, and dyslipidemia was tested using linear regression analysis under the additive model adjusted for age, sex, and BMI as appropriate. Only nondiabetic control subjects from stages 1 and 2 were considered for association analysis with clinical traits. Pairwise LD between SNPs at the associated loci was determined using Haploview 4.0 software (21). Haplotype association analyses were adjusted for age and sex and assessed at 10,000 permutations using PLINK v1.07.

Imputation analysis.

SNPs within a 2-Mb region around four lead signals were imputed in final post-QC genotype data using a 1000 Genomes Phase 1 (interim) panel that consists of combined phased haplotypes of 1,094 individuals from Africa, Asia, Europe, and America (build 37) using IMPUTE v2.1.2. (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html). Association was tested using SNPTEST v2.2.0 (https://mathgen.stats.ox.ac.uk/genetics_software/snptest/snptest.html) adjusting for age, sex, PC1, and PC2. Association analysis was done for SNPs with a call rate >99%, threshold posterior probability of 0.90, and proper_info >0.50.

Genotyping and analysis of the of the top imputed SNP.

The SNP rs998451 that showed most significant association after imputation was also confirmed by genotyping. Genotyping was performed in a total of 5,262 subjects (2,538 T2D patients and 2,724 nondiabetic control subjects) that included 2,036 subjects (1,087 T2D patients and 949 nondiabetic control subjects) of stage 1 and 3,226 subjects (1,451 T2D patients and 1,775 nondiabetic control subjects) of stage 2. Genotyping was carried out by single-nucleotide extension of SNP-specific probes using a SNaPshot ddNTP primer extension kit. The extended probes were then electrophoresed on an ABI Prism 3100 genetic analyzer (Applied Biosystems, Foster City, CA), and genotypes were determined using GeneMapper Software v4.0. Call rate of the SNP was >99%, and genotypes did not deviate from HWE (P > 0.05 in control subjects) in stage 1 or stage 2 study populations. Association analyses were performed as mentioned above for stage 1 and stage 2.

In silico replication.

Associations of four top loci obtained in meta-analysis of Indo-European samples were tested in Europeans by in silico replication using GWAS phase data from the DIAGRAM+ study. DIAGRAM+ comprises GWAS phase data of 8,130 individuals with T2D and 38,987 control subjects of European descent with an effective sample size of 22,570 subjects (22). Meta-analysis by combining summary statistics of results across the studies was performed by inverse variance method under fixed-effect model using METAL.

Calculation of phenotypic variance explained by selective SNPs.

The GCTA tool (23) was used for calculating the variance explained by combination of 56 SNPs from the 56 known loci of T2D and the SNP rs6723108 near TMEM163 identified by the current study. The analysis was performed assuming a disease prevalence of 10% and adjusting for sex, age, PC1, and PC2. Assuming a disease prevalence of 10% would mean the rest of the subjects (90%) are control subjects and should represent the remaining liability distribution, which is not true in the current study, as control subjects were defined based on certain other criteria in addition to being unaffected with T2D. Thus, results of this analysis are approximate.

Power calculation.

Statistical power of the study was calculated for a range of allele frequencies from 0.01 to 0.50, with odds ratios (ORs) ranging from 1.10 to 1.40, using Quanto software (http://hydra.usc.edu/gxe/), assuming a log-additive model of inheritance and 10% prevalence of disease at α = 0.05 (Supplementary Fig. 3).

RESULTS

After stringent QC, a successfully genotyped dataset for 1,101 T2D patients and 1,027 control subjects at 536,420 autosomal SNPs was obtained in stage 1 (Supplementary Fig. 1). The quantile–quantile plot showed good agreement with null distribution (Supplementary Fig. 4). The genomic control inflation factor (λ) was 1.09. Consistent with previous reports, TCF7L2 was identified as the strongest signal in Indians as well (OR 1.56; P = 5.01 × 10−11) (Fig. 2). In stage 2, 708 SNPs (stage 1, P < 1.0 × 10−3) not reported for association with T2D by previous studies and 25 SNPs of previously reported T2D-associated loci were genotyped. QC filtering yielded data for 653 SNPs in 6,552 Indo-European subjects and 635 SNPs in 2,256 Dravidians. We tested association of SNPs with T2D separately in Indo-Europeans and Dravidians and combined the results of two stages for Indo-European subjects by meta-analysis using the fixed-effect inverse variance method (20).

Manhattan plot of the association P values for T2D. The −log10 P values for association of directly genotyped SNPs are plotted as a function of genomic position (National Center for Biotechnology Information Build 37). The P values were determined using logistic regression adjusted for age, sex, PC1, and PC2 in stage 1 analysis. Each chromosome (Chr) has been represented with a unique color.

Replication analysis of 708 previously unreported SNPs in the independent study population of Dravidians showed association of 19 of them (P < 0.05) with consistent direction with Indo-Europeans (Supplementary Table 6). Among the top four loci in Indo-Europeans (P < 5.0 × 10−6; Table 1), only rs1929752 (FLJ35379) was nominally associated, but with opposite effect. SNPs rs10461617 (MAP3K1) and rs11165354 (TGFBR3) showed consistency in direction but were not associated. The observed discrepancies in association results might be due to differences in allele frequencies or effect sizes of these variants between Indo-Europeans and Dravidians (Supplementary Table 6); however, a larger study population of Dravidian origin would be required to state anything conclusively. We carried out in silico replication of top four loci—TMEM163, MAP3K1, TGFBR31, and FLJ35379 in DIAGRAM+—and performed meta-analysis (Supplementary Fig. 6). The SNP rs6723108 (TMEM163) had the same direction of effect in DIAGRAM+; however, the association was not statistically significant. At the MAP3K1 locus (rs10461617), suggestive association with consistent direction was observed in DIAGRAM+. Nominal association of rs11165354 of TGFBR3 in DIAGRAM+ was found (P = 0.036), but there was inconsistency in direction of effect. The observed differences in association results might be due to differences in allele frequencies and LD structures in Europeans and Indians.

Further, we investigated association of 56 T2D loci identified by previous GWAS and assessed heterogeneity in effect of the present and previous studies (Tables 2 and 3). A total of 49 loci showed consistency in direction of association with similar effect sizes with previous reports (P-heterogeneity >0.009, P value corrected for multiple testing), and association for 25 of these loci was also observed (P < 0.05). The seven loci HNF1B, DGKB-TMEM195, SRR, KCNQ1, GLIS3, GCC1-PAX4, and PSMD6 were not consistent in direction, two of which (HNF1B and SRR) also showed significant heterogeneity of effect even when our study was sufficiently powered (>98%) to detect the associations. In Dravidians, association analysis of 22 known loci that were genotyped in stage 2 (Supplementary Table 7) showed that 17 loci have consistency in direction and similar effect size, with 7 of these attaining statistical significance (P < 0.05).

To explore the possible functional mechanisms of the identified loci, we performed association analysis with quantitative traits. The risk allele of rs6723108 was associated with decrease in fasting plasma insulin (β= −5.71 pmol/L; P = 3.7 × 10−3) and HOMA-IR (β= −0.21; P = 7.4 × 10−3) in Indo-Europeans (Supplementary Table 8). Similarly, rs998451 was also associated with decrease in fasting plasma insulin (β= −5.54 pmol/L; P = 0.02) and HOMA-IR (β= −0.20; P = 0.04). This suggests that TMEM163 variants might modulate T2D susceptibility by affecting insulin secretion. Previous studies have shown that at this 2q21 region, rs6723108 and a variant in RAB3GAP1 (rs6730157) are associated with waist circumference (24) (P = 6.2 × 10−5) and total cholesterol (P = 8.5 × 10−8) (25), respectively. However, variants from this locus were not associated with these traits in our study, which could be due to the small effect size of these variants for these traits in the current study population or lower power of the current study to detect them.

DISCUSSION

GWAS has enormously contributed to the identification of susceptibility genes for T2D and many other complex disorders. The discovery of new susceptibility loci in GWAS of different ethnic groups emphasizes the need for conducting more GWAS in populations of diverse ethnic groups. Our study also shows the potential of identifying new signals through GWAS in population of different ethnicity. In this study, we performed a GWAS in Indians and identified a new signal for T2D on chromosome 2q21.

The strongest signal on newly identified locus 2q21 mapped to TMEM163. The association of TMEM163 variants (rs6723108 and rs998451) with reduced fasting plasma insulin levels and HOMA-IR indicates that it might modulate susceptibility to T2D by affecting insulin secretion. Previously, rs6723108 has also been suggested to be associated with waist circumference. These data indicate the biological relevance of the identified locus with T2D. TMEM163 encodes a synaptic vesicle membrane protein with six putative transmembrane helices that may function as a vesicular transporter. TMEM163 is expressed in select brain regions and contained in subpopulations of nerve terminals (26,27). This region also harbors RAB3GAP1 and ACMSD, which are shown to be involved in neurologic disorders (28). RAB3GAP1 encodes a catalytic subunit of a Rab GTPase-activating protein involved in regulated exocytosis of neurotransmitters and hormones. Mutations in RAB3GAP1 are associated with neurodevelopmental abnormalities known as Warburg micro syndrome (OMIM 600118). ACMSD codes for an enzyme involved in picolinic and quinolinic acid homeostasis and NAD biosynthesis (29,30). ACMSD activity and expression have been shown to be elevated in streptozotocin-induced diabetic rats and suppressed by insulin injection and is suggested as a potential therapeutic avenue for diabetes treatment (30). Thus, our results suggest the involvement of neurologic component in the etiology of T2D, which needs to be confirmed through further investigations.

The signal on 2q21 observed in Indo-Europeans was not detected in Dravidians and Europeans from DIAGRAM+. The observed differences in association results might be due to differences in allele frequencies and LD structures or statistical power to detect the association. In Dravidians, the minor allele frequency of the index variant rs6723108 is very low (0.03) and differs significantly from that of Indo-Europeans, and hence, to capture the association of this variant, if any, a larger sample size is required. Thus, the statistical power of the Dravidian study population might not be sufficient to detect the observed association in Indo-Europeans (Supplementary Fig. 3). Among Europeans from DIAGRAM+ study, direction of effect was same as found in Indo-Europeans, but the association could not reach statistical significance.

MAP3K1 was identified as a promising candidate for T2D, which has not been reported earlier, but needs further investigation. MAP3K1 codes for serine/threonine protein kinase and has been shown to have an essential role in stress-induced β-cell death (31) and inhibits transcription of insulin (32). We observed suggestive association at rs10461617 in the 5′ flank of MAP3K1. The rs10461617 was also observed to have nominal association with T2D in Dravidians and DIAGRAM+ with consistency in direction of effect. Although the association of MAP3K1 variant could not reach genome-wide significance, these results and biological roles of MAP3K1 suggest it as an important candidate for T2D.

Most of the previously reported T2D-associated loci from earlier GWAS showed consistency in direction of association with similar effect sizes in Indo-European subjects of the present study, of which 25 were also associated at P < 0.05. Association at other known loci could not be detected mainly due to the low power of the current study to detect those associations, as most of these loci were identified through meta-analysis of GWAS, including large study populations. Only at loci PTPRD, HNF1B, THADA, and SRR, we did not observe association even when our study was sufficiently powered (>98%) to detect the previously reported effect sizes. The effect sizes at these loci in our study population could be very small; thus, a larger study population would be required to confirm their association status in our population. We also found consistent results as in the recent GWAS in South Asians. Of the six loci identified in the recent GWAS in South Asians (7), HNF4A has been shown to be associated in our previous study (33) involving study subjects from the current study and was also associated in the current study (P = 0.004). The loci HMG20A was also associated in the current study (P = 0.02) while the other four loci (GRB14, ST6GAL1, VPS26A, and AP3S2) showed consistency in direction and effect size. All of the previously associated loci in Indo-Europeans accounted for 7.17% of the variance in the risk of T2D in our study population, which is similar to the observations in previous studies (5 to 10%). The newly identified 2q21 locus alone contributed 0.48% of the variance, and addition of 2q21 locus to the list increased the total variance explained to 7.65%. Thus, our analysis provides further evidence for the belief that underlying disease mechanisms involving common variants are largely similar among populations (11).

In conclusion, we identified a new locus associated with T2D on the 2q21 region. The identified region 2q21 harbors the genes TMEM163, RAB3GAP1, and ACMSDI, and incidentally, all three are involved in neurologic processes, suggesting a neurologic component in the etiology of T2D. The findings provide new insights into the genetic architecture and pathophysiology of T2D; however, further studies exploring the functional and clinical significance of the identified locus are warranted.

ACKNOWLEDGMENTS

The major funding for this work comes from Council for Scientific and Industrial Research, Government of India, in the form of the grant “Diabetes mellitus—New drug discovery R&D, molecular mechanisms, and genetic and epidemiological factors” (NWP0032-19). R.T. received a postdoctoral fellowship from the Fogarty International Center and the Eunice Kennedy Shriver National Institute of Child Health and Human Development at the National Institutes of Health (D43-HD-065249).

No potential conflicts of interest relevant to this article were reported.

R.T., G.C., and O.P.D. researched data, contributed to discussion, and wrote the manuscript. A.M., A.J., I.K., K.B., T.S., B.J.M., Y.P., M.I.M., R.V., and V.M. researched data and contributed to discussion. M.C., A.S., S.C., S.S., L.R., P.V., S.K.A., S.G., D.P., R.K.S., M.S., M.B., S.M., A.Bh., V.N.S., S.V.M., R.K.M., A.Ba., and V.S. researched data. N.T. and D.B. researched data, contributed to discussion, and reviewed and edited the manuscript. D.B. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

The authors thank all of the study subjects for participation. The authors also thank A.K. Sharma (Council for Scientific and Industrial Research-Institute of Genomics and Integrative Biology) for helping in sample collection and Prof. Greg Gibson (Georgia Institute of Technology, Atlanta, Georgia) for critical evaluation of the manuscript.