Abstract

Type 2 diabetes mellitus has been at the forefront of human diseases and phenotypes studied by new genetic analyses. Thanks to genome-wide association studies, we have made substantial progress in elucidating the genetic basis of type 2 diabetes. This review summarizes the concept, history, and recent discoveries produced by genome-wide association studies for type 2 diabetes and glycemic traits, with a focus on the key notions we have gleaned from these efforts. Genome-wide association findings have illustrated novel pathways, pointed toward fundamental biology, confirmed prior epidemiological observations, drawn attention to the role of β-cell dysfunction in type 2 diabetes, explained ~10% of disease heritability, tempered our expectations with regard to their use in clinical prediction, and provided possible targets for pharmacotherapy and pharmacogenetic clinical trials. We can apply these lessons to future investigation so as to improve our understanding of the genetic basis of type 2 diabetes.

Introduction

During the last decade there has been an outpouring of studies providing clues into the genetic architecture underlying complex diseases. Type 2 diabetes mellitus (T2D) has been at the forefront of human diseases and traits studied by new genetic analyses. Prior to genome-wide association studies (GWAS), the primary methods used to establish a link between genotype and phenotype were linkage analysis and candidate gene approaches. Linkage analysis relies on shared DNA segments inherited from common ancestors coupled with phenotypic information. This method was useful in identifying familial genetic variants with large effects, such as those giving rise to maturity-onset diabetes of the young (MODY).1 When applied to common T2D, Reynisdottir et al. identified segments in chromosomes 5 and 10 with suggestive linkage to T2D.2 The chromosome 10 region harbored TCF7L2, one of the T2D-associated genes with strongest effects, although not sufficiently high to explain the original linkage signal.3

Linkage analysis can detect rare genetic loci that strongly influence a disease, but proved limited in unveiling common genetic variants with a more modest impact on complex diseases (Figure 1). Association methods could detect genetic variants causing smaller effects. Unlike linkage analysis, in their earlier form association methods could only be applied to candidate genes: they were therefore biased in assuming that a specific locus caused disease based on biological plausibility, and limited in that they depended on prior knowledge with a consequent disregard for intergenic regions. From this work, PPARG and KCNJ11 emerged as two candidate genes both of which encode targets for anti-diabetes medications, and harbor missense variants associated with T2D.4-6

Thanks to the completion of the Human Genome and International HapMap Projects (see below), the novel approach of searching for genetic associations in a “genome-wide” fashion came to fruition (Figure 2). Thus, scientists embarked in GWAS which allowed them to discover multiple gene variants with individually small effects. Once a specific polymorphism is associated with a disease, it is usually annotated by naming the gene in closest proximity to it. However, this does not necessarily mean that the variant in question is the molecular defect responsible for the phenotype, nor does it implicate the nearest gene; it simply flags a genomic region that harbors the causal variant, which may itself be acting at a certain distance, for instance by modulating expression of a far-away gene. Therefore while association signals are often identified by gene names, only in a few cases has a causal relationship been demonstrated, typically via fine-mapping and functional approaches.

Substantial progress in our knowledge of the genetic basis of T2D has been elucidated by T2D GWAS, but there remains a large portion of unexplained genetic heritability. This review summarizes the concept, history, and recent discoveries produced by GWAS for T2D and glycemic traits. It will emphasize what we have learned from the explosion of data in the last ten years, and how we can apply our knowledge to future investigation to gain further understanding of the genetic basis of T2D.

Launching GWAS

The ability to interrogate the entire genome was made possible by two key advances: the Human Genome Project, with a draft sequence in 2001 and near-complete sequence in 2003, and the International HapMap project, with its first phase completed in 2005, and now currently in its third phase.7-10Progress in high-throughput and affordable genotyping technology, analytical tools to assist in the data mining, cleaning and interpretation of large databases, and the assembly of international collaborations combining well-phenotyped cohorts made utilizing these advances possible.

Following the completion of the Human Genome Project, a search for genetic variation that might explain phenotypic diversity and an individual’s risk of disease ensued. Assaying single nucleotide polymorphisms (SNPs) became a mainstream way to study the association of genetic variation and disease. A SNP is a single nucleotide DNA sequence variant in the genome that differs between members of the same species or a pair of chromosomes in an individual. SNPs occur on average every 300 base pairs, have a low rate of recurrent mutation, and are most often binary in nature. Several million SNPs were discovered and deposited in public databases.11 Initially, the HapMap genotyped 3.9 million SNPs in 270 DNA samples among four different ethnic groups and defined the underlying patterns of the inheritance of genetic variation. The inheritance pattern is quantified by linkage disequilibrium (LD), which represents the likelihood that alleles of nearby SNPs will stay together and preserve their linear arrangement on a haplotype during meiosis. This likelihood is dependent on recombination rates, with recombination events more likely to separate alleles that lie further apart. In this manner, two SNPs in strong LD will be inherited together more frequently than two SNPs in weak LD. By knowing this correlation structure, investigators and chip manufacturers only have to query a smaller subset of SNPs, or“tag” SNPs, to design genotyping arrays and conduct association analyses that essentially capture the majority of remaining common genomic variation. Genetic variants that are not directly genotyped can then be imputed from the genotyped ”tag” SNPs subset. Imputation presumes the allele of a SNP at a different location inferred by its degree of LD with an allele at a directly genotyped variant.12

Initial GWAS for T2D

GWAS have led to the discovery of 38 SNPs associated with T2D, in addition to nearly two dozen SNPs associated with glycemic traits (Table 1). Notably, although these SNPs have been associated with the trait of interest beyond reasonable statistical doubt, save a few exceptions they have not yet led to the identification of the specific causal variant(−s).

The first GWAS for T2D was conducted in a French discovery cohort composed of 661 cases of T2D (body mass index [BMI]<30 kg/m2, first-degree family history of T2D) and 614 non-diabetic controls were genotyped on two genotyping platforms. In total, 392,935 SNPs culled from two different genotyping platforms were analyzed for association with T2D. Although two associations were not reproducible in follow-up studies (LOC387761, EXT2), this study identified novel and reproducible association signals at SLC30A8 and HHEX, and validated the well known association at TCF7L2.13 Investigators from the Icelandic company deCODEand their collaborators confirmed the association of loci SLC30A8 and HHEX with T2D and identified an additional signal in CDKAL1.14 On the same day, three collaborating groups, the Wellcome Trust Case Control Consortium (WTCCC), the Finland-United States Investigation of NIDDM Genetics (FUSION) group, and the Diabetes Genetics Initiative (DGI), published their findings replicating SLC30A8 and HHEX, and independently discovering novel associations at CDKAL1, IGF2BP2 and CDKN2A/B.15-17

More power to detect low-effect size common variants with increasing sample size

WTCCC, FUSION, and DGI ultimately combined their data to form the Diabetes Genetics Replication and Metaanalysis (DIAGRAM) consortium, which led to a substantial increase in sample size and, thus, power to detect common genetic variants with low effect size.18Table 1 summarizes the findings of each of these cohorts and consortia.

Most T2D genetics cohorts have now coalesced to form DIAGRAM+, which achieved an effective sample size of over 22,000 subjects of European origin. In a recent report, 2,426,886 imputed and genotyped autosomal SNPs, with additional interrogation of the X-chromosome, were examined for association with T2D as a categorical phenotype. Fourteen signals, also shown in Table 1, reached genome-wide significance (P<5×10−8) in association with T2D (this threshold is determined by correction for the estimated 106 independent tests that capture common genomic variation in non-African populations19).

In addition to nine novel loci, this most recent meta-analysis allowed for the confirmation of other loci previously associated with T2D among smaller cohorts including IRS1, MTNR1B and KCNQ1.20-25 In independent work, IRS1 (encoding the insulin receptor substrate-1) had been associated with T2D, development of hyperglycemia, insulin resistance by homeostasis model assessment (HOMA-IR26), fasting glucose, and fasting insulin.27 Several GWAS for fasting glucose as a quantitative trait, described in more detail below, had already identified MTNR1B as a locus influencing fasting hyperglycemia making it a candidate locus for association with T2D.21-23 A third SNP, rs231362, is located in an intron of KCNQ1 in chromosome 11 which overlaps the KCNQ1OT1 transcript, thought to influence expression of CDKN1C which regulates β-cell development;28 an independent signal in KCNQ1 had been associated with T2D in a Japanese, Korean, Chinese and European populations.24,25 Two additional loci were near genes had been linked with T2D in previous studies, but not at genome-wide significance: HNF1A harbors rare mutations that account for MODY, and BCL11A had shown suggestive association not reaching genome-wide significance in the first DIAGRAM discovery meta-analysis.18

GWAS for continuous glycemic traits

Initial GWAS interrogated the genetic determinants of T2D as a dichotomous phenotype (disease vs. no disease), rather than examining continuous glycemic traits. Prior to the development of GWAS, Weedon et al. had examined GCK, a gene in which rare mutations cause a defect in the the rate-limiting enzyme that drives glucose metabolism in the β cell, resulting in MODY2. Using the candidate gene approach, they demonstrated that two common variants, rs1799884 and rs3757840, are associated with fasting glucose surpassing or approaching genome-wide significance (1×10−9 and 8×10−7, respectively).29

The association of GCK with fasting glucose and the success of categorically-driven GWAS primed the field to examine the genetic factors that contribute to the inter-individual variation in glycemic measures in normoglycemic subjects. Using a genome-wide approach G6PC2 and MTNR1B were associated with fasting glucose. It is postulated that G6PC2 modulates the glycolytic pathway and insulin secretion by dephosphorylating glucose-6-phosphate generated by the β-cell glucose sensor, glucokinase. In a French cohort, carriers of the A allele for rs560887 in the third intron of G6PC2 had decreased fasting plasma glucose and lower risk of developing mild hyperglycemia in 9-year follow-up.30 Shortly following the publication of these findings, a study of nondiabetic individuals from Finland and Sardinia reported an association between SNP rs563694 and fasting glucose concentrations. SNP rs563694 is in high LD (r2=0.729) with and only 11 kb away from the SNP discovered in the French cohort, rs560887.31

In early 2009, three groups concurrently described the influence of MTNR1B on quantitative glycemic traits and further characterized the functional significance of one of these variants, rs10830963.21-23MTNR1B encodes the melatonin receptor MT2, whose endogenous ligand melatonin is a neurohormone that mediates circadian rhythmicity and appears to influence insulin secretion and glucose levels.32-34 The largest of the three studies was conducted by the Meta-Analysis of Glucose and Insulin-related traits Consortium (MAGIC). MAGIC represents an international collaborative effort to combine data from multiple GWAS to identify additional loci that affect glycemic and metabolic traits. This initial study focused on the top association signals from four individual consortia meta-analyses, with each of these consortia providing mutual replication of the top SNPs. This meta-analysis of 10 cohorts, comprised of 40,735 individuals, described the association of rs10830963 at MTNR1B with fasting glucose (P=3.2×10−50) and T2D (P=3.3×10−7) and confirmed the previously known associations of GCK and G6PC2 with fasting glucose.21,29-31 In an independent French cohort, a second variant near MTNR1B (rs1387153) was associated with T2D, fasting glucose, β-cell function by homeostasis model assessment (HOMA-B26), and glycated hemoglobin (HbA1C).22 Lyssenko et al. further characterized the physiological ramifications of carrying the risk allele in rs10830963. Carriers of the G allele showed decreased insulin secretion, increasing fasting glucose levels and a higher risk of developing T2D in an average 23.5 year follow-up period, as well as a higher proinsulin-to-insulin ratio.23

In follow-up to these initial investigations, a large-scale meta-analysis of all genome-wide data for continuous fasting and 2-hour post-glucose load traits in non-diabetic participants was performed.35,36 Twenty one GWAS of European descent including 46,186 participants analyzed for ~2.5 million genotyped or imputed SNPs coalesced around MAGIC in order to examine fasting glucose, fasting insulin, and fasting indices of β-cell function and insulin resistance (HOMA-B and HOMA-IR respectively).35 Lead SNPs were replicated among 76,558 additional individuals from 34 additional cohorts (7 of which had undergone GWAS genotyping and contributed in silico replication). The joint analysis for fasting glucose uncovered nine new loci including SNPs in or near ADCY5, MADD, CRY2, ADRA2A, FADS1, PROX1, SLC2A2, GLIS3, and C2CD4B and one SNP upstream from IGF1 in association with fasting insulin and HOMA-IR. The meta-analysis also confirmed prior associations for glycemic traits with SNPs in or near DGKB-TMEM195, GCKR, G6PC2, MTNR1B, and GCK. The 14 fasting glucose-associated loci explained a substantial proportion (10%) of the inherited variation in fasting glucose in the Framingham cohort. Additionally, an aggregate genotype score, constructed by summing the risk variants, showed a difference in fasting glucose of about 7.2 mg/dl between groups with the highest and lowest scores.35 Contrary to straightforward expectations, only five of the novel fasting glucose-associated SNPs were associated with T2D, which underscores that the mechanism by which glucose is raised, rather than elevated glucose alone, may have contributed to the progression to T2D. The three loci with the largest effect sizes on fasting glucose, GCPC2, MTNR1B, and GCK, were found to have a significant association with HbA1C at genome-wide significance thresholds 37; interestingly, G6PC2 does not increase risk of T2D. In terms of insulin resistance measures, IGF1 (rs35767) and GCKR (rs780094), were significantly associated with HOMA-IR and fasting insulin.35

In order to examine glucose and insulin levels 2 hours after an oral glucose tolerance test (OGTT), nine GWAS (n=15,234) were compiled with replication in 17 studies (n=30,620).36 This study found an association of SNP rs10423928 in GIPR (encoding the gastric inhibitory peptide receptor) with glucose levels 2 hours after an oral glucose challenge.36 Gastric inhibitory peptide is an incretin hormone released from the intestine in response to oral glucose, and it stimulates insulin secretion by the pancreatic β cell.38 GIP response is decreased in T2D.39 In further analyses supporting the known role of this pathway in glucose regulation, this intronic SNP was associated with a decreased early-phase insulin secretion, decreased insulin response, and lower insulin levels 2 hours after an oral, but not intravenous glucose load.

Genetic variants and physiological parameters

Assessing more detailed physiological parameters influenced by newly discovered genetic variants lends credibility to the original associations, besides offering a better characterization of the mechanism by which these variants might modulate glucose and insulin levels and cause T2D. Consequently, the MAGIC investigators systematically examined the influence of the genetic variants newly associated with fasting glucose, fasting insulin and/or 2-hour glucose with measures that define insulin processing, insulin secretion, and insulin sensitivity.40 With these measures, loci were categorized into five groups based on their hypothesized mechanism of action: 1) abnormal insulin processing, 2) higher proinsulin and lower insulin secretion, 3) abnormalities in early insulin secretion, 4) reduced insulin sensitivity, 5) no obvious effect on insulin processing, secretion or sensitivity. MADD (encoding a death domain-containing adaptor protein that propagates apoptotic signals) was strongly associated with proinsulin levels, but not other parameters. This indicates that the locus is implicated in isolated insulin processing only. TCF7L2, SLC30A8, GIPR and C2CD4B were associated with higher proinsulin and lower insulin secretion. The findings for TCF7L2 and SLC30A8 confirmed previous results, which appear biologically plausible given the proposed function of these genes. For instance, genetic variants in TCF7L2 are thought to adversely affect β-cell responsiveness to incretins and insulin granule exocytosis, which would impair insulin processing and decrease insulin secretion. In turn, variants at MTNR1B, FADS1, DGKB, and GCK were only associated with a lower insulinogenic index (which measures the initial phase of glucose-stimulated insulin release, by dividing the change in insulin over the first 30 minutes of an OGTT by the change in glucose over the same time frame), suggesting that variants in or near these genes impair early insulin secretion. For instance, FADS1 encodes a key enzyme in metabolism of unsaturated fatty acids, and insulin secretion differs in response to meals with varying fatty acid composition.41 The lower insulinogenic index caused by genetic variation in FADS1 indicates that variants in this gene may alter an individual’s β-cell response to meals composed of certain fatty acid content. SNPs at GCKR and IGF1 were found to influence insulin sensitivity by multiple methods whose calculation includes measurements beyond fasting glucose and insulin. A functional variant at GCKR (encoding the glucokinase regulatory protein) may act by inhibiting glucokinase in the liver, leading to increased hepatic glucose production.42 Glucose lowering by IGF1 occurs by the binding of insulin receptors, stimulation of glucose transport into fat and muscle, and thus, lowering glucose and suppression of insulin.43 Unlike GCKR, the influence of the index SNP near IGF1 on glucose homeostasis is not yet understood. This study confirmed that these variants influence insulin resistance measures that go beyond the fasting state, possibly implicating extra-hepatic tissues in the process.

What have we learned from GWAS?

In a short span of five years, GWAS investigating the links between genetics and complex traits have transformed our knowledge. As we strive to decipher how to navigate these new roads, it may be instructive to look back at what we have learned to know how to journey forward.

They illustrate novel pathways

GWAS have implicated novel pathways in the development of diabetes in humans. One of the earliest and most illustrative examples was the discovery that the variant rs13266634, which encodes a R→W change at position 325 in the β-cell zinc transporter ZnT-8 (encoded by SLC30A8) was strongly associated with T2D (OR 1.26 for the major C allele, P=5.0×10−7) in the first published GWAS for T2D.13 This association was confirmed by the UK WTCCC (P=10−3) 15, FUSION (P=10−5) 16, and DGI (P=5.3×10−8).17

SLC30A8 is expressed almost exclusively in pancreatic islets with low levels in the cortex and thyroid.44 Its protein product, ZnT-8, is a zinc transporter localized in secretion vesicle membranes that transports zinc from the cytoplasm into insulin secretory vesicles.45 Insulin is stored as a hexamer bound to two zinc ions, and ZnT-8 provides zinc to allow for insulin storage and secretion.46 ZnT-8 thus appears to be a critical component of the final biosynthetic pathway of insulin production and secretion. Interestingly, overexpression of SLC30A8 in insulinoma cells increases glucose-stimulated insulin secretion.47 Other in vitro studies demonstrate that inflammatory cytokines noted to be elevated in T2D downregulate ZnT-8 in β cells and alter β-cell function.20,48 The development of a Slc30a8 null mouse shortly after the GWAS publications further defined how these genetic variants may affect glycemic parameters in vivo, although with varying results depending on the mouse model.49,50 In the mouse models, the size and number of islets as well as insulin sensitivity were comparable between knock-out (KO) mice and wild-type littermates, though the latter showed decreased islet zinc staining. Some of the KO models demonstrated impaired glucose tolerance and fasting glucose depending on age and gender, and one of the mouse models also showed decreased plasma insulin and decreased glucose-stimulated insulin secretion. Thus, though the role of this gene in insulin packaging and storage was known from prior work, GWAS and the experiments spawned by them revealed that coding variation in SLC30A8 may increase the risk of developing T2D by causing an insulin secretory defect.

New findings point to fundamental biology

While one can easily hypothesize about the biological consequences of missense SNPs, a potential molecular mechanism is not readily apparent for most other associated SNPs, which lie in intronic or intergenic DNA segments. Therefore, GWAS have reinforced the fundamental quest to characterize the biological relevance of non-coding genomic regions.

One pertinent example concerns the intronic SNP in the gene TCF7L2, rs7903146. This SNP has the strongest effect size on T2D demonstrated for a common variant thus far (~1.40 per copy of the risk allele).51,52 Using a new method called FAIRE-seq (formaldehyde-assisted isolation of regulatory elements coupled with high-throughput sequencing), Gaulton et al. identified open chromatin sites in human pancreatic islets.53 Open chromatin sites are evolutionarily conserved and can be used to tag regions that are bound by regulatory factors.54 Of 350 SNPs in strong LD with variants associated with T2D or fasting glycemia, 38 SNPs from ten loci (TCF7L2, CDKAL1, CDKN2A/CKDN2B, IGF2BP2, CDC123/CAMK1D, THADA, FTO, SLC30A8, HNF1B, and G6PC2) were associated with an open chromatin region in islets. Upon further examination of TCF7L2, SNP rs7903146 was localized to an islet-selective open chromatin site; more importantly, the high-risk T allele is associated with a “more open” chromatin state in human islets and greater enhancer activity, as revealed by allele-specific luciferase reporter assays in two β-cell lines.53 Given that the T allele associates with an increased risk of T2D, the more open chromatin state may imply increased transcription of TCF7L2, and thus confirms prior observations that the T allele is correlated with a 5-fold increase in TCF7L2 transcripts in human islets from donors with diabetes when compared to controls.55 By pairing fundamental biology, such as open chromatin regions indicative of regulatory binding sites, with current genetic association datasets, this study demonstrated a mechanism by which an intronic polymorphism may contribute to the development of a human disease.

Genetic discoveries support prior epidemiological observations

The GWAS method has not only led us to discover new or confirm known biology, but has also underscored prior epidemiological observations by providing potential genetic links between lipid dysregulation and glycemia (FADS1, GCKR, HNF1A), circadian rhythmicity and metabolic derangements (MTNR1B, CRY2), and low birth weight with subsequent T2D risk (ADCY5). A potential connection between genetic determinants of T2D that simultaneously lower cancer risk (HNF1B, JAZF1) is discussed at length elsewhere.56

Fatty acid desaturases (FADS) convert polyunsaturated fatty acids into cell signaling metabolites, that in turn, can affect circulating lipid levels.57 Different GWAS have associated FADS1 with fasting glucose and HOMA-B (rs174550, P=2×10−15 and 5×10−13 respectively), HDL cholesterol and triglyceride levels (rs174547, P=2×10−12 and 2×10−14 respectively), and LDL cholesterol levels (rs174546, P=1×10−7).35,57,58 All of these SNPs are in complete LD (r2=1) and therefore serve as perfect proxies for each other. In a large meta-analysis of glycemic traits the T allele at rs174550 was associated with higher fasting glucose, lower HOMA-B, increased HDL and LDL cholesterol, and decreased triglyceride levels.35 Classically, T2D and impaired fasting glucose have been clustered with dyslipidemia in the metabolic syndrome; the discovery of a genetic variant that is associated with both measures provides support for this clinical correlation. More recently FADS1 was found to be associated with a statin-induced response in triglyceride and HDL cholesterol levels (P=2.6×10−6 and 6.8×10−6, respectively).59 Given that unsaturated fatty acids influence insulin secretion, this genetic variant can tightly link hyperglycemia and dyslipidemia.41 Although the same SNP appears to influence various metabolic parameters, each of these measures could actually be influenced by different SNPs in high LD with the index SNP; elucidation of molecular mechanism requires functional experiments.

HNF1A (encoding hepatocyte nuclear factor-1α) illustrates a locus that contributes to the inheritance of both a rare Mendelian disorder and common polygenic disease, while also having pleiotropic effects. Classically, MODY3 is caused by a mutation in the HNF1A gene, which impairs the dimerization of this transcription factor and thus promotes metabolic dysregulation resulting in diabetes mellitus.60 Like FADS1, SNPs in the HNF1A region have been associated with multiple related traits including T2D (P=2×10−8), C-reactive protein (P=2×10−8 to 1×10−30), coronary artery disease (P=5×10−7), and LDL cholesterol (P=2×10−8).57,61-65 In terms of lipid metabolism and inflammation, HNF1A regulates numerous genes involving lipoprotein metabolism and inflammatory markers in the liver.66 These genetic variants create another molecular link between related metabolic derangements.

MTNR1B is an equally compelling story that helps provide a genetic link between circadian rhythmicity and glucose metabolism. As discussed above, MTNR1B variants have been associated with glycemic quantitative traits including T2D, fasting glucose, HOMA-B, and HbA1C.21-23 A second circadian gene, CRY2, has also been implicated in regulating fasting glucose in non-diabetic individuals.35 Epidemiological and physiological studies in humans, as well as experimental work in animals, have established a clear relationship between alterations of the circadian system and metabolic derangements. Homozygous mutant Clock mice, which are deficient in a key circadian rhythm transcription factor, were noted to have a variation in nighttime activity, in addition to metabolic disturbances including hyperphagia, obesity, hyperlipidemia, hepatic steatosis, hyperglycemia and hypoinsulinemia.67 Furthermore, wild-type islets appear to possess a self-sustained oscillation of a circadian gene (Per2) and circadian transcription factors (CLOCK and BMAL1) that is not seen in the islets of Clock null mice. In vivo studies in Clock mutant mice and in vitro studies in islets from whole-body Clock and Bmal1 mutant mice, as well as islets from pancreas-specific Bmal1 mutant mice, demonstrated impairment in insulin secretion.68 In order to explain the higher glucose intolerance noted in the morning versus night, Boden et al. studied normal human volunteers undergoing a hyperglycemic clamp titrated to various levels of glucose control for 24 hours. By measuring insulin secretion, the investigators demonstrated that, particularly at higher glucose levels, insulin is secreted in a circadian rhythm.32 In addition, people with T2D demonstrate a decreased diurnal serum melatonin level.34 The link between genetic variants in MTNR1B and CRY2 with T2D and glycemic measures lends further credence to the relationships observed in these animal and human studies.

Finally, genetic variants in ADCY5 (encoding adenylate cyclase type 5) have been robustly associated with fasting glucose, 2-hour glucose post-OGTT and T2D.35,36 In recent work confirmed by an independent group, the same locus has been associated with low birth weight.69,70 This finding provides another molecular connection between the observed predictive power of reduced size at birth with regard to future metabolic derangements, including risk of T2D 71.

Most loci point to the β-cell

It is now clear that the risk of developing T2D is a combination of genetic risk for β-cell dysfunction superimposed on genetic and environmental factors (e.g. obesity, Western diet, sedentary lifestyle) that promote insulin resistance. Recent genetic discoveries have identified numerous variants that appear to influence insulin secretion rather than insulin resistance. For example, risk variants of CDKAL1 were associated with insulin secretion defects and impaired insulin response to both oral and IV glucose tolerance tests (IVGTT).14,72,73 Variants in SLC30A8 were associated with insulin secretion defects following OGTT.14,74HHEX variants linked with impaired β-cell function in response to OGTT and hyperinsulinemic-euglycemic clamp.72,74,75 Finally, CDKN2A/B variants have been associated with impaired glucose-induced insulin release in healthy subjects.75 Of the ten novel variants discovered in association with fasting glycemic traits in a large meta-analysis, nine were associated with fasting glucose/HOMA-B and only one was associated with fasting insulin/HOMA-IR, measures of insulin resistance.35

There are several potential reasons for the dearth of common genetic variants related to insulin resistance found by the GWAS approach. These include study design, different heritability estimates for each of the traits, and the allelic frequency spectrum of these variants. One of the five initial high-density GWAS selected diabetic cases with BMI<30 kg/m2, thus excluding obese individuals;13 another matched cases and controls for BMI.17 These deliberate study designs were chosen to maximize the likelihood of detecting variants that increase T2D risk directly, rather than through the mediation of adiposity. Because loci that achieve genome-wide significance result from the meta-analyses of various scans, some loci that truly cause T2D via insulin resistance may show a trend toward association that fails to reach a threshold that marks them for follow-up. In this way, the studies may not identify loci associated with insulin resistance related to adiposity, but rather, detect loci related only to β-cell function. This issue has become less of a concern as more recent meta-analyses incorporate larger numbers of discovery cohorts that are not ascertained by BMI.

Investigators may bypass this limitation by performing GWAS for insulin resistance as a quantitative trait, rather than basing the discovery of insulin resistance genes in scans for T2D as a categorical trait. One measure that is readily available in large population cohorts is HOMA-IR. However, a meta-analysis of GWAS cohorts in which this phenotype was available identified many more loci influencing β-cell function than those regulating HOMA-IR, suggesting that other reasons exist for this observed discrepancy.

The degree to which each of the two traits is heritable may also contribute to the difficulties in identifying insulin resistance genes. The insulinogenic index is, on average, 10% more heritable than HOMA-IR.76 While HOMA-IR still displays substantial heritability (0.44 among the Framingham Offspring population77), there may be better insulin resistant phenotypes to examine in order to identify genetic associations. For example, in the Insulin Resistance and Atherosclerosis Study Family Study, the insulin sensitivity index (determined from IVGTT) was twice as heritable as HOMA-IR.78 This suggests that the insulin sensitivity index may be a more robust trait to examine when assessing the genetic contribution of insulin resistance; this must be balanced with the difficulty in performing the necessary physiology experiments in a sufficiently high number of individuals..

The distribution of P values in the similarly powered GWAS for HOMA-B and HOMA-IR (both derived from fasting glucose and insulin) suggests a different genetic architecture for the two traits.35 In those meta-analyses, there were many more P values that deviated from the null expectation at the top of the HOMA-B distribution than there were for HOMA-IR (Figure 3). For instance, insulin resistance variants may be fewer in number, rarer in frequency, or have a lower effect size. In addition, there may be a stronger impact of environmental interactions (e.g. diet, physical activity) with the genetic background. Exploring the lower range of the allele frequency spectrum (which requires denser genotyping arrays), integrating environmental factors in the analyses, and increasing the sample size could potentially improve the power to detect genetic variants related to insulin resistance.

Quantile-Quantile Plots for HOMA-B and HOMA-IR in a large meta-analysis

Nevertheless, several convincing associations with surrogate measures of insulin resistance (HOMA-IR and fasting insulin) have been identified, in part due to larger sample sizes and a novel study design. IRS1 is a highly attractive candidate gene for the development of insulin resistance. Rung et al. demonstrated that SNP rs2943641, 502 kb upstream of IRS1 is associated with T2D, fasting insulin and HOMA-IR. In further functional analyses, the risk allele was associated with reduced levels of IRS1 protein expression and decreased downstream effects noted by phosphatidylinositol-3-OH kinase (PI(3)K) activity (which is activated by IRS1 binding to IGF-1 receptors) in human skeletal muscle biopsies. Such functional experiments were critical in establishing that this variant, located at a non-trivial distance from IRS1, likely acts through IRS1 itself. These findings confirm what is already known biologically about the action of IRS1 in mice. Null Irs1 mice lack PI(3)K activity and demonstrate impaired glucose tolerance and insulin resistance.79

The most recent meta-analysis for fasting glycemic traits uncovered the variant rs35767 1.2 kb upstream of IGF1 as associated with fasting insulin and HOMA-IR.35IGF1 is another excellent biological candidate for insulin resistance: it encodes the insulin-like growth factor-1, which has been shown to bind insulin receptors and enhance glucose transport in adipose and muscle tissue, while inhibiting hepatic glucose production. Both a null IGF1 mouse and a human with IGF1 gene deletion have resulted in insulin resistance that improves with IGF1 therapy.43 Therefore, it seems intuitive that genetic variation in IGF1 may perturb the expression or function of IGF1, and thereby increase insulin resistance.

In the same large meta-analysis, GCKR was associated with insulin resistance at genome-wide significance levels. Originally, the T allele of rs780094, an intronic SNP in the GCKR gene, was found to be associated with higher triglyceride levels (P=3.7×10−8), and carriers of the T allele were noted to have a trend toward lower glucose, decreased insulin resistance (HOMA-IR), and decreased T2D risk.15 Two follow-up studies confirmed these results.80,81 The directions of these trends (higher triglyceride levels but lower fasting glucose and lower insulin resistance) may be counterintuitive with what we have seen in human epidemiologic studies. This may be explained by the inherent biological mechanism of GCKR, where the functional variant may downregulate gluconeogenesis, increase VLDL-triglyceride synthesis, and also upregulate glucose utilization.42,81

It is clear that as far as the genetic architecture of insulin resistance, we have only visualized the tip of the iceberg. A potentially large proportion of the genetic foundation of T2D remains hidden owing to our difficulties in discovering insulin resistance-related genes. With well-powered samples, more refined phenotyping, assessment of gene-environment interactions, a wider range of BMI, and novel study designs, this genetic information may soon rise to the surface.

Genetic variants identified only explain ~10% of T2D heritability

T2D GWAS have been successful in identifying specific loci that contribute to the causation of the complex disease, but only roughly 10% of the heritability can be accounted for by these variants, suggesting that much remains to be discovered. In the search for the “missing heritability”, firstly the accuracy of original heritability estimates needs to be considered. The familial component caused by a shared early uterine and post-natal environment and latent epigenetic changes (inherited changes in gene expression that are not caused by genetic sequence) may produce a contribution to heritability that is not removed sufficiently by these estimates.82 In addition, GWAS use commercially available genotyping platforms whose early-generation arrays fail to adequately capture nearly 20% of common SNPs, structural variants (such as copy number variants), and variants unique to non-European populations.83 Improving upon these limitations of first-generation genotyping platforms will undoubtedly uncover other loci contributing to the disease. In addition, since a common SNP represents a large segment of genetic material, closer scrutiny of these areas by fine mapping for rare variants is required to find the potential “causal” gene that could have a larger effect than that conferred by the index variant. Through purifying natural selection, high-penetrance deleterious variants should be selectively removed from the population, so we would not expect to find common variants contributing large effect sizes to common disease, but may find large effect sizes in less common variants. Whole-genome sequencing with less costly high-thoughput methods have been developed to discover rarer variants in the human genome. The 1,000 Genomes Project (www.1000genomes.org) is an international initiative designed to catalog all variants with minor allele frequencies greater than 1% of at least 1,000 genomes, thus pushing the envelope of captured shared variation. Initial pilot analyses have successfully identified more than 9 million new polymorphisms, many insertions/deletions, and some large structural variants.84

The Metabochip is another effort to develop a single custom-made chip encompassing 200,000 SNPs culled from deeper layers of the P-value distribution in GWAS for cardiovascular disease, obesity, T2D and related traits. As the Metabochip is deployed across thousands of patients, new and rare variants whose P values are near-the-top but below threshold will be examined in association with these phenotypes. Researchers are also incorporating novel bioinformatics to examine non-additive gene × gene interactions (commonly referred as epistasis). For example, two variants may individually affect genetic risk of T2D only mildly, but together they could increase the risk significantly. In addition, examining gene × environment interactions might lead to insights into biological pathways and guide researchers to novel genes that act synergistically with environmental factors, such as physical activity or dietary composition.85 If successful, this strategy may explain why some genetic associations do not replicate across all populations, hinting to different environmental exposures. Lastly, incorporating the prior probabilities afforded by the mechanistic knowledge garnered through functional experiments in cellular or animal models, investigators may rescue variants that did not make the list of associated loci on purely statistical grounds.

Common genetic variants are not yet useful for clinical prediction

In practical terms, genetic information has been expected to enable clinicians to predict an individual’s risk of developing disease. However, thus far the clinical usefulness of genetic information has been limited. Meigs et al. examined 18 loci associated with T2D in the Framingham Offspring Study in a genotype score in association with T2D incidence, and generated C statistics to determine the individual’s future risk of diabetes.86 While the genotype score predicted diabetes better than gender alone, when adjusted for other easily obtainable clinical parameters (age, sex, family history, BMI, fasting glucose level, systolic blood pressure, HDL cholesterol, and triglyceride levels), predictive power did not improve significantly. A similar study in a larger Swedish cohort examined 16 SNPs associated with T2D and generated largely comparable results.87 In subset analyses, both studies highlighted that younger patients may benefit more from genetic testing before they manifest clinical characteristics of the disease. An updated age-stratified analysis re-examined the genotype score including 40 SNPs associated with T2D in Framingham. The researchers noted that the new genotype score improved the ability to predict the onset of T2D in participants younger than 50 years. Diabetes risk prediction using the genetic variants still needs to be confirmed by other studies, but appears potentially promising in a younger population when medical and lifestyle intervention may be the most feasible and beneficial. 88

An alternative clinical use of genetic information is determining whether genetic information may affect a person’s behavior and thus modify the risk of developing diabetes. The Chinese Da Qing Diabetes Prevention Study, the Finnish Diabetes Prevention Study, the US Diabetes Prevention Program (DPP) and the Indian Diabetes Prevention Program showed that behavioral modification can reduce one’s risk of progressing to diabetes.89-91;Ramachandran, 2006 #2944} For instance, the DPP demonstrated that lifestyle modification led to a 58% reduced incidence of the development of diabetes compared to placebo.91 In further genetic investigation in the DPP, lifestyle modification attenuated the risk of diabetes conferred by the risk variant at TCF7L2: while the placebo participants who carried the homozygous risk genotype at rs7903146 (TT) had 80% higher risk of diabetes development, those randomized to the lifestyle intervention group had no increased risk despite carrying the risk genotype, indicating that lifestyle intervention may trump the genetic risk conferred by this variant.92 This argues that if genetic information could motivate a high-risk individual to change his/her behavior, it may help prevent the eventual onset of disease. However, whether reassuring genetic information may be counterproductive in de-motivating those at lower genetic risk from engaging in healthy behaviors should be empirically tested.

The concept of disease-risk information changing one’s behavior was examined among 150 non-diabetic primary care patients.93 Those with an accurate perception of their higher risk of diabetes (as documented by the Framingham Heart Study diabetes risk score) did not intend to modify their lifestyle any more than patients with low perceived risk. Whether genetic information (rather than perceived or measured risk factors) will have greater influence is being tested. Grant et al. examined how genetic information might influence an individual’s motivation for lifestyle change.94 After questioning patients regarding how they would respond to information containing their genetic risk of T2D development, 71% of those surveyed indicated that such testing would motivate them to adopt lifestyle change. On the contrary, only 1.3% of the patients reported they would be less motivated by genetic information indicating they were at low risk. Therefore, despite the limited ability of genetic testing to predict T2D when adjusted for common risk factors, genetic information may be more powerful in persuading a patient to change his or her behavior. Further studies, including more surveyed individuals and prospective studies, are need to determine if the person’s behavior would in fact change.

The underlying biological mechanisms which GWAS have unveiled could be targets for new pharmacotherapy. For example, prior biological knowledge regarding SLC30A8 and new genetic evidence have drawn our attention to a possible therapeutic target. A pharmacological agent that enhances the function of this transporter could hypothetically increase insulin secretion and improve β-cell function. Given the novelty of the GWAS findings, there has yet to be a therapeutic that has specifically evolved from these discoveries, but it seems that there is a wide breadth of possibilities on the horizon.

Studies to examine the pharmacogenetics of T2D have been encouraged by findings of variable pharmacotherapy response among individuals with monogenic forms of diabetes. One of the most impressive stories lies in the discovery of activating mutations in KCNJ11 (encoding the Kir6.2 subunit of the ATP-sensitive potassium channel), which causes the channels to remain open in the presence of glucose, thereby reducing insulin secretion and giving rise to permanent neonatal diabetes (PNDM).95 Infants carrying these mutations present with marked hyperglycemia or diabetic ketoacidosis at less than six months of age. Three individuals harboring mutations in KCNJ11 were successfully transitioned from insulin to a sulfonylurea with an improvement in glycemic control compared to insulin therapy.96 These findings were later confirmed by larger study populations of patients with PNDM.95,97 Similarly, the enhanced efficacy of sulfonylureas in patients with MODY3 was described with a fivefold greater response to sulfonylurea versus metformin therapy.98

Preliminary pharmacogenetic studies in polygenic diabetes have been initiated. These studies have primarily involved studying PPARG, KCNJ11, and TCF7L2, whose variants have been associated with T2D at genome-wide levels of signficance (as previously mentioned, the first two encode known targets of antidiabetic medications).

Given its therapeutic relavance in PNDM and MODY3, the influence of KCNJ11 variants has been examined in response to sulfonylurea treatment in T2D. Sesti et al. conducted a prospective trial of 524 European T2D patients who were treated with a progressive antidiabetic medication regimen which escalated with each medication failure. They started with glibenclamide, followed by metformin and then insulin. Carriers of the T2D risk K allele had a relative risk of 1.45 compared to E23E homozygotes for sulfonylurea failure.99 Feng et al. found an association of rs757110 (a SNP in the adjacent gene ABCC8, which encodes the sulfonylurea receptor SUR1) with response to sulfonurea treatment in Chinese patients with T2D, measured by a decrease in mean fasting glucose.100 The missense mutation at rs757110, S1369A, is in high LD (r2=0.87) with the E23K polymorphism, such that they are statistically indistinguishable. In contrast to Sesti et al., in the Chinese study carriers of the risk genotype showed greater responsiveness to gliclazide, which is consistent with the PNDM findings; whether such responsiveness also heralds a greater risk of long-term sulfonylurea failure must be tested in longer pharmacogenetic trials.

PPARG is an attractive genetic target because it is the known drug target of thiazolidinediones, such as troglitazone, rosiglitazone, and pioglitazone.101 Initial studies reporting that genotype at the index variant in PPARG (rs1801282, P12A) or at other nearby variants predict a better response to troglitazone therapy 102,103 have not been substantiated.104-106

Lastly, TCF7L2 has also been under investigation with regards to response to antidiabetes medication, despite its mechanism not being as well understood as that of KCNJ11/ABCC8 or PPARG. The risk variant in this gene confers a defect in insulin secretion, and possibly affects GLP-1 metabolism.107 The Go-DARTS study group has examined the variation in medication response based on TCF7L2 genotype in two settings. Firstly, they noticed an overrepresentation of the T risk allele in people with T2D treated with insulin versus diet therapy alone. In a second study, they demonstrated carriers of the risk TCF7L2 variants were more likely to fail sulfonylurea therapy than metformin.108,109

Which road do we take from here?

GWAS have led the way toward an exciting journey into understanding the genetic architecture of T2D and related traits. Critical building blocks have been established. With the development of HapMap and 1,000 Genomes, we have the ability to use imputation to combine different genotyping platforms and continue international collaborations involving multiple cohorts. Advancing technology has enabled large-scale genotyping with high quality data. Innovative statistical analyses, often publically available, have allowed the field to analyze this growing wealth of data. Strategies to identify causal variants include deploying next-generation sequencing techniques, examining the less common allelic spectrum, exploiting genetic pleiotropy, fine-mapping loci with biological relevance, and applying biological insight to the discovered associations. Ultimately, investigators strive to translate this genetic knowledge to the clinician’s bedside. Perhaps most importantly, the collaborative spirit that has permeated the field may be the most significant achievement catalyzing our enhanced understanding of the genetics underlying this complex disease.

Box 1What have we learned from GWAS?

They illustrate novel pathways.

The association of a missense mutation rs13266634 in SLC30A8 (encoding Zn2+ Transporter, ZnT-8) with type 2 diabetes has highlighted the importance of Zn2+ transport in the β cell, the variant’s influence on insulin packaging and secretion, and this pathway’s potential relevance as a drug target.

New findings point to fundamental biology.

The intronic SNP rs7903146 in TCF7L2 is located in an open chromatin site in β cells; its risk T alllele is correlated with an increased transcription in human islets and with increased expression in cellular luciferase assays.

Genetic discoveries support prior epidemiological observations.

The T allele of rs17550 in FADS1 is associated with higher fasting glucose, lower HOMA-B, increased LDL and HDL cholesterol, and decreased triglycerides.

Translating genetic knowledge to the clinician’s bedside with targeted treatment algorithms and risk assessment tools that may influence patient’s behavior based on risk.

Leveraging the welcome collaborative spirit that has permeated the field to catalyze large international studies to enhance our understanding of the genetics of T2D.

Acknowledgements

L.K.B. is supported by NIH training grant 5 T32 DK007028-35. J.C.F. is supported by the Massachusetts General Hospital and a Clinical Scientist Development Award by the Doris Duke Charitable Foundation. We thank Erik Billings for help in adapting Figure 1. We thank the members of the Florez research group and the larger diabetes genetics community for fruitful and engaging discussions.

104. Bluher M, Lubben G, Paschke R. Analysis of the relationship between the Pro12Ala variant in the PPAR-γ2 gene and the response rate to therapy with pioglitazone in patients with type 2 diabetes. Diabetes Care. 2003;26:825–831.[PubMed]

105. Snitker S, et al. Changes in insulin sensitivity in response to troglitazone do not differ between subjects with and without the common, functional Pro12Ala peroxisome proliferator-activated receptor-γ2 gene variant: results from the Troglitazone in Prevention of Diabetes (TRIPOD) study. Diabetes Care. 2004;27:1365–1368.[PMC free article][PubMed]