Abstract

Background: The CpG island methylator phenotype (CIMP) is a major molecular pathway in colorectal cancer. Approximately 25% to 60% of CIMP tumors are microsatellite unstable (MSI-H) due to DNA hypermethylation of the MLH1 gene promoter. Our aim was to determine if the distributions of clinicopathologic factors in CIMP-positive tumors with MLH1 DNA methylation differed from those in CIMP-positive tumors without DNA methylation of MLH1.

Methods: We assessed the associations between age, sex, tumor-site, MSI status BRAF and KRAS mutations, and family colorectal cancer history with MLH1 methylation status in a large population-based sample of CIMP-positive colorectal cancers defined by a 5-marker panel using unconditional logistic regression to assess the odds of MLH1 methylation by study variables.

Results: Subjects with CIMP-positive tumors without MLH1 methylation were significantly younger, more likely to be male, and more likely to have distal colon or rectal primaries and the MSI-L phenotype. CIMP-positive MLH1-unmethylated tumors were significantly less likely than CIMP-positive MLH1-methylated tumors to harbor a BRAF V600E mutation and significantly more likely to harbor a KRAS mutation. MLH1 methylation was associated with significantly better overall survival (HR, 0.50; 95% confidence interval, 0.31–0.82).

Conclusions: These data suggest that MLH1 methylation in CIMP-positive tumors is not a completely random event and implies that there are environmental or genetic determinants that modify the probability that MLH1 will become methylated during CIMP pathogenesis.

Introduction

There are multiple molecular phenotypes of colorectal cancer (1). The CpG island methylator phenotype (CIMP) is one of these, present in about 15% of colorectal cancers and characterized by widespread aberrant DNA hypermethylation across the genome of cancer cells (2–6). About 25% to 60% of CIMP-positive tumors, depending on the study population, display high levels of microsatellite instability (MSI-H), mainly due to transcriptional silencing of the MLH1 DNA mismatch repair gene by somatic DNA methylation of the MLH1 gene promoter region (4–7). Studies of multiple populations, including this study population (8, 9), have reported that somatic MLH1 DNA methylation is associated with several clinicopathologic variables including older age, female gender, proximal tumor location, and the BRAF V600E mutation (reviewed in ref. 10). These differences have been defined in comparisons of CIMP-positive to CIMP-negative tumors. However, there is much less data on the comparison of MLH1 methylated to MLH1-unmethylated tumors within the CIMP-positive tumor subset. We are aware of only one study comparing CIMP-positive MLH1-unmethylated to CIMP-positive MLH1 methylated tumors and this small study was limited to MSI-H tumors (11).

DNA methylation of the MLH1 promoter region, reported in 22% to 49% of CIMP-positive tumors (6, 11–13), is a key event for determining colorectal cancer phenotype for two reasons. First, the methylation-induced loss of MLH1 protein expression is likely to be the underlying cause for somatically altered MSI-H tumors, given that the methylation is predominantly associated with loss of its protein expression. Second, the methylation of the MLH1 promoter almost always occurs in the context of CIMP (5, 6, 14, 15) and, therefore, may also be a consequence of the same pathogenetic mechanisms that are most commonly associated with CIMP. Given this, a key question is whether or not the MLH1 methylation is a random event unrelated to the presence of one or more risk factors, for example age or sex, in which case future etiologic studies of CIMP can ignore MLH1 methylation status. Alternatively, if there are distinct risk factors within CIMP-positive tumors, these studies will need to consider MLH1 methylation status in the analysis.

In this analysis, we assessed the correlations between MLH1 promoter region DNA methylation and age, sex, tumor site, and MSI status, as well as BRAF and KRAS mutations and family history of colorectal cancer, in a large population-based sample of CIMP-positive colorectal tumors. We reasoned that if the probability that the MLH1 promoter-region DNA becomes methylated during CIMP pathogenesis is due solely to a random process within CIMP pathogenesis, there will be no differences between the MLH1-unmethylated and MLH1-methylated tumors in the genetic or environmental CIMP-associated factors (e.g., age, sex, and tumor location), whereas the different phenotypes (MSS versus MSI-H) will affect pathologic findings (e.g., infiltrating lymphocytes or signet ring cells) in the two tumor subsets. Finally, we assessed overall survival in CIMP-positive tumors by MLH1 methylation status, in general and stratified by the presence of BRAF and KRAS mutations. No tested tumors classified as CIMP-negative were included in the analysis. We did not assess pathologic characteristics of the tumors because it is likely that many of these are consequences rather than causes of the MSI-H phenotype.

Materials and Methods

Subjects

Data for this study were obtained through the Colon Cancer Family Registry (C-CFR), a National Cancer Institute funded registry of colorectal cancer cases, family members, and population-based controls, which utilized standardized methods for data collection and genotyping. Detailed information about the C-CFR can be found elsewhere (16) and at coloncfr.org. Recruitment at individual C-CFR sites has been described previously (16). Participants were recruited from six C-CFR centers: the University of Southern California (USC) Consortium (Arizona, Colorado, New Hampshire, Minnesota, North Carolina, and Los Angeles, California), the University of Hawaii (Honolulu), Fred Hutchinson Cancer Research Center (FHCRC, Seattle, WA), Mayo Clinic (Rochester, MN), Cancer Care Ontario (Toronto, Canada), and University of Melbourne (Victoria, Australia) using population-based ascertainment strategies. All centers except FHCRC oversampled case probands with first-degree relatives reporting colorectal cancer, or colorectal cancer case probands diagnosed under age 50 years to target families with increased colorectal cancer risk. More than 80% of recruited subjects were Caucasian. We used sampling weights that reflected the sampling probability that the case proband was recruited into the registry, accounting for family history, age, and race, relative to the base population. First-degree and some second-degree relatives with colorectal cancer were also recruited from families with multiple colorectal cancer cases. In this study, we included only colorectal cancer cases recruited from 1997 to 2002 (16). Each institution's Institutional Review Board (IRB) approved the study protocol and all subjects signed a written informed consent approved by their IRB. Only subjects who completed the risk factor questionnaire (RFQ) within 5 years of their colorectal cancer diagnosis were included.

Tumor blocks

Primary colorectal cancer formalin-fixed, paraffin-embedded (FFPE) tissue from the Jeremy Jass Memorial Tissue Bank was collected and processed as previously described (9). Briefly, we requested blocks from all population-based case probands recruited in 1997 to 2002 as well as their colorectal cancer–affected first-, second-, and third-degree relatives. This totaled 3,970 specimens, out of which we received 3,732 (94%) sets of slides. Specifically, we received two unstained 5-mm tissue sections embedded in paraffin from each tumor on positively charged “plus” glass slides without coverslips. Slides were microdissected to enrich for tumor cells and DNA was extracted as described (17). Proteinase K was inactivated by heating at 100°C for 10 minutes. Tissues were randomized before being analyzed.

CIMP determination

The method for CIMP analysis is described in detail in Weisenberger and colleagues (9). Briefly, all samples were bisulfite converted using the Zymo EZ-96 DNA Methylation Kit (Zymo Research) as specified by the manufacturer. The DNA methylation levels of individual loci were assessed using MethyLight technology as described (9). CIMP status in each sample was determined using a five-gene MethyLight-based signature (CACNA1G, IGF2, NEUROG1, RUNX3, and SOCS1) described previously (6). All MethyLight CIMP assays were performed using a control reaction specific for Alu repeats as a means of normalizing for input bisulfite DNA amounts. MethyLight data were organized as percent of methylated reference (PMR) value. Genes were considered methylated if the PMR value was ≥10. Tumors with methylation in ≥3 of these five genes were classified as CIMP-positive. Those with two or fewer methylated genes were considered CIMP-negative and excluded from this analysis. Out of the 3,732 samples processed, 46 (1.2%) failed the assay. MLH1 methylation status was determined using the MLH1-M2 MethyLight assay as previously described (6) and classified as methylated for PMR ≥ 10.

Risk factors

We obtained risk factor data (age, sex, and family colorectal cancer history) from the completed baseline RFQ available at coloncfr.org. Age at the time of enrollment was categorized as a three-category variable: ≤50, 51 to 69, and ≥70 years. Family history of colorectal cancer was self-reported and was considered positive if the case reported colorectal cancer in one or more first-degree family members (e.g., parents, siblings, or children).

Tumor site

Tumor site was abstracted from pathology reports and/or state or provincial cancer registries and coded using International Classification of Diseases for Oncology (ICD-O), third edition codes. Tumors were labeled as proximal colon if located in the cecum, ascending colon, hepatic flexure, transverse colon, and splenic flexure. Tumors were labeled as distal colon if located in the descending colon, sigmoid colon, and tumors overlapping the colon and rectum. Tumors were labeled as rectal if located in the rectum or rectosigmoid junction.

KRAS and BRAF mutation testing

DNA from each tumor sample was tested for BRAF and KRAS mutations. The somatic T>A mutation at nucleotide 1799 causing the p.V600E mutation in BRAF was determined using a fluorescent allele-specific PCR assay that amplified a 97 bp product for the mutant allele (A1799) and a 94 bp product for the wild-type allele (T1799), as previously described (18). Positive controls were run in each experiment and 10% of samples were replicated with 100% concordance. KRAS mutation analysis of codons 12 and 13 was performed using direct Sanger sequencing of a 169 bp PCR amplified product as previously described (19). The larger amplicon size for KRAS analysis compared with BRAF V600E contributed to a slightly higher proportion of samples failing to amplify for the KRAS assay compared with BRAF V600E assay.

MSI testing

MSI was tested using DNA from tumor and matched normal tissue as described in ref. 20, using 10 microsatellite loci (BAT25, BAT26, BAT40, BAT34C4, D5S346, D17S250, ACTC, D18S55, D10S197, and MYCL). Samples were classified as MSI-H if >30% showed instability, MSS if no markers showed instability, and MSI-L otherwise. Tumor classification was based on ≥4 interpretable markers.

MMR gene mutation carriers

Identification of individuals with germline MMR gene mutations was determined for MSI-H tumors as previously described (21, 22). Briefly, all tumors with a missing mismatch repair protein by immunohistochemistry were tested for germline mutations in the corresponding mismatch repair gene. MLH1, MSH2, and MSH6 mutations were identified using Sanger sequencing or denaturing high-performance liquid chromatography (dHPLC), followed by confirmatory DNA sequencing. Large duplication and deletion mutations including those involving EPCAM were detected by multiplex ligation-dependent probe amplification (MLPA) according to the manufacturer's instructions (MRC Holland). PMS2 mutations were identified as previously described (22, 23), where exons 1 to 5, 9, and 11 to 15 were amplified in three long-range PCRs followed by nested exon-specific PCR/sequencing. The remaining exons (6, 7, 8, and 10) were amplified and sequenced directly from genomic DNA. Large-scale deletions in PMS2 were detected using the P008-A1 MLPA Kit according to manufacturer's specifications (MRC Holland). Eight individuals with germline MMR gene mutations were identified in the data set, one of which was CIMP-positive and excluded from the analysis.

Statistical analyses

All analyses included only tumors tested and classified as CIMP. No tested tumors classified as non-CIMP were included. Contingency tables and Chi-square analysis were used to assess the prevalence of patient and tumor characteristics in CIMP-positive tumors by MLH1 methylation status. Unconditional logistic regression was used to compare tumor subsets while mutually controlling for all studied variables except for MSI status, which was too highly associated with MLH1 methylation to estimate an odds ratio. Pearson correlation coefficients were used to assess correlations between the raw PMR values of the CIMP markers. All analyses were weighted based on the inverse of the sampling probability that the case proband was recruited into the registry to ensure the numbers represent the entire population of colorectal cancer cases at each study site. Subjects were included from all C-CFR sites except Hawaii, because their sampling design precluded this type of weighted analysis. Frequencies are based on the weighted number of tumors in each category.

Overall survival

We used Cox proportional hazards regression to evaluate the association between MLH1 methylation status and overall survival among individuals with CIMP-positive colorectal cancer, accounting for sampling weights, using time since diagnosis as the time axis. Survival analyses were adjusted for age, sex, tumor site (proximal vs. distal/rectal), and family history of colorectal in at least one first-degree relative (yes, no); analyses were further adjusted for study site via stratification of the baseline hazards. Including tumor stage in the model did not modify the results. In addition to this primary analytic model, we conducted sensitivity analyses adjusting for MSI, BRAF, and KRAS mutation status as well as analyses stratifying on these tumor attributes.

For all analyses, statistical significance was defined as a P-value ≤ 0.05 in a two-sided test. Statistical analyses for the clinicopathologic variables (age, sex, tumor site, MSI status, family colorectal cancer history, and BRAF and KRAS mutations) were performed using SAS 9.3 (SAS institute Inc.) and the survival analysis was conducted using Stata 13 (Stata Corp.).

The processed samples yielded a total of 3,660 colorectal cancers with CIMP classification: 3,544 primary colorectal cancers from case probands and 116 colorectal cancers from affected relatives. Of these primary colorectal cancers, 108 case probands (3.0%) were excluded for having been interviewed more than 5 years after diagnosis, 203 (5.7%) for missing RFQ data, and 104 (2.9%) for missing tumor site data or sampling weights. The final data set included 3,119 (unweighted) primary colorectal cancers from case probands, 411 (unweighted) of which were classified as CIMP-positive and included in this analysis. After applying the sampling weights and deleting one proven Lynch Syndrome subject, a total of 786 tumors were included in the analysis.

Results

The overall risk factor distribution and prevalence of DNA methylation for the five CIMP markers and MLH1 for all CIMP-positive tumors included in this analysis is presented in Table 1. More than 90% of the CIMP-positive study population was older than age 50 and 59% were female. Eighty-one percent (81%) of all CIMP-positive tumors were in the proximal colon and approximately half were MSI-H. BRAF and KRAS mutations were present in 62% and 18% of tumors respectively and 19% of all subjects reported a history of colorectal cancer in one or more first-degree relatives. MLH1 was methylated in 51% of CIMP-positive tumors.

We next determined the associations between clinicopathologic factors and MLH1 methylation status in both univariate and multivariate analyses (Table 2). In univariate analysis, MLH1-unmethylated tumors occurred significantly more often in those under 50 years of age at diagnosis, in men, and in distal and rectal tumors (all P-values <0.0001). Individuals with MLH1-unmethylated tumors were significantly less likely to report a colorectal cancer diagnosis in a first-degree relative (P = 0.0226). Overall, 7.5% of MLH1-unmethylated tumors were also MSI-H. In addition, MLH1-unmethylated tumors were significantly more likely than tumors with MLH1 methylation to be classified as MSI-L (20.7% vs. 0.25% respectively; P < 0.0001). CIMP-positive tumors with MLH1 methylation were mainly classified as MSI-H (98%), as expected. Tumors without MLH1 methylation were significantly less likely to have a BRAF V600E mutation than MLH1 methylated tumors (42.3% vs. 86.2% respectively; P < 0.0001) and were more likely to have a mutation in KRAS codon 12 or codon 13 (39.5% vs. 2.1% respectively; P < 0.0001). Including MLH1 in the panel defining CIMP status did not change any results (39.5% mutated vs. 3.0% mutated, respectively). Except for family colorectal cancer history, these associations were not altered in a multivariate analysis mutually controlling for all the variables except for MSI status. The odds ratio (OR) for family colorectal cancer history was essentially null after multivariate adjustment. There was a low to moderate correlation between the PMRs of all markers with MLH1 methylation and each other. The correlation coefficients ranged from 0.017 to 0.48.

Table 3 shows the association between MLH1 methylation and overall survival in multivariate analyses. Patients with CIMP-positive, MLH1 DNA methylated tumors were 50% less likely to die during the observation period of up to 15 years in univariate analysis [hazard ratio (HR), 0.50; 95% confidence interval (CI), 0.31–0.82)]. This association was no longer significant after adjustment for MSI status, but the point estimate did not change materially. Adjustment for BRAF or KRAS mutation status also did not substantially change the estimates. In stratified analyses, the HR associated with MLH1 methylated status was significantly decreased only for cases with a mutated BRAF (HR, 0.41; 95% CI, 0.22–0.77) or a wild-type KRAS (HR, 0.46; 95% CI, 0.25–0.84).

Discussion

In this large population-based sample, MLH1 methylation in CIMP-positive tumors was associated with the characteristics commonly observed for MSI-H and CIMP-high tumors relative to non-CIMP tumors (10). CIMP-positive MLH1-unmethylated tumors were more prevalent in men, and significantly more common in younger age groups and in distal tumors than CIMP positive, MLH1-methylated tumors. The prevalence of BRAF and KRAS mutations also differed significantly between the MLH1-unmethylated and MLH1-methylated tumors.

Overall, our results are consistent with those of other studies. Kim and colleagues observed a highly significant difference in the sex ratio in individuals with CIMP-positive, MLH1-methylated tumors (55% female) compared with individuals with CIMP-positive tumors where MLH1 was unmethylated (9% female), in a Korean population (11). An increase in the prevalence of distal tumors among those without MLH1 methylation compared with those with MLH1 methylation was also seen in that study, as well as a significantly lower prevalence of BRAF mutation and higher prevalence of KRAS mutations in unmethylated tumors, as we observed here. However, all tumors were MSI-H in that study and the prevalence of Lynch Syndrome in the small set of CIMP-positive MLH1-unmethylated tumors was not provided. Similar differences were observed in studies comparing CIMP/MSI-H tumors (as a proxy for MLH1 DNA methylation) to non-CIMP tumors (4, 7, 24) and studies that assessed risk factors for MLH1 DNA methylation in CIMP-positive tumors compared with non-CIMP tumors (12, 25).

The different age distributions in subjects with MLH1-unmethylated tumors compared with those with a MLH1-methylated tumor suggests a role for age-associated epigenetic drift as a factor promoting MLH1 methylation in our study population (26, 27). A positive association between age and MLH1 DNA methylation has been reported (28, 29). Looking only at CIMP-positive MLH1 methylated or CIMP-positive MSI-H tumors, an association between MLH1 methylation or MSI-H and older age was observed in multiple studies (4, 7, 11, 30) supporting this conclusion.

A preponderance of females in CIMP-positive tumors is not a universal finding. Several studies report a higher CIMP prevalence in male subjects (13, 31–34) as we observed for the CIMP-positive, MLH1-unmethylated tumors in our population. It may be significant that four of these five studies were of Asian populations. In addition, in a previous analysis of this population, we observed that several risk factors for a CIMP-positive tumor were significantly associated with a CIMP-positive tumor only among females (9). Taken together, these data suggest that female sex is a proxy for more biologically relevant exposures that vary between populations.

Several previous studies of potential CIMP markers have reported that there may be more than one type of CIMP-positive tumor (35–40). In these studies, using significantly more marker genes, tumors designated as CIMP could be further differentiated statistically into a CIMP-Low/CIMP2/Intermediate (IME) or low methylation (LME) group and a CIMP-High/CIMP1/High methylation (HME) group with different marker gene sets. Tumors classified as CIMP-Low were significantly more likely to be MSS and have a KRAS mutation in these populations compared with tumors classified as CIMP-High. This is similar to what we observed in our data after stratifying on MLH1 methylation alone, even though all tumors used in this analysis were classified as CIMP-positive using a marker panel for what is variably called CIMP-H (35, 37, 39), CIMP1 (38), or HME (36, 40). We note that MLH1 methylation status was not a criterion for establishing CIMP status in the current and other analyses (35, 38, 39). However, redefining CIMP status using a six marker panel that included MLH1 methylation did not change any of the observed associations. Future studies are required to assess directly whether MLH1 methylation is associated with CIMP-Low or if different types of CIMP tumors must be identified using separate marker panels.

Establishing that a tumor has an MSI-H phenotype has important prognostic value (41) and may also be predictive of treatment response (42). Data from a meta-analysis concluded that the MSI-H phenotype, due mainly to MLH1 methylation, is associated with a better prognosis than tumors with the MSS phenotype (41) so identifying the two different types of CIMP-positive tumors, those in which the MLH1 promoter region DNA is not methylated (MSS), and those with MLH1 methylation (MSI-H) may be of clinical relevance. In a recent meta-analysis, CIMP was associated with a bad prognosis in both MSI-H and MSS tumors (43). In our data, using only CIMP-positive tumors, overall survival was decreased in those without MLH1 methylation compared with those with MLH1 methylation. This is consistent with the close association between MLH1 methylation and MSI-H and the reduction in significance after controlling for MSI status. However, in our population, the better prognosis was limited to subjects with a tumor that was KRAS-wild-type, most of whom also had tumors that were BRAF mutated. The presence of the BRAF V600E mutation was associated with a poor prognosis overall in a meta-analysis (44) whereas data from several studies suggest that the poor prognosis may be ameliorated by the presence of MSI-H (45–49). In one of those studies survival was significantly better in MSI-H, BRAF mutated tumors relative to MSS, BRAF wild-type tumors (46). Reaching a consensus about the modifying effect of BRAF mutation on prognosis in CIMP-positive MLH1 methylated tumors will require further study.

The strengths of our study include the facts that it is the largest study to date of CIMP-positive colorectal cancer tumors and that we utilized a population-based sample weighted for age, race and family history so that our results can be generalized. We limited the current analysis to CIMP-positive tumors so that unmeasured causal factors were equally relevant for both unmethylated and methylated tumors and there can be no confounding by differential effects of unmeasured exposures relevant mainly to non-CIMP tumors. Our study also has some weaknesses. Although we used a set of well-characterized markers to define CIMP status (6), an eight marker CIMP panel has been described (3), which may have eliminated some tumors from this analysis had it been used instead (50). Alternatively, some CIMP tumors in the parent data set may not have been identified as CIMP using our marker panel and so were wrongly excluded from the analysis. In addition, it is possible that some MLH1-unmethylated tumors should have been classified as CIMP-Lo as discussed above. It is unclear how this misclassification may have biased the results. Studies using genome-wide methylation techniques to improve CIMP classification, which we were unable to do, may decrease this type of misclassification in the future. Our assay for MLH1 methylation was limited to eight CpG sites in the previously described C region of the promoter (51), which might have caused us to misclassify some methylated tumors as unmethylated, biasing our results toward null values. We were unable to control for MSI status in our multiple regression analysis due to the small number of MLH1-methylated MSS tumors so there may be residual confounding by this variable in the multivariate ORs. It is unclear how this may have biased our adjusted ORs. KRAS mutation data were missing for 16% of tumors, although approximately equally in MLH1-methylated (15.2%) and -unmethylated subsets (17%), and we only looked at KRAS mutations in codons 12 and 13. Thus, our data for KRAS mutations may be biased toward the null value. Finally, we were not able to assess or control for the association between MLH1 methylation and the MLH1-93GA genotype (rs1800734) because only about 25% of the study population was genotyped for that SNP.

In conclusion, our analysis suggests that there are significant differences between CIMP-positive tumors with MLH1 DNA methylation and CIMP-positive tumors without such methylation for variables classically associated with CIMP. The differences included exposures (e.g., age, sex, and tumor site) which cannot be consequences of the CIMP phenotype. These results, in the context of data from other populations, are consistent with the hypothesis that MLH1 methylation in CIMP-positive tumors is not a completely random event and implies that there are environmental or genetic determinants that modify the probability that MLH1 will become methylated during CIMP pathogenesis. This suggests that etiologic studies of the CIMP pathway may need to stratify on MLH1-methylation status.

Disclosure of Potential Conflicts of Interest

D.J. Ahnen is a consultant/advisory board member for EXACT Sciences Inc. and Cancer Prevention Pharmaceuticals. D.J. Weisenberger is a consultant/advisory board member for Zymo Research Corporation and has ownership interest (including patents). Zymo did not contribute to this work and has no interest in the outcome of this research. No potential conflicts of interest were disclosed by the other authors.

Disclaimer

The content of this article does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Colon Cancer Family Registry (CCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government or the CCFR.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Acknowledgments

The authors thank the members of the Colon Cancer Family Registry for their contributions and dedicated work on this project and all the subjects who provided their time and effort in providing the data.