Abstract

Therapies that target estrogen signaling have made a very considerable contribution to reducing mortality from breast cancer. However, resistance to tamoxifen remains a major clinical problem. Here we have used a genome-wide functional profiling approach to identify multiple genes that confer resistance or sensitivity to tamoxifen. Combining whole-genome shRNA screening with massively parallel sequencing, we have profiled the impact of more than 56,670 RNA interference reagents targeting 16,487 genes on the cellular response to tamoxifen. This screen, along with subsequent validation experiments, identifies a compendium of genes whose silencing causes tamoxifen resistance (including BAP1, CLPP, GPRC5D, NAE1, NF1, NIPBL, NSD1, RAD21, RARG, SMC3, and UBA3) and also a set of genes whose silencing causes sensitivity to this endocrine agent (C10orf72, C15orf55/NUT, EDF1, ING5, KRAS, NOC3L, PPP1R15B, RRAS2, TMPRSS2, and TPM4). Multiple individual genes, including NF1, a regulator of RAS signaling, also correlate with clinical outcome after tamoxifen treatment.

Approximately 70% of breast tumors express estrogen receptor α (ER) (1), which binds and mediates many of the effects of the hormone estrogen. Estrogen signaling is known to modulate several processes relevant to tumorigenesis mostly by the activity of ER as a transcription factor (2). After binding estrogen ER interacts with coactivators, resulting in the regulation of histones and gene expression (3). Of particular importance are the effects of estrogen/ER on cyclin D1 (4) and c-Myc expression (5), which are likely drivers of estrogen-stimulated cellular proliferation.

The dependence of a significant proportion of breast cancers upon estrogen signaling has been studied since the 1890s (6) and culminated in the development of pharmacological agents that inhibit ER signaling, including tamoxifen (7). Tamoxifen has gone on to become the most widely used drug in managing breast cancer. However, as with many cancer treatments, resistance to tamoxifen is a significant issue, and up to 40% of early-stage breast cancer patients who receive tamoxifen as an adjuvant therapy eventually relapse with tamoxifen-resistant disease (8).

Despite intense study, the molecular alterations that underlie endocrine therapy resistance are not fully understood, and this has limited the development of effective approaches for preventing and overcoming resistance. Nevertheless, two general mechanisms have been proposed to explain the development of resistance: (i) continued ER signaling in the presence of ER antagonists or the absence of estrogen (9), and (ii) the use of non-ER pathways that circumvent the reliance upon ER signaling (10). The activity of signal-transducing kinases has been implicated in both of these mechanisms (11, 12), and considerable effort has been made to characterize the role of individual genes in endocrine therapy resistance, with the notable findings that PAK1 and AKT activation can cause resistance to tamoxifen (13, 14). However, although candidate-based studies have been informative, a complementary approach is to interrogate the entire genome to uncover potential unique mechanisms of resistance, and high-throughput RNAi screening allows such systematic analysis to be performed (15).

Here we identified multiple genes that modulate the cellular response to tamoxifen by carrying out an unbiased genome-wide functional screen, coupling an shRNA interference library to pool deconvolution by massively parallel sequencing.

Before MCF7 cell infection, we divided the genome-wide shRNA library into six pools (each pool comprising 9,600 shRNAs) and infected cells with each pool separately. Cells were infected at multiplicity of infection of 0.7, and on average 2,000 cells per shRNA construct were infected (SI Appendix, Fig. S2). Seventy-two hours after infection, cells were selected in puromycin for 2 d to remove the nontransduced fraction and then divided into two samples: one subsequently exposed to 4OHT solubilized in ethanol, the other exposed to ethanol alone. Five days after initial infection, drug exposure was initiated and sustained for 21 d to model chronic tamoxifen exposure used clinically. A final 4OHT concentration of 500 nM was used, which caused 40% inhibition of cell survival over the time course of the screen (surviving fraction 60%, SF60). To identify shRNA constructs that modulated the response to 4OHT, we estimated shRNA frequency in 4OHT and vehicle-treated cultures at the end of this 21-d period using massively parallel sequencing.

After PCR products were sequenced (SI Appendix, Table S1), each short read was matched to the corresponding RNAi target in the reference library. We then used the frequency of each individual short-read sequence to estimate the frequency of shRNAs in each surviving population. To identify shRNAs that conferred either 4OHT resistance or sensitivity effects, we compared the representation of shRNAs in the 4OHT-treated samples with those in vehicle-treated samples. First, data from each of three biological replicate screens were normalized to account for experimental intrascreen variation (variation between different viral pools). In doing this, we also corrected biases associated with differential starting representations of shRNAs. Second, we applied an interreplica screen rank normalization and used these values to ascribe each shRNA a drug effect (DE) score that represented the magnitude of 4OHT resistance (positive DE) or sensitivity (negative DE) (SI Appendix, Fig. S1B). Here overrepresentation of an shRNA in a 4OHT-treated sample indicated a resistance-causing effect, whereas underrepresentation indicated a sensitization effect.

Identification of Screen Hits.

To define 4OHT-modulating effects from the screen data, we used three parallel strategies previously used in RNA interference screens. In the first instance we selected significant effects according to the variance of the entire dataset. We calculated the median absolute deviation (MAD) to estimate the variance of the normalized data (16) and defined resistance-causing hits as those shRNAs that gave DE scores >2 × MAD (Z score >2), a threshold approximately equal to 2 SDs from the median. Sensitization effects were defined as shRNAs that returned DE scores of Z < −2. In addition to this approach we also used RNAi Gene Enrichment Ranking (RIGER) (17), as implemented in the Broad Institute's GENE-E software package. In brief, RIGER calculated the weighted sum of the two top-ranked shRNAs for each gene on the basis of the log fold change between three biological replicates for each condition, and provided a normalized enrichment score per gene. Finally, we also used RNAi Set Analysis (RSA), a modification of Gene Set Analysis (http://www-stat.stanford.edu/∼tibs/GSA/) that uses maximum–mean statistics to identify significantly enriched or depleted shRNA sets.

In total, Z score threshold identified 1,049 resistance-causing genes and 1,126 sensitization genes, RIGER generated a list of 504 resistance-causing and 498 sensitization genes with a P value of <0.05, and RSA generated a list of 443 resistance-causing genes and 477 sensitization genes with a false discovery rate approaching zero. Given the limitations of each method, we took a pragmatic approach and considered a subset of the genes identified by all three methods for further examination. This intersection approach identified 121 candidate genes mediating sensitization and 131 candidate genes mediating resistance to tamoxifen (Fig. 1A and SI Appendix, Table S2).

Detection of tamoxifen sensitization and resistance-causing effects. (A) Venn diagrams indicating the number of candidate hits defined by three parallel analysis methods. (B) Plot of shRNA DE Z scores ranked by size of effect. Data from nonsilencing shRNAs is highlighted, as are data from shRNA targeting PTEN or ESR1, the gene encoding ERα.

We examined screen performance using positive and negative controls. As expected, nonsilencing control shRNAs did not modulate the cellular response to 4OHT, whereas four shRNAs targeting ESR1, the gene encoding ERα, caused sensitivity to 4OHT (Fig. 1B). In vitro functional studies as well as clinical data suggest that reduced PTEN activity causes resistance to tamoxifen (17). Five shRNAs targeting PTEN caused 4OHT resistance in the screen (Fig. 1B), and three PTEN shRNAs chosen for further investigation caused 4OHT resistance as assessed using a GFP competition assay (SI Appendix, Fig. S3 A and C). Similarly, an siRNA pool targeting PTEN was able to recapitulate the resistance phenotype in MCF7 cells treated with either 4OHT or a pure antiestrogen, ICI182,780, using a previously validated 96-well arrayed method (15) (SI Appendix, Fig. S3 B and C). As the PTEN effect identified in the genome-wide screen was confirmed using two orthogonal assays, we reasoned that the screen itself and the analysis methods used were able to identify true endocrine therapy-modulating effects.

Validation of Screen Effects.

Despite their utility, RNAi screens are prone to artifacts such as off-target and screen format-specific effects (18). To overcome these issues, we selected genes identified in the genome-wide shRNA screen and examined them using an independent mode of RNA interference (siRNA) and a 96-well arrayed method. The target sequences for siRNAs used in this validation step did not overlap with the target sequences of shRNAs present in the initial screen (SI Appendix, Table S3), and this allowed us to minimize the impact of off-target effects.

Validation of individual tamoxifen resistance- or sensitivity-causing effects. (A) Effect of siRNA on survival in 4OHT. MCF7s were transfected with siRNA duplexes in replica plates and 48 h later exposed to 1 μM 4OHT or drug vehicle. Seven days later cell viability was determined using Cell Titer Glo. Surviving fraction for each siRNA was calculated using the calculation SFgene x = luminescence in 4OHT-treated wells/luminescence in similarly transfected vehicle-treated wells. (B) Individual dose–response curves for 11 resistance-causing genes. Experimental setup was as in A but using a 4OHT titration. (C) Individual dose–response curves for 11 sensitivity-causing genes.

Network analysis using STRING (19) identified a number of well-established molecular associations among these validated genes (Fig. 3A). These included the identification of both components of the UBA3/NAE1 heterodimer, which mediates the conjugation of NEDD8, an ubiquitin-like protein, to substrates (20). siRNAs targeting either UBA3 or NAE1 (SI Appendix, Fig. S4), in addition to causing resistance to 4OHT, also caused resistance to ICI182,780 (21). The resistant phenotype was also reproduced by siRNA silencing of NEDD8 itself (Fig. 3B), thus validating our screen observations. Interestingly, members of this neddylation pathway, and the UBA3/NAE1 complex in particular, have previously been reported to mediate the degradation of ER (22).

We also demonstrated that a number of cohesion-associated/chromatin remodeling proteins modulated response to both 4OHT and ICI182,780. The cohesin protein complex is involved in maintaining sister chromatid proximity in dividing cells and has recently been implicated in mediating transcriptional insulation and control via interactions with CTCF (23). siRNA targeting RAD21 (24), SMC3, and NIPBL (25), caused 4OHT and ICI182,780 resistance (Figs. 2B and 3B). The precise mechanism whereby cohesin components affect tamoxifen sensitivity is not yet clear, but it seems possible that imbalance of these proteins alters the transcriptional profile of ER+ tumor cells, enabling continued growth in the face of endocrine therapy (23).

Finally, a number of proteins involved in RAS signaling were also found to modulate the response to 4OHT. Both shRNA and siRNAs targeting NF1 (neurofibromin), a GTPase tumor suppressor protein, caused 4OHT resistance (Figs. 2B and 3B). NF1 mediates the inactivation of classic RAS proteins (e.g., KRAS and HRAS) but also nonclassic RAS proteins, including RRAS2 (26). siRNAs targeting KRAS and RRAS2 both caused 4OHT sensitivity, and KRAS silencing also caused moderate ICI182,780 sensitivity (Figs. 2C and 3B), as did the downstream mediator of RAS signaling, c-RAF (RAF1) (Fig. 3B). To verify that silencing of NF1 affected RAS signaling in the face of 4OHT treatment, we measured levels of active RAS (GTP-RAS) and phosphorylated ERK in MCF7 cells treated with 4OHT, and these were indeed elevated with NF1 silencing (Fig. 4A). Furthermore, silencing of c-Raf in MCF7 cells with stable NF1 silencing alleviated tamoxifen resistance (Fig. 4B), supporting the hypothesis that these effects were, in part at least, via RAS/RAF signaling.

NF1 and tamoxifen response. (A) NF1 RNAi reagents that cause 4OHT resistance activate RAS and ERK in MCF7 cells treated with 4OHT. MCF7 cells were transfected with the indicated RNAi reagents and treated with 4OHT (500 nM for 24 h), after which levels of GTP-Ras and phosphorylated ERK were detected by immunoblotting. (B) Suppression of downstream RAS effector, c-RAF (RAF1), restores sensitivity to tamoxifen in MCF7s with stable NF1 knockdown. MCF7 cells were infected with NF1 shRNA and, after puromycin selection, subsequently transfected with nontargeting or RAF1 siRNAs. The response to 4OHT was then monitored using a 96-well format assay (15).

Clinical Significance.

Having identified a compendium of genes that modulated the response to endocrine therapy, we assessed whether they were associated with clinical response to endocrine therapy. To do this we interrogated publicly available microarray gene expression profiles from tumors of patients subsequently treated with adjuvant tamoxifen. Specifically, we used five patient datasets for this analysis: Stockholm GSE1456, n = 87 (STOCK in Fig. 5) (27); Oxford GSE6532, n = 109 (OXFT) (28); Karolinska Institute GSE3494, n = 72 (KIT) (29); Guy's Hospital GSE9195, n = 77 (GUYS77) (30); and Guy's Hospital GSE6532, n = 87 (GUYS87) (28). In each of these datasets, biopsies were taken from ER+ breast tumors before tamoxifen treatment. We obtained the normalized gene expression levels for each of the 23 functionally validated genes and compared these with a surrogate marker of tamoxifen response, the time to distant relapse. To quantify associations between gene expression and the likelihood of relapse/tamoxifen resistance, we calculated Cox proportional log hazard ratios (HRs) (31) for each gene as the relative risk of distant relapse in terms of gene expression. This analysis suggested that reduced expression of NF1, a gene whose silencing caused tamoxifen resistance in the functional screen, was associated with a statistically significant higher risk to distant relapse when considering all datasets (P < 0.05; Fig. 5A). Reduced NF1 expression did not, however, correlate with well-established prognostic factors (SI Appendix, Table S6), suggesting that it was a relatively independent marker of response to tamoxifen. A similar analysis of the validated sensitivity-causing effects indicated that reduced expression of EDF1 was associated with a statistically significant lower risk to distant relapse in a combined series of independent cohorts (P < 0.05; SI Appendix, Fig. S5).

Clinical significance of genes identified in the functional screen. (A) Low NF1 expression correlates with poor outcome in tamoxifen-treated patients. Forest plot indicates log HRs for increase in NF1 expression as estimated by Cox regression. Dots represent the weighted median effect and lines the 95% confidence interval (CI) for each of the five studies considered. Log HRs refer here to the relative risk of distant relapse in relation to gene expression, whereby positive and negative values represent decreasing or increasing risk, respectively, as gene expression decreases (zero suggests no correlation between gene expression and relative risk). A global negative log HR indicates that reduced expression of NF1 correlates with a poor outcome (high risk) in tamoxifen-treated breast cancer patients. As diamond limits define CI for the overall effect, the combined effect for NF1 is significant at P < 0.05. The most significant individual study is presented as a Kaplan-Meier survival curve in SI Appendix, Fig. S6. (B) Low expression of eight resistance-causing genes identified in the functional screen correlates with poorer outcome in tamoxifen-treated patients. Genes included in this analysis were BAP1, CDK10, NF1, NIPBL, PTEN, RARG, SMC3, and UBA3. (C) Low expression of six sensitivity-causing genes identified in the functional screen correlates with a favorable outcome in tamoxifen-treated breast cancer patients. Genes included in this analysis were EDF1, KRAS, PDPK1, RAF1, TMPRSS2, and TPM4.

Although interrogating the relationship between individual gene expression and clinical outcome parameters in tamoxifen-treated patients can provide some insight into the clinical relevance of the genetics of tamoxifen resistance, it is also possible that a constellation of individually modest gene effects combine to generate the final resistant phenotype. Given this, we tested whether aggregate measures of expression from groups or modules of genes (metagenes) also correlated with time to distant relapse in patients treated with tamoxifen. Using the same patient datasets and established methodology (32), we defined two functional metagenes for this analysis: one populated by genes shown in this or our previous study (15, 33) to cause resistance to tamoxifen, and a second similarly curated gene set defined by sensitivity-causing effects that also correlated with outcome data. In total, these modules comprised eight genes functionally predicting resistance [BAP1, RARG, PTEN, SMC3, NF1, NIPBL, UBA3, and CDK10 (15)] (Fig. 5B) and six genes functionally predicting sensitivity [EDF1, KRAS, RAF1, TMPRSS2, TPM4, and PDPK1 (33)] (Fig. 5C). This analysis suggested that both “resistance” and “sensitivity” metagenes correlated with clinical outcome (P < 0.05) and gave moderately greater predictions of outcome than the use of individual genes such as NF1.

Discussion

Here, we report a genome-wide functional screen to identify determinants of tamoxifen sensitivity. This screen and the subsequent validation experiments define a constellation of genes that modulate the cellular response to this widely used drug. Notable among the set of genes identified are multiple components of a neddylation complex, a series of genes that modulate cohesion-associated/chromatin remodeling and also multiple members of the RAS signaling cascade. Of these, the mRNA expression levels of genes such as NF1 appear to correlate with clinical outcome, as do two metagenes that encompass genes functionally shown to modulate the response to tamoxifen.

Our functional screen provides additional evidence for a key role of RAS/RAF/MEK/ERK signaling in determining the response to endocrine agents. Previously, we demonstrated that silencing of the kinase CDK10 was sufficient to cause 4OHT and ICI182,780 resistance, an effect likely mediated by the loss of CDK10 suppression on RAF1 expression and subsequent up-regulation of RAS/RAF/MEK/ERK signaling (15). Furthermore, other studies have demonstrated that transfection of constitutively active MEK1 or RAF1 cDNA expression constructs into MCF7 cells results in hyperactivation of p42/p44 MAPK and acquisition of antiestrogen resistance (34). In the genome-wide screen reported here shRNA targeting NF1 (neurofibromin), a GTPase that suppresses tumor formation by inhibiting RAS activation, caused 4OHT resistance, and RNAi targeting of KRAS caused 4OHT sensitivity, thus adding to the known nodal points within this signaling cascade that can modulate drug response. In addition, RRAS2 (TC21) silencing also resulted in tamoxifen sensitivity. RRAS2 is a RAS family member that shares more than 50% amino acid sequence identity with classic RAS proteins (HRAS, NRAS, and KRAS). RRAS2 encompasses almost identical functional domains to these latter proteins and accordingly shares a number of effectors, including RAF1 (35). A link between RRAS2 and tamoxifen response has already been suggested. Specifically, high RRAS2 protein expression in breast tumor biopsies has been shown to correlate with a shorter time to relapse in tamoxifen-treated patients, as does possession of a RRAS2 allele carrying the −582C→T RRAS2 promoter polymorphism present in ≈34% of European Caucasians (36). Our functional data support the case for RRAS2 playing a critical role in determining the response to tamoxifen and strengthens the argument for examining the functional effects of RRAS2 polymorphisms and how these relate to clinical outcome.

Genome-wide functional screens, such as that described here, should provide annotated gene lists that may act as a starting point for the in-depth dissection of clinically relevant phenotypes, in much the same way as microarray profiling has allowed an understanding of gene expression alterations associated with disease. As RNAi screening technology and reagents mature, the functional screening of selected gene subsets is now within the means of most laboratories. Furthermore, the experimental setup applied here, with the deconvolution of pooled shRNAs by massively parallel sequencing, allows genome-wide screens to be performed at a fraction of the cost and time compared with genome-wide plate arrayed screens and will likely enable the dissection of additional disease-based phenotypes.

Materials and Methods

Pooled shRNA Screen Method.

As a screening library, we used the OpenBiosystems GIPZ human genome-wide shRNA library. Lentiviral DNA was generated according to the manufacturer's instructions and then pooled to generate six pools, each containing 9,600 different shRNA constructs. Virus was packaged in 293T cells and MCF7 cells transduced with a final representation of ≈1,000 cells per shRNA (20,000,000 per shRNA pool). Transduction efficiency was estimated by GFP detection using FACS 72 h after infection. Cells were then cultured in 1 mg/L puromycin for 2 d to enrich for transduced cells and then continuously cultured in media containing 500 nM 4OHT or 0.5% (vol/vol) ethanol for 21 d. Media with drug was replenished every 2 to 3 d. MCF7 cells were cultured in phenol red-free RPMI-1640 media supplemented with 10% (vol/vol) charcoal-dextran stripped FBS. Massively parallel sequencing for shRNA retrieval and screen data analysis are described in SI Appendix, SI Methods and Materials.

Clinical Analysis.

This was performed using tumor samples from five independent datasets of ER+ patients subject to tamoxifen adjuvant therapy. Raw Affymetrix U133A microarray expression files were downloaded from the Gene Expression Omnibus repository (accession codes per study indicated in main text) and processed using the RMA function from the Bioconductor Affy package for R (37). Log–rank analyses and Kaplan-Meier plots were produced using the “survdiff” and “plot.survfit” S-plus (TIBCO Software) functions, respectively. Normalized gene expression values were partitioned into quartiles. Cox proportional hazards regression was carried out using the “coxph” S-plus function. Here we used gene expression values as the sole continuous predictor variable. Random effects metaanalysis was carried out using the β values estimated by this regression.

Acknowledgments

This work was funded by Breakthrough Breast Cancer, The Breast Cancer Research Foundation, and American Association for Cancer Research/Stand Up To Cancer. We acknowledge National Health Service funding to the Royal Marsden Hospital National Institute for Health Research Biomedical Research Centre.

Footnotes

↵1To whom correspondence may be addressed. E-mail: alan.ashworth{at}icr.ac.uk or Chris.Lord{at}icr.ac.uk.

(2010) Assessment of an RNA interference screen-derived mitotic and ceramide pathway metagene as a predictor of response to neoadjuvant paclitaxel for primary triple-negative breast cancer: A retrospective analysis of five clinical trials. Lancet Oncol11:358–365.

Researchers report biparental inheritance of mitochondrial DNA in 17 members of three unrelated multigeneration families, paving the way for insights into alternative mechanisms for the treatment of inherited mitochondrial diseases.

Researchers report a machine-learning approach to identify land plants at risk of extinction, suggesting that the approach can be used to guide policies aimed at allocating resources for biodiversity conservation.

A study explores how cats groom fur using fine structures called papillae on the surface of the tongue and presents a biologically inspired hairbrush to remove allergens from cat fur and apply medications on cat skin.