Since tissues and tumors are heterogenous populations containing different cell types, their transcriptomes are blends of multiple mRNA expression profiles. Although fluorescence-activated cell sorting (FACS) allows isolation of individual cell types, RNA isolation and quantification remain problematic from rare subsets, such as tissue stem cells. Likewise, identification of transcriptional changes relevant to the tumorigenic potential of mammalian cells while they are actively growing as colonies in soft agar is also hampered by limited amounts of starting material. Here we describe a convenient method that fills the gap between single cell and whole tissue mRNA analysis, enabling mRNA quantification for individual colonies picked from soft agar. Our method involves direct lysis, reverse transcription and quantitative PCR (RT-qPCR) on 500 sorted cells or a single soft agar colony, thus allowing evaluation of up to 20 transcripts in functionally distinct subpopulations without the need for RNA isolation or amplification.

Limitations in RNA isolation protocols when working with small numbers of cells currently restrict routine analysis of transcriptional changes in rare cell populations or in single cell clones. For instance, tissue stem or progenitor cells are present at low frequencies, and although fluorescence-activated cell sorting (FACS) is routinely used to isolate these cellular subsets, standard RNA isolation protocols still require a relatively large number of cells. Alternatively, protocols geared toward low cell numbersor even single cells require RNA amplification unless only a few abundant transcripts are analyzed (1), are limited by the use of oligo-dT or gene specific reverse-transcriptase (RT) primers, or require expensive microfluidic devices (reviewed in Reference 2). For our purposes, single cell resolution was not required, but we found yields for rare primary cell populations (maximum 10,000 cells) were too low for reliable quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR) due to the manipulation and transfer steps during standard RNA isolation. Moreover, when directly investigating the transcriptional status of tumorigenic clones of several hundred cells growing in soft agar, these issues are exacerbated by the relative inaccessibility of cells in a solid matrix. To our knowledge, no protocol has been described in the literature that allows transcriptional profiling of mammalian cell clones directly from soft agar without subsequent culturing on plastic. Combining tissues from several mice, or collecting hundreds of soft agar clones could increase cell number to more than 50,000, for which optimized RNA recovery methods are described (3). However, besides increased cost and labor, this also introduces noise due to mouse-to-mouse variability and precludes detection of biologically relevant variations between clones. A protocol that avoids RNA isolation by using osmotic lysis of leukemic cells and direct reverse transcription of RNA for the non-quantitative detection of Bcr-abl transcripts from leukemic cells has been previously presented (4). Here we modified this osmotic lysis method by using a reverse transcription buffer containing a helper protein for more efficient cDNA synthesis. We show that our method for direct reverse transcription on 500 sorted cells allows increased resolution by interrogating small cellular subsets instead of the tissue as a whole, and is also applicable to individual clones growing in soft agar. Importantly, it is not slower or more laborious than standard RNA-isolation/RT protocols with comparable costs.

Method summary

Here we describe a modified quantitative PCR methodology that enables efficient expression analyses from a limited number of cells (500 or less). Our approach, which is validated using FACS sorted cells and single soft agar colonies, fills a methodological gap between single cell approaches and whole tissue analyses.

We are interested in how the molecular wiring of distinct cell types in the mammary gland collaborates with particular oncogenes to give rise to different subtypes of breast cancer. Therefore, we use primary cell isolations from mouse mammary glands for cell sorting and a defined model of transforming human mammary epithelial cells for soft agar assays. Briefly, mammary glands from an 8–12 week old mouse were collected in L15 medium and processed to single cells according to a published protocol (5) with minor modifications (Figure 1; Protocol available as supplementary material). In particular, cell isolation was scaled down from a pool of 20 animals to enzymatic digestion of mammary glands from an individual animal. Single cells were stained with DAPI to exclude dead cells, CD45 fluorescently-labeled antibodies to exclude lymphocytes along with labeled CD24 (heat stable antigen) and CD49f (α6 integrin) antibodies to identify mammary epithelial cells. CD24 and CD49f cell surface markers separate the mammary epithelial population into two lineages: basal and luminal (Figure 1). Functional assays have found that most tissue stem cell activity resides in the cell population at the tip of the basal cloud (MRU, mammary repopulating units, which is the functional definition of mammary stem cells), whereas progenitor activity is enriched at the tip of the luminal cloud (CFC, colony forming cells) (6, 7).

Mammary epithelial cells are relatively large and therefore sorted with a 100 um nozzle at 20 psi in a modified FACS collection strategy which allows collection of two populations in PCR tubes in parallel (see detailed protocol in supplementary material). Sorting 500 cells with this nozzle size equates to approximately 2 µl sheath fluid with cells that is sorted into tubes pre-filled with 10 µl lysis solution. Cell lysis is based on osmotic shock; therefore, the lysis solution is predominantly water with the addition of 0.15% Tween and an RNase inhibitor to prevent RNA degradation. Variations in lysis buffer were not tested for this application, but alternative lysis methods are summarized in Reference 8. We routinely sort three replicates of 500 cells per population of interest plus an additional tube with 500 cells for a minus RT control. Next, a mixture of RT enzyme and buffer was added for a final volume of 20 µl and the tubes were placed into a PCR machine for reverse transcription. We found that the Superscript III (Invitrogen, Grand Island, NY) reverse transcriptase formulation that contains a proprietary helper protein (Vilo) required up to six cycles less to pass the cycle threshold (Ct) compared with other RT enzymes we tested (data not shown). cDNA can be generated by the accompanying mastermix that contains random hexamers, or by a modular mastermix based on iScript Select buffer (Bio-Rad, Hercules, CA) that allows the separate addition of oligo-dT or gene-specific primers. We routinely use the iScript Select buffer with both oligo-dT and random hexamers (see supplementary protocol). Since inactivation of DNase upon digestion of genomic DNA is incompatible with subsequent PCR and RNA purification results in unavoidable template loss, we used intron-straddling primers along with no-RT controls to check for genomic background signal. If required, the genomic DNA contribution to the qPCR signal can be estimated using genomic DNA reference samples and primer sets (9). Quantitative RT-PCR by SYBR (Bio-Rad) green incorporation was performed with 1 µL of cDNA in 10 µl reaction, thus allowing up to 20 transcripts to be analyzed per sample of 500 cells.

We validated our sorting strategy using qPCR for keratin markers on cDNA derived from triplicate sorts of the basal and luminal population (CD24loCD49fhi and CD24hiCD49flo, see Figure 1) and from the stem cell (MRU) and progenitor (CFC) populations. As shown in Figure 2, luminal cytokeratin 8 (CK8) in the luminal populations could be robustly detected with minimal variation between three independent collections of 500 cells (Figure 2A), while cytokeratin 14 (CK14) was detected specifically in the basal populations (data not shown). To determine whether our method is sensitive enough to detect regulatory genes usually transcribed at much lower levels than genes encoding structural proteins like keratins, we examined the expression of Bmi1, p63 and Elf5. The polycomb gene Bmi1 is often considered a marker of tissue stem cells such as hematopoietic and intestinal stem cells (10, 11), but we previously found that in the mammary gland, Bmi1 is not only required for stem cell function but also prevents premature differentiation of the luminal lineage (12). Accordingly, our data indicate that even though Bmi1 is transcribed by the stem cell-containing basal lineage, its expression is higher in the luminal lineage (Figure 2B). In contrast, expression of the transcription factor p63 was specific for the basal lineage (Figure 2C), consistent with protein expression data (13). Unfortunately, no unique markers have yet been identified for the stem cell population, but the transcription factor Elf5 was enriched in the luminal progenitor cell population as described previously (14) (Figure 2D). Taken together, these results show that regulatory genes could be detected reproducibly with Ct values around 27 to 30 in the population with the highest expression, demonstrating that our method is sensitive enough to investigate genes of interest that are expressed at modest levels. The data presented is normalized against that for hypoxanthine phosphoribosyltransferase (HPRT) based on its modest expression levels and minimal variation between the different cell types (Figure 2E). The high reproducibility of household gene expression is an indication of consistent RNA quality between samples. It should be noted that with three different household genes the signal was detected at earlier cycles in the CFC population, suggesting that the RNA content of these cells is higher compared to the other cell types (Figure 2E).