scTrio-seq identifies colon cancer lineages

To better design treatments for cancer, it is important to understand the heterogeneity in tumors and how this contributes to metastasis. To examine this process, Bian et al. used a single-cell triple omics sequencing (scTrio-seq) technique to examine the mutations, transcriptome, and methylome within colorectal cancer tumors and metastases from 10 individual patients. The analysis provided insights into tumor evolution, linked DNA methylation to genetic lineages, and showed that DNA methylation levels are consistent within lineages but can differ substantially among clones.

Abstract

Although genomic instability, epigenetic abnormality, and gene expression dysregulation are hallmarks of colorectal cancer, these features have not been simultaneously analyzed at single-cell resolution. Using optimized single-cell multiomics sequencing together with multiregional sampling of the primary tumor and lymphatic and distant metastases, we developed insights beyond intratumoral heterogeneity. Genome-wide DNA methylation levels were relatively consistent within a single genetic sublineage. The genome-wide DNA demethylation patterns of cancer cells were consistent in all 10 patients whose DNA we sequenced. The cancer cells’ DNA demethylation degrees clearly correlated with the densities of the heterochromatin-associated histone modification H3K9me3 of normal tissue and those of repetitive element long interspersed nuclear element 1. Our work demonstrates the feasibility of reconstructing genetic lineages and tracing their epigenomic and transcriptomic dynamics with single-cell multiomics sequencing.

Colorectal cancer (CRC), a major cause of mortality, is characterized by heterogeneous features of genomic, epigenomic, and transcriptomic alterations (1–4), which are not separate events, as multiple cellular processes may interact to promote tumorigenesis (5, 6). Intratumoral heterogeneity (ITH) across multiple layers of molecular features is a barrier for effective diagnosis and treatment (7). However, studies have been limited to analysis of bulk cells which consist of non–tumor cells and complex subclones and only reflect the average profiles of tumor samples. Single-cell genome and transcriptome sequencing have revealed ITH in several cancer types (8–12). However, single-cell sequencing has been limited in the ability to characterize multiple layers of molecular features of each genetic lineage.

Global SCNA patterns (250-kb resolution) of CRC01. Each row represents an individual cell. The subclonal SCNAs used for identifying genetic sublineages were marked and indexed; for details, see fig. S6B. On the top of the heatmap, the amplification or deletion frequency of each genomic bin (250 kb) of the non-hypermutated CRC samples from the TCGA Project and patient CRC01’s cancer cells are shown.

Most cancer cells from six of our study patients (CRC01, CRC03, CRC04, CRC06, CRC09, and CRC11) were assigned to the CMS2 group (fig. S3, A to C), a canonical CRC group with abnormal activation of the WNT/β-catenin and MYC signaling pathways, frequent SCNAs, and nonhypermutation (2). Whole-genome sequencing (WGS) verified a low frequency of somatic single-nucleotide variations (SNVs) in tumors from CRC01 (fig. S3D) (1) and identified an oncogenic mutation of NRAS and inactivating mutations of APC and SMAD4, consistent with the features of non-hypermutated CRC (1).

For the 10 patients for whom we had DNA methylation data, we performed SCNA profiling of individual cells at 250-kilobase (kb) resolution (Fig. 1 and figs. S4A and S5). Significant focal SCNAs and probable gene targets were identified (fig. S4B and table S2) (15). Additionally, the SCNA profiles estimated from the transcriptome data were consistent with those estimated from DNA methylation data (fig. S4C). The scTrio-seq2 data confirmed the presence of one cancer cell with homozygous deletions of several whole chromosomes (fig. S4C).

Genomic alterations in tumors provide markers for lineage tracing (16, 17). Clonal variants occur in early tumorigenesis, whereas subclonal SCNAs indicate the emergence of sublineages (Fig. 1 and figs. S6 and S7). As whole-chromosome or arm-level events are more likely to be acquired independently in different lineages (18, 19), we mainly used subclonal breakpoints within chromosome arms to identify genetic lineages (fig. S6). For five patients for whom we had methylation data (for >90 cells), cancer cells were classified into several genetic sublineages (Fig. 1 and figs. S5 to S7). For CRC01, we identified 12 sublineages originating from two distinct lineages (A and B) on the basis of 21 subclonal breakpoints; each sublineage was supported by 4 to 8 subclonal breakpoints (Fig. 1 and figs. S6 and S7). The sublineage A5 of CRC01 was detected in both LN (17%) and ML (87%), indicating that these metastases had a common origin (20). In all five patients, the primary tumors showed more complex subclonal structures than the metastases, indicating that metastases tend to be clonal (fig. S7, B and C).

(A) Single-cell global DNA methylation levels (1-kb tile) of each sublineage for patient CRC01. Colors represent sampling regions. Lines represent median values. (B) One representative DMR on chromosome 16 between genetic lineages A and B in the PT sample of CRC01. Each row shows one individual cell sorted by the mean methylation levels in the region. Each column shows one single CpG site. The bar plot on the right shows each cell’s mean methylation levels in the region. (C) The correlations between gene expression levels and DNA methylation levels across gene bodies and their 15-kb flanking regions (Spearman correlations). The gray lines represents individual cells. The blue line represents the mean value for each patient. TSS, transcription start site; TES, transcription end site.

We further traced the dynamics of DNA methylation and gene expression during metastasis within one single lineage each for CRC01 and CRC10. Global DNA methylation levels were relatively stable during metastasis and accompanied by changes in focal regions, such as promoters (fig. S11, A to C). We did not observe obvious changes pre- or postmetastasis in the expression levels or DNA methylation levels of epithelial–mesenchymal transition–related genes (fig. S11, D to G). Furthermore, we did observe molecular associations between the methylome and transcriptome of individual cells. Globally, the correlations between DNA methylation levels and gene expression levels were negative in promoter regions but positive in gene body regions in individual cells (Fig. 2C). The transcriptome groups were consistent with genetic lineages and DNA methylation groups for CRC04 but not for CRC01, CRC10, and CRC11 (fig. S12, A and B). Additionally, some promoters of genes associated with tumor proliferation and migration contained DMRs (21) (fig. S12, C and D).

Cancer cells experienced variable demethylation degrees across the whole genome, which were concordant within each genetic sublineage but varied in degree among different sublineages (Fig. 3A and fig. S13A). The relative demethylation degrees of cancer cells were correlated with the absolute DNA methylation levels of NC cells across the genome (fig. S13B), and this comparison suggested that the regions with higher DNA methylation levels in NC cells tended to undergo stronger demethylation in cancer cells.

(A) Unsupervised hierarchical clustering of relative DNA methylation levels in single cancer cells compared with NC cells (10-Mb tile). In the upper panel, the black points indicate the average DNA methylation levels of each genomic bin of normal cells, and the blue dashed line represents the mean level. (B) Box plot showing the Pearson correlations between the DNA demethylation levels of genomic bins in individual cancer cells and the densities of genomic features across a range of resolutions.

DNA demethylation showed positive correlations with the densities of L1 and H3K9me3 modifications (of NC cells) but negative correlations with the densities of the H3K4me3 modifications and open chromatin (of NC cells) (Fig. 3B). Similar correlations were also observed between SNVs and chromatin state in the two patients for whom there were WGS data (fig. S13C), consistent with chromatin organization influencing regional SNV frequencies (22). Interestingly, L1, which is evolutionarily younger and more active than LINE-2 (L2) (23), experienced significantly stronger DNA demethylation than L2 in cancer cells for all patients (P < 2.7 × 10−4, Wilcoxon rank-sum test) (fig. S13, D and E). This is in contrast to DNA demethylation during embryonic development, where L1 tends to retain higher DNA methylation levels than L2 (24, 25). These results suggest that during tumorigenesis and progression, L1 and heterochromatin regions experience aberrant DNA demethylation, breaking the rules for normal development.

We further explored the chromosomal patterns of aberrant DNA methylation and genome instability of CRC. Chromosomal demethylation degrees were variable, with six chromosomes (4, 5, 8, 13, 18, and X) exhibiting stronger DNA demethylation than others (Figs. 4, A and B, and fig. S14A). These six chromosomes were also significantly enriched for stronger DNA demethylation in most patients (P < 0.05, hypergeometric test) (fig. S14B). Additionally, population analyses of the non-hypermutated CRC samples from the TCGA Project and our study showed that three of the hypomethylated chromosomes recurrently generated SCNAs (chromosomes 8, 13, and 18) (Fig. 4C and fig. S14C). Using WGS data, we found that five of the hypomethylated chromosomes were also significantly enriched for SNVs (chromosomes 4, 5, 8, 13, and X) (P < 0.05, hypergeometric test) (fig. S14D).

(A) The chromosomal DNA methylation levels of NC cells and lineage B of CRC01. Chromosomes are ranked by their mean methylation levels within each lineage. The lines link the same chromosomes. The six chromosomes with the lowest DNA methylation levels in lineage B are highlighted in red. (B) The residual ratios of chromosomal methylation levels of lineage B compared with those in NC cells of CRC01. (C) Circos plot showing the amplification or deletion frequency of each genomic bin (250-kb resolution) of the nine CRC patients in our study. The minimum scale of coordinates is 18 Mb. Some oncogenes and tumor suppressor genes are labeled.

Unambiguously assigned cancer cell lineages enable characterization of their DNA methylation and gene expression features. We found that DNA methylation profiles are relatively stable within a single genetic lineage or sublineage. The differences in DNA methylation levels between the primary tumors and metastases could be mainly caused by the differences in the compositions of sublineages, but not de novo methylation or demethylation during metastasis. The DNA demethylation patterns of individual cancer cells were consistent among the 10 CRC patients whose DNA we sequenced. Thus, single-cell multiomics sequencing provides insights and resources for understanding the molecular alterations that occur during CRC progression and metastasis.

Acknowledgments: We thank I. Bruce for polishing our manuscript. Funding: F.T. was supported by grants from the National Natural Science Foundation of China (81561138005, 31625018, and 81521002). This work was supported by the Beijing Advanced Innovation Center for Genomics at Peking University. Author contributions: F.T., J.Q., and W.F. conceived the project. S.B., Y.H., and F.T. wrote the manuscript with help from all authors. Y.H., X.Z., X.L., Jun.Y., Y.W., W.W., Jia.Y., H.G., S.G., and Y.M. performed the experiments. S.B., X.L., B.H., J.D., and P.Z. conducted the bioinformatics analyses. Competing interests: The authors declare they have no competing interests. Data and materials availability: The scTrio-seq2 data have been submitted to the NCBI Gene Expression Omnibus (GEO) under accession number GSE97693. WGS data have been deposited in the European Genome-phenome Archive (EGA) under the accession number EGAS00001003242. Analysis code is available at https://github.com/bianshuhui/CRC_code.