Abstract

Metastasis is a complex biological process that has been difficult to delineate in human colorectal cancer (CRC) patients. A major obstacle in understanding metastatic lineages is the extensive intra-tumor heterogeneity at the primary and metastatic tumor sites. To address this problem, we developed a highly multiplexed single-cell DNA sequencing approach to trace the metastatic lineages of two CRC patients with matched liver metastases. Single-cell copy number or mutational profiling was performed, in addition to bulk exome and targeted deep-sequencing. In the first patient, we observed monoclonal seeding, in which a single clone evolved a large number of mutations prior to migrating to the liver to establish the metastatic tumor. In the second patient, we observed polyclonal seeding, in which two independent clones seeded the metastatic liver tumor after having diverged at different time points from the primary tumor lineage. The single-cell data also revealed an unexpected independent tumor lineage that did not metastasize, and early progenitor clones with the "first hit" mutation in APC that subsequently gave rise to both the primary and metastatic tumors. Collectively, these data reveal a late-dissemination model of metastasis in two CRC patients and provide an unprecedented view of metastasis at single-cell genomic resolution.

Single-cell and bulk population experimental workflow. (A) The frozen primary tumors and liver metastases from two CRC patients were dissociated into nuclear suspensions and stained with DAPI. (B) Single nuclei and populations of cells were gated and flow-sorted by ploidy distribution. (C) To detect mutations, single nuclei were amplified by multiple-displacement-amplification (MDA), and libraries were captured using the T1000 cancer gene panel, while copy number detection was performed on single nuclei using DOP-PCR. Millions of cells were isolated in parallel for standard exome sequencing. Barcoded libraries were constructed and captured for targeted cancer gene panels or exome panels. Libraries were pooled for next-generation sequencing on the Illumina platform.

Concordance of mutations in bulk primary and metastatic tumors. (A,B) Scaled Venn diagrams reflect the total number of mutations (synonymous and nonsynonymous) identified by exome sequencing of the bulk flow-sorted tumor cells from the primary and metastatic tumors. (C,D) Dot plots showing the variant allele frequencies of the nonsynonymous mutations in the primary and metastatic tumors. (E) Targeted deep amplicon sequencing of the metastasis-specific mutations in the primary tumor and matched normal tissue. Significance of the mutations based on the variant read counts was determined using deepSNV and a Bayesian hypothesis test (Methods).

Single-cell mutational profiling of matched primary and metastatic tumors. Targeted cancer gene panel (T1000) sequencing data of point mutations in 372 single cells from the primary colon and liver metastatic tumors from patients CRC1 and CRC2. (A,B) Multidimensional scaling analysis, in which each dot represents a single cell. Cells are colored by the flow-sorting distribution from which they were isolated. (C,D) Two-dimensional clustered heat maps of the single-cell mutation data (T1000), with clusters labeled by color above. Nonsynonymous mutations are labeled in bold, while synonymous mutations are labeled in regular text. Populations of flow-sorted aneuploid tumor cells that were sequenced on the T1000 panel from the primary and metastatic tumors are shown on the right-hand side and labeled as “pop.” Blue bars represent mutations, light gray bars represent reference alleles, dark gray bars represent false-positives, and white bars represent sites with low or no coverage (NA).

Early APC progenitor cells detected in CRC1. Raw sequencing reads and variant alleles are plotted for three diploid APC progenitor cells (PD16, PD41, PDD93) and one representative primary tumor cell (PA74) from the major tumor population at genomic regions where mutations were detected in APC, KRAS, TP53, and TCF7L2. Plots and read counts were generated using the Integrative Genomics Viewer (IGV).

Single-cell copy number profiling of primary and metastatic tumors. (A,B) MDS plots of single-cell copy number profiles from patients CRC1 and CRC2. (C,D) Hierarchical one-dimensional clustered heat maps of single-cell integer copy number profiles from patients CRC1 and CRC2. Heat map colors correspond to the integer copy number values in the single cells. Clusters of cells with similar profiles are labeled in colored bars on the left-hand side, and cancer genes are annotated on the x-axis.