Abstract

We, and others, have recently shown that the vast majority of primary tumors are mosaics of clonal populations of varying sizes, different genetic makeup and distinct phenotypes. If subsets of these clones evolve the ability to migrate from the primary tumor and to survive in blood or lymphatic circulation, these clones can seed distant metastasis. For this study, we have two goals. First, we identify clones with metastatic phenotypes and characterize the somatic mutations that distinguish them from non-metastatic clones. Second, we use these mutation signatures to learn to recognize metastatic clones and to calculate the likelihood that a primary tumor will metastasize.

The identification of clones with metastatic phenotypes among heterogeneous tumor populations has so far been limited due to the availability of only single samples from tumors. We overcome this limitation using our previously published algorithm, EXPANDS, to identify clones present at >10% cell frequency within single tumor samples across eight different types of carcinomas. We use TCGA's exome-sequencing data to characterize the size and genetic content of clones in 453 primary and 23 metastatic tumors.

To quantify the metastatic potential of clones, we compare clone size between primary and metastatic tumors. Next, we model the metastatic potential of clones as a function of their specific point mutations and copy number variations and use principal component analysis to select metastasis gene candidates. Finally, we calculate the likelihood that a primary tumor will metastasize, from the number and size of primary tumor clones with high metastatic potential, further referred to as metastatic clones. We validate the prognostic significance of metastatic clone presence in two independent exome-sequencing datasets: a cross-sectional cohort, consisting of 683 primary tumors and a longitudinal cohort, consisting of six matched primary and metastatic tumors from three patients.