Introduction

For the vast majority of patients that die from solid malignancies, lethality can be directly traced to the propensity of their tumor cells to metastasize. Paget's seminal seed and soil hypothesis proposed that the colonization of distant sites by primary seed tumour cells is dependent on a compatible environment in the secondary soil site (1). Development of this central idea over the years has led to the prevailing view that metastases are founded by rare single cells that escape from the primary site. A key advantage of this view is that it provides an explanation for the relative rarity of clinical metastasis formation in the general cancer population.

A body of evidence has subsequently accumulated that supports this model of tumor dissemination. Some of the early works include a study of spontaneously arising lung metastases in mouse models of melanoma, where cells uniquely tagged with random irradiation-induced karyotypic markers unequivocally indicated that metastases originated from a single progenitor cell (2). Follow-on experiments showed that when mixtures of two distinct melanoma cell lines were injected intravenously, subsequent lung metastases were derived from only one line and not admixtures of the two cell lines (3).

More recently, models of human metastasis have been updated, especially with regard to the timing of spread (4). Largely responsible for this shift is the application of next-generation sequencing to matched primary and metastatic samples. By identifying sets of shared and private mutations, sample relatedness can be observed and an approximate evolutionary relationship determined. Studies of human colorectal cancer (5), pancreatic cancer (6), melanoma (7), and neuroblastoma (8) have shown that spread can occur late in the evolution of the primary disease, revealing a linear evolutionary relationship between the primary tumor and metastasis. Conversely, in renal cancer (9), metastatic progression has been shown to occur early, with both primary tumor and metastasis having private mutations and, thus, evolving in parallel. However, follow-up studies in pancreatic (10) and other cancers (11) have shown examples of both early and late spread, suggesting that timing patterns are not necessarily tumor specific.

Studies across multiple metastases from the same patient have also revealed that asynchronous spread can occur from primary to multiple distant metastatic sites in colorectal cancer (11) as well as seeding from metastatic to secondary metastatic site in a cascading manner in prostate cancer (12), ovarian cancer (13), and pancreatic cancer (10).

One limitation of these studies is that the clonal composition of each sample is determined using the presence or absence of private and shared mutations. This type of modeling does not allow for the estimation of clonal frequencies, which is vital for accurate evolutionary reconstruction and identification of more than two clones per sample. In an attempt to adopt a more detailed modeling strategy, algorithms have been developed that model the clonal composition within a tumor using mutation variant allele frequencies. These algorithms have vastly improved our ability to model and understand metastatic spread. The first use of such an algorithm appeared in a study of primary breast cancers, where it was used to accurately identify the clonal makeup of a tumor and infer the evolutionary history of its clones (genetically distinct populations of tumor cells; ref. 14). Since then, a rapidly developing field has emerged that uses high-coverage exome, capture, amplicon, and/or whole-genome tumor sequence data to trace clone lineages and infer phylogenetic relationships within and between lesions from individual patients (15, 16).

A subset of recent studies have used these algorithms to infer the evolutionary relationship of clones in matched primary and metastatic samples (17–26), revealing patterns of metastasis only observable using this type of quantitative analysis. A recent review outlined the implications of these studies on treatment, including a summary of the potential underlying genetic determinants of spread (27). Here we focus specifically on how subclonal modeling of multiple samples from the same individual has shaped our understanding of metastasis in humans.

Subclonal Modeling of Metastasis

By comparing the constituent subclonal mutations between pairs of primary and metastatic samples, it is possible to derive the ancestral relationships between tumor clones rather than between tumor samples. This type of modeling has allowed confirmation of existing patterns of metastatic spread at increased (subclonal) resolution and has yielded new insights into the patterns and timing of tumor cell spread, which we articulate below.

Timing of spread

Seeding from an ancestral clone early during disease development (Fig. 1A), results in a branched evolution pattern, where primary and metastasis samples evolve in a “parallel” manner (28). This has been shown at subclonal resolution in two lung cancer cases (22), two glioblastoma cases (26), one ovarian case (17), seven prostate cases (18, 20), as well as in mouse models, where evolutionary analysis of skin cancer demonstrated that the majority of tumors adopt a parallel mode of evolution (29). Much debate exists, however, whether particular tumor types have a dominant mode of evolution in humans. Spread occurring late in the evolution of the primary tumor in a linear fashion (Fig. 1B) has been observed in one oral cancer case (25), eight melanoma cases (24), and four glioblastoma cases (in these cases from residual tumor cells; ref. 26). Although sample sizes across these studies are not yet sufficient to determine whether certain tumor types are enriched for late or early spread, examples of both have been seen in a study of 82 patients with brain metastases originating from various primaries (23), as well as across 11 cases of head and neck cancer (19).

Examples of subclonal modeling of metastasis in human tumors. This figure summarizes the findings of several recent studies that sequenced the DNA of matched primary and metastatic tissue from patients with lung cancer, breast cancer, melanoma, and prostate cancer. In each case, mutations were used to infer the evolutionary history of the disease as a clonal tree, where each node in the tree represents a genetically distinct population of cells, or clone. A schematic of the clonal spread in each patient is shown, along with a simplified version of the clonal evolution tree reported in the original studies. A, A lung cancer patient (PP4) from Paik and colleagues (22) who showed early metastasis to the brain, resulting in a branched clonal tree with parallel evolution of both the primary tumor and metastasis. B, A lung cancer patient (308) from Brastianos and colleagues (23) who showed late spread to the brain, resulting in a linear clonal tree. C, A melanoma patient (H) from Sanborn and colleagues who showed a distant brain metastasis seeded from a single clone present in a leg lesion. D, A prostate cancer patient (177) from Hong and colleagues (20) who showed an early and late spread of two clones to the ilium in an asynchronous polyclonal manner (left). A prostate cancer patient (A32) from Gundem and colleagues (18) who showed the spread of two clones to two separate metastatic locations in a synchronous manner (right). E, A prostate cancer patient (299) from Hong and colleagues (20) with multiple regions of the primary tumor sequenced showing a single, extraprostatic clone as the source of the shoulder metastasis. F, A breast cancer patient from Murtaza and colleagues (21) showing a cascade of spread from primary to brain metastasis, then brain to ovary.

Seed composition

All tumor types studied at subclonal resolution mentioned in this review showed at least one example of monoclonal seeding, where a single clone escapes the primary to find a metastatic deposit (Fig. 1C).

New data in mouse models of cancer metastasis have challenged the predominant monoclonal model of how metastases are constituted, positing that some metastases are comprised of mixtures of distinct tumor clones seeded in a polyclonal manner (30–32). Furthermore, it has also been argued that clones present in polyclonal mixtures are not necessarily indifferent to one another but may actually cooperate to seed a secondary lesion, suggesting that mutual interclonal cooperation between distinct clones exists (33). The evidence for such oncogenic cooperation in different model systems has recently been extensively covered in an excellent review (34).

The key distinguishing feature required to confirm the existence of polyclonal seeding in bulk sequencing of human samples is the presence of subclonal clusters of mutations across multiple tumors from distinct locations. A mutation is considered subclonal if it appears in only a fraction of the tumor cells in a sample. Sets of mutations appearing subclonally in two or more metastases can arise under two potential scenarios: (i) The same sets of mutations occur independently in each sample; (ii) two distinct founder cells containing the sets of mutations spread to each location together. Although convergent evolution could give weight to scenario 1, it is extremely unlikely statistically given the sizeable sets of subclonal mutations observed in the studies discussed here. Therefore, scenario 2 can be the only real explanation for these subclonal clusters. It is this reasoning that has allowed the determination of the existence of polyclonal seeding in humans.

Many of the studies discussed here have gone a step beyond subclonal clustering and inferred the evolutionary relationship between clones. This process facilitates finer understanding of polyclonal seeding and begins to help us determine whether polyclonal spread occurs synchronously, with both cells transiting in unison, or asynchronously, with multiple waves of spread to the same location. Although evidence has yet to accumulate to unequivocally determine synchronicity, the clonal evolution trees determined from multiple studies tend to favor one or the other.

Synchronous polyclonal seeding is a plausible explanation for the patterns of spread observed in six separate studies across five tumor types: oral, breast, glioblastoma, melanoma, and prostate (Fig. 1D). In these studies, similar mixes of clones were detected in multiple samples from the same individual: Wood and colleagues reconstructed the clonal evolution of a matched oral primary tumor and metastasis in patient PG030, showing that the same mix of clones was present in both samples (25); Murtaza and colleagues observed two subclonal mutation clusters present at varying frequencies across five distant metastatic sites from a single breast cancer patient (21); two patients (C and E) showed evidence of polyclonal seeding in a study of melanoma (24); two cases of glioblastoma revealed clusters of mutations present at subclonal fractions in both primary and recurrent disease (26); and, two separate studies into prostate cancer revealed multiple cases of polyclonal seeding (18, 20). Although it is feasible that the mix of clones observed across these cases could have arisen asynchronously, evidence seen in studies of circulating tumor cell clusters lends weight to a synchronous model of spread. For example, a recent study of clusters of circulating tumor cells versus single cells showed that cell clusters had up to 50-fold increased metastatic potential compared with single cells (35). Interestingly, however, in a study of 86 brain metastasis cases arising from various primary tumors, no evidence of polyclonal seeding was found (23), even though the authors explicitly searched for it. These differences could be attributed to the metastatic niche, whereby the blood–brain barrier prevented clusters of cells transiting but allowed single-cell spread. This suggests that the ability for multiple clones to colonize a site could be heavily dependent on the metastatic niche.

Despite the preference for a synchronous model of spread, asynchronous polyclonal seeding has been shown to be a more likely explanation for at least two patients from the aforementioned prostate studies (Fig. 1D; refs. 18, 29). In patient 177 from Hong and colleagues (20), a combination of unusual mutant allele frequency patterns combined with structural variant allele frequencies leads to the most likely explanation of the polyclonal makeup of a metastasis being an early spread from the primary tumor, followed by a late spread of a further evolved clone (Fig. 1C). In patient A32 from Gundem and colleagues (18), a left supraclavicular lymph node was seeded twice from the primary tumor. In the first wave of metastatic seeding, all four of the metastatic sites in this patient were seeded with a particular clone; however, in a subsequent round of spread, a second, distinct metastasizing clone spread to the left supraclavicular lymph node only and not the other three metastatic sites. These findings raise important questions as to whether some tumor clones act as pathfinders colonizing distant sites, which then act as beacons to attract subsequent waves of metastatic colonization in the nascent metastatic niche. Properties of the metastatic niche itself are also likely to contribute to metastatic subclonal seeding and expansion, as evidenced by patient A32. They also clearly suggest that for some patients, at least, removal of the primary tumor even after distant metastases have been detected may still be clinically warranted, as the primary tumor may continue to serve as an incubator of further metastatic tumor cell dissemination. This concept is now supported by a growing body of clinical evidence, suggesting that treatment of primary tumors in patients with synchronous metastases can provide clinical benefits, including improvements in overall survival (36–39). Further along these lines, one could postulate that polyclonal seeding may occur more often at terminal disease stages where natural defense mechanisms are strained, facilitating easier colonization by multiple tumor clones.

Seed source

Subclonal modeling of multiregional primary prostate tumor samples allowed Hong and colleagues (20) to precisely pinpoint the clone that gave rise to a distant metastasis (Fig. 1D). Furthermore, by defining each clone in the primary, they were able to interrogate its presence in circulating tumor DNA and found that, in addition to the (expected) detection of metastatic clones, clones (presumed) exclusive to the primary tumor were also detected, despite the primary tumor being removed 2 years prior. These clones had not seeded any clinically obvious metastases, strongly implying that all clones had colonized distant sites, some occult.

As well as seeding from the primary tumor, cells from one metastasis can seed another metastasis, resulting in what is known as an evolutionary cascade (Fig. 1D). This phenomenon has been seen at subclonal resolution from lymph node to distant metastasis in mouse models of skin cancer (30), single cases of human breast cancer (21) and melanoma (24), and multiple prostate cases (18, 20). In one of these prostate cases, cross-metastatic–site seeding appeared to occur directly in response to the onset of targeted treatment, with marked remodeling of the original subclonal composition at an iliac crest metastatic site within 12 weeks of the patient starting androgen deprivation therapy. Similar subclonal remodeling has also been shown in response to chemotherapy in ovarian cancer (17) and leukemia (40).

Detecting Polyclonal Seeding

Patterns of polyclonal seeding can only be detected using algorithms that identify the subclonal makeup of multiple tumor samples from a given patient (41–45). Although a number of different computational techniques exist for inferring subclonal structure, the majority of studies covered in this review have used a statistical clustering algorithm known as a Bayesian Dirichlet Mixture Model. Therefore, to illustrate how polyclonal seeding is detected, we adapt an example from Gundem and colleagues (ref. 18; Fig. 2). We look at two samples from patient A22: a bladder metastasis (G) and a pelvic lymph node metastasis (H). First, using copy number, tumor purity, and tumor ploidy, the mutant allele fraction of each mutation is converted to the fraction of tumor cells harboring the mutation, represented as black dots in Fig. 2A, also known as the cancer cell fraction (for conversion details, see Nik-Zainal and colleagues; ref. 14). A Bayesian Dirichlet Mixture Model is then used to group mutations into clusters based on their frequencies in both samples (Fig. 2A, red shading). These clusters subsequently help define the distinct populations of cells that arose from clonal expansions during the evolution of the tumor. The cluster of mutations present in all tumor cells in both samples represents the founding clone (Fig. 2A, dark blue circle). Clusters of mutations that are in tumor cells across both samples represent founding cells of the metastases (Fig. 2A, dark blue and purple circles). Clusters that are unique to one of the two samples represent the clones that are emerging at each site (Fig. 2A, orange, light blue, and green circles; for simplicity, two clones belonging to the same metastasis with the same ancestor are colored green). The frequencies of the clusters combined with the pigeonhole principle (14) can then be used to reconstruct the most likely clonal evolution tree (Fig. 2B). As the purple cluster is present at subclonal frequencies in both samples, both cells from this clone and cells from the ancestral clone (dark blue circle) must have founded the metastatic site G in a polyclonal manner. The resulting clonal makeup can be represented by color-coded, nested ovals reflecting the evolutionary relationship between clones (Fig. 2C, white space represents normal cell admixture). Finally, an overall schematic of the clonal spread can be derived (Fig. 2D).

Detecting polyclonal seeding. This figure illustrates how polyclonal seeding can be detected using a Dirichlet Mixture modeling approach (see main text for a detailed description). A, A density plot showing the cancer cell fractions of mutations (black dots) in two metastatic samples of a prostate patient. The red shading represents the posterior probability of a cluster, as determined using a Dirichlet Mixture Model. The colored circles show the defined mutation clusters. B, A clone tree where each node represents a tumor clone with a distinct genotype. The shaded ellipses show the clone membership for the samples from this patient. C, An “easter egg” plot showing clone membership and ancestry as a series of embedded ellipses. The size of the ellipses is approximately proportional to the number of cells in the sample from that clone. D, A schematic showing the clonal composition of the primary tumor and metastases.

Discussion

The application of whole-genome sequencing and new computational methods to multiple metastatic samples has enabled exciting insights into the process of metastatic seeding, with the presence of polyclonal seeding being the most significant. However, there are now many open questions around the underlying mechanisms behind this observation. Do clones transit as polyclonal clusters or as single cells? If as clusters, are they cooperating within the cluster to survive blood transit and eventual seeding of distant sites? Do they form clusters within the blood or within the primary tumor site?

Some headway has been made through animal models of breast cancer, with a recent study showing that clusters of tumor cells have a much higher capacity to induce metastasis formation despite being present at much lower frequency than single cells (35). Furthermore, tumor cell clusters did not form in the blood but rather, appeared to form within the site of tumor cell inoculation. Another important question is whether and to what extent specific clones may be involved in establishing premetastatic niches conducive to subsequent waves of tumor cell inoculation. Evidence in favor of this is the observed extracellular vesicles secreted by tumor cells that can be sequestered by bone marrow–derived cells, enhancing their capacity to form a metastatic niche (46–48). Also, specific clones might be able to modify the metastatic potential of surrounding less metastatic clones through transfer of metastatic extracellular vesicles, as has been recently demonstrated in animal models of breast cancer (49). Further application of subclonal modeling to this question in humans is likely to yield greater insight.

The polyclonal seeding observed in multiple sites across the cases discussed in this review may be indicative of intimate crosstalk occurring between metastatic clones and suggests that targeted disruption of these interactions might be productive in obstructing metastasis formation. Certain patterns of metastasis may be targeted by particular treatment regimes. However, these insights are currently limited by the availability of samples, so predicting which pattern is likely to occur in a given tumor subtype is not yet feasible. Further studies incorporating the subclonal analysis of multiple primary and multiple metastases from individual patients are required not only to answer fundamental questions as to how tumor cells metastasize but also to provide insights into how this process may be disrupted.

Disclosure of Potential Conflicts of Interest

F. Markowetz reports receiving speakers bureau honoraria from Bayer Berlin. No potential conflicts of interest were disclosed by the other authors.

Grant Support

This work was supported by: a Federal grant for the Australian Prostate Cancer Research Centres and by NHMRC grants 1047581 and 1104010 (to C.M. Hovens and N.M. Corcoran), the Cancer Research UK (grants A15973 and A15601d; to G. Macintyre), the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001202), the UK Medical Research Council (FC001202), and the Wellcome Trust (FC001202; to P. Van Loo), The University of Cambridge, Cancer Research UK, Hutchinson Whampoa Limited, CRUK core grant C14303/A17197 (CRUK CI Institute core grant; to F. Markowetz and G. Macintyre), and A19274 (F. Markowetz lab grant). D.C. Wedge is funded by the Li Ka Shing foundation.