1Department of Medicine, University of Chicago, Chicago, Illinois, United States of America.

Abstract

Computational models in biomedicine rely on biological and clinical assumptions. The selection of these assumptions contributes substantially to modeling success or failure. Assumptions used by experts at the cutting edge of research, however, are rarely explicitly described in scientific publications. One can directly collect and assess some of these assumptions through interviews and surveys. Here we investigate diversity in expert views about a complex biological phenomenon, the process of cancer metastasis. We harvested individual viewpoints from 28 experts in clinical and molecular aspects of cancer metastasis and summarized them computationally. While experts predominantly agreed on the definition of individual steps involved in metastasis, no two expert scenarios for metastasis were identical. We computed the probability that any two experts would disagree on k or fewer metastatic stages and found that any two randomly selected experts are likely to disagree about several assumptions. Considering the probability that two or more of these experts review an article or a proposal about metastatic cascades, the probability that they will disagree with elements of a proposed model approaches 1. This diversity of conceptions has clear consequences for advance and deadlock in the field. We suggest that strong, incompatible views are common in biomedicine but largely invisible to biomedical experts themselves. We built a formal Markov model of metastasis to encapsulate expert convergence and divergence regarding the entire sequence of metastatic stages. This model revealed stages of greatest disagreement, including the points at which cancer enters and leaves the bloodstream. The model provides a formal probabilistic hypothesis against which researchers can evaluate data on the process of metastasis. This would enable subsequent improvement of the model through Bayesian probabilistic update. Practically, we propose that model assumptions and hunches be harvested systematically and made available for modelers and scientists.

The tumor's development starts with its growth at the primary location (primary tumor). In metastatic progression, some cells from the primary tumor detach from the colony (detachment), enter blood or lymph vessels (intravasation) and travel within the body (migration/transport). Next, the traveling cells exit blood or lymph vessels (extravasation) and colonize new sites in the body. There, they divide and form tiny colonies at first (micrometastasis), followed by further cell proliferation, recruitment of blood vessels (angiogenesis) that provide small colonies with sufficient nutrients to develop into large tumors (macrometastasis). It is currently unclear if secondary colonies can re-metastasize to form tertiary and quaternary colonies (dotted line indicating a cyclic process).

Visualization of expert views about the importance of canonical metastatic stages within the actual process of metastasis.

While commenting on the suggested diagram (see Figures 1 and 2A), most experts were not confident that certain stages proposed are part of the metastatic process observed in the laboratory or clinic. Font size represents the number of experts voting for inclusion of the stage represented by the corresponding phrase. (A) The schematic that we presented to experts as canonical. (B–D) Subgroups of experts: PhDs only (B), MDs only (C), MD/PhDs (D). (E) The distribution of certainty across all experts.

(A) The cascade that we presented to the experts as a “textbook” cascade. Experts suggested reordering stages, removing certain stages and/or adding new ones. (B) Expert-specific depiction of metastasis progression suggested during interviews. Note that every scenario is distinct. Expert 3 did not suggest any scenario, commenting that we have insufficient knowledge; four experts suggested two possibilities (depicted as a pair of scenarios grouped by vertical bars). The ellipses (…) indicate that experts agreed with the prior or posterior sequence that we showed them. (Supplementary Figure S1 demonstrates further variation provided by experts in renaming stages.) (C) Additional stages/concepts that were not present in the original schematic (A) but were suggested by experts.

(A) A hypothetical example of scenarios produced by two experts (experts 1 and 2). Expert 1 suggested two alternative scenarios (ABC and AFDC), while expert 2 suggested only one (ACBD). We define the agreement/similarity between experts as twice the number of stage pairs they share, of all possible ordered pairs of stages from each scenario. We define the disagreement/dissimilarity as the agreement minus the sum of unmatched pairs; see the example. (B) Cumulative probabilities that two experts agree at least at level x (similarity of x or greater), and that two experts disagree at level x or less. The probability of agreement drops rapidly as the number of statements from each expert increases, while the probability of disagreement grows gradually to a rather large number of pairwise disputes. (C) A hypothetical four-expert “regulatory deadlock” could occur if any one of the experts insisted on his disagreement with all others.

We grouped the comments of experts into several topics. Within each topic (represented here by a row), an expert may have mentioned several ideas he or she agrees with, disagrees with, or is unsure about. This heat map depicts the main topics for modeling the process and how many ideas under each topic every expert (columns: 1–28) discussed, along with their agreement. An asterisk (*) indicates a topic whose frequency of mention was statistically significant compared to the null model of a uniform distribution of comments (see Table S1). Abbreviations: Def = Definition, Dev = Development, Env = Environment, Evol = Evolution, Met = Metastasis, Mic = Micro, Trop = Tropism.

A map of comments provided by experts with details on expert backgrounds.

The ideas discussed by the experts for three main interview topics were (A) what acquisition of metastasis requires, (B) when metastasis is acquired, and (C) what factors affect preferential choice of specific tissues/organs by metastatic cells (tropism). Color-filled squares indicate that the corresponding idea (one per row) was discussed by the corresponding expert (one per column). The responses have been grouped to illustrate the variance for “PhDs vs. MDs vs. MD/PhDs” and experts that received their first aforementioned degree “before 1986 vs. between 1986 and 1995 vs. after 1995.” On the top, the expert ID number can be found followed by four rows of color-coded squares, indicative of four classifications associated with each expert: (1) “The-University-of-Chicago vs. elsewhere”; (2) when experts received their first professional degree—“before 1986 vs. between 1986 and 1995 vs. after 1995”; (3) “men vs. women”; (4) and the nature of their advanced degrees—“PhDs vs. MDs vs. MD/PhDs.”

Three hypothetical scenarios involving five experts, each providing a story containing five stages.

The three scenarios illustrate situations of (A) “complete agreement”, (B) “moderate agreement”, and (C) “random agreement.” Panels D, E and F illustrate the probability of agreement and disagreement between two randomly chosen experts on at least k statements for each scenario (see Figure 4B). Panels G, H and I render heat maps to illustrate the transition probability matrices after a single Markov chain transition under each scenario. Each stochastic matrix is square, non-negative and organized in the following way. The ith row of the matrix provides probabilities of transitions from state i to all other states of the model. The sum of probabilities in each row is therefore equal to 1.

Heat maps that visualize the transition probability matrices in our Markov model of metastasis after n Markov chain transitions (n = 1, 2, 10); and a plot of probabilities associated with reaching each state after n chain transitions.

(A) Stochastic matrices are organized as in Figure 7G, H and I: each stochastic matrix is square, non-negative and organized such that the ith row of the matrix provides probabilities of transitions from state i to all other states in the model and the sum of each row is equal to 1. The three matrices show the transition probabilities from states i to state j after one (P1), two (P2), and ten (P10), transition steps of the Markov process. The labels, “Start” and “End” correspond to the generating and absorbing states of the Markov chain, respectively. The rest of the labels indicate metastasis stages that were suggested by the authors (in blue) and by individual experts (in red). (B) The probability of finding the Markov process in a given state at transition n (n = 1, 2, …, 50) is given by . Here Λ is the distribution over all states at the beginning of the chain, and Pn = [pij(n)] is the transition probability matrix after the nth transition (obtained by raising matrix P to nth power). This figure demonstrates the decreasing probability that expert intuition about metastasis lands in any particular state at any particular stage. Note that even after 50 transitions there is a substantial probability that the chain would not reach the “End” state.