Intratumor heterogeneity (ITH) is observed at different stages of tumor progression, metastasis and reouccurence, which can be important for clinical applications. We used RNA-sequencing data from tumor samples, and measured the level of ITH in terms of biological network states. To model complex relationships among genes, we used a protein interaction network to consider gene-gene dependency. ITH was measured by using an entropy-based distance metric between two networks, nJSD, with Jensen-Shannon Divergence (JSD). With nJSD, we defined transcriptome-based ITH (tITH). The effectiveness of tITH was extensively tested for the issues related with ITH using real biological data sets. Human cancer cell line data and single-cell sequencing data were investigated to verify our approach. Then, we analyzed TCGA pan-cancer 6,320 patients. Our result was in agreement with widely used genome-based ITH inference methods, while showed better performance at survival analysis. Analysis of mouse clonal evolution data further confirmed that our transcriptome-based ITH was consistent with genetic heterogeneity at different clonal evolution stages. Additionally, we found that cell cycle related pathways have significant contribution to increasing heterogeneity on the network during clonal evolution. We believe that the proposed transcriptome-based ITH is useful to characterize heterogeneity of a tumor sample at RNA level.

f5: Pan-cancer survival analysis of gITH and tITH.We divided patients into two groups with median of gITH and tITH value. (A,B) was analyzed with same patients group who had reported number of subclones from other research. (A) Kaplan-Meier plot of the two groups based on the subclone number in 5-year censored data, and (B) based on tITH in 5-year censored data. (C) is Kaplan-Meier plot of pan-cancer patients in 12 different cancer types.

Mentions:
The first experiment was to investigate the prognostic power of tITH at whole cancer level (n = 606) using the Cox regression model, in comparison to that of the gITH information predicted by EXPAND. To test the prognostic power of gITH and tITH for each cancer type, we built two univariate Cox models. One was done with gITH and 5-year survival information. The other was done with tITH and 5-year survival information. The univariate Cox model using gITH was not significant (p-value = 0.3370, c-index = 0.543) while the model using tITH was statistically significant (p-value = 0.0006, c-index = 0.604). It is reported that the mutation based subclone number has a nonlinear association with survival24, which supports why the gITH model was not successful in separating patients groups into good or poor groups in Kaplan-Meier survival analysis. However, tITH has linear association with survival and tITH model was successful in separating patients groups (Fig. 5A,B). Next, we performed cox proportional hazard test with a bigger data set including patients without subclone information, a total of 5,628 patients. Pan-cancer univariate cox model using tITH values found clinical utility with significant statistics (c-index = 0.64, p < 2e-16). Kaplan-Meier survival analysis with poor and good group also shows significant separating of two groups (Fig. 5C).

f5: Pan-cancer survival analysis of gITH and tITH.We divided patients into two groups with median of gITH and tITH value. (A,B) was analyzed with same patients group who had reported number of subclones from other research. (A) Kaplan-Meier plot of the two groups based on the subclone number in 5-year censored data, and (B) based on tITH in 5-year censored data. (C) is Kaplan-Meier plot of pan-cancer patients in 12 different cancer types.

Mentions:
The first experiment was to investigate the prognostic power of tITH at whole cancer level (n = 606) using the Cox regression model, in comparison to that of the gITH information predicted by EXPAND. To test the prognostic power of gITH and tITH for each cancer type, we built two univariate Cox models. One was done with gITH and 5-year survival information. The other was done with tITH and 5-year survival information. The univariate Cox model using gITH was not significant (p-value = 0.3370, c-index = 0.543) while the model using tITH was statistically significant (p-value = 0.0006, c-index = 0.604). It is reported that the mutation based subclone number has a nonlinear association with survival24, which supports why the gITH model was not successful in separating patients groups into good or poor groups in Kaplan-Meier survival analysis. However, tITH has linear association with survival and tITH model was successful in separating patients groups (Fig. 5A,B). Next, we performed cox proportional hazard test with a bigger data set including patients without subclone information, a total of 5,628 patients. Pan-cancer univariate cox model using tITH values found clinical utility with significant statistics (c-index = 0.64, p < 2e-16). Kaplan-Meier survival analysis with poor and good group also shows significant separating of two groups (Fig. 5C).

Intratumor heterogeneity (ITH) is observed at different stages of tumor progression, metastasis and reouccurence, which can be important for clinical applications. We used RNA-sequencing data from tumor samples, and measured the level of ITH in terms of biological network states. To model complex relationships among genes, we used a protein interaction network to consider gene-gene dependency. ITH was measured by using an entropy-based distance metric between two networks, nJSD, with Jensen-Shannon Divergence (JSD). With nJSD, we defined transcriptome-based ITH (tITH). The effectiveness of tITH was extensively tested for the issues related with ITH using real biological data sets. Human cancer cell line data and single-cell sequencing data were investigated to verify our approach. Then, we analyzed TCGA pan-cancer 6,320 patients. Our result was in agreement with widely used genome-based ITH inference methods, while showed better performance at survival analysis. Analysis of mouse clonal evolution data further confirmed that our transcriptome-based ITH was consistent with genetic heterogeneity at different clonal evolution stages. Additionally, we found that cell cycle related pathways have significant contribution to increasing heterogeneity on the network during clonal evolution. We believe that the proposed transcriptome-based ITH is useful to characterize heterogeneity of a tumor sample at RNA level.