Menu

Is a doctoral student in Department of Biostatistics, Yale University. Xingjie

Is a doctoral student in Department of Biostatistics, Yale University. Xingjie Shi is a doctoral student in biostatistics currently under a joint training program by the Shanghai University of Finance and Economics and Yale University. Yang Xie is Associate Professor at Department of Clinical Science, UT Southwestern. Jian Huang is Professor at Department of Statistics and Actuarial Science, University of Iowa. BenChang Shia is Professor in Department of Statistics and Information Science at FuJen Catholic University. His research interests include data mining, big data, and health and economic studies. Shuangge Ma is Associate Professor at Department of Biostatistics, Yale University.?The Author 2014. Published by Oxford University Press. For Permissions, please email: [email protected] et al.Consider mRNA-gene expression, methylation, CNA and microRNA measurements, which are commonly available in the TCGA data. We note that the analysis we conduct is also applicable to other datasets and other types of genomic measurement. We choose TCGA data not only because TCGA is one of the largest publicly available and I-BRD9 chemical information high-quality data sources for cancer-genomic studies, but also because they are being analyzed by multiple research groups, making them an ideal test bed. Literature review suggests that for each individual type of measurement, there are studies that have shown good predictive power for cancer outcomes. For instance, patients with glioblastoma H-89 (dihydrochloride) chemical information multiforme (GBM) who were grouped on the basis of expressions of 42 probe sets had significantly different overall survival with a P-value of 0.0006 for the log-rank test. In parallel, patients grouped on the basis of two different CNA signatures had prediction log-rank P-values of 0.0036 and 0.0034, respectively [16]. DNA-methylation data in TCGA GBM were used to validate CpG island hypermethylation phenotype [17]. The results showed a log-rank P-value of 0.0001 when comparing the survival of subgroups. And in the original EORTC study, the signature had a prediction c-index 0.71. Goswami and Nakshatri [18] studied the prognostic properties of microRNAs identified before in cancers including GBM, acute myeloid leukemia (AML) and lung squamous cell carcinoma (LUSC) and showed that srep39151 the sum of jir.2014.0227 expressions of different hsa-mir-181 isoforms in TCGA AML data had a Cox-PH model P-value < 0.001. Similar performance was found for miR-374a in LUSC and a 10-miRNA expression signature in GBM. A context-specific microRNA-regulation network was constructed to predict GBM prognosis and resulted in a prediction AUC [area under receiver operating characteristic (ROC) curve] of 0.69 in an independent testing set [19]. However, it has also been observed in many studies that the prediction performance of omic signatures vary significantly across studies, and for most cancer types and outcomes, there is still a lack of a consistent set of omic signatures with satisfactory predictive power. Thus, our first goal is to analyzeTCGA data and calibrate the predictive power of each type of genomic measurement for the prognosis of several cancer types. In multiple studies, it has been shown that collectively analyzing multiple types of genomic measurement can be more informative than analyzing a single type of measurement. There is convincing evidence showing that this isDNA methylation, microRNA, copy number alterations (CNA) and so on. A limitation of many early cancer-genomic studies is that the `one-d.Is a doctoral student in Department of Biostatistics, Yale University. Xingjie Shi is a doctoral student in biostatistics currently under a joint training program by the Shanghai University of Finance and Economics and Yale University. Yang Xie is Associate Professor at Department of Clinical Science, UT Southwestern. Jian Huang is Professor at Department of Statistics and Actuarial Science, University of Iowa. BenChang Shia is Professor in Department of Statistics and Information Science at FuJen Catholic University. His research interests include data mining, big data, and health and economic studies. Shuangge Ma is Associate Professor at Department of Biostatistics, Yale University.?The Author 2014. Published by Oxford University Press. For Permissions, please email: [email protected] et al.Consider mRNA-gene expression, methylation, CNA and microRNA measurements, which are commonly available in the TCGA data. We note that the analysis we conduct is also applicable to other datasets and other types of genomic measurement. We choose TCGA data not only because TCGA is one of the largest publicly available and high-quality data sources for cancer-genomic studies, but also because they are being analyzed by multiple research groups, making them an ideal test bed. Literature review suggests that for each individual type of measurement, there are studies that have shown good predictive power for cancer outcomes. For instance, patients with glioblastoma multiforme (GBM) who were grouped on the basis of expressions of 42 probe sets had significantly different overall survival with a P-value of 0.0006 for the log-rank test. In parallel, patients grouped on the basis of two different CNA signatures had prediction log-rank P-values of 0.0036 and 0.0034, respectively [16]. DNA-methylation data in TCGA GBM were used to validate CpG island hypermethylation phenotype [17]. The results showed a log-rank P-value of 0.0001 when comparing the survival of subgroups. And in the original EORTC study, the signature had a prediction c-index 0.71. Goswami and Nakshatri [18] studied the prognostic properties of microRNAs identified before in cancers including GBM, acute myeloid leukemia (AML) and lung squamous cell carcinoma (LUSC) and showed that srep39151 the sum of jir.2014.0227 expressions of different hsa-mir-181 isoforms in TCGA AML data had a Cox-PH model P-value < 0.001. Similar performance was found for miR-374a in LUSC and a 10-miRNA expression signature in GBM. A context-specific microRNA-regulation network was constructed to predict GBM prognosis and resulted in a prediction AUC [area under receiver operating characteristic (ROC) curve] of 0.69 in an independent testing set [19]. However, it has also been observed in many studies that the prediction performance of omic signatures vary significantly across studies, and for most cancer types and outcomes, there is still a lack of a consistent set of omic signatures with satisfactory predictive power. Thus, our first goal is to analyzeTCGA data and calibrate the predictive power of each type of genomic measurement for the prognosis of several cancer types. In multiple studies, it has been shown that collectively analyzing multiple types of genomic measurement can be more informative than analyzing a single type of measurement. There is convincing evidence showing that this isDNA methylation, microRNA, copy number alterations (CNA) and so on. A limitation of many early cancer-genomic studies is that the `one-d.