Abstract

Lung cancer is one of the most common causes of cancer-related death in men and women throughout the world. An appropriate statistical model for survival analysis on lung cancer can provide precise prognosis for treatment planning. Usually the traditional prognostic decisions are made purely based on pathologists’ subjective evaluations. It has been proven that accuracy and objectivity of diagnosis and prognosis, when assisted with computational algorithm, will dramatically increase. In this paper, we have developed a novel prediction model called LC-Morph. The prediction model includes cell detection, segmentation, and statistical model for survival analysis. 122 lung cancer patients’ images extracted from the cancer genome atlas (TCGA) data set has been used in this study. A robust seed detection-based cell segmentation algorithm is proposed to accurately segment each individual cell in the image. Based on the cell segmentation results, a set of comprehensive cellular features are extracted using some efficient image feature descriptors. To build a prognostic image signature for patient overall survival, the study data set is randomly split into a training data set (82 patients) and a testing data set (40 patients). Based on the training data, univariate Cox models are used to identify informative image features. A lasso-penalized Cox model is used to derive an image feature-based prognostic model and calculate the corresponding risk score (LC-Morph score) which is used to evaluate the patient’s survival. This prediction score is externally validated using the testing data set. We also stratify patients into high- and low-risk groups based on the LC-Morph score and find significantly longer survival time in the low-risk group than the high-risk group (log-rank P=0.013), which indicates the efficacy of the LC-Morph score in estimating the survival rates of lung cancer patients.