Abstract

Purpose: The early detection of lung cancer in heavy smokers by low-dose CT (LDCT) can reduce the mortality. However, LDCT screening increases the number of indeterminate solitary pulmonary nodules (SPN) in asymptomatic individuals, leading to overdiagnosis. Making a definitive preoperative diagnosis of malignant SPNs has been a clinical challenge. We have demonstrated that sputum miRNAs could provide potential biomarkers for lung cancer. Here, we aimed to develop sputum miRNA biomarkers for diagnosis of malignant SPNs.

Experimental Design: Using quantitative RT-PCR, we evaluated expressions of 13 sputum miRNAs, previously identified sputum miRNA signatures of lung cancer, in a training set of 122 patients with either malignant (n = 60) or benign SPNs (n = 62) to define a panel of biomarkers. We then validated the biomarker panel in an internal testing set of 136 patients with either malignant (n = 67) or benign SPNs (n = 69), and an external testing cohort of 155 patients with either malignant (n = 76) or benign SPNs (n = 79).

Results: In the training set, a panel of three miRNA biomarkers (miRs21, 31, and 210) was developed, producing 82.93% sensitivity and 87.84% specificity for identifying malignant SPNs. The sensitivity and specificity of the biomarkers in the two independent testing cohorts were 82.09% and 88.41%, 80.52% and 86.08%, respectively, confirming the diagnostic value.

Translational Relevance

The early detection of lung cancer in heavy smokers by low-dose CT (LDCT) can reduce the mortality. However, LDCT produces a substantial number of indeterminate solitary pulmonary nodules (SPN), leading to a high level of overdiagnosis. Having a definitive preoperative diagnosis of malignant SPNs is a clinical challenge. Using a training set of cases and controls, we developed a panel of three miRNA biomarkers (miRs21, 31, and 210) that could diagnose early-stage lung cancer among SPNs with 82.93% sensitivity and 87.84% specificity. We then confirmed the diagnostic performance of the biomarkers in two independent testing cohorts. The results indicate that sputum miRNA biomarkers may have potential utility in risk-stratifying indeterminate SPNs, and improving LDCT screening for lung cancer in heavy smokers.

Introduction

Non–small cell lung cancer (NSCLC) is the major type of lung cancer, which is the number one cancer killer for men and women. NSCLC mainly consists of two histologic categories: adenocarcinoma and squamous cell carcinoma (SCC). Cigarette smoking is the most common cause of the disease. A NCI-National Lung Screening Trail showed that the early detection of lung cancer by low-dose CT (LDCT) in heavy smokers followed by appropriate treatments significantly reduced the mortality (1). Therefore, many national medical societies recently recommend lung cancer screening in heavy smokers by LDCT (2). However, LDCT increases the number of indeterminate solitary pulmonary nodules (SPN) in asymptomatic individuals, whereas only a small fraction of SPNs may be lung tumors (2). Therefore, lung cancer screening in heavy smokers with LDCT could lead to a substantial amount of overdiagnosis (2). Radiology-based noninvasive and biopsy-based invasive techniques are used for managements of the indeterminate SPNs (3). However, the noninvasive approaches may cause unnecessary procedures, radiation exposure, anxiety, and cost. Furthermore, biopsies have risks of pneumothorax, hemorrhage, and false-negative results. The development of noninvasive biomarkers that can preoperatively identify malignant SPNs, and hence reduce the overdiagnosis of CT scan is urgently needed (2).

Sputum is a noninvasively and easily accessible body fluid that contains exfoliated bronchial epithelial cells (4). Sputum cytology can identify morphologic abnormalities of bronchial epitheliums of patient with lung cancer (5). Yet it has a poor sensitivity for diagnosis of lung cancer (5, 6). Molecular study of sputum could detect the cells containing lung tumor-associated molecular aberrations, thus providing a noninvasive approach for diagnosis of lung cancer (5). Numerous sputum molecular markers have been identified. However, none has been acceptable for clinical utility in diagnosis of lung cancer (5).

Dysregulation of miRNAs plays crucial roles in tumorigenesis (7, 8). Specific over- or underexpressions of miRNAs have been found to associate with particular tumor types, and thus open up a new field for molecular diagnosis of cancer (8–10). We have, for the first time, demonstrated that endogenous miRNAs are resistant to freeze–thaw action and stably exist in sputum (9). Using microarray-based platforms to profile expression signatures of 818 human mature miRNAs on NSCLC and the paired normal lung tissues, we identified a set of 12 miRNAs (miRs21, 31, 126, 139, 182, 200b, 205, 210, 375, 429, 486, and 708) that displayed dysregulation in NSCLC (11–13). We further showed 10 of the 12 miRNAs (miRs21, 31, 126, 182, 200b, 205, 210, 375, 486, and 708) whose abnormal expressions in sputum were related to lung cancer (11, 12). Furthermore, Roa and colleagues directly defined sputum miRNA profiling of lung cancer, and found that expressions of five sputum miRNAs (miRs21, 143, 155, 210, and 372) were related to the disease (14). So far, there are 13 sputum miRNAs (miRs21, 31, 126, 143, 155, 182, 200b, 205, 210, 372, 375, 486, and 708) showing promise as biomarkers of NSCLC. Moreover, the previous studies indicated that the 13 miRNAs could be reproducibly and specifically measured in sputum by using quantitative RT-PCR (qRT-PCR), providing rationale for developing sputum miRNA biomarkers for preoperative diagnosis of malignant SPNs.

On the basis of the earlier findings, we aimed to identify and characterize sputum miRNAs that could be used for identifying lung cancer in CT-discovered SPNs. We first evaluated expressions of the 13 sputum miRNAs in a training set of 122 patients with either malignant or benign SPNs to define a panel of biomarkers. We then validated the biomarker panel in an internal testing set of 136 patients with either malignant or benign SPNs, and an external testing cohort of 155 patients with either malignant or benign SPNs.

Materials and Methods

Patient cohorts

The study protocols were approved by the Institutional Review Boards of the University of Maryland Medical Center (UMMC; Baltimore, MD) and the Baltimore VA Medical Center (BVAMC; Baltimore, MD). All subjects were selected and gave informed consent on the basis of the presence of SPNs on chest CT scan when they visited the SPN clinics in the two medical centers. Final clinical diagnoses were confirmed with histopathologic examinations of specimens obtained by CT-guided transthoracic needle biopsy, transbronchial biopsy, videotape-assisted thoracoscopic surgery, or surgical resection. Of the 258 subjects recruited from UMMC, 127 had malignant SPNs and were diagnosed with early-stage NSCLC (stage I or II), and 131 had benign SPNs. The 131 subjects with benign SPNs were diagnosed with granulomatous inflammation (n = 75), nonspecific inflammatory changes (n = 33), or lung infections (n = 23). The 258 cases were randomly split into a training set and an internal testing set. The training set consisted of 60 individuals with malignant SPNs and 62 individuals with benign SPNs (Table 1). The testing set comprised 67 subjects with malignant SPNs and 69 individuals with benign SPNs (Table 2). Of the 155 patients recruited from BVAMC, 76 had malignant SPNs (stage I and II NSCLC) and 79 had benign SPNs. The set of cases and controls was used as an external and independent testing cohort (Table 3). All participants with benign SPNs remained cancer free for a minimum 2-year follow-up. The demographic and clinical variables, including information about nodules' size range, of the three cohorts are shown in Tables 1–3.

The demographic and clinical variables of an external testing set of patients with malignant SPNs and patients with benign SPNs

Sputum collection, preparation, and sputum cytologic study

The subjects were instructed to spontaneously cough sputum as previously described (6, 9, 11–25), before receiving any treatment (e.g., surgery, preoperative adjuvant chemotherapy, and radiotherapy). Thirty percent of the participants (mainly former smokers and nonsmokers) were not able to spontaneously cough sputum; these subjects underwent sputum induction using a Lung Flute (Medical Acoustics)-based technique as described in our previous work (19). Sputum was collected in a sterile cup, and centrifuged at 1,000 × g for 15 minutes. Cytospin slides were prepared and underwent Papanicolaou staining for evaluating whether the specimens were representative of deep bronchial cells. All sputum samples were of lower respiratory origin as indicated by the presence of macrophages and bronchial epithelial cells. Cytologic diagnosis was then performed on the cytospin slides from sputum using the classification of Saccomanno (4). Positive cytology included both carcinoma in situ and invasive carcinoma (15, 16). The cell pellet from each sample was resuspended in Sputolysin (Calbiochem) for 15 minutes at 37°C. The cell pellets were then washed in PBS (Sigma-Aldrich) and stored at −80°C until being tested.

The analysis of miRNAs in sputum by qRT-PCR

RNA was extracted from cell pellets of sputum as previously described (9, 11–13, 18, 19). The purity and concentration of RNA were determined by OD260/280 readings using a dual beam UV spectrophotometer (Eppendorf AG). RNA integrity was determined by capillary electrophoresis using the RNA 6000 Nano Lab-on-a-Chip kit and the Bioanalyzer 2100 (Agilent Technologies). The expression levels of the 13 sputum miRNAs (miRs21, 31, 126, 143, 155, 182, 200b, 205, 210, 372, 375, 486, and 708) were determined by using qRT-PCR with TaqMan miRNA assays (Applied Biosystems) as previously described (9, 11–13, 18, 19). Two internal control genes, U6 and miR16, were also analyzed in parallel by qRT-PCR in the specimens. Relative expression of a targeted miRNA in a given sample was computed using the equation 2−ΔCt, where ΔCt = Ct (targeted miRNA) − Ct (internal control gene). Ct values were defined as the fractional cycle number, in which, the fluorescence crossed the fixed threshold. All assays were performed in triplicates. Furthermore, two interplate controls and one no-template control were carried along in each experiment. The no template control for RT was RNease free water instead of RNA sample input, and no template control for PCR was RNease free water instead of RT products input.

Statistical analysis

On the basis of one sample with binomially distributed outcomes, we required 45 patients with lung cancer and 45 subjects with benign SPNs in a training set at 5% significant level with 80% power to discover a panel of biomarkers. To estimate the sample size of a testing set for the validation of the biomarkers, we used area under the ROC curve (AUC) analysis. The AUC of H0 (the null hypothesis) was set at 0.5. H1 represented the alternative hypothesis. To have a high reproducibility with adequate precision, we required 60 subjects per group in the testing set. With this sample size, we would have 90% power to detect an AUC of 0.75 at the 2% significance level. Furthermore, we used Pearson correlation analysis to evaluate the association between miRNA expressions and demographic and clinical characteristics of the patients with either malignant or benign SPNs. The clinicopathologic results were used as the reference standards to determine the diagnostic value of each miRNA biomarker. We used ROC curve and AUC analyses to decide sensitivity, specificity, and corresponding cutoff value of each miRNA. Sensitivity and specificity indicated the accuracy of biomarkers. In addition, positive predictive value (PPV) and negative predictive value (NPV) were calculated as previously described (26), which indicated the probability of disease. We further used logistic regression (13) to develop composite panels of biomarkers, and further identify an optimal panel that could distinguish malignant from benign SPNs with the highest sensitivity and specificity. All analyses, including correlation coefficient, Wilcoxon test, logistic regression, ANOVA, and t test, were performed using log-transformed data.

Results

Developing a panel of sputum miRNA biomarkers for diagnosis of malignant SPNs in a training cohort of specimens

All targeted 13 miRNAs had ≤32 Ct values in each sputum sample of the training set, and therefore were reliably detectable in the specimens by using qTR-PCR assay. No product was synthesized in the negative control samples. Of the two evaluated internal control genes (miR16 and U6), miR16 displayed a Ct value of 26 (mean ± SD, 26 ± 1.3) in all the 122 sputum samples. U6 had ≤32 Ct values in 95.1% (116/122); however, Ct values were 36 or higher in 4.9% (6/122) of the sputum samples. The finding suggested that U6 might not be reliably detectable in some of the tested specimens. Therefore, in this present study, we used miR16 as an internal control to normalize the data of the 13 targeted miRNAs. As shown in Table 4, the 13 miRNAs displayed a significantly different level between patients with lung cancer and individuals with benign diseases (all P < 0.05). Furthermore, the individual miRNAs exhibited AUC values of 0.64 to 0.85 in distinguishing malignant from benign SPNs (Table 4). We used logistic regression models with constrained parameters as in LASSO to develop a panel of miRNA biomarkers for malignant SPNs. miRs21, 31, and 210 were selected as the best biomarkers (all P < 0.001). The expression levels of three sputum miRNAs were significantly higher in patients with lung cancer compared with subjects with benign SPNs (Table 4). The cutoff value for each of the three sputum miRNAs was selected at the point of the highest Youden Index. The cutoffs for miR21, 31, and 210 were 30.38, 1.62, and 36.56, respectively. Combined use of the three miRNAs produced 0.92 AUC (Table 4; Fig. 1). Furthermore, Pearson correlation analysis indicated that the estimated correlations among expression levels of the three miRNAs in sputum were low (All P > 0.05), implying that the diagnostic values of the miRNAs were complementary to each other. Subsequently, the use of the three miRNAs in combination generated 82.93% sensitivity and 87.84% specificity. Sputum cytology has 43.33% sensitivity and 90.32% specificity. Therefore, the sputum biomarkers had a higher sensitivity (82.93%) compared with sputum cytology (43.33%), while maintaining a similar specificity. The three miRNAs did not display statistical differences of sensitivity and specificity between stages (stage I vs. stage II; P > 0.05). The changes of the three genes were associated with size of SPNs (P < 0.05). The expression of miR21 in sputum was more closely associated with adenocarcinoma (P < 0.05), whereas miR210 was related to SCC (P < 0.05). However, overall, the panel of three biomarkers did not exhibit special association with a histologic type of the NSCLC cases, and the age, gender, and ethnicity of the participants (All P > 0.05). The expression level of miR31 was associated with smoking history of patients with lung cancer at the edge of significance (P = 0.05).

ROC curve analysis of three sputum miRNAs (miR21, 31, and 210) in a training set of 122 patients with either malignant (n = 60) or benign SPNs (n = 62). The area under the ROC curve (AUC) for each miRNA conveys its accuracy for discriminating malignant from benign SPNs. The individual miRNAs produces 0.789 to 0.853 AUC values (A–C). Combined analysis of the three miRNAs creates AUC value of 0.919 (D), which is significantly higher than that of a single miRNA used alone (All P < 0.05).

The expression difference of sputum miRNAs between patients with either malignant or benign SPNs, the AUC values and corresponding sensitivity and specificity in distinguishing malignant from benign SPNs

Validating the panel of sputum miRNA biomarkers in internal and external testing cohorts of specimens

The panel of sputum miRNA biomarkers was validated in a testing cohort (Table 2) for the diagnostic value in a blinded fashion by using the optimal thresholds established in the above training set. The panel of the three miRNAs had 82.09% sensitivity and 88.41% specificity, yielding 87.30% PPV and 83.56% NPV in differentiating malignant from benign SPNs (Table 5). The three miRNAs were further tested in an independent testing set of sputum samples (Table 3) collected from a different medical center. The panel of the sputum biomarkers could discern lung cancer from benign diseases with 80.52% sensitivity, 86.08% specificity, 84.93% PPV, and 81.93% NPV (Table 5). Taken together, the results created from the extensive validation confirmed the potential of the miRNAs as sputum biomarkers for the early detection of NSCLC among CT-found SPNs.

The diagnostic values of the panel of three sputum miRNA biomarkers in an internal testing set and an external testing set of specimens

Discussion

In the present study, we develop a panel of three sputum miRNA biomarkers (miRs21, 31, and 210) that can discriminate early-stage NSCLCs from benign SPNs with 82.93% sensitivity and 87.84% specificity. The biomarker panel has a significant higher sensitivity (82.93% vs. 65.20%) compared with our previously developed two miRNA biomarkers that mainly distinguished patients with NSCLC from cancer-free smokers (13). Furthermore, the biomarker panel has a higher sensitivity (82.93% vs. 43.33%) compared with sputum cytology. The validations of the biomarkers in two different testing sets with large sample sizes confirm their performance for diagnosis of malignant SPNs, producing more than 84% PPV and 81% NPV. The higher PPV (84%) of the biomarkers as compared with only 2% PPV of LDCT indicates that the biomarkers would result in much less overdiagnosis. The positive cases detected by the biomarkers in CT-found SPNs are malignant SPNs, and should need instant surgical treatment. Furthermore, the negative cases discovered by the biomarkers in CT-found SPNs are benign growths, and will not be followed up for 2 years using harmful and expensive approaches. Therefore, the future application of the biomarkers may dramatically decrease CT scan-related overdiagnosis and lead to more personalized therapy by sparing individuals with benign growths from radiation exposure and unnecessary surgical resections or biopsies.

Some limitations may exist in the present study. First, the sputum samples used in this study were obtained from the individuals with SPNs that were found by contrast-enhanced CT rather than LDCT. The individuals might not be representative of the subjects in LDCT screening setting. Therefore, a larger-scale validation study for the biomarkers across multiple centers with a population screened by LDCT is required. It would also be interesting to know whether there is any difference in expression level of the sputum miRNAs between patients with benignant SPNs and healthy subjects. Second, the panel of three miRNAs biomarkers was selected from only 13 sputum miRNA biomarker candidates. Other important miRNAs might not be included in this study. Therefore, the diagnostic efficiency (82.93% sensitivity and 87.84% specificity) is still not sufficient to be used in clinical settings. Applying whole-genome next-generation sequencing to globally analyze primary lung tumor tissues, we recently identified 68 miRNA signatures of stage I NSCLC (27). The comprehensively identified miRNA signatures would provide new biomarker candidates for lung cancer. Our ongoing efforts are to identify additional miRNA biomarkers from the new signatures that can improve the overall accuracy of the sputum test.

In sum, we report the development of a panel of sputum miRNAs that may provide potential biomarkers for a definitive preoperative diagnosis of SPNs primarily found by CT scan. However, carrying out a multicenter clinical trial in a large population to prospectively and vigorously validate the biomarkers is required before they can be translated into routine clinical practice.

Grant Support

This work was supported in part by NCI R01CA161837, VA merit Award I01 CX000512, and LUNGevity/Upstage Foundation Early Detection Award (to F. Jiang).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.