Abstract

Introduction

Analyses of microRNA expression have reveal that microRNAs are expressed abnormally in gastric cancer and correlate with tumorigenesis and progression. We propose the use of method of data mining using microRNA expression microarrays extracted from publicly available database to evaluate their predictive power in different disease conditions and compare this approach with traditional methodologies of statistical comparison.

Methods

MicroRNA expression patterns were compared between non-tumor mucosa and cancer samples, classified by diffuse and intestinal histological types and by progression factors (clinical stage and depth of invasion) using three methods of supervised analysis: Random Forest, Decision Tress and Support Vector Machines (SVM), all implemented in R. We estimated the prediction error of each model using leave-one-out cross-validation and incorporating important genes extracted of the first two models as feature extraction strategy for SVM. Some microRNA involved in immune response to Helicobacter pylori (146, 155, 21, 199, 125a, 100 and 106) were assessed with Real Time PCR (RT-PCR) in an independent group of 20 samples to compare their fold change with the data obtained from microarray platforms, and to assess their role in the intestinal histological type.

Results

353 gastric samples were used to execute class prediction algorithms to determine if microRNA expression patterns could accurately differentiate between cancer (n = 184) and non-tumor mucosa (n = 169), the most relevant microRNA in these regards include 181b, 181d, 375, 93, 21, 148, 181a, 181c, some of them are either up and down-regulated as confirm by microarray analysis based on t-test. These genes are among the top 10 or 20 genes generated by traditional methodologies as SAM (Significance Analysis of Microarray). Comparison between each histological type (diffuse, n = 103; intestinal, n = 81) and normal mucosa (n = 84 and n = 75 respectively) highlighted 181b and 18d in diffuse and 133a and 148 in intestinal type. The two histological subtypes of gastric cancer exhibited different expression when they are compared, including microRNA 373, 498, 494. Support Vector Machines exhibits the lowest error rates, when applied to the test samples in all the comparisons, ranging between 5-10%. All the algorithms have good ability of prediction with error rate below 30%. We identified microRNAs that were correlated with depth of invasion or disease stage using the same procedure but with less accuracy. Expression of different microRNAs implicated in immune response due to infection by Helicobacter pylori, studied in the intestinal type by RT-PCR, shown small levels of fold changes as is seen in microarray platforms ranging between 0,2 a 2,5 units.

Conclusion

We propose the use of different classification methods to predict the participation of microRNA in gastric cancer tumorigenesis and progression, this strategy produce reliable signatures with some coincidences and differences with other analysis of expression and could be a starting point that address wet lab experiments. The expression evaluated by RT-PCR in a set of microRNAs shown very basal levels of expression and it seems to contribute little in tumorigenesis.