Abstract

Although the UK cervical screening programme has reduced mortality associated with invasive disease, advancement from a high-throughput predictive methodology that is cost-effective and robust could greatly support the current system. We combined analysis by attenuated total reflection Fourier-transform infrared spectroscopy of cervical cytology with self-learning classifier eClass. Thispredictive algorithm can cope with vast amounts of multidimensional data with variable characteristics. Using a characterised dataset [set A: consisting of UK cervical specimens designated as normal (n=60), low-grade (n=60) or high-grade (n=60)] and one further dataset (set B) consisting of n=30 low-grade samples, we set out to determine whether this approach could be robustly predictive.Variously extending the training set consisting of set A with set B data produced good classification rates with three two-class cascade classifiers. However, a single three-class classifier was equally efficient, producing a user-friendly, applicable methodology with improved interpretability (i.e., better classification with only one set of fuzzy rules). As data from set B were added incrementallyto the training set, the model learned and evolved.Additionally, monitoring of results of the set B low-grade specimens (known to be low-grade cervical cytology specimens) provided the opportunity to explore the possibility of distinguishing patients likely to progress towards invasive disease. eClass exhibited a remarkably robust predictive power in a user-friendly fashion (i.e., high throughput, ease of use) compared to other classifiers (k-nearest neighbours, support vector machines, artificial neural networks). Developmentof eClass to classify such datasets for applications such as screening exhibits robustness in identifying a dichotomous marker of invasive disease progression.