Abstract

Data mining techniques play a major role in healthcare centers to solve large volume of datasets. For diabetes patients if the blood glucose level diverges from typical range leads to serious complications. So, they must be monitored regularly to determine any critical variations. Implementing a predictive model for monitoring the glucose level would enable the patients to take preventive measures. This paper describes a solution for early detection of diabetes by applying various data mining techniques to generate informative structures to train on specific data. The main goal of the research is to generate clear and understandable pattern description in order to extract data knowledge and information stored in the dataset. We investigate the relative performance of various classifiers such as Naive Bayes, SMO-Support Vector Machine (SVM), Decision Tree, and also Neural Network (multilayer perceptron) for our purpose. The ensemble data mining approaches have been improved by classification algorithm. The experimental result shows that Naive Bayes algorithm shows better accuracy of 83.5% by splitting techniques (ST), when the data sets is reduced by 70–30 ratio percentage. By cross-validation (CV) decision tree shows better result 78.3% when compared with other classifiers. The experiment is performed on diabetes dataset at UCI repository in Weka tool. The study shows the potential of ensemble predictive model for predicting instance of diabetes using UCI repository diabetes data. The results are compared among various classifiers and accuracy of test results is measured.