Discriminant Analysis Triumphantly Ends Comparison of Classifiers

Finally, the last classifier. Will it be the best one at the end? Yes! Previous SVM classification will be replaced by a discriminant analysis model. Discriminant analysis in short: its principle is very similar to the naive Bayes approach - probabilities, statistics and optimization mixed up together.

The approach is the same as always: we will create a discriminant analysis model based on training data, then a reliability of the model will be tested on some samples. The graph is also the same as always, just the results differ. The tables compare reliability (when percentage of training data is 70 %) of all five classifiers and the computational times that MATLAB needs to create them. All data except the last results from dicriminat analysis models were mentioned in previous posts.

Mode

Clasification tree

k-nearest neighbors

Naive Bayes

Multiclass SVM

Discriminant analysis

without optimization

91.2 %

73.4 %

96.5 %

96.6 %

98.2 %

'auto' optimization

89.2 %

94.9 %

96.9 %

97.4 %

98.1 %

'all' optimization

90.9 %

96.3 %

97.2 %

97.4 %

98.3 %

Mode

Time per one tree

Time per one k-NN

Time per one NB

Time per one SVM

Time per one DA

without optimization

0.01 s

0.02 s

0.03 s

0.07 s

0.02 s

'auto' optimization

46.3 s

58.6 s

97.2 s

112.4 s

50.7 s

'all' optimization

65.5 s

94.3 s

108.9 s

145.6 s

84.3 s

Nice, the lines finally approach 100 % reliability. The mean values are around 98 %, but there is a lot of cases where the model returns exactly 100 % reliability. The only imperfection is that I’m not 100 % sure how it works. An unknown sample is somehow assigned to the class based on a probability.

To sum-up everything I would like to compare all five classifiers at once. I chose the mode without optimization of hyperparameters, because in most cases it didn’t improve reliability of models anyhow; it only costs lots of time [1]. Then I could afford to increase the number of created models from 50 to 2500 (for each percentage of training data with step 10 %), and thereby increase accuracy. All data were standardized.

The result speaks for itself. Discriminant analysis is clearly the best, followed by naive Bayes. Poor trees, they have the nicest visualization, but being overcome by each classifier is not very pleasant for them. On the other hand, waiting for the results from optimized multiclass SVM classifiers tested my patience at most (I would like to remind once again, it was more than 32 hours). A final note, the results are valid only for this dataset, thus they can be different for other data. Then, testing classifiers on various datasets could be a next challenge.

1. I’m not saying that a hyperparameter optimization can’t be useful, just that it is not very suitable for the cases where thousands of models are needed ↩