Confusion Matrix
Online Calculator

About

A Confusion Matrix is a popular representation of the performance of
classification models. The matrix (table) shows us the number of correctly
and incorrectly classified examples, compared to the actual outcomes
(target value) in the test data. One of the advantages of using confusion
matrix as evaluation tool is that it allows more detailed analysis
(such as if the model is confusing two classes), than simple proportion
of correctly classified examples (accuracy) which can give misleading
results if the dataset is unbalanced (i.e. when there are huge differences
in number of between difference classes).

The matrix is n by n, where n is the number of classes. The simplest
classifiers, called binary classifiers, has only two classes:
positive/negative, yes/no, male/female…
Performance of a binary classifier is summarized in a confusion matrix
that cross-tabulates predicted and observed examples into four options:

Measures

{{metric.name}}
{{metric.longName}}

Example
Evaluating Spam Classifier

How can we use those metrics and what we can read from the
confusion matrix? For instance, let's consider a classical
problem of predicting spam and non-spam email, by using binary
classification model. Our dataset consists of 50 emails that
are Spam, and 105 emails that are Not Spam. In order to evaluate
the performance of our developed model, which labels emails as
Spam or Not Spam, we can use confusion matrix, where the outcome
is formulated in a 2×2
contingency table or a
confusion matrix:

Altogether, the classifier made 100 predictions
(100 emails were classified in Spam or Non-Spam class)

Out of 100 emails, our model correctly classified 95 emails:
85 were correctly classified as Non-Spam,
and 10 of them were correctly classified as Spam.
This result to 95% accuracy.

Further, 5 out of 100 emails were classified falsely:
5 emails, which were actual Spam, were not predicted as Spam (False Negative).
And more important, no email was falsely predicted as Spam (False Positive),
which is very desired in this case.

We can observe that our model is very conservative when it comes to predicting Spam.
Therfore, the precision of this of this model is very high: 1.0.

By computing additional measures (also called rates) from the classification
matrix, we can get additional insight about our model.