0% found this document useful (0 votes)
105 views3 pages

Confusion Matrix: Example Table of Confusion References External Links

Uploaded by

John Doe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views3 pages

Confusion Matrix: Example Table of Confusion References External Links

Uploaded by

John Doe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Confusion matrix

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix,[4] is a
specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is
usually called a matching matrix). Each row of the Matrix represents the instances in a predicted class while each column represents the instances in an
actual class (or vice versa).[2] The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e. commonly mislabeling
one as another).

It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each
combination of dimension and class is a variable in the contingency table).

Contents
Example
Table of confusion
References
External links

Example
If a classification system has been trained to distinguish between cats, dogs and rabbits, a confusion matrix will summarize the results of testing the
algorithm for further inspection. Assuming a sample of 27 animals — 8 cats, 6 dogs, and 13 rabbits, the resulting confusion matrix could look like the
table below:

Actual class In this confusion matrix, of the 8 actual cats, the system predicted that
three were dogs, and of the six dogs, it predicted that one was a rabbit
Cat Dog Rabbit and two were cats. We can see from the matrix that the system in

Cat 5 2 0 question has trouble distinguishing between cats and dogs, but can
Predicted

make the distinction between rabbits and other types of animals pretty
class

Dog 3 3 2 well. All correct predictions are located in the diagonal of the table
(highlighted in bold), so it is easy to visually inspect the table for
Rabbit 0 1 11 prediction errors, as they will be represented by values outside the
diagonal.

Table of confusion
In predictive analytics, a table of confusion (sometimes also called a confusion matrix), is a table with two rows and two columns that reports the
number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct
classifications (accuracy). Accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set
is unbalanced (that is, when the numbers of observations in different classes vary greatly). For example, if there were 95 cats and only 5 dogs in the data,
a particular classifier might classify all the observations as cats. The overall accuracy would be 95%, but in more detail the classifier would have a 100%
recognition rate (sensitivity) for the cat class but a 0% recognition rate for the dog class. F1 score is even more unreliable in such cases, and here would
yield over 97.4%, whereasinformedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always
guessing cat).

Assuming the confusion matrix above, its corresponding table of confusion, for the cat class, would be:

Actual class
Cat Non-cat
Predicted

Cat 5 True Positives 2 False Positives


class

Non-cat 3 False Negatives 17 True Negatives


The final table of confusion would contain the average values
for all classes combined. Terminology and derivations
from a confusion matrix
Let us define an experiment from P positive instances and N
negative instances for some condition. The four outcomes can condition positive (P)
be formulated in a 2×2confusion matrix, as follows: the number of real positive cases in the data
condition negative (N)
the number of real negative cases in the data

true positive (TP)


eqv. with hit
true negative (TN)
eqv. with correct rejection
false positive (FP)
eqv. with false alarm, Type I error
false negative (FN)
eqv. with miss, Type II error

sensitivity, recall, hit rate, or true positive rate (TPR)

specificity, selectivity or true negative rate (TNR)

precision or positive predictive value (PPV)

negative predictive value (NPV)

miss rate or false negative rate (FNR)

fall-out or false positive rate (FPR)

false discovery rate (FDR)

false omission rate (FOR)

accuracy (ACC)

F1 score
is the harmonic mean of precision and sensitivity

Matthews correlation coefficient (MCC)

Informedness or Bookmaker Informedness (BM)

Markedness (MK)

Sources: Fawcett (2006), Powers (2011), and Ting (2011) [1] [2] [3]
True condition
Prevalence Accuracy (ACC) =
Total
Condition positive Condition negative
population = ΣΣCondition positive
Total population
Σ True positive + Σ True negative
Σ Total population

Predicted Positive predictive value


True positive, False positive, False discovery rate (FDR) =
condition (PPV), Precision =
Σ False positive
Power Type I error Σ True positive
Predicted positive Σ Predicted condition positive
Σ Predicted condition positive
condition
Predicted False omission rate (FOR) = Negative predictive value (NPV)
False negative,
condition True negative Σ False negative Σ True negative
Type II error Σ Predicted condition negative = Σ Predicted condition negative
negative
True positive rate
False positive rate
(TPR), Recall,
(FPR), Fall-out, Positive likelihood ratio (LR+)
Sensitivity,
probability of detection
probability of false alarm
Σ False positive
= TPR
FPR
Σ True positive = Σ Condition Diagnostic F1 score =
= Σ Condition positive
negative odds ratio
LR+ 2
Specificity (SPC), (DOR) = LR− 1
+ 1
False negative rate Recall Precision
Selectivity, True Negative likelihood ratio (LR−)
(FNR), Miss rate
= Σ ΣCondition
False negative negative rate (TNR)
Σ True negative
= FNR
TNR
positive = Σ Condition negative

References
1. Fawcett, Tom (2006). "An Introduction to ROC Analysis"(http://people.inf.elte.hu/kiss/11dwhdm/roc.pdf)(PDF). Pattern Recognition
Letters. 27 (8): 861–874. doi:10.1016/j.patrec.2005.10.010(https://doi.org/10.1016%2Fj.patrec.2005.10.010)
.
2. Powers, David M W (2011)."Evaluation: From Precision, Recall and F-Measureto ROC, Informedness, Markedness & Correlation"(htt
p://www.flinders.edu.au/science_engineering/fms/School-CSEM/publications/tech_reps-research_artfcts/TRRA_2007.pdf)
(PDF).
Journal of Machine Learning Technologies. 2 (1): 37–63.
3. Ting, Kai Ming (2011). Encyclopedia of machine learning(https://link.springer.com/referencework/10.1007%2F978-0-387-30164-8).
Springer. ISBN 978-0-387-30164-8.
4. Stehman, Stephen V. (1997). "Selecting andinterpreting measures of thematic classification accuracy".Remote Sensing of
Environment. 62 (1): 77–89. doi:10.1016/S0034-4257(97)00083-7(https://doi.org/10.1016%2FS0034-4257%2897%2900083-7) .

External links
Theory about the confusion matrix
GM-RKB Confusion Matrix concept page

Retrieved from "https://en.wikipedia.org/w/index.php?title=Confusion_matrix&oldid=865598510


"

This page was last edited on 24 October 2018, at 22:45(UTC).

Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this site, you agree to
the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia Foundation, Inc., a non-profit organization.

You might also like