ML - Home
ML - Introduction
ML - Getting Started
ML - Basic Concepts
ML - Ecosystem
ML - Python Libraries
ML - Applications
ML - Life Cycle
ML - Required Skills
ML - Implementation
ML - Challenges & Common Issues
ML - Limitations
ML - Reallife Examples
ML - Data Structure
ML - Mathematics
ML - Artificial Intelligence
ML - Neural Networks
ML - Deep Learning
ML - Getting Datasets
ML - Categorical Data
ML - Data Loading
ML - Data Understanding
ML - Data Preparation
ML - Models
ML - Supervised Learning
ML - Unsupervised Learning
ML - Semi-supervised Learning
ML - Reinforcement Learning
ML - Supervised vs. Unsupervised
Machine Learning Data Visualization
ML - Data Visualization
ML - Histograms
ML - Density Plots
ML - Box and Whisker Plots
ML - Correlation Matrix Plots
ML - Scatter Matrix Plots
Statistics for Machine Learning
ML - Statistics
ML - Mean, Median, Mode
ML - Standard Deviation
ML - Percentiles
ML - Data Distribution
ML - Skewness and Kurtosis
ML - Bias and Variance
ML - Hypothesis
Regression Analysis In ML
ML - Regression Analysis
ML - Linear Regression
ML - Simple Linear Regression
ML - Multiple Linear Regression
ML - Polynomial Regression
Classification Algorithms In ML
ML - Classification Algorithms
ML - Logistic Regression
ML - K-Nearest Neighbors (KNN)
ML - Naïve Bayes Algorithm
ML - Decision Tree Algorithm
ML - Support Vector Machine
ML - Random Forest
ML - Confusion Matrix
ML - Stochastic Gradient Descent
Clustering Algorithms In ML
ML - Clustering Algorithms
ML - Centroid-Based Clustering
ML - K-Means Clustering
ML - K-Medoids Clustering
ML - Mean-Shift Clustering
ML - Hierarchical Clustering
ML - Density-Based Clustering
ML - DBSCAN Clustering
ML - OPTICS Clustering
ML - HDBSCAN Clustering
ML - BIRCH Clustering
ML - Affinity Propagation
ML - Distribution-Based Clustering
ML - Agglomerative Clustering
Dimensionality Reduction In ML
ML - Dimensionality Reduction
ML - Feature Selection
ML - Feature Extraction
ML - Backward Elimination
ML - Forward Feature Construction
ML - High Correlation Filter
ML - Low Variance Filter
ML - Missing Values Ratio
ML - Principal Component Analysis
Reinforcement Learning
ML - Reinforcement Learning Algorithms
ML - Exploitation & Exploration
ML - Q-Learning
ML - REINFORCE Algorithm
ML - SARSA Reinforcement Learning
ML - Actor-critic Method
ML - Monte Carlo Methods
ML - Temporal Difference
Deep Reinforcement Learning
ML - Deep Reinforcement Learning
ML - Deep Reinforcement Learning Algorithms
ML - Deep Q-Networks
ML - Deep Deterministic Policy Gradient
ML - Trust Region Methods
Quantum Machine Learning
ML - Quantum Machine Learning
ML - Quantum Machine Learning with Python
Machine Learning Miscellaneous
ML - Performance Metrics
ML - Automatic Workflows
ML - Boost Model Performance
ML - Gradient Boosting
ML - Bootstrap Aggregation (Bagging)
ML - Cross Validation
ML - AUC-ROC Curve
ML - Grid Search
ML - Data Scaling
ML - Train and Test
ML - Association Rules
ML - Apriori Algorithm
ML - Gaussian Discriminant Analysis
ML - Cost Function
ML - Bayes Theorem
ML - Precision and Recall
ML - Adversarial
ML - Stacking
ML - Epoch
ML - Perceptron
ML - Regularization
ML - Overfitting
ML - P-value
ML - Entropy
ML - MLOps
ML - Data Leakage
ML - Monetizing Machine Learning
ML - Types of Data
Machine Learning - Resources
ML - Quick Guide
ML - Cheatsheet
ML - Interview Questions
ML - Useful Resources
ML - Discussion

Confusion Matrix in Machine Learning

Quiz

What is Confusion Matrix?

The confusion matrix in machine learning is the easiest way to measure the performance of a classification problem where the output can be of two or more type of classes. It is nothing but a table with two dimensions viz. "Actual" and "Predicted" and furthermore, both the dimensions have "True Positives (TP)", "True Negatives (TN)", "False Positives (FP)", "False Negatives (FN)" as shown below −

Take an example of classifying emails as "spam" and "not spam" for a better understanding. Here a spam email is labeled as "positive" and a legitimate (not spam) email is labeled as negative.

Explanation of the terms associated with confusion matrix are as follows −

True Positives (TP) − It is the case when both actual class & predicted class of data point is 1. The classification model correctly predicts the positive class label for data sample. For example, a "spam" email is classified as "spam".
True Negatives (TN) − It is the case when both actual class & predicted class of data point is 0. The model correctly predicts the negative class label for data sample. For example, a "not spam" email is classified as "not spam".
False Positives (FP) − It is the case when actual class of data point is 0 & predicted class of data point is 1. The model incorrectly predicts the positive class label for data sample. For example, a "not spam" email is misclassified as "spam". It is known as a Type I error.
False Negatives (FN) − It is the case when actual class of data point is 1 & predicted class of data point is 0. The model incorrectly predicts the negative class label for data sample. For example, a "spam" email is misclassified as "not spam". It is also known as Type II error.

We use the confusion matrix to find correct and incorrect classifications −

Correct classification − TP and TN are correctly classified data points.
Incorrect classification − FP and FN are incorrectly classified data points.

We can use the confusion matrix to calculate different classification metrics such as accuracy, precision, recall, etc. But before discussing these metrics, let's take understand how to create a confusion matrix with the help of a practical exmaple.

Confusion Matrix Practical Example

Let's take a practical example for a classifications of emails as "spam" or "not spam". Here we are representing class for a spam email as positive (1) and a not spam email as negative (0). So emails are classified either −

spam (1) - positive class lebel
not spam (0) - negative class lebel

The actual and predicted classes/ categories are as follows −

Actual Classification	0	1	0	1	1	0	0	1	1	1
Predicted Classification	0	1	0	1	0	1	0	0	1	1

So with the above results, let's find out that a particular classification falls under TP, TN, FP or FN. Look at the below table −

Actual Classification	0	1	0	1	1	0	0	1	1	1
Predicted Classification	0	1	0	1	0	1	0	0	1	1
Result	TN	TP	TN	TP	FN	FP	TN	FN	TP	TP

In above table, when we compare actual classification set to the predicted classification, we observe that there are four different types of outcomes. First, true positive (1,1), i.e. the actual classification is positive and predicted classification is also postive. This means the classifier has identified postive sample correctly. Second, false negative (1,0), i.e., the actual classification is positive and predicted classification in negative. The classifier has identified positive sample as negative. Third, false positive, (0,1), i.e., the actual classification is negative and predicted classification is postive. The negative sample is incorrectly identified as positive. Fourth, true negative (0,0), i.e., the actual and predicted classifications are negative. The negative sample is correctly identified by model as negative.

Let's find the total number of samples in each categories.

TP (True Positive): 4
FN (False Negative): 2
FP (False Positive): 1
TN (True Negative): 3

Let's now create confusion matrix as following −

		Actual Class
		Positive (1)	Negative (0)
Predicted Class	Positive (1)	4 (TP)	1 (FP)
Predicted Class	Negative (0)	2 (FN)	3 (TN)

So far we have created the confusion matrix for above problem. Let's infer some meaning from the above matrix −

Amongst 10 emails, four "spam" emails are correctly classified as "spam" (TP).
Amongst 10 emails, two "spam" emails are incorrectly classified as "not spam" (FN).
Amongst 10 emails, one "not spam" email is incorrectly classified as "spam" (FP).
Amongst 1o emails, three "not spam" emails are correctly classified as "not spam" (TN).
So Amongst 10 emails, seven emails are correctly classified (TP & TN)and three emails are incorrectly classified (FP & FN).

Classificaiton Metrics Based on Confusion Matrix

We can define many classificaiton performance metrics using the confusion matrix. We will consider the above practical example and calculate the metrics using the values in that example. Some of them are as follows −

Accuracy
Precision
Recall or Sensitivity
Specificity
F1 Score
Type I Error Rate
Type II Error Rate

Accuracy

Accuracy is most common metrics to evaluate a classification model. It is ratio of total correction predictions and all predictions made. Mathematically we can use the following formula to calculate accurcy −

$$\mathrm{Accuracy = \frac{TP + TN}{TP + FP + FN + TN}}$$

Let's calculate the accuracy −

$$\mathrm{Accuracy = \frac{4 + 3}{4 + 1 + 2 + 3}= \frac{7}{10} = 0.7}$$

Hence the model's classification accuracy is 70%.

Precision

Precision measures the proportion of true positive instances out of all predicted positive instances. It is calculated as ratio of the number of true positive instances and the sum of true positive and false positive instances.

$$\mathrm{Precision = \frac{TP}{TP + FP}}$$

Let's calculate the precision −

$$\mathrm{Precision = \frac{4}{4 + 1} = \frac{4}{5} = 0.8}$$

Recall or Sensitivity

Recall (Sensitivity) is defined as the number of positives classifications by the classifier. We can calculate it with the help of following formula

$$\mathrm{Recall = \frac{TP}{TP + FN}}$$

Let's calculate recall −

$$\mathrm{Recall = \frac{4}{4 + 2} = \frac{4}{6} = 0.666}$$

Specificity

Specificity, in contrast to recall, is defined as the number of negatives returned by the classifier. We can calculate it with the help of following formula −

$$\mathrm{Specificity = \frac{TN}{TN + FP}}$$

Let's calculate the specificity −

$$\mathrm{Specificity = \frac{3}{3 + 1} = \frac{3}{4} = 0.75}$$

F1 Score

F1 score is a balanced measure that takes into account both precision and recall. It is the harmonic mean of precision and recall.

We can calculate F1 score with the help of following formula −

$$\mathrm{F1 \: Score = 2 \times \frac{(Precision \times Recall)}{Precision + Recall}}$$

Let's calculate F1 score −

$$\mathrm{F1 \: Score = 2 \times \frac{(0.8 \times 0.667)}{0.8 + 0.667} = 0.727}$$

Hence, F1 score is 0.727.

Type I Error Rate

Type I error occurs when the classifier predicts positive classification but it is actually negative class. The type I error rate is calculated as −

$$\mathrm{Type \: I \: Error \: Rate = \frac{FP}{FP + TN}}$$

$$\mathrm{Type \: I \: Error \: Rate = \frac{1}{1 + 3} = \frac{1}{4} = 0.25}$$

Type II Error Rate

Type II error occurs when the classifier predicts negative but it is actually positive class. The type II error rate can be calculate as −

$$\mathrm{Type \: II \: Error \: Rate = \frac{FN}{FN + TP}}$$

$$\mathrm{Type \: II \: Error \: Rate = \frac{2}{2 + 4} = \frac{2}{6} = 0.333}$$

How to Implement Confusion Matrix in Python?

To implement the confusion matrix in Python, we can use the confusion_matrix() function from the sklearn.metrics module of the scikit-learn library.

Note: Please note that the confusion_matrix() function returns a 2D array that correspondence to the following confusion matrix −

		Predicted Class
		Negative (0)	Positive (1)
Actual Class	Negative (0)	True Negative (TN)	False Positive (FP)
Actual Class	Positive (1)	False Negative (FN)	True Positive (TP)

Here is an simple example of how to use the confusion_matrix() function −

from sklearn.metrics import confusion_matrix

# Actual values
y_actual = [0, 1, 0, 1, 1, 0, 0, 1, 1, 1]

# Predicted values
y_pred = [0, 1, 0, 1, 0, 1, 0, 0, 1, 1]

# Confusion matrix
cm = confusion_matrix(y_actual, y_pred)
print(cm)

In this example, we have two arrays: y_actual contains the actual values of the target variable, and y_pred contains the predicted values of the target variable. We then call the confusion_matrix() function, passing in y_actual and y_pred as arguments. The function returns a 2D array that represents the confusion matrix.

The output of the code above will look like this −

[[3 1]
 [2 4]]

Compare the above result with the confusion matrix we created above.

True Negative (TN): 3
False Positive (FP): 1
False Negative (FN): 2
True Positive (TP): 4

We can also visualize the confusion matrix using a heatmap. Below is how we can do that using the heatmap() function from the seaborn library

import seaborn as sns

# Plot confusion matrix as heatmap
sns.heatmap(cm, annot=True, cmap='summer')

This will produce a heatmap that shows the confusion matrix −

In this heatmap, the x-axis represents the predicted values, and the y-axis represents the actual values. The color of each square in the heatmap indicates the number of samples that fall into each category.

Print Page