0% found this document useful (0 votes)
29 views

Roc Curve in Python

Uploaded by

tv8585800
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Roc Curve in Python

Uploaded by

tv8585800
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Roc curve in python

The Receiver Operating Characteristic (ROC) curve


is a fundamental tool in the field of machine
learning for evaluating the performance of
classification models. In this context, we'll explore
the ROC curve and its associated metrics using the
breast cancer dataset, a widely used dataset for
binary classification tasks.
What is the ROC Curve?
The ROC curve stands for Receiver Operating
Characteristics Curve and is an evaluation metric
for classification tasks and it is a probability curve
that plots sensitivity and specificity. So, we can say
that the ROC Curve can also be defined as the
evaluation metric that plots the sensitivity against
the false positive rate. The ROC curve plots two
different parameters given below:
 True positive rate
 False positive rate
The ROC Curve can also defined as a graphical
representation that shows the performance or
behavior of a classification model at all different
threshold levels. The ROC Curve is a tool used for
binary classification in machine learning. While
learning about the ROC Curve we need to be
familiar with the terms specificity and sensitivity.
 Specificity: It is defined as the proportion of
negative instances that were predicted
correctly as negative values. In other terms,
the true negative is also called the specificity.
The false positive rate can be found using the
specificity by subtracting one from it.
 Sensitivity: The true positive rate is defined as
the rate of positive instances that were
predicted correctly to be positive. The true
positive rate is a synonym for "True positive
rate".The sensitivity is also called recall and
these terms are often interchangeable. The
formula for TPR is as follows,
TPR = TP/(TP+FN)
where, TPR = True positive rate, TP = True positive,
FN = False negative.
False positive rate: On the other side, false
positive rate can be defined as the rate of negative
instances that were predicted incorrectly to be
positive. In other terms, the false positive can also
be called "1-specificity".
FPR=FP/(FP+TN)
where, FPR = False positive rate, FP= False
positive, TN= True negative.
The ROC Curve is often comparable with the
precision and recall curve but it is different
because it plots the true positive rate (which is
also called recall) against the false positive rate.
The curve is plotted by finding the values of TPR
and FPR at distinct threshold values and we don't
plot the probabilities but we plot the scores. So
the probability of the positive class is taken as the
score here.
Types of ROC Curve
There are two types of ROC Curves:
 Parametric ROC Curve: The parametric
method plots the curve using maximum
likelihood estimation. This type of ROC Curve
is also smooth and plots any sensivitiy and
specificity, but it has drawbacks like actual
data can be discarded. The computation of
this method is complex.
 Non-Parametric ROC Curve: The non-
parametric method does not need any
assumptions about the data distributions. It
gives unbiased estimates and plot passes
through all the data points. The computation
of this method is simple.
What is the AUC-ROC curve?
The AUC-ROC curve, or Area Under the Receiver
Operating Characteristic curve, is a graphical
representation of the performance of a binary
classification model at various classification
thresholds. It is commonly used in machine
learning to assess the ability of a model to
distinguish between two classes, typically the
positive class (e.g., presence of a disease) and the
negative class (e.g., absence of a disease).
Let’s first understand the meaning of the two
terms ROC and AUC.
 ROC: Receiver Operating Characteristics
 AUC: Area Under Curve
Receiver Operating Characteristics (ROC) Curve
ROC stands for Receiver Operating Characteristics,
and the ROC curve is the graphical representation
of the effectiveness of the binary classification
model. It plots the true positive rate (TPR) vs the
false positive rate (FPR) at different classification
thresholds.
Area Under Curve (AUC) Curve:
AUC stands for the Area Under the Curve, and the
AUC curve represents the area under the ROC
curve. It measures the overall performance of the
binary classification model. As both TPR and FPR
range between 0 to 1, So, the area will always lie
between 0 and 1, and A greater value of AUC
denotes better model performance. Our main goal
is to maximize this area in order to have the
highest TPR and lowest FPR at the given threshold.
The AUC measures the probability that the model
will assign a randomly chosen positive instance a
higher predicted probability compared to a
randomly chosen negative instance.
It represents the probability with which our
model can distinguish between the two classes
present in our target.

ROC-AUC Classification Evaluation Metric


Key terms used in AUC and ROC Curve
1. TPR and FPR
This is the most common definition that you
would have encountered when you would Google
AUC-ROC. Basically, the ROC curve is a graph that
shows the performance of a classification model
at all possible thresholds( threshold is a particular
value beyond which you say a point belongs to a
particular class). The curve is plotted between two
parameters
 TPR – True Positive Rate
 FPR – False Positive Rate
Before understanding, TPR and FPR let us quickly
look at the confusion matrix.
Confusion Matrix for a Classification Task
 True Positive: Actual Positive and Predicted as
Positive
 True Negative: Actual Negative and Predicted
as Negative
 False Positive(Type I Error): Actual Negative
but predicted as Positive
 False Negative(Type II Error): Actual Positive
but predicted as Negative
In simple terms, you can call False Positive a false
alarm and False Negative a miss. Now let us look
at what TPR and FPR are.
2. Sensitivity / True Positive Rate / Recall
Basically, TPR/Recall/Sensitivity is the ratio of
positive examples that are correctly identified. It
represents the ability of the model to correctly
identify positive instances and is calculated as
follows:
TPR=TPTP+FNTPR=TP+FNTP
Sensitivity/Recall/TPR measures the proportion of
actual positive instances that are correctly
identified by the model as positive.
3. False Positive Rate
FPR is the ratio of negative examples that are
incorrectly classified.
FPR=FPTN+FPFPR=TN+FPFP
4. Specificity
Specificity measures the proportion of actual
negative instances that are correctly identified by
the model as negative. It represents the ability of
the model to correctly identify negative instances
Specificity=TNTN+FP=1−FPRSpecificity=TN+FPTN
=1−FPR
And as said earlier ROC is nothing but the plot
between TPR and FPR across all possible
thresholds and AUC is the entire area beneath this
ROC curve.
Sensitivity versus False Positive Rate plot
Relationship between Sensitivity, Specificity, FPR,
and Threshold.
Sensitivity and Specificity:
 Inverse Relationship: sensitivity and
specificity have an inverse relationship. When
one increases, the other tends to decrease.
This reflects the inherent trade-off between
true positive and true negative rates.
 Tuning via Threshold: By adjusting the
threshold value, we can control the balance
between sensitivity and specificity. Lower
thresholds lead to higher sensitivity (more
true positives) at the expense of specificity
(more false positives). Conversely, raising the
threshold boosts specificity (fewer false
positives) but sacrifices sensitivity (more false
negatives).

Threshold and False Positive Rate (FPR):


 FPR and Specificity Connection: False Positive
Rate (FPR) is simply the complement of
specificity (FPR = 1 – specificity). This signifies
the direct relationship between them: higher
specificity translates to lower FPR, and vice
versa.
 FPR Changes with TPR: Similarly, as you
observed, the True Positive Rate (TPR) and
FPR are also linked. An increase in TPR (more
true positives) generally leads to a rise in FPR
(more false positives). Conversely, a drop in
TPR (fewer true positives) results in a decline
in FPR (fewer false positives)
How does AUC-ROC work?
We looked at the geometric interpretation, but I
guess it is still not enough in developing the
intuition behind what 0.75 AUC actually means,
now let us look at AUC-ROC from a probabilistic
point of view. Let us first talk about what AUC
does and later we will build our understanding on
top of this
AUC measures how well a model is able to
distinguish between classes.
An AUC of 0.75 would actually mean that let’s say
we take two data points belonging to separate
classes then there is a 75% chance the model
would be able to segregate them or rank order
them correctly i.e positive point has a higher
prediction probability than the negative class.
(assuming a higher prediction probability means
the point would ideally belong to the positive
class). Here is a small example to make things
more clear.

Index Class Probability

P1 1 0.95

P2 1 0.90

P3 0 0.85

P4 0 0.81

P5 1 0.78

P6 0 0.70

Here we have 6 points where P1, P2, and P5


belong to class 1 and P3, P4, and P6 belong to
class 0 and we’re corresponding predicted
probabilities in the Probability column, as we said
if we take two points belonging to separate
classes then what is the probability that model
rank orders them correctly.
We will take all possible pairs such that one point
belongs to class 1 and the other belongs to class
0, we will have a total of 9 such pairs below are all
of these 9 possible pairs.

Pair isCorrect

(P1,P3) Yes

(P1,P4) Yes

(P1,P6) Yes

(P2,P3) Yes

(P2,P4) Yes

(P2,P6) Yes

(P3,P5) No

(P4,P5) No
Pair isCorrect

(P5,P6) Yes

Here column is Correct tells if the mentioned pair


is correctly rank-ordered based on the predicted
probability i.e class 1 point has a higher
probability than class 0 point, in 7 out of these 9
possible pairs class 1 is ranked higher than class 0,
or we can say that there is a 77% chance that if
you pick a pair of points belonging to separate
classes the model would be able to distinguish
them correctly.
EXAMPLE

Let’s consider an example to illustrate how ROC


curves are generated for different thresholds and
how a particular threshold corresponds to a
confusion matrix. Suppose we have a binary
classification problem with a model predicting
whether an email is spam (positive) or not spam
(negative).
Let us consider the hypothetical data,
True Labels: [1, 0, 1, 0, 1, 1, 0, 0, 1, 0]
Predicted Probabilities: [0.8, 0.3, 0.6, 0.2, 0.7, 0.9,
0.4, 0.1, 0.75, 0.55]
Case 1: Threshold = 0.5

Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.5)

1 0.8 1

0 0.3 0

1 0.6 1

0 0.2 0

1 0.7 1

1 0.9 1
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.5)

0 0.4 0

0 0.1 0

1 0.75 1

0 0.55 1

Confusion matrix based on above predictions

Prediction = 0 Prediction = 1

Actual = 0 TP=4 FN=1

Actual = 1 FP=0 TN=5

Accordingly,
 True Positive Rate (TPR):
Proportion of actual positives correctly
identified by the classifier is
TPR=TP/(TP+FN)=4/4+1=0.8

 TPR=TP+FN/TP=4+1/4=0.8
 False Positive Rate (FPR):
Proportion of actual negatives incorrectly
classified as positives
FPR=FP/(FP+TN)=00+5=0
 FPR=FP+TN/FP=0+50=0
So, at the threshold of 0.5:
 True Positive Rate (Sensitivity): 0.8
 False Positive Rate: 0
The interpretation is that the model, at this
threshold, correctly identifies 80% of actual
positives (TPR) but incorrectly classifies 0% of
actual negatives as positives (FPR).
Accordingly for different thresholds we will get ,
Case 2: Threshold = 0.7

Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.7)

1 0.8 1

0 0.3 0

1 0.6 0

0 0.2 0

1 0.7 0

1 0.9 1

0 0.4 0

0 0.1 0
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.7)

1 0.75 1

0 0.55 0

Confusion matrix based on above predictions

Prediction = 0 Prediction = 1

Actual = 0 TP=5 FN=0

Actual = 1 FP=2 TN=3

Accordingly,
 True Positive Rate (TPR):
Proportion of actual positives correctly
identified by the classifier is
TPR=TPTP+FN=55+0=1.0TPR=TP+FNTP=5+05
=1.0
 False Positive Rate (FPR):
Proportion of actual negatives incorrectly
classified as positives
FPR=FPFP+TN=11+4=0.2FPR=FP+TNFP=1+41
=0.2
Case 3: Threshold = 0.4

Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.4)

1 0.8 1

0 0.3 0

1 0.6 1

0 0.2 0

1 0.7 1
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.4)

1 0.9 1

0 0.4 0

0 0.1 0

1 0.75 1

0 0.55 1

Confusion matrix based on above predictions

Prediction = 0 Prediction = 1

Actual = 0 TP=4 FN=1

Actual = 1 FP=0 TN=5

Accordingly,
 True Positive Rate (TPR):
Proportion of actual positives correctly
identified by the classifier is
TPR=TPTP+FN=44+1=0.8TPR=TP+FNTP=4+14
=0.8
 False Positive Rate (FPR):
Proportion of actual negatives incorrectly
classified as positives
FPR=FPFP+TN=00+5=0FPR=FP+TNFP=0+50=0
Case 4: Threshold = 0.2

Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.2)

1 0.8 1

0 0.3 1

1 0.6 1
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.2)

0 0.2 0

1 0.7 1

1 0.9 1

0 0.4 1

0 0.1 0

1 0.75 1

0 0.55 1

Confusion matrix based on above predictions


Prediction = 0 Prediction = 1

Actual = 0 TP=2 FN=3

Actual = 1 FP=0 TN=5

Accordingly,
 True Positive Rate (TPR):
Proportion of actual positives correctly
identified by the classifier is
TPR=TPTP+FN=22+3=0.4TPR=TP+FNTP=2+32
=0.4
 False Positive Rate (FPR):
Proportion of actual negatives incorrectly
classified as positives
FPR=FPFP+TN=00+5=0FPR=FP+TNFP=0+50=0
Case 5: Threshold = 0.85
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.85)

1 0.8 0

0 0.3 0

1 0.6 0

0 0.2 0

1 0.7 0

1 0.9 1

0 0.4 0

0 0.1 0

1 0.75 0
Predicted
Labels
Predicted (if Threshold =
True Labels Probabilities 0.85)

0 0.55 0

Confusion matrix based on above predictions

Prediction = 0 Prediction = 1

Actual = 0 TP=5 FN=0

Actual = 1 FP=4 TN=1

Accordingly,
 True Positive Rate (TPR):
Proportion of actual positives correctly
identified by the classifier is
TPR=TPTP+FN=55+0=1.0TPR=TP+FNTP=5+05
=1.0
False Positive Rate (FPR):
Proportion of actual negatives incorrectly
classified as positives
FPR=FPFP+TN=44+1=0.8FPR=FP+TNFP=4+14=0.8

Receiver Operating Characteristic (ROC)


Example of Receiver Operating Characteristic
(ROC) metric to evaluate classifier output quality.
ROC curves typically feature true positive rate on
the Y axis, and false positive rate on the X axis.
This means that the top left corner of the plot is
the “ideal” point - a false positive rate of zero, and
a true positive rate of one. This is not very
realistic, but it does mean that a larger area under
the curve (AUC) is usually better.
The “steepness” of ROC curves is also important,
since it is ideal to maximize the true positive rate
while minimizing the false positive rate.
ROC curves are typically used in binary
classification to study the output of a classifier. In
order to extend ROC curve and ROC area to multi-
class or multi-label classification, it is necessary to
binarize the output. One ROC curve can be drawn
per label, but one can also draw a ROC curve by
considering each element of the label indicator
matrix as a binary prediction (micro-averaging).
Note
See also sklearn.metrics.roc_auc_score,
Receiver Operating Characteristic (ROC) with cross
validation.

PROGRAM
Step 1: Importing the required libraries
In scikit-learn, the roc_curve function is used to
compute Receiver Operating Characteristic (ROC)
curve points. On the other hand, the auc function
calculates the Area Under the Curve (AUC) from
the ROC curve.
AUC is a scalar value representing the area under
the ROC curve quantifing the classifier's ability to
distinguish between positive and negative
examples across all possible classification
thresholds.
Python3
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import
train_test_split
from sklearn.linear_model import
LogisticRegression
from sklearn.metrics import roc_curve, auc

Step 2: Loading the dataset


Python3
data = load_breast_cancer()
X = data.data
y = data.target # Split the data into features (X)
and target variable (y)
X_train, X_test, y_train, y_test = train_test_split(X,
y, test_size=0.25, random_state=42)

Step 3: Training and testing the model


Python3
# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Predict probabilities on the test set
y_pred_proba = model.predict_proba(X_test)[:, 1]

Step 4: Plot the ROC Curve


 The roc_curve function is used to calculate the
False Positive Rates (FPR), True Positive Rates
(TPR), and corresponding thresholds with true
labels and the predicted probabilities of
belonging to the positive class as inputs.
 plt.plot([0, 1], [0, 1], 'k--', label='No Skill') is
used to plot a diagonal dashed line
representing a classifier with no discriminative
power (random guessing).
Python3
# Calculate ROC curve
fpr, tpr, thresholds = roc_curve(y_test,
y_pred_proba)
roc_auc = auc(fpr, tpr)
# Plot the ROC curve
plt.figure()
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' %
roc_auc)
plt.plot([0, 1], [0, 1], 'k--', label='No Skill')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve for Breast Cancer
Classification')
plt.legend()
plt.show()
Output:
The dashed line represents the ROC Curve for the
classifier. The AUC as 1.00, signifies perfect
classification, meaning the model can distinguish
malignant from benign tumors flawlessly at any
threshold.
How Ideal Curve looks like?
An ideal ROC curve would be as close as possible
to the upper left corner of the plot, indicating high
TPR (correctly identifying true positives) with low
FPR (incorrectly identifying false positives). The
closer the curve is to the diagonal baseline, the
worse the classifier's performance.
The AUC score provides a quantitative measure of
the classifier's performance, with a value of 1
indicating perfect classification and a value of 0.5
indicating no better than random guessing.
Advantages of ROC Curve
 Threshold-independent: The ROC Curve
provides an all-inclusive view of a model's
performance across distinct classification
thresholds, and they are threshold-
independent.
 Performance Comparison: The ROC Curve is
not dependent on the class imbalance in our
data, and it helps to compare the
performances of various models on the same
data sets.
 Clear and precise: The ROC Curve gives a
detailed visualization for distinguishing
between normal and abnormal test results.
 Visual representation: It also shows the
sensitivity and specificity at all threshold
values, so the data does not need to be
grouped to plot the graph.
Disadvantages of ROC Curve
 Can be perplexing: The ROC Curve can be
confusing and doesn't give a clear idea as it is
based on the binary classification and there
are only two outcomes yes or no or have
similar responses like is there or not there
depending on the data given.
 Not Smooth: The ROC Curve for larger
samples may not be smooth because the ROC
Curve appears to be jagged for smaller sample
sizes.
 Can be deceptive: The ROC Curve can be very
deceptive sometimes and it can't be used for
complex situations that is where more than
two classes are involved.
 Suitability: The ROC Curve may be suitable for
binary classification, but it may not be suitable
for multiclass classification tasks. They are
also robust to the class imbalance in the data.
Precision-Recall
Example of Precision-Recall metric to evaluate
classifier output quality.
Precision-Recall is a useful measure of success of
prediction when the classes are very imbalanced.
In information retrieval, precision is a measure of
the fraction of relevant items among actually
returned items while recall is a measure of the
fraction of items that were returned among all
items that should have been returned. ‘Relevancy’
here refers to items that are postively labeled, i.e.,
true positives and false negatives.
Precision (P) is defined as the number of true
positives (Tp) over the number of true positives
plus the number of false positives (Fp).
P=Tp/Tp+Fp
Recall (R) is defined as the number of true
positives (Tp) over the number of true positives
plus the number of false negatives (Fn).
R=Tp/Tp+Fn
The precision-recall curve shows the tradeoff
between precision and recall for different
thresholds. A high area under the curve
represents both high recall and high precision.
High precision is achieved by having few false
positives in the returned results, and high recall is
achieved by having few false negatives in the
relevant results. High scores for both show that
the classifier is returning accurate results (high
precision), as well as returning a majority of all
relevant results (high recall).
A system with high recall but low precision returns
most of the relevant items, but the proportion of
returned results that are incorrectly labeled is
high. A system with high precision but low recall is
just the opposite, returning very few of the
relevant items, but most of its predicted labels are
correct when compared to the actual labels. An
ideal system with high precision and high recall
will return most of the relevant items, with most
results labeled correctly.
The definition of precision (TpTp+Fp) shows that
lowering the threshold of a classifier may increase
the denominator, by increasing the number of
results returned. If the threshold was previously
set too high, the new results may all be true
positives, which will increase precision. If the
previous threshold was about right or too low,
further lowering the threshold will introduce false
positives, decreasing precision.
Recall is defined as Tp/Tp+Fn, where Tp+Fn does
not depend on the classifier threshold. Changing
the classifier threshold can only change the
numerator, Tp. Lowering the classifier threshold
may increase recall, by increasing the number of
true positive results. It is also possible that
lowering the threshold may leave recall
unchanged, while the precision fluctuates. Thus,
precision does not necessarily decrease with
recall.
The relationship between recall and precision can
be observed in the stairstep area of the plot - at
the edges of these steps a small change in the
threshold considerably reduces precision, with
only a minor gain in recall.
Average precision (AP) summarizes such a plot as
the weighted mean of precisions achieved at each
threshold, with the increase in recall from the
previous threshold used as the weight:
AP=∑n(Rn−Rn−1)Pn
where Pn and Rn are the precision and recall at
the nth threshold. A pair (Rk,Pk) is referred to as
an operating point.
AP and the trapezoidal area under the operating
points (sklearn.metrics.auc) are common ways to
summarize a precision-recall curve that lead to
different results.
Precision-recall curves are typically used in binary
classification to study the output of a classifier. In
order to extend the precision-recall curve and
average precision to multi-class or multi-label
classification, it is necessary to binarize the
output.

PROGRAM
Step 1: Import Packages

First, we’ll import the necessary packages:


from sklearn import datasets
from sklearn.model_selection import
train_test_split
from sklearn.linear_model import
LogisticRegression
from sklearn.metrics import
precision_recall_curve
import matplotlib.pyplot as plt
Step 2: Fit the Logistic Regression Model
Next, we’ll create a dataset and fit a logistic
regression model to it:
#create dataset with 5 predictor variables
X, y =
datasets.make_classification(n_samples=1000,
n_features=4,
n_informative=3,
n_redundant=1,
random_state=0)
#split dataset into training and testing set
X_train, X_test, y_train, y_test = train_test_split(X,
y, test_size=.3,random_state=0)

#fit logistic regression model to dataset


classifier = LogisticRegression()
classifier.fit(X_train, y_train)

#use logistic regression model to make predictions


y_score = classifier.predict_proba(X_test)[:, 1]
Step 3: Create the Precision-Recall Curve
Next, we’ll calculate the precision and recall of the
model and create a precision-recall curve:
#calculate precision and recall
precision, recall, thresholds =
precision_recall_curve(y_test, y_score)

#create precision recall curve


fig, ax = plt.subplots()
ax.plot(recall, precision, color='purple')

#add axis labels to plot


ax.set_title('Precision-Recall Curve')
ax.set_ylabel('Precision')
ax.set_xlabel('Recall')

#display plot
plt.show()
Additional information
Machine learning models are the mathematical engines that drive Artificial
Intelligence and thus are highly vital for successful AI implementation. In fact, you
could say that your AI is only as good as the machine models that drive them.

So, now convinced of the importance of a good machine learning model, you apply
yourself to the task, and after some hard work, you finally create what you believe to
be a great machine learning model. Congratulations!

But wait. How can you tell if your machine learning model is as good as you believe
it is? Clearly, you need an objective means of measuring your machine learning
model’s performance and determining if it’s good enough for implementation. It
would help if you had a ROC curve.

This article has everything you need to know about ROC curves. We will define ROC
curves and the term “area under the ROC curve,” how to use ROC curves in
performance modeling, and a wealth of other valuable information. We begin with
some definitions.

What Is a ROC Curve?

A ROC (which stands for “receiver operating characteristic”) curve is a graph that shows a
classification model performance at all classification thresholds. It is a probability curve that
plots two parameters, the True Positive Rate (TPR) against the False Positive Rate (FPR), at
different threshold values and separates a so-called ‘signal’ from the ‘noise.’

The ROC curve plots the True Positive Rate against the False Positive Rate at different
classification thresholds. If the user lowers the classification threshold, more items get
classified as positive, which increases both the False Positives and True Positives. You can
see some imagery regarding this here.

ROC Curve

An ROC (Receiver Operating Characteristic) curve is a graphical representation used to


evaluate the performance of a binary classifier. It plots two key metrics:

1. True Positive Rate (TPR): Also known as sensitivity or recall, it measures the proportion of actual
positives correctly identified by the model. It is calculated as:
TPR=True Positives/(TP)True Positives (TP)+False Negatives
2. False Positive Rate (FPR): This measures the proportion of actual negatives incorrectly identified
as positives by the model. It is calculated as:
FPR=False Positives (FP)/False Positives (FP)+True Negatives (TN)

The ROC curve plots TPR (y-axis) against FPR (x-axis) at various threshold settings. Here's a
more detailed explanation of these metrics:

 True Positive (TP): The instance is positive, and the model correctly classifies it as positive.

 False Positive (FP): The instance is negative, but the model incorrectly classifies it as positive.

 True Negative (TN): The instance is negative, and the model correctly classifies it as negative.

 False Negative (FN): The instance is positive, but the model incorrectly classifies it as negative.

Interpreting the ROC Curve

 A curve closer to the top left corner indicates a better-performing model.

 The diagonal line (from (0,0) to (1,1)) represents a random classifier.

 The area under the ROC curve (AUC) is a single scalar value that measures the model's overall
performance. It ranges from 0 to 1, and a higher AUC indicates a better-performing model.

Area Under the ROC Curve (AUC)

The Area Under the ROC Curve (AUC) is a single scalar value that summarizes the overall
performance of a binary classification model. It measures the ability of the model to
distinguish between the positive and negative classes. Here's what you need to know about
AUC:

Key Points About AUC

Range of AUC:

 The AUC value ranges from 0 to 1.

 An AUC of 0.5 indicates a model that performs no better than random chance.

 An AUC closer to 1 indicates a model with excellent performance.

Interpretation of AUC Values:

 0.9 - 1.0: Excellent


 0.8 - 0.9: Good

 0.7 - 0.8: Fair

 0.6 - 0.7: Poor

 0.5 - 0.6: Fail

Advantages of Using AUC:

 Threshold Independent: AUC evaluates the model's performance across all possible classification
thresholds.

 Scale Invariant: AUC measures how well the predictions are ranked rather than their absolute
values.

Calculation of AUC:

 AUC is typically calculated using numerical integration methods, such as the trapezoidal rule,
applied to the ROC curve.

 In practical terms, libraries like Scikit-learn in Python provide functions to compute AUC directly
from model predictions and true labels.

Become a AI & Machine Learning Professional


 $267 billionExpected Global AI Market Value By 2027
 40.2%The annual growth rate for the global AI market
 $15.7 trillionExpected Total Contribution Of AI To The Global Economy By 2030

Professional Certificate Course in Generative AI and Machine Learning


 Program completion certificate from E&ICT Academy, IIT Kanpur
 Curriculum delivered in live virtual classroom sessions by seasoned industry experts

11 months months

View Program

Artificial Intelligence Engineer


 Industry-recognized AI Engineer Master’s certificate from Simplilearn
 Dedicated live sessions by faculty of industry experts

11 Months months

View Program

Here's what learners are saying regarding our programs:


Abhineet Srivastava
Senior Manager - Analytics & Reporting, AXA
The machine learning course module was truly worth every moment! The content was not
only comprehensive but also up-to-date with the latest advancements. The progression from
Python fundamentals to grasping statistical concepts and diving deep into machine learning
was incredible. Thanks to the amazing trainers and supportive co-learners.

Indrakala Nigam Beniwal


Technical Consultant, Land Transport Authority (LTA) Singapore
I completed a Master's Program in Artificial Intelligence Engineer with flying colors from
Simplilearn. Thanks to the course teachers and others associated with designing such a
wonderful learning experience.
Not sure what you’re looking for?View all Related Programs

Key Terms Used in AUC and ROC Curve

1. True Positive (TP)

 Definition: The number of positive instances correctly identified by the model.

 Example: In a medical test, a TP is when the test correctly identifies a person with a disease as
having the disease.

2. True Negative (TN)

 Definition: The number of negative instances correctly identified by the model.

 Example: In a spam filter, a TN is when a legitimate email is correctly identified as not spam.

3. False Positive (FP)

 Definition: The number of negative instances incorrectly identified as positive by the model.

 Example: In a fraud detection system, an FP is when a legitimate transaction is incorrectly flagged


as fraudulent.

4. False Negative (FN)

 Definition: The number of positive instances incorrectly identified as negative by the model.
 Example: In a cancer screening test, an FN is when a person with cancer is incorrectly identified as
not having cancer.

5. True Positive Rate (TPR)

 Definition: Also known as sensitivity or recall, it measures the proportion of actual positives that
are correctly identified by the model.

 Formula: TPR=TP/TP+FN

 Example: A TPR of 0.8 means 80% of actual positive cases are correctly identified.

6. False Positive Rate (FPR)

 Definition: It measures the proportion of actual negatives that are incorrectly identified as
positive by the model.

 Formula: FPR=FP/FP+TN

 Example: An FPR of 0.1 means 10% of actual negative cases are incorrectly identified.

7. Threshold

 Definition: The value at which the model's prediction is converted into a binary classification. By
adjusting the threshold, different TPR and FPR values can be obtained.

 Example: In a binary classification problem, a threshold of 0.5 might mean that predicted
probabilities above 0.5 are classified as positive.

8. ROC Curve

 Definition: A graphical plot that illustrates the diagnostic ability of a binary classifier as its
discrimination threshold is varied. It plots TPR against FPR at various threshold settings.

 Example: An ROC curve close to the top left corner indicates a better-performing model.

9. Area Under the Curve (AUC)

 Definition: A single scalar value that summarizes the overall performance of a binary classifier
across all possible thresholds. It is the area under the ROC curve.

 Range: 0 to 1, where 1 indicates perfect performance and 0.5 indicates no better than random
guessing.

 Example: An AUC of 0.9 indicates excellent performance.


10. Precision

 Definition: The proportion of positive identifications that are actually correct.

 Formula: Precision=TP/TP+FP

 Example: A precision of 0.75 means 75% of the instances classified as positive are actually
positive.

11. Recall (Sensitivity)

 Definition: Another term for True Positive Rate (TPR), measuring the proportion of actual
positives correctly identified.

 Example: A recall of 0.8 means 80% of actual positive cases are correctly identified.

12. Specificity

 Definition: The proportion of actual negatives correctly identified by the model.

 Formula: Specificity=TN/TN+FP

 Example: A specificity of 0.9 means 90% of actual negative cases are correctly identified.

What Is a ROC Curve: How Do You Speculate Model


Performance?

AUC is a valuable tool for speculating model performance. An excellent model has its AUC
close to 1, indicating a good separability measure. Consequently, a poor model's AUC leans
closer to 0, showing the worst separability measure. In fact, the proximity to 0 means it
reciprocates the result, predicting the negative class as positive and vice versa, showing 0s as
1s and 1s as 0s. Finally, if the AUC is 0.5, it shows that the model has no class separation
capacity at all.

So, when we have a 0.5<AUC<1 result, there’s a high likelihood that the classifier can
distinguish between the positive class values and the negative class values. That’s because the
classifier can detect more numbers of True Positives and Negatives instead of False
Negatives and Positives.
The Relation Between Sensitivity, Specificity, FPR, and
Threshold

Before we examine the relation between Specificity, FPR, Sensitivity, and Threshold, we
should first cover their definitions in the context of machine learning models. For that, we'll
need a confusion matrix to help us to understand the terms better. Here is an example of
a confusion matrix:

Source

TP stands for True Positive, and TN means True Negative. FP stands for False Positive, and
FN means False Negative.

 Sensitivity: Sensitivity, also termed "recall," is the metric that shows a model's ability to predict
the true positives of all available categories. It shows what proportion of the positive class was
classified correctly. For example, when trying to figure out how many people have the flu,
sensitivity, or True Positive Rate, measures the proportion of people who have the flu and were
correctly predicted as having it.

Here’s how to mathematically calculate sensitivity:

Sensitivity = (True Positive)/(True Positive + False Negative)

 Specificity: The specificity metric Specificity evaluates a model's ability to predict true negatives
of all available categories. It shows what proportion of the negative class was classified correctly.
For example, specificity measures the proportion of people who don't have the flu and were
correctly predicted as not suffering from it in our flu scenario.

Here’s how to calculate specificity:

Specificity = (True Negative)/(True Negative + False Positive)

 FPR: FPR stands for False Positive Rate and shows what proportion of the negative class was
incorrectly classified. This formula shows how we calculate FPR:

FPR= 1 – Specificity

 Threshold: The threshold is the specified cut-off point for an observation to be classified as either
0 or 1. Typically, an 0.5 is used as the default threshold, although it’s not always assumed to be
the case.

Sensitivity and specificity are inversely proportional, so if we boost sensitivity, specificity


drops, and vice versa. Furthermore, we net more positive values when we decrease the
threshold, thereby raising the sensitivity and lowering the specificity.

On the other hand, if we boost the threshold, we will get more negative values, which results
in higher specificity and lower sensitivity.

And since the FPR is 1 – specificity, when we increase TPR, the FPR also increases and vice
versa.

How AUC-ROC Works

The AUC-ROC (Area Under the Curve - Receiver Operating Characteristic) is a performance
measurement for classification problems at various threshold settings. Here's how it works:

Threshold Variation:

 The ROC curve is generated by plotting the True Positive Rate (TPR) against the False Positive Rate
(FPR) at various threshold levels.

 By varying the threshold, different pairs of TPR and FPR values are obtained.

Plotting the ROC Curve:

 True Positive Rate (TPR), also known as Sensitivity or Recall, is plotted on the y-axis. It is the ratio
of true positives to the sum of true positives and false negatives.
 False Positive Rate (FPR) is plotted on the x-axis. It is the ratio of false positives to the sum of false
positives and true negatives.

Calculating AUC:

 The area under the ROC curve (AUC) quantifies the overall ability of the model to discriminate
between positive and negative classes.

 An AUC value ranges from 0 to 1. A value of 0.5 suggests no discrimination (random


performance), while a value closer to 1 indicates excellent model performance.

When to Use the AUC-ROC Evaluation Metric?

The AUC-ROC metric is particularly useful in the following scenarios:

1. Binary Classification Problems: It is primarily used for binary classification tasks with only two
classes.

2. Imbalanced Datasets: AUC-ROC is beneficial when dealing with imbalanced datasets, providing an
aggregate performance measure across all possible classification thresholds.

3. Model Comparison: It is useful for comparing the performance of different models. A higher AUC
value indicates a better-performing model.

4. Threshold-Independent Evaluation: When you need a performance metric that does not depend
on selecting a specific classification threshold.

Understanding the AUC-ROC Curve

1. ROC Curve Interpretation

 Closer to Top Left Corner: A curve that hugs the top left corner indicates a high-performing model
with high TPR and low FPR.

 Diagonal Line: A curve along the diagonal line (from (0,0) to (1,1)) indicates a model with no
discrimination capability, equivalent to random guessing.

2. AUC Value Interpretation

 0.9 - 1.0: Excellent discrimination capability.


 0.8 - 0.9: Good discrimination capability.

 0.7 - 0.8: Fair discrimination capability.

 0.6 - 0.7: Poor discrimination capability.

 0.5 - 0.6: Fail, model performs worse than random guessing.

How to Use the AUC - ROC Curve for the Multi-Class Model

We can use the One vs. ALL methodology to plot the N number of AUC ROC Curves for N
number classes when using a multi-class model. One vs. ALL gives us a way to leverage
binary classification. If you have a classification problem with N possible solutions, One vs.
ALL provides us with one binary classifier for each possible outcome.

So, for example, you have three classes named 0, 1, and 2. You will have one ROC for 0
that’s classified against 1 and 2, another ROC for 1, which is classified against 0 and 2, and
finally, the third one of 2 classified against 0 and 1.

We should take a moment and explain the One vs. ALL methodology to better answer the
question “what is a ROC curve?”. This methodology is made up of N separate binary
classifiers. The model runs through the binary classifier sequence during training, training
each to answer a classification question. For instance, if you have a cat picture, you can train
four different recognizers, one seeing the image as a positive example (the cat) and the other
three seeing a negative example (not the cat). It would look like this:

 Is this image a rutabaga? No

 Is this image a cat? Yes

 Is this image a dog? No

 Is this image a hammer? No

This methodology works well with a small number of total classes. However, as the number
of classes rises, the model becomes increasingly inefficient.

Acelerate your career in AI and ML with the AI and Machine Learning Courses with Purdue University
collaborated with IBM.
Are You Interested in a Career in Machine Learning?

There’s a lot to learn about Machine Learning, as you can tell from this “what is a ROC
curve” article! However, both machine learning and artificial intelligence are the waves of the
future, so it’s worth acquiring skills and knowledge in these fields. Who knows? You could
find yourself in an exciting machine learning career!

If you want a career in machine learning, Simplilearn can help you on your way. The AI and
ML Certification offers students an in-depth overview of machine learning topics. You will
learn to develop algorithms using supervised and unsupervised learning, work with real-time
data, and learn about concepts like regression, classification, and time series modeling. You
will also learn how Python can be used to draw predictions from data. In addition, the
program features 58 hours of applied learning, interactive labs, four hands-on projects, and
mentoring.

And since machine learning and artificial intelligence work together so frequently, check out
Simplilearn’s Artificial Intelligence Engineer Master’s program, and cover all of your bases.

According to Glassdoor, Machine Learning Engineers in the United States enjoy an average
annual base pay of $133,001. Payscale.com reports that Machine Learning Engineers in India
can potentially earn ₹732,566 a year on average.

So visit Simplilearn today, and explore the rich possibilities of a rewarding vocation in the
machine learning field!

FAQs

1. What does a perfect AUC-ROC curve look like?

A perfect AUC-ROC curve reaches the top left corner of the plot, indicating a True Positive
Rate (TPR) of 1 and a False Positive Rate (FPR) of 0 for some threshold. This means the
model perfectly distinguishes between positive and negative classes, resulting in an AUC
value of 1.0.

2. What does an AUC value of 0.5 signify?

An AUC value of 0.5 signifies that the model's performance is no better than random
guessing. It indicates that the model cannot distinguish between positive and negative classes,
as the True Positive Rate (TPR) and False Positive Rate (FPR) are equal across all thresholds.
3. How do you compare ROC curves of different models?

To compare ROC curves of different models, plot each model's ROC curve on the same
graph and examine their shapes and positions. The model with the ROC curve closest to the
top left corner and the highest Area Under the Curve (AUC) value generally performs better.

4. What are some limitations of the ROC curve?

Some limitations of the ROC curve include:

 It can be less informative for highly imbalanced datasets, as the True Negative Rate (specificity)
might dominate the curve.

 It does not account for the cost of false positives and false negatives, which can be crucial in some
applications.

 Interpretation can be less intuitive compared to precision-recall curves in certain contexts.

5. What are common metrics derived from ROC curves?

Common metrics derived from ROC curves include:

 True Positive Rate (TPR): Also known as sensitivity or recall, it measures the proportion of actual
positives correctly identified.

 False Positive Rate (FPR): Measures the proportion of actual negatives incorrectly identified as
positives.

 Area Under the Curve (AUC): Summarizes the model's overall performance across all thresholds.

 Optimal Threshold: The threshold value maximizes the TPR while minimizing the FPR.

You might also like