0% found this document useful (0 votes)
2 views

Lect 9- Loss Functions

The document explains loss functions, which measure how well a model's predictions match true values in deep learning. It categorizes various loss functions into regression, binary classification, and multi-class classification, detailing specific types such as Mean Squared Error, Binary Cross-Entropy, and Huber Loss. The document also discusses the sensitivity of these loss functions to outliers and the importance of using appropriate loss functions for different types of models.

Uploaded by

cs22b2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lect 9- Loss Functions

The document explains loss functions, which measure how well a model's predictions match true values in deep learning. It categorizes various loss functions into regression, binary classification, and multi-class classification, detailing specific types such as Mean Squared Error, Binary Cross-Entropy, and Huber Loss. The document also discusses the sensitivity of these loss functions to outliers and the importance of using appropriate loss functions for different types of models.

Uploaded by

cs22b2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

LOSS FUNCTIONS

Dr. Umarani Jayaraman


What is loss function?
 It provides a measure of how well the
model is performing on training data
(that includes validation data) with
respect to its objective.
 A loss function, in the context of deep
learning and optimization, is a measure
of how well a model's predictions match
the true values (ground truth) of the
dataset.
Various Loss Function
• Regression Loss Functions
• Mean Squared Error Loss
• Mean Absolute Error Loss
• Huber Loss
• Binary Classification Loss Functions
• Binary Cross-Entropy
• Hinge Loss
• Multi-class Classification Loss Functions
• Multi-class Cross Entropy Loss
• Kullback Leibler Divergence Loss
Various Loss Function
 1. Regression Loss Functions (Used for continuous output)
1. MSE(Mean Squared Error)
2. MAE(Mean Absolute Error)
3. Hubber loss
 2. Classification
1. Binary cross-entropy
2. Categorical cross-entropy
 3. Auto-Encoder
1. KL Divergence
 4. GAN
1. Discriminator loss
2. Minmax GAN loss
 5. Object detection
1. Focal loss
 6. Word embeddings
1. Triplet loss
Regression Loss
 Mean Squared Error/Squared loss/
L2 loss

Penalizes large
errors more than
small errors.
Regression Loss
 Mean Absolute Error/ L1 loss

Less
sensitive to
outliers
compared to
MSE
Huber Loss
 Huber Loss: A combination of MSE and
MAE, useful for handling outliers.
Understanding Sensitivity to Outliers

 Since MSE squares the error, even a


single large error (outlier) can
significantly increase the total loss. This
happens because squaring amplifies
large values more than small ones.
 Example: Effect of an Outlier
 Consider a dataset with actual values (y)
and two sets of predictions (y^​):
 One set has a normal error.
 The other contains an outlier.
MSE Calculation
A graph comparing MSE and MAE with
outliers
 MSE (red curve) grows much faster for large
errors because of squaring.
 MAE (blue curve) increases linearly, meaning it
does not exaggerate large errors
Why This Is a Problem
 MSE overreacts to outliers because
the squared error makes big errors
dominate the loss.
 The model may prioritize reducing a few
large errors instead of improving overall
performance.
Next For Classification…
Alternative to Reduce Outlier Sensitivity
Classification Loss-
Binary Cross Entropy Loss

 Let us start by understanding the


term ‘entropy’.
 Generally, we use entropy to indicate
disorder or uncertainty.
 It is measured for a random variable X
with probability distribution p(X):
Binary Cross Entropy Loss
 The negative sign is used to make the
overall quantity positive.
 A greater value of entropy for a
probability distribution indicates a
greater uncertainty in the distribution.
 Likewise, a smaller value indicates a
more certain distribution.
Binary Cross Entropy Loss
This makes binary cross-entropy suitable

as a loss function – you want to


minimize its value.
 We use binary cross-entropy loss for

classification models which output a


probability p.
Probability that the element belongs to class 1
(or positive class) = p
Then, the probability that the element belongs
to class 0 (or negative class) = 1 - p
The probability
Binary Cross Entropy Loss
 Then, the cross-entropy loss for output
label y (can take values 0 and 1) and
predicted probability p is defined as:

− [y log(p) + (1−y) log(1−p)]


Binary Cross Entropy Loss

 This is also called Log-Loss.


 To calculate the probability p, we can
use the sigmoid function. Here, z is a
function of our input features:
Binary Cross Entropy Loss
 The range of the sigmoid function is [0,
1] which makes it suitable for calculating
probability.
Binary Cross Entropy Loss
Multi-class cross-entropy/categorical cross-entropy
Categorical Cross Entropy
Loss
Categorical Cross Entropy
Loss
 Softmax converts logits into
probabilities. The purpose of cross-
entropy is to take the output
probabilities (P) and measure the
distance from the truth values (as shown
below).
Categorical Cross Entropy
Loss
Categorical Cross Entropy
Loss
Categorical Cross Entropy
Loss
Questions
 Why does cross-entropy is used most
commonly as compared to MSE for
classification problem?
Thank you

You might also like