0% found this document useful (0 votes)

15 views11 pages

Loss Functions Types

The document discusses various types of loss functions used in machine learning, categorized into regression, classification, ranking, image reconstruction, adversarial, and specialized loss functions. It details specific loss functions such as Mean Squared Error, Binary Cross-Entropy, Hinge Loss, and others, highlighting their advantages and disadvantages. Each loss function serves distinct purposes based on the nature of the task, such as predicting continuous values, classifying data, or generating images.

Uploaded by

rasiksuhaif35

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views11 pages

Loss Functions Types

Uploaded by

rasiksuhaif35

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Types Of Loss Functions

 Regression Loss Function :

In machine learning, loss functions are critical components

used to evaluate how well a model's predictions match the
actual data.

For regression tasks, where the goal is to predict a continuous

value, several loss functions are commonly used.

Each has its own characteristics and is suitable for different

scenarios. Here, we will discuss four popular regression loss
functions:
 Mean Squared Error (MSE) Loss
 Mean Absolute Error (MAE) Loss
 Huber Loss, and Log-Cosh Loss

Mean Squared Error :

 The Mean Squared Error (MSE) Loss is one of the most
widely used loss functions for regression tasks. It
calculates the average of the squared differences
between the predicted values and the actual values.

 MSE = 1n∑i=1n(yi−y^i)2MSE=n1∑i=1n(yi−yi)2

 Advantages :
 Simple to compute and understand.
 Differentiable, making it suitable for gradient-based
optimization algorithms.

 DisAdvantages :

 Sensitive to outliers because the errors are squared,

which can disproportionately affect the loss.

Mean Absolute Error :

 The Mean Absolute Error (MAE) Loss is another
commonly used loss function for regression. It
calculates the average of the absolute differences
between the predicted values and the actual values.

 MAE = 1n∑i=1n∣yi−yi^∣MAE=n1∑i=1n∣yi−yi∣

 Advantages:

 Less sensitive to outliers compared to MSE.

 Simple to compute and interpret.

 Disadvantages:

 Not differentiable at zero, which can pose issues for

some optimization algorithms.

Huber Loss :

 Huber Loss combines the advantages of MSE and MAE.

It is less sensitive to outliers than MSE and
differentiable everywhere, unlike MAE.

 Advantages:

 Robust to outliers, providing a balance between MSE

and MAE.
 Differentiable, facilitating gradient-based
optimization.

 Disadvantages:

 Requires tuning of the parameter δδ.

Log-Cosh Loss :
 Log-Cosh Loss is another smooth loss function for
regression, defined as the logarithm of the hyperbolic
cosine of the prediction error.

 Advantages:

 Combines the benefits of MSE and MAE.

 Smooth and differentiable everywhere, making it
suitable for gradient-based optimization.

 Disadvantages:

 More complex to compute compared to MSE and

MAE

 Classification Loss Functions :

Classification loss functions are essential for evaluating how

well a classification model's predictions match the actual class
labels. Different loss functions cater to various classification
tasks, including binary, multiclass, and imbalanced datasets.

Here, we will discuss several widely used classification loss

functions:

 Binary Cross-Entropy Loss (Log Loss)

 Categorical Cross-Entropy Loss
 Sparse Categorical
 Cross-Entropy Loss
 Kullback-Leibler Divergence Loss (KL Divergence)
 Hinge Loss
 Squared Hinge Loss
 Focal Loss

Binary Cross-Entropy Loss(Log Loss) :

 Binary Cross-Entropy Loss, also known as Log Loss, is

used for binary classification problems. It measures the
performance of a classification model whose output is a
probability value between 0 and 1.

 Advantages:

 Suitable for binary classification.

 Differentiable, making it useful for gradient-based
optimization.

 Disadvantages:

 Can be sensitive to imbalanced datasets.

Categorical Cross-Entropy Loss :

 Categorical Cross-Entropy Loss is used for multiclass

classification problems. It measures the performance of
a classification model whose output is a probability
distribution over multiple classes.

 Advantages:

 Suitable for multiclass classification.

 Differentiable and widely used in neural networks.

 Disadvantages:

 Not suitable for sparse targets.

Sparse Categorical Cross-Entropy Loss :

 Sparse Categorical Cross-Entropy Loss is similar to

Categorical Cross-Entropy Loss but is used when the
target labels are integers instead of one-hot encoded
vectors.
 Advantages:

 Efficient for large datasets with many classes.

 Reduces memory usage by using integer labels
instead of one-hot encoded vectors.

 Disadvantages:

 Requires integer labels.

Kullback-Leibler Divergence Loss (KL Divergence) :

 KL Divergence measures how one probability

distribution diverges from a second, expected probability
distribution. It is often used in probabilistic models.

 Advantages:

 Useful for measuring divergence between

distributions.
 Applicable in various probabilistic modeling tasks.

 Disadvantages:

 Sensitive to small differences in probability

distributions.

Hinge Loss :

 Hinge Loss is used for training classifiers, especially or

support vector machines (SVMs). It is suitable for binary
classification tasks

 Advantages:

 Effective for SVMs.

 Encourages correct classification with a margin.

 Disadvantages:
 Not differentiable at zero, posing challenges for some
optimization methods.

Squared Hinge Loss :

 Squared Hinge Loss is a variation of Hinge Loss that

suares the hinge loss term, making it more sensitive to
misclassifications.

 Advantages:

 Penalizes misclassifications more heavily.

 Encourages larger margins.

 Disadvantages:

 Similar challenges as Hinge Loss regarding

differentiability at zero.

Focal Loss :

 Focal Loss is designed to address class imbalance by

focusing more on hard-to-classify examples. It
introduces a modulating factor to the standard cross-
entropy loss.

 Advantages:

 Effective for addressing class imbalance.

 Focuses on hard-to-classify examples.

 Disadvantages:

 Requires tuning of the focusing parameter γ\

gammaγ.
 Ranking Loss Function :

Ranking loss functions are used to evaluate models that

predict the relative order of items. These are commonly
used in tasks such as recommendation systems and
information retrieval.

Contrastive Loss :

 Contrastive Loss is used to learn embeddings such that

similar items are closer in the embedding space, while
dissimilar items are farther apart. It is often used in
Siamese networks.

 Formula :

= 1/2N ∑N i=1 (yi . di2 + (1 - yi) . max(0,m –

2
di ) )

 where didi is the distance between a pair of

embeddings, yiyi is 1 for similar pairs and 0 for
dissimilar pairs, and mmm is a margin.

Triplet Loss :

 Triplet Loss is used to learn embeddings by comparing

the relative distances between triplets: an anchor, a
positive example, and a negative example.

 Formula :

= 1/N ∑ i=1 [|| f(xai) – f(xpi) || 2 2 - ||f(xai) –

f(xni) ||2 2 + α]+

Margin Ranking Loss :

 Margin Ranking Loss measures the relative distances

between pairs of items and ensures that the correct
ordering is maintained with a specified margin.
 Formula :

= 1/N ∑Ni=1 max(0, -yi . (s+i – s-i) + margin)

 Image and Reconstruction Loss Functions :

These loss functions are used to evaluate models that

generate or reconstruct images, ensuring that the output is
as close as possible to the target images.

Pixel-wise Cross-Entropy Loss :

 Pixel-wise Cross-Entropy Loss is used for image

segmentation tasks, where each pixel is classified
independently.

 Formula :

= - 1/N ∑Ni=1 ∑Cc=1 yi,c log(yˆ,c)

Dice Loss :

 Dice Loss is used for image segmentation tasks and is

particularly effective for imbalanced datasets. It
measures the overlap between the predicted
segmentation and the ground truth.

 Formula :

= 1 - ∑Ni=1 yiyˆi / ∑Ni=1 yi + ∑Ni=1 yˆi

Jaccard Loss (Intersection over Union, IoU) :

 Jaccard Loss, also known as IoU Loss, measures the

intersection over union of the predicted segmentation
and the ground truth.
 Formula :

= 1- ∑Ni=1 yiyî / ∑Ni=1 yi + ∑Ni=1 yî - ∑Ni=1 yiyî

Perceptual Loss :

 Perceptual Loss measures the difference between high-

level features of images rather than pixel-wise
differences. It is often used in image generation tasks.

 Formula :

= ∑Ni=1 || ϕj(yi) – ϕj(yˆj) || 2 2

Total Variation Loss :

 Total Variation Loss encourages spatial smoothness in

images by penalizing differences between adjacent
pixels.

 Formula :

= ∑i,j ((yi,j+1 – yi,j)2+(yi+1,j – yi,j)2)

 Adversarial Loss Functions :

Adversarial loss functions are used in generative

adversarial networks (GANs) to train the generator and
discriminator networks

Least Squares GAN Loss :

Least Squares GAN Loss aims to provide more stable

training by minimizing the Pearson χ2\chi^2χ2 divergence.
 Formula :

maxD Ex~Pdata(x) [(D(x) – 1)2] + 1/2Ez~Pz(z)D[D(G(z))2]

minGEz~Pz(z)[(D(G(z))-1)2]minG21Ez ~ pz(z)[(D(G(z))
- 1)2]

Adversarial Loss (GAN Loss) :

 The standard GAN loss function involves a minimax

game between the generator and the discriminator.

 Formula :

minGmaxDEx~Pdata(x)[logD(x)] + Ez~Pz(z)[log(1 –
D(G(z)))]

 Specialized Loss Functions :

Specialized loss functions cater to specific tasks such as

sequence prediction, count data, and cosine similarity.

CTC Loss (Connectionist Temporal

Classification) :

 CTC Loss is used for sequence prediction tasks where

the alignment between input and output sequences is
unknown.

 Formula :

CTC Loss = −log(p(y∣x))

Poisson Loss :
 Poisson Loss is used for count data, modeling the
distribution of the predicted values as a Poisson
distribution.
 Formula :
= ∑Ni=1(yˆi – yilog(yˆi))

Cosine Proximity Loss :

 Cosine Proximity Loss measures the cosine similarity

between the predicted and target vectors, encouraging
them to point in the same direction.

 Formula :

= -1/N ∑Ni=1 yi.yˆi / || yi || || y^i ||

Log Loss :

 Log Loss, or logistic loss, is used for binary

classification tasks. It measures the performance of a
classification model whose output is a probability value
between 0 and 1

 Formula :

= - 1/N ∑Ni=1 [yilog(yˆi) + (1 – yi) log(1 – yˆi)]

Earth Mover's Distance (Wasserstein Loss) :

 Earth Mover's Distance measures the distance between

two probability distributions and is often used in
Wasserstein GANs.

 Formula :

=Ex~Pr[D(x)] – Ez~Pz[D(G(z))]

DC 21EC51 Module 5 Notes
No ratings yet
DC 21EC51 Module 5 Notes
103 pages
jss1 Mathe Exam
No ratings yet
jss1 Mathe Exam
2 pages
Unit 3 - Mathematics III - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Mathematics III - WWW - Rgpvnotes.in
35 pages
Losses
No ratings yet
Losses
9 pages
Deep Learning(Part 2). Loss Function and Gradient Function _ by Sumbatilinda _ Medium
No ratings yet
Deep Learning(Part 2). Loss Function and Gradient Function _ by Sumbatilinda _ Medium
30 pages
Loss Functions and Metrics in Deep Learning A Revi
No ratings yet
Loss Functions and Metrics in Deep Learning A Revi
53 pages
PERT and CPM: Comparison and Its Differences
100% (1)
PERT and CPM: Comparison and Its Differences
5 pages
A Survey On Encrypted and Decrypted Text Algorithm Using CRC, SHA 256, MD5 and Caesar Cipher
No ratings yet
A Survey On Encrypted and Decrypted Text Algorithm Using CRC, SHA 256, MD5 and Caesar Cipher
4 pages
Neural Networks
No ratings yet
Neural Networks
63 pages
Bayesian Disease Mapping : Hierarchical Modeling in Spatial Epidemiology, Third Edition Lawson all chapter instant download
100% (3)
Bayesian Disease Mapping : Hierarchical Modeling in Spatial Epidemiology, Third Edition Lawson all chapter instant download
52 pages
3 - Loss Functions
No ratings yet
3 - Loss Functions
14 pages
On Loss Functions For Deep Neural Networks in Classification Katarzyna Janocha, Wojciech Marian Czarnecki
No ratings yet
On Loss Functions For Deep Neural Networks in Classification Katarzyna Janocha, Wojciech Marian Czarnecki
10 pages
3-Region Elimination Methods_ Unrestricted Search,-11!01!2025
No ratings yet
3-Region Elimination Methods_ Unrestricted Search,-11!01!2025
79 pages
co po
No ratings yet
co po
2 pages
GT - Exercise Set
No ratings yet
GT - Exercise Set
5 pages
Control Systems_removed
No ratings yet
Control Systems_removed
20 pages
DAA_2.2-Mincost Spanning Tree
No ratings yet
DAA_2.2-Mincost Spanning Tree
29 pages
8 Linear Classifiers HInge Loss 03-08-2024
No ratings yet
8 Linear Classifiers HInge Loss 03-08-2024
20 pages
Data Visualization and Communication Introduction
No ratings yet
Data Visualization and Communication Introduction
14 pages
（2024，美团）Scene-wise Adaptive Network for Dynamic Cold-start Scenes Optimization in CTR Prediction
No ratings yet
（2024，美团）Scene-wise Adaptive Network for Dynamic Cold-start Scenes Optimization in CTR Prediction
10 pages
Architectures and Algorithms For DSP Systems (Crl702) : Centre For Applied Research in Electronics Iit Delhi
No ratings yet
Architectures and Algorithms For DSP Systems (Crl702) : Centre For Applied Research in Electronics Iit Delhi
8 pages
Human Activity Recognition With Accelerometer and Gyroscope: A Data Fusion Approach
No ratings yet
Human Activity Recognition With Accelerometer and Gyroscope: A Data Fusion Approach
13 pages
Cyclomatic and Loc Complexity
No ratings yet
Cyclomatic and Loc Complexity
15 pages
DL145611_03_Shallow
No ratings yet
DL145611_03_Shallow
92 pages
2016 Exam Stabilité
No ratings yet
2016 Exam Stabilité
3 pages
DeekshikaJadyada20-AP24LDS11
No ratings yet
DeekshikaJadyada20-AP24LDS11
4 pages
MMW 101 Lesson 6 Statistics
No ratings yet
MMW 101 Lesson 6 Statistics
32 pages
Data Structures Ass
No ratings yet
Data Structures Ass
11 pages
Loss Function
No ratings yet
Loss Function
2 pages
AI and Math_Python Multiple-Choice Questions
No ratings yet
AI and Math_Python Multiple-Choice Questions
16 pages
Boi 2007 Solutions
No ratings yet
Boi 2007 Solutions
35 pages
BC0043 Computer Oriented Numerical Methods Paper 1
No ratings yet
BC0043 Computer Oriented Numerical Methods Paper 1
13 pages
WINSEM2024-25_CSE4006_ETH_AP2024254000689_2025-01-09_Reference-Material-I
No ratings yet
WINSEM2024-25_CSE4006_ETH_AP2024254000689_2025-01-09_Reference-Material-I
15 pages
Cost Function Loss Function
No ratings yet
Cost Function Loss Function
7 pages
chapter02.Background-theory_5e45b9b50ccb12d028c8edf9b332c5e5
No ratings yet
chapter02.Background-theory_5e45b9b50ccb12d028c8edf9b332c5e5
20 pages
Limitations of Algorithm Power
No ratings yet
Limitations of Algorithm Power
18 pages
Loss functions
No ratings yet
Loss functions
29 pages
Nptel Lec
No ratings yet
Nptel Lec
22 pages
ML Intro Numericals
No ratings yet
ML Intro Numericals
27 pages
320-cheatsheet2 2024-05-09 17_40_40
No ratings yet
320-cheatsheet2 2024-05-09 17_40_40
2 pages
4 How To Use Smart PLS Software Structur
No ratings yet
4 How To Use Smart PLS Software Structur
48 pages
SkriptOptMach
No ratings yet
SkriptOptMach
49 pages
9.b Handout-1-Loss Functions
No ratings yet
9.b Handout-1-Loss Functions
3 pages
DL_Assi02
No ratings yet
DL_Assi02
9 pages
practicalMachineLearning_lecture3
No ratings yet
practicalMachineLearning_lecture3
25 pages
05 AIS302 ANN-Optimization
No ratings yet
05 AIS302 ANN-Optimization
44 pages
Roz-4 - Janocha
No ratings yet
Roz-4 - Janocha
11 pages
Frequency Response Analysis: Karl D. Hammond January 2008
No ratings yet
Frequency Response Analysis: Karl D. Hammond January 2008
14 pages
Loss functions
No ratings yet
Loss functions
25 pages
Loss Functions in Neural Networks PDF
No ratings yet
Loss Functions in Neural Networks PDF
6 pages
Lect 8
No ratings yet
Lect 8
117 pages
Practical-5_2CEIT606_Artificial Intelligence
No ratings yet
Practical-5_2CEIT606_Artificial Intelligence
14 pages
Most Influential Data Science Research Papers
No ratings yet
Most Influential Data Science Research Papers
628 pages
Loss
No ratings yet
Loss
18 pages
Module 6_Loss Function
No ratings yet
Module 6_Loss Function
22 pages
Loss_Functions
No ratings yet
Loss_Functions
17 pages
Lecture 11
No ratings yet
Lecture 11
26 pages
lecture19
No ratings yet
lecture19
8 pages
A General and Adaptive Robust Loss Function
No ratings yet
A General and Adaptive Robust Loss Function
9 pages
Assignment 1 - Machine Learning
No ratings yet
Assignment 1 - Machine Learning
9 pages
Lect 9- Loss Functions
No ratings yet
Lect 9- Loss Functions
28 pages
03-Linear Classification
No ratings yet
03-Linear Classification
17 pages
loss function
No ratings yet
loss function
23 pages
Lec 04 Deep Networks 2
No ratings yet
Lec 04 Deep Networks 2
78 pages
Machine Learning Models
No ratings yet
Machine Learning Models
52 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Lecture 07
No ratings yet
Lecture 07
29 pages
A General and Adaptive Robust Loss Function: Jonathan T. Barron Google Research
No ratings yet
A General and Adaptive Robust Loss Function: Jonathan T. Barron Google Research
19 pages
Loss Function - Ipynb - Colaboratory
No ratings yet
Loss Function - Ipynb - Colaboratory
6 pages
Differential Equation 8th Edition by Warren S. Wright
No ratings yet
Differential Equation 8th Edition by Warren S. Wright
174 pages
loss-functions
No ratings yet
loss-functions
8 pages
4-Loss Function
No ratings yet
4-Loss Function
8 pages
04 LossFunctions
No ratings yet
04 LossFunctions
22 pages
Loss Functions in Deep Learning - MLearning - Ai
No ratings yet
Loss Functions in Deep Learning - MLearning - Ai
14 pages
Cross Entropy Loss Intro, Applications
No ratings yet
Cross Entropy Loss Intro, Applications
21 pages
Loss Functions
No ratings yet
Loss Functions
7 pages
DL Practical 3 Loss Function
No ratings yet
DL Practical 3 Loss Function
6 pages
Private-Key Cryptography
No ratings yet
Private-Key Cryptography
30 pages
Detailed Guide 7 Loss Functions Machine Learning Python Code
No ratings yet
Detailed Guide 7 Loss Functions Machine Learning Python Code
16 pages
Lectures On Ergodic Theory
No ratings yet
Lectures On Ergodic Theory
153 pages
DL Unit-2
No ratings yet
DL Unit-2
24 pages
Refresco: A Community-Based Open-Usage and Open-Source CFD Code For The Maritime World
No ratings yet
Refresco: A Community-Based Open-Usage and Open-Source CFD Code For The Maritime World
2 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Unit 2b
No ratings yet
Unit 2b
11 pages
Machine Vesion hw6
No ratings yet
Machine Vesion hw6
18 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet