Module 6_Loss Function

The document covers loss functions in neural networks, explaining their purpose in training and optimization. It categorizes loss functions into regression (e.g., Mean Squared Error, Mean Absolute Error) and classification (e.g., Binary Cross-Entropy, Categorical Cross-Entropy, Sparse Categorical Cross-Entropy) types. Additionally, it discusses how loss functions guide model training by providing feedback for parameter updates through optimization techniques.

Uploaded by

Erica Mae Cañete

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Module 6_Loss Function

Uploaded by

Erica Mae Cañete

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Module 6

Loss Functions
Intended Learning Outcomes (ILOs) for
Loss Functions in Neural Networks

At the end of the lesson, students are expected to be able to:

• Explain the purpose of a loss function in training neural networks.
• Identify and classify loss functions based on the type of problem
(regression vs. classification).
• Select an appropriate loss function for a given machine learning task.
• Implement loss functions using Python libraries such as
TensorFlow/Keras.
Introduction to Loss Functions
• In neural networks, a loss function measures how well the model's
predictions match the actual target values.
• It serves as the foundation for training the network, guiding the
optimization process to improve accuracy.
• The loss function, also referred to as the error function, is a crucial
component in machine learning that quantifies the difference
between the predicted outputs of a machine learning algorithm and
the actual target values.
Why Do We Need a Loss Function?
• It quantifies the difference between the predicted output and the
actual output.
• It provides feedback to update model parameters (weights and
biases).
• It helps optimize the neural network through gradient descent or
other optimization techniques.
Why Do We Need a Loss Function?
• The resulting value, the loss, reflects the accuracy of
the model's predictions.
• During training, a learning algorithm such as the
backpropagation algorithm uses the gradient of the
loss function with respect to the model's parameters to
adjust these parameters and minimize the loss,
effectively improving the model's performance on the
dataset.
Gradient Descent
Types of Loss Functions
Loss functions are broadly categorized into regression loss functions
and classification loss functions based on the type of problem being
solved.
• Loss Functions for Regression
• Mean Squared Error (MSE)
• Mean Absolute Error (MAE)
• Loss Functions for Classification
• Binary Cross-Entropy (Log Loss) (for Binary Classification)
• Categorical Cross-Entropy (for Multi-Class Classification)
• Sparse Categorical Cross-Entropy
Mean Squared Error (MSE)

• Measures the average squared difference between actual and

predicted values.
• Penalizes larger errors more than smaller ones.

Actual Pred Diff Squared Sum MSE

85 86 -1 1
78 84 -6 36
89 92 -3 9 114 22.8
78 76 2 4
82 74 8 64
Mean Absolute Error (MAE)

• Measures the average absolute difference between actual and

predicted values.
• Less sensitive to large errors compared to MSE.

Actual Pred |Diff| Sum MAE

85 86 1
78 84 6
89 92 3 20 4
78 76 2
82 74 8
Binary Cross-Entropy (Log Loss) (for Binary
Classification)

• Used when predicting probabilities for two classes (0 or 1).

• Encourages correct probability estimation.

Class Probability of Predicted Log ( Predicted

User ID Prediction Predicted Class Actual Class Probability of Probability of ‘Class 1’ )
‘Class 1’
sd459 1 0.8 1 0.8 -0.22
sd325 1 0.65 1 0.65 -0.43
ef345 1 0.78 1 0.78 -0.25
bw678 1 0.91 1 0.91 -0.09
df837 0 0.65 0 0.35 -1.05
lk948 1 0.87 1 0.87 -0.14
os274 0 0.22 0 0.78 -0.25
ye923 0 0.33 0 0.67 -0.4
Categorical Cross-Entropy (for Multi-Class
Classification)
• Categorical Cross-Entropy (CCE), also known as softmax loss or log loss, is one of the most
commonly used loss functions in machine learning, particularly for classification problems.
• It measures the difference between the predicted probability distribution and the actual (true)
distribution of classes.
Calculating Categorical Cross-Entropy
• Let's break down the categorical cross-entropy calculation with a mathematical
example using the following true labels and predicted probabilities.
• We have 3 samples, each belonging to one of 3 classes (Class A, Class B, or Class
C). The true labels are one-hot encoded.
True Labels (y_true):
Example 1: Class B → [0, 1, 0]
Example 2: Class A → [1, 0, 0]
Example 3: Class C → [0, 0, 1]

Predicted Probabilities (y_pred):

Example 1: [0.1, 0.8, 0.1]
Example 2: [0.7, 0.2, 0.1]
Example 3: [0.2, 0.3, 0.5]
Calculating Categorical Cross-Entropy
• Final Losses:
• For Example 1, the loss is: 0.22314355
• For Example 2, the loss is: 0.35667494
• For Example 3, the loss is: 0.69314718

• How Categorical Cross-Entropy Works

• Prediction of Probabilities - The model outputs probabilities for each class.
These probabilities are the likelihood of a data point belonging to each class.
Typically, this is done using a softmax function, which converts raw scores into
probabilities.
• Comparison with True Class - Categorical cross-entropy compares the
predicted probabilities with the actual class labels (one-hot encoded).
• Calculation of Loss - The logarithm of the predicted probability for the correct
class is taken, and the loss function penalizes the model based on how far the
prediction was from the actual class.
Sparse Categorical Cross-Entropy
• Similar to categorical cross-entropy but used when target labels are integer-
encoded instead of one-hot encoded.
• Instead, the labels are represented as integers corresponding to the class
indices. The true labels are integers, where each integer represents the class
index.
• Example:
• If the correct label is "Cat," it would be represented as the integer 1 (since
"Cat" is the second class, starting from 0).
• Suppose the model predicts probabilities like [0.2, 0.7, 0.1]. The loss is
calculated for the correct class (Cat) using the formula: -log(0.7)
• Sparse Categorical Cross entropy internally converts these integer labels into
one-hot encoded format before calculating the loss. This approach can save
memory and computational resources, especially when dealing with datasets
containing a large number of classes.
How Does Loss Function Affect Model
Training?
• The loss function provides a numerical value that the optimizer (e.g.,
Gradient Descent, Adam, RMSprop) minimizes.
• The network updates weights using backpropagation, which
calculates the gradient of the loss function with respect to each
parameter.
MSE
MAE
Binary Cross-Entropy
Categorical Cross-Entropy
Sparse Categorical Cross-Entropy

C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
DL Unit-2
No ratings yet
DL Unit-2
24 pages
Operator'S Manual: Modbus TCP Interface
No ratings yet
Operator'S Manual: Modbus TCP Interface
55 pages
Lecture 11
No ratings yet
Lecture 11
26 pages
Losses
No ratings yet
Losses
9 pages
Loss Functions
No ratings yet
Loss Functions
7 pages
04 LossFunctions
No ratings yet
04 LossFunctions
22 pages
Lect 9- Loss Functions
No ratings yet
Lect 9- Loss Functions
28 pages
Loss functions
No ratings yet
Loss functions
29 pages
Lesson 4 Deep Neural Network and Tools
No ratings yet
Lesson 4 Deep Neural Network and Tools
159 pages
3 - Loss Functions
No ratings yet
3 - Loss Functions
14 pages
Cross Entropy Loss Intro, Applications
No ratings yet
Cross Entropy Loss Intro, Applications
21 pages
Practical-5_2CEIT606_Artificial Intelligence
No ratings yet
Practical-5_2CEIT606_Artificial Intelligence
14 pages
Cost Function Loss Function
No ratings yet
Cost Function Loss Function
7 pages
Loss Functions Types
No ratings yet
Loss Functions Types
11 pages
DL Practical 3 Loss Function
No ratings yet
DL Practical 3 Loss Function
6 pages
loss-functions
No ratings yet
loss-functions
8 pages
Loss functions
No ratings yet
Loss functions
25 pages
Linear Classfiers, Loss
No ratings yet
Linear Classfiers, Loss
38 pages
CM20315 05 Loss
No ratings yet
CM20315 05 Loss
100 pages
Lect 8
No ratings yet
Lect 8
117 pages
4-Loss Function
No ratings yet
4-Loss Function
8 pages
Assignment 1 - Machine Learning
No ratings yet
Assignment 1 - Machine Learning
9 pages
Deep Learning(Part 2). Loss Function and Gradient Function _ by Sumbatilinda _ Medium
No ratings yet
Deep Learning(Part 2). Loss Function and Gradient Function _ by Sumbatilinda _ Medium
30 pages
Loss Function - Ipynb - Colaboratory
No ratings yet
Loss Function - Ipynb - Colaboratory
6 pages
NN WK 3 Lec 5 6 Gradient Descent
No ratings yet
NN WK 3 Lec 5 6 Gradient Descent
7 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Loss Functions
No ratings yet
Loss Functions
7 pages
Machine Vesion hw6
No ratings yet
Machine Vesion hw6
18 pages
Lec 04 Deep Networks 2
No ratings yet
Lec 04 Deep Networks 2
78 pages
Loss Function
No ratings yet
Loss Function
13 pages
loss function
No ratings yet
loss function
23 pages
1 Intro
No ratings yet
1 Intro
5 pages
Logistic_Regression
No ratings yet
Logistic_Regression
19 pages
Loss Function
No ratings yet
Loss Function
9 pages
04 AIS302 ANN - Loss Functions (1)
No ratings yet
04 AIS302 ANN - Loss Functions (1)
74 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Exp 5
No ratings yet
Exp 5
4 pages
Unit 2 - Part A - B - C
No ratings yet
Unit 2 - Part A - B - C
25 pages
03-Linear Classification
No ratings yet
03-Linear Classification
17 pages
Lecture 4 - Linear Classification
No ratings yet
Lecture 4 - Linear Classification
34 pages
lecture19
No ratings yet
lecture19
8 pages
Practice QuestionsV1
No ratings yet
Practice QuestionsV1
7 pages
Practice QuestionsV1
No ratings yet
Practice QuestionsV1
7 pages
2304.07288v2
No ratings yet
2304.07288v2
26 pages
7.TrainingNN-2
No ratings yet
7.TrainingNN-2
84 pages
UNIT4 CostFunctions
No ratings yet
UNIT4 CostFunctions
23 pages
05 AIS302 ANN-Optimization
No ratings yet
05 AIS302 ANN-Optimization
44 pages
16-Softmax Regression - Softmax Classifier-19!08!2024
No ratings yet
16-Softmax Regression - Softmax Classifier-19!08!2024
14 pages
Lecture 07
No ratings yet
Lecture 07
29 pages
b9c50b5b58d240169f8bec65f9d6589b
No ratings yet
b9c50b5b58d240169f8bec65f9d6589b
61 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
Loss Function
No ratings yet
Loss Function
2 pages
DeepLearning Workshop Humayun
No ratings yet
DeepLearning Workshop Humayun
63 pages
02 - Linear Models - D (Multiclass Classification)
No ratings yet
02 - Linear Models - D (Multiclass Classification)
9 pages
Binary Classification MSE Cross Entropy Explanation
No ratings yet
Binary Classification MSE Cross Entropy Explanation
2 pages
Inteligencia Artificial (12) (ST)
No ratings yet
Inteligencia Artificial (12) (ST)
49 pages
ml
No ratings yet
ml
10 pages
Machine Learning Models
No ratings yet
Machine Learning Models
52 pages
Binary Cross Entropy and Categorical Cross Entropy
No ratings yet
Binary Cross Entropy and Categorical Cross Entropy
19 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
CS312 Intro To Robotics
100% (1)
CS312 Intro To Robotics
25 pages
CS317 ProjectTeamComposition Roble Cañete Salvador
No ratings yet
CS317 ProjectTeamComposition Roble Cañete Salvador
2 pages
Group 5 Bye
No ratings yet
Group 5 Bye
16 pages
Core Region 10 Dec2022 DB
No ratings yet
Core Region 10 Dec2022 DB
341 pages
LP Math2
No ratings yet
LP Math2
9 pages
Uxt Series Spec Sheet Rev00 en Lowres Zmag
No ratings yet
Uxt Series Spec Sheet Rev00 en Lowres Zmag
4 pages
Em Ut1
No ratings yet
Em Ut1
30 pages
(Ebook) The SAGE Handbook of Public Administration by B. Guy Peters (editor), Jon Pierre (editor) ISBN 9781446200506, 1446200507 download
100% (3)
(Ebook) The SAGE Handbook of Public Administration by B. Guy Peters (editor), Jon Pierre (editor) ISBN 9781446200506, 1446200507 download
58 pages
Metal ForMing PDF
91% (11)
Metal ForMing PDF
107 pages
FFS Airbus A320-200 (FT55)
No ratings yet
FFS Airbus A320-200 (FT55)
3 pages
RVS Guidelines
No ratings yet
RVS Guidelines
41 pages
Natural Confidence PDF
No ratings yet
Natural Confidence PDF
8 pages
FU 360 / 365 / 370 / 375 / 380 M Silk Pro: Multi Busbar
No ratings yet
FU 360 / 365 / 370 / 375 / 380 M Silk Pro: Multi Busbar
2 pages
Terrain Below Ground Drainage Dimensional Data
No ratings yet
Terrain Below Ground Drainage Dimensional Data
16 pages
Template Spinning Wheel
No ratings yet
Template Spinning Wheel
14 pages
Bip Neuroscience - Plaquette2023 2024
No ratings yet
Bip Neuroscience - Plaquette2023 2024
7 pages
Unique Floating Mechanism System Automatically Adjusts The Difference Between The Spindle Feed of The Tap
No ratings yet
Unique Floating Mechanism System Automatically Adjusts The Difference Between The Spindle Feed of The Tap
2 pages
Master Presentation Slides 2024-2025
No ratings yet
Master Presentation Slides 2024-2025
22 pages
ICT Policies and Safety Issues in Teaching and Learning
No ratings yet
ICT Policies and Safety Issues in Teaching and Learning
3 pages
【英文版】Selling Points Brochure
No ratings yet
【英文版】Selling Points Brochure
10 pages
Computer-Assisted Language Learning
No ratings yet
Computer-Assisted Language Learning
13 pages
Tourism Imaginaries and Fridge Magnets
No ratings yet
Tourism Imaginaries and Fridge Magnets
6 pages
Hindi VersionProcess Flow For Aadhaar Authentication by Students
No ratings yet
Hindi VersionProcess Flow For Aadhaar Authentication by Students
1 page
Journalism 4
No ratings yet
Journalism 4
8 pages
Old Dominion University Physics 227N/232N Exam Review
No ratings yet
Old Dominion University Physics 227N/232N Exam Review
21 pages
Sales For Executives: by Dr. Raafat Youssef Shehata
No ratings yet
Sales For Executives: by Dr. Raafat Youssef Shehata
65 pages
Setup For Oracle Payables and Cash Management
No ratings yet
Setup For Oracle Payables and Cash Management
4 pages
Air Condition Size Calculator (1.1.19)
No ratings yet
Air Condition Size Calculator (1.1.19)
5 pages
GNSS PDF
No ratings yet
GNSS PDF
39 pages
793F Plano Hidraulico
No ratings yet
793F Plano Hidraulico
10 pages
Instek GRG 450B Manual
No ratings yet
Instek GRG 450B Manual
14 pages
221 Fset
No ratings yet
221 Fset
7 pages
Cur. Map Science - 2
No ratings yet
Cur. Map Science - 2
2 pages
SSC CGL Syllabus 2018 With Exam Pattern
No ratings yet
SSC CGL Syllabus 2018 With Exam Pattern
7 pages

Module 6_Loss Function

Uploaded by

Module 6_Loss Function

Uploaded by

Module 6

At the end of the lesson, students are expected to be able to:

• Measures the average squared difference between actual and

Actual Pred Diff Squared Sum MSE

• Measures the average absolute difference between actual and

Actual Pred |Diff| Sum MAE

• Used when predicting probabilities for two classes (0 or 1).

Class Probability of Predicted Log ( Predicted

Predicted Probabilities (y_pred):

• How Categorical Cross-Entropy Works

You might also like