0% found this document useful (0 votes)
2 views

Module 6_Loss Function

The document covers loss functions in neural networks, explaining their purpose in training and optimization. It categorizes loss functions into regression (e.g., Mean Squared Error, Mean Absolute Error) and classification (e.g., Binary Cross-Entropy, Categorical Cross-Entropy, Sparse Categorical Cross-Entropy) types. Additionally, it discusses how loss functions guide model training by providing feedback for parameter updates through optimization techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 6_Loss Function

The document covers loss functions in neural networks, explaining their purpose in training and optimization. It categorizes loss functions into regression (e.g., Mean Squared Error, Mean Absolute Error) and classification (e.g., Binary Cross-Entropy, Categorical Cross-Entropy, Sparse Categorical Cross-Entropy) types. Additionally, it discusses how loss functions guide model training by providing feedback for parameter updates through optimization techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Module 6

Loss Functions
Intended Learning Outcomes (ILOs) for
Loss Functions in Neural Networks

At the end of the lesson, students are expected to be able to:


• Explain the purpose of a loss function in training neural networks.
• Identify and classify loss functions based on the type of problem
(regression vs. classification).
• Select an appropriate loss function for a given machine learning task.
• Implement loss functions using Python libraries such as
TensorFlow/Keras.
Introduction to Loss Functions
• In neural networks, a loss function measures how well the model's
predictions match the actual target values.
• It serves as the foundation for training the network, guiding the
optimization process to improve accuracy.
• The loss function, also referred to as the error function, is a crucial
component in machine learning that quantifies the difference
between the predicted outputs of a machine learning algorithm and
the actual target values.
Why Do We Need a Loss Function?
• It quantifies the difference between the predicted output and the
actual output.
• It provides feedback to update model parameters (weights and
biases).
• It helps optimize the neural network through gradient descent or
other optimization techniques.
Why Do We Need a Loss Function?
• The resulting value, the loss, reflects the accuracy of
the model's predictions.
• During training, a learning algorithm such as the
backpropagation algorithm uses the gradient of the
loss function with respect to the model's parameters to
adjust these parameters and minimize the loss,
effectively improving the model's performance on the
dataset.
Gradient Descent
Types of Loss Functions
Loss functions are broadly categorized into regression loss functions
and classification loss functions based on the type of problem being
solved.
• Loss Functions for Regression
• Mean Squared Error (MSE)
• Mean Absolute Error (MAE)
• Loss Functions for Classification
• Binary Cross-Entropy (Log Loss) (for Binary Classification)
• Categorical Cross-Entropy (for Multi-Class Classification)
• Sparse Categorical Cross-Entropy
Mean Squared Error (MSE)

• Measures the average squared difference between actual and


predicted values.
• Penalizes larger errors more than smaller ones.

Actual Pred Diff Squared Sum MSE


85 86 -1 1
78 84 -6 36
89 92 -3 9 114 22.8
78 76 2 4
82 74 8 64
Mean Absolute Error (MAE)

• Measures the average absolute difference between actual and


predicted values.
• Less sensitive to large errors compared to MSE.

Actual Pred |Diff| Sum MAE


85 86 1
78 84 6
89 92 3 20 4
78 76 2
82 74 8
Binary Cross-Entropy (Log Loss) (for Binary
Classification)

• Used when predicting probabilities for two classes (0 or 1).


• Encourages correct probability estimation.

Class Probability of Predicted Log ( Predicted


User ID Prediction Predicted Class Actual Class Probability of Probability of ‘Class 1’ )
‘Class 1’
sd459 1 0.8 1 0.8 -0.22
sd325 1 0.65 1 0.65 -0.43
ef345 1 0.78 1 0.78 -0.25
bw678 1 0.91 1 0.91 -0.09
df837 0 0.65 0 0.35 -1.05
lk948 1 0.87 1 0.87 -0.14
os274 0 0.22 0 0.78 -0.25
ye923 0 0.33 0 0.67 -0.4
Categorical Cross-Entropy (for Multi-Class
Classification)
• Categorical Cross-Entropy (CCE), also known as softmax loss or log loss, is one of the most
commonly used loss functions in machine learning, particularly for classification problems.
• It measures the difference between the predicted probability distribution and the actual (true)
distribution of classes.
Calculating Categorical Cross-Entropy
• Let's break down the categorical cross-entropy calculation with a mathematical
example using the following true labels and predicted probabilities.
• We have 3 samples, each belonging to one of 3 classes (Class A, Class B, or Class
C). The true labels are one-hot encoded.
True Labels (y_true):
Example 1: Class B → [0, 1, 0]
Example 2: Class A → [1, 0, 0]
Example 3: Class C → [0, 0, 1]

Predicted Probabilities (y_pred):


Example 1: [0.1, 0.8, 0.1]
Example 2: [0.7, 0.2, 0.1]
Example 3: [0.2, 0.3, 0.5]
Calculating Categorical Cross-Entropy
• Final Losses:
• For Example 1, the loss is: 0.22314355
• For Example 2, the loss is: 0.35667494
• For Example 3, the loss is: 0.69314718

• How Categorical Cross-Entropy Works


• Prediction of Probabilities - The model outputs probabilities for each class.
These probabilities are the likelihood of a data point belonging to each class.
Typically, this is done using a softmax function, which converts raw scores into
probabilities.
• Comparison with True Class - Categorical cross-entropy compares the
predicted probabilities with the actual class labels (one-hot encoded).
• Calculation of Loss - The logarithm of the predicted probability for the correct
class is taken, and the loss function penalizes the model based on how far the
prediction was from the actual class.
Sparse Categorical Cross-Entropy
• Similar to categorical cross-entropy but used when target labels are integer-
encoded instead of one-hot encoded.
• Instead, the labels are represented as integers corresponding to the class
indices. The true labels are integers, where each integer represents the class
index.
• Example:
• If the correct label is "Cat," it would be represented as the integer 1 (since
"Cat" is the second class, starting from 0).
• Suppose the model predicts probabilities like [0.2, 0.7, 0.1]. The loss is
calculated for the correct class (Cat) using the formula: -log(0.7)
• Sparse Categorical Cross entropy internally converts these integer labels into
one-hot encoded format before calculating the loss. This approach can save
memory and computational resources, especially when dealing with datasets
containing a large number of classes.
How Does Loss Function Affect Model
Training?
• The loss function provides a numerical value that the optimizer (e.g.,
Gradient Descent, Adam, RMSprop) minimizes.
• The network updates weights using backpropagation, which
calculates the gradient of the loss function with respect to each
parameter.
MSE
MAE
Binary Cross-Entropy
Categorical Cross-Entropy
Sparse Categorical Cross-Entropy

You might also like