The document covers loss functions in neural networks, explaining their purpose in training and optimization. It categorizes loss functions into regression (e.g., Mean Squared Error, Mean Absolute Error) and classification (e.g., Binary Cross-Entropy, Categorical Cross-Entropy, Sparse Categorical Cross-Entropy) types. Additionally, it discusses how loss functions guide model training by providing feedback for parameter updates through optimization techniques.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views
Module 6_Loss Function
The document covers loss functions in neural networks, explaining their purpose in training and optimization. It categorizes loss functions into regression (e.g., Mean Squared Error, Mean Absolute Error) and classification (e.g., Binary Cross-Entropy, Categorical Cross-Entropy, Sparse Categorical Cross-Entropy) types. Additionally, it discusses how loss functions guide model training by providing feedback for parameter updates through optimization techniques.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22
Module 6
Loss Functions Intended Learning Outcomes (ILOs) for Loss Functions in Neural Networks
At the end of the lesson, students are expected to be able to:
• Explain the purpose of a loss function in training neural networks. • Identify and classify loss functions based on the type of problem (regression vs. classification). • Select an appropriate loss function for a given machine learning task. • Implement loss functions using Python libraries such as TensorFlow/Keras. Introduction to Loss Functions • In neural networks, a loss function measures how well the model's predictions match the actual target values. • It serves as the foundation for training the network, guiding the optimization process to improve accuracy. • The loss function, also referred to as the error function, is a crucial component in machine learning that quantifies the difference between the predicted outputs of a machine learning algorithm and the actual target values. Why Do We Need a Loss Function? • It quantifies the difference between the predicted output and the actual output. • It provides feedback to update model parameters (weights and biases). • It helps optimize the neural network through gradient descent or other optimization techniques. Why Do We Need a Loss Function? • The resulting value, the loss, reflects the accuracy of the model's predictions. • During training, a learning algorithm such as the backpropagation algorithm uses the gradient of the loss function with respect to the model's parameters to adjust these parameters and minimize the loss, effectively improving the model's performance on the dataset. Gradient Descent Types of Loss Functions Loss functions are broadly categorized into regression loss functions and classification loss functions based on the type of problem being solved. • Loss Functions for Regression • Mean Squared Error (MSE) • Mean Absolute Error (MAE) • Loss Functions for Classification • Binary Cross-Entropy (Log Loss) (for Binary Classification) • Categorical Cross-Entropy (for Multi-Class Classification) • Sparse Categorical Cross-Entropy Mean Squared Error (MSE)
• Measures the average squared difference between actual and
predicted values. • Penalizes larger errors more than smaller ones.
• Used when predicting probabilities for two classes (0 or 1).
• Encourages correct probability estimation.
Class Probability of Predicted Log ( Predicted
User ID Prediction Predicted Class Actual Class Probability of Probability of ‘Class 1’ ) ‘Class 1’ sd459 1 0.8 1 0.8 -0.22 sd325 1 0.65 1 0.65 -0.43 ef345 1 0.78 1 0.78 -0.25 bw678 1 0.91 1 0.91 -0.09 df837 0 0.65 0 0.35 -1.05 lk948 1 0.87 1 0.87 -0.14 os274 0 0.22 0 0.78 -0.25 ye923 0 0.33 0 0.67 -0.4 Categorical Cross-Entropy (for Multi-Class Classification) • Categorical Cross-Entropy (CCE), also known as softmax loss or log loss, is one of the most commonly used loss functions in machine learning, particularly for classification problems. • It measures the difference between the predicted probability distribution and the actual (true) distribution of classes. Calculating Categorical Cross-Entropy • Let's break down the categorical cross-entropy calculation with a mathematical example using the following true labels and predicted probabilities. • We have 3 samples, each belonging to one of 3 classes (Class A, Class B, or Class C). The true labels are one-hot encoded. True Labels (y_true): Example 1: Class B → [0, 1, 0] Example 2: Class A → [1, 0, 0] Example 3: Class C → [0, 0, 1]
Predicted Probabilities (y_pred):
Example 1: [0.1, 0.8, 0.1] Example 2: [0.7, 0.2, 0.1] Example 3: [0.2, 0.3, 0.5] Calculating Categorical Cross-Entropy • Final Losses: • For Example 1, the loss is: 0.22314355 • For Example 2, the loss is: 0.35667494 • For Example 3, the loss is: 0.69314718
• How Categorical Cross-Entropy Works
• Prediction of Probabilities - The model outputs probabilities for each class. These probabilities are the likelihood of a data point belonging to each class. Typically, this is done using a softmax function, which converts raw scores into probabilities. • Comparison with True Class - Categorical cross-entropy compares the predicted probabilities with the actual class labels (one-hot encoded). • Calculation of Loss - The logarithm of the predicted probability for the correct class is taken, and the loss function penalizes the model based on how far the prediction was from the actual class. Sparse Categorical Cross-Entropy • Similar to categorical cross-entropy but used when target labels are integer- encoded instead of one-hot encoded. • Instead, the labels are represented as integers corresponding to the class indices. The true labels are integers, where each integer represents the class index. • Example: • If the correct label is "Cat," it would be represented as the integer 1 (since "Cat" is the second class, starting from 0). • Suppose the model predicts probabilities like [0.2, 0.7, 0.1]. The loss is calculated for the correct class (Cat) using the formula: -log(0.7) • Sparse Categorical Cross entropy internally converts these integer labels into one-hot encoded format before calculating the loss. This approach can save memory and computational resources, especially when dealing with datasets containing a large number of classes. How Does Loss Function Affect Model Training? • The loss function provides a numerical value that the optimizer (e.g., Gradient Descent, Adam, RMSprop) minimizes. • The network updates weights using backpropagation, which calculates the gradient of the loss function with respect to each parameter. MSE MAE Binary Cross-Entropy Categorical Cross-Entropy Sparse Categorical Cross-Entropy