4-Loss Function
4-Loss Function
Loss Function
• Based on the type of problem, we can have the following loss functions:
• Regression Problem:
• Mean Square Error (MSE): (1/n)σ𝑛𝑖=1(𝑦𝑖 − 𝑦𝑖^ )2
• Advantages
• This loss function is in the form of a quadratic equation. When we plot the quadratic equation graph, we get a
gradient descent with only one global minimum and no local minimum.
• This loss function penalize the mode for making large errors by squaring them.
• Disadvantages
• It is not robust to the outliers.
Loss Function
• Based on the type of problem, we can have the following loss functions:
• Regression Problem:
• Mean Absolute Error (MSE): (1/n)σ𝑛𝑖=1 |𝑦𝑖 − 𝑦𝑖^ |2
• Advantages
• More robust to the outliers.
• Disadvantages
• It may have local minima.
Loss Function
• Based on the type of problem, we can have the following loss functions:
• Regression Problem:
• Huber Loss: The Huber loss function is a type of loss function that combines characteristics of both the
mean squared error (MSE) and mean absolute error (MAE) loss functions.
• Based on the type of problem, we can have the following loss functions:
• Classification Problems
• Hinge Loss
• Hinge loss is a function commonly used in support vector machines (SVM) and other classifiers
for binary classification tasks.
• It is designed to maximize the margin between the positive and negative classes in the data.
• For a binary classification problem with a ground truth label y (either -1 or +1) and the model's
decision function f(x), where x is the input data, the hinge loss is calculated as:
max(0, 1 - y * f(x))
• The objective during training is to minimize the hinge loss, which encourages the model to
correctly classify the samples while maximizing the margin between the classes.
• The hinge loss is convex, making it suitable for optimization in machine learning algorithms like
SVM.
Loss Function
• Based on the type of problem, we can have the following loss functions:
• Classification Problems
• Binary cross entropy/Log loss function
• Binary Cross Entropy (BCE) is a loss function commonly used in binary classification tasks, where the
goal is to classify inputs into one of two classes (e.g., positive and negative, 0 and 1).
• Given a binary classification problem with a ground truth label y (either 0 or 1) and the predicted
probability of the positive class ŷ (a value between 0 and 1), the BCE loss function is calculated as
L(y, ŷ) = -[y * log(ŷ) + (1 - y) * log(1 - ŷ)]
• The objective during training is to minimize the BCE loss, which means making the predicted
probabilities closer to the true labels for both positive and negative samples.
• The BCE loss function is commonly used in binary classification problems. It is often combined with a
sigmoid activation function at the output layer of the neural network to ensure that the predicted
probabilities ŷ are within the range of [0, 1].
• Its advantage is that it is differentiable.
• Its disadvantage is that multiple local minima are there.
Loss Function
• Based on the type of problem, we can have the following loss functions:
• Classification Problems
• Categorical cross entropy/SoftMax Cross Entropy /Cross Entropy Loss:
• It is a popular loss function used in multi-class classification tasks.
• The categorical cross entropy loss is calculated as L = - σ𝑖(𝑦 ∗ log 𝑦 ^ )
• Here i denotes the number of classes