Loss function for Linear regression in Machine Learning
Last Updated :
29 Jul, 2024
The loss function quantifies the disparity between the prediction value and the actual value. In the case of linear regression, the aim is to fit a linear equation to the observed data, the loss function evaluate the difference between the predicted value and true values. By minimizing this difference, the model strives to find the best-fitting line that captures the relationship between the input features and the target variable.
In this article, we will discuss Mean Squared Error (MSE) , Mean Absolute Error (MAE) and Huber Loss.
Mean Squared Error (MSE)
One of the most often used loss functions in linear regression is the Mean Squared Error (MSE). The average of the squared difference between the real values and the forecasted values is how it is computed:
MSE = (1/n) * Σ(y_{pred}- y_{true})^2
where,
- y_pred is the projected value
- y_true is the true value
- n is the number of data points
Because of the squaring process, the MSE penalizes greater mistakes more severely than smaller ones. Because outliers have the potential to greatly raise the MSE, this makes it susceptible to them. But the MSE is differentiable, which is a desired characteristic for machine learning optimization techniques.
Computing Mean Squared Error in Python
Python
import numpy as np
def mse(y_true, y_pred):
y_true = np.array(y_true)
y_pred = np.array(y_pred)
return np.mean((y_true - y_pred) ** 2)
# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Squared Error:",mse(y_true, y_pred))
Output:
Mean Squared Error: 1.75
Computing Mean Squared Error using Sklearn Library
Python
from sklearn.metrics import mean_squared_error
# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Squared Error:", mean_squared_error(y_true, y_pred))
Output:
Mean Squared Error: 1.75
Mean Absolute Error (MAE)
For linear regression, another often-used loss function is the Mean Absolute Error (MAE). The average of the absolute differences between the real values and the forecasted values is used to compute it:
MAE = (1/n) * Σ|y_{pred} - y_{true}|
Since the MAE does not square the errors, it is less susceptible to outliers than the MSE. MAE handles all mistakes the same way, no matter how big. However, certain optimization techniques may encounter difficulties since the MAE is not differentiable at zero.
Computing Mean Absolute Error in Python
Python
import numpy as np
def mae(y_true, y_pred):
y_true = np.array(y_true)
y_pred = np.array(y_pred)
return np.mean(np.abs(y_true - y_pred))
# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Absolute Error:',mae(y_true, y_pred))
Output:
Mean Absolute Error: 1.25
Computing Mean Absolute Error using Sklearn
Python
from sklearn.metrics import mean_absolute_error
# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Absolute Error:",mean_absolute_error(y_true, y_pred))
Output:
Mean Absolute Error: 1.25
Huber Loss
The MSE and the MAE are combined to get the Huber Loss. It is intended to maintain differentiation but be less susceptible to outliers than the MSE:
Huber Loss = (1/n) * Σ L_δ(y_{pred} - y_{true})
where L_δ is the Huber loss function defined as:
L_δ(x)=\begin{cases} 0.5*x^2 & \text{ if } |x|\leq \delta\\ \delta(|x|-0.5*\delta)& \text otherwise \end{cases}
The Huber Loss exhibits the same behavior as the MAE for big errors (|x| > δ) and the MSE for minor errors (|x| <= δ). The point of transition between the two regimes is determined by the parameter δ.
Computing Huber Loss in Python
Python
import numpy as np
def huber_loss(y_true, y_pred, delta):
residual = y_true - y_pred
huber_loss = np.where(np.abs(residual) <= delta, 0.5 * residual ** 2, delta * (np.abs(residual) - 0.5 * delta))
return np.mean(huber_loss)
# Example usage:
y_true = np.array([3, -0.5, 2, 7])
y_pred = np.array([2.5, 0.0, 2, 8])
delta = 1.0
print("Huber Loss:", huber_loss(y_true, y_pred, delta))
Output:
Huber Loss: 0.1875
Comparison of Loss Functions for Linear Regression
In this section, we compare different loss functions commonly used in regression tasks: Mean Squared Error (MSE), Mean Absolute Error (MAE), and Huber Loss.
- First, it calculates the MSE and MAE using the mean_squared_error and mean_absolute_error functions from the sklearn.metrics module.
- Then, it defines a custom function huber_loss to compute the Huber Loss, which is a combination of MSE and MAE, offering a balance between robustness to outliers and smoothness.
- Next, it calculates the Huber Loss with a specified delta value (delta=1.0) using the implemented huber_loss function.
- Finally, it plots the values of these loss functions for visualization using matplotlib, with labels indicating the type of loss function.
The plot provides a visual comparison of the loss values for the different functions, allowing you to observe their behavior and relative magnitudes.
Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error
# Sample target and predicted values
y_true = np.array([3, 7, 4, 1, 8, 5])
y_pred = np.array([4, 6, 5, 3, 7, 6])
# Calculate MSE and MAE
mse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)
# Huber Loss implementation
def huber_loss(y_true, y_pred, delta=1.0):
error = np.abs(y_true - y_pred)
loss = np.where(error <= delta, 0.5 * error**2, delta * error - 0.5 * delta**2)
return np.mean(loss)
huber_delta1 = huber_loss(y_true, y_pred, delta=1.0)
# Plot the loss functions
losses = [mse, mae, huber_delta1]
labels = ['MSE', 'MAE', 'Huber Loss (delta=1)']
# Providing x-values explicitly for plotting
x = np.arange(len(losses))
plt.figure(figsize=(10, 6))
plt.bar(x, losses, tick_label=labels)
plt.xlabel('Loss Function')
plt.ylabel('Loss Value')
plt.title('Comparison of Loss Functions')
plt.show()
Output:
.png)
In linear regression, the particular issue and the data's properties determine the loss function to use. where handling regularly distributed mistakes and where outliers are not a significant problem, the MSE is often used. When robustness to outliers is crucial, the Huber Loss offers robustness without sacrificing differentiability, and the MAE is the recommended choice.
Similar Reads
Cost function in Logistic Regression in Machine Learning
Logistic Regression is one of the simplest classification algorithms we learn while exploring machine learning algorithms. In this article, we will explore cross-entropy, a cost function used for logistic regression.What is Logistic Regression?Logistic Regression is a statistical method used for bin
9 min read
Linear Regression in Machine learning
Linear regression is a type of supervised machine-learning algorithm that learns from the labelled datasets and maps the data points with most optimized linear functions which can be used for prediction on new datasets. It assumes that there is a linear relationship between the input and output, mea
15+ min read
Classification vs Regression in Machine Learning
Classification and regression are two primary tasks in supervised machine learning, where key difference lies in the nature of the output: classification deals with discrete outcomes (e.g., yes/no, categories), while regression handles continuous values (e.g., price, temperature).Both approaches req
5 min read
Regression in machine learning
Regression in machine learning refers to a supervised learning technique where the goal is to predict a continuous numerical value based on one or more independent features. It finds relationships between variables so that predictions can be made. we have two types of variables present in regression
5 min read
Cost Function in Linear Regression
Linear Regression is a method used to predict values by drawing the best-fit line through the data. When we first create a model, the predictions may not always match the actual data. To understand how well the model is performing we use a cost function. This function helps us to measure the differe
5 min read
Linear Algebra Operations For Machine Learning
Linear algebra is essential for many machine learning algorithms and techniques. It helps in manipulating and processing data, which is often represented as vectors and matrices. These mathematical tools make computations faster and reveal patterns within the data.It simplifies complex tasks like da
15+ min read
Normal Equation in Linear Regression
Linear regression is a popular method for understanding how different factors (independent variables) affect an outcome (dependent variable. At its core, linear regression aims to find the best-fitting line that minimizes the error between observed data points and predicted values. One efficient met
8 min read
Robust Regression for Machine Learning in Python
Simple linear regression aims to find the best fit line that describes the linear relationship between some input variables(denoted by X) and the target variable(denoted by y). This has some limitations as in real-world problems, there is a high probability that the dataset may have outliers. This r
4 min read
Prediction Interval for Linear Regression in R
Linear Regression model is used to establish a connection between two or more variables. These variables are either dependent or independent. Linear Regression In R Programming Language is used to give predictions based on the given data about a particular topic, It helps us to have valuable insight
15+ min read
Locally Linear Embedding in machine learning
LLE (Locally Linear Embedding) is a technique used to reduce the number of dimensions in a dataset without losing the important shape or structure of the data. It is an unsupervised method meaning it works without needing labeled data. LLE operates in several key steps:First LLE finds the nearest ne
5 min read