Linear Regression
Linear Regression
Linear regression is a type of supervised machine learning algorithm that computes the linear relationship
between a dependent variable and one or more independent features. When the number of the independent
feature, is 1 then it is known as Univariate Linear regression, and in the case of more than one feature, it is known
as multivariate linear regression.
Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product
price, etc.
Linear regression is a powerful tool for understanding and predicting the behaviour
of a variable, however, it needs to meet a few conditions in order to be accurate
and dependable solutions.
Linear regression algorithm shows a linear relationship between a dependent (y) and one or
more independent (y) variables, hence called as linear regression. Since linear regression
shows the linear relationship, which means it finds how the value of the dependent variable is
changing according to the value of the independent variable.
The linear regression model provides a sloped straight line representing the relationship
between the variables. Consider the below image:
Simple Linear Regression: This is the simplest form of linear regression, and it
involves only one independent variable and one dependent variable. The
equation for simple linear regression is:
where:
where:
Our primary objective while using linear regression is to locate the best-fit line, which
implies that the error between the predicted and actual values should be kept to a
minimum. There will be the least error in the best-fit line.
The best Fit Line equation provides a straight line that represents the relationship between
the dependent and independent variables. The slope of the line indicates how much the
dependent variable changes for a unit change in the independent variable(s).
We utilize the cost function to compute the best values in order to get the best fit
line since different values for weights or the coefficient of lines result in different
regression lines.
As we have assumed earlier that our independent feature is the experience i.e X
and the respective salary Y is the dependent variable. Let’s assume there is a linear
relationship between X and Y then the salary can be predicted using:
OR
Here,
The model gets the best regression fit line by finding the best θ1 and θ2 values.
θ1: intercept
θ2: coefficient of x
Once we find the best θ1 and θ2 values, we get the best-fit line. So when we are
finally using our model for prediction, it will predict the value of y for the input
value of x.
To achieve the best-fit regression line, the model aims to predict the target value
such that the error difference between the predicted value and the true value Y is
minimum. So, it is very important to update the θ1 and θ2 values, to reach the best
value that minimizes the error between the predicted y value (pred) and the true y
value (y).
Gradient Descent for Linear Regression
A linear regression model can be trained using the optimization algorithm gradient
descent by iteratively modifying the model’s parameters to reduce the mean
squared error (MSE) of the model on a training dataset. To update θ1 and θ2 values
in order to reduce the Cost function (minimizing RMSE value) and achieve the best-
fit line the model uses Gradient Descent. The idea is to start with random θ1 and θ2
values and then iteratively update the values, reaching minimum cost.
A gradient is nothing but a derivative that defines the effects on outputs of the
function with a little bit of variation in inputs.
Let’s differentiate the cost function(J) with respect to
Finding the coefficients of a linear equation that best fits the training data is the
objective of linear regression. By moving in the direction of the Mean Squared
Error negative gradient with respect to the coefficients, the coefficients can be
changed. And the respective intercept and coefficient of X will be if is the learning
rate.
Evaluation Metrics for Linear Regression
Total Sum of Squares (TSS)– The sum of the data points’ errors from the
answer variable’s mean is known as the total sum of squares, or TSS.
Root Mean Squared Error (RMSE): The square root of the residuals’ variance is
the Root Mean Squared Error. It describes how well the observed data points
match the expected values, or the model’s absolute fit to the data.
In mathematical notation, it can be expressed as:
Rather than dividing the entire number of data points in the model by the number
of degrees of freedom, one must divide the sum of the squared residuals to obtain
an unbiased estimate. Then, this figure is referred to as the Residual Standard Error
(RSE). In mathematical notation, it can be expressed as:
RSME is not as good of a metric as R-squared. Root Mean Squared Error can
fluctuate when the units of the variables vary since its value is dependent on the
variables’ units (it is not a normalized measure).