0% found this document useful (0 votes)
34 views

Linear Regression

Linear regression is a machine learning algorithm that models the linear relationship between a dependent variable and one or more independent variables. It makes predictions for continuous variables. For an accurate linear regression model, the data must meet certain assumptions like linearity, independence, homoscedasticity, normality, and no multicollinearity. Gradient descent is commonly used to update the model parameters and minimize error to find the best fit line. Model performance is evaluated using metrics like the coefficient of determination and residual sum of squares.

Uploaded by

graphiegy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Linear Regression

Linear regression is a machine learning algorithm that models the linear relationship between a dependent variable and one or more independent variables. It makes predictions for continuous variables. For an accurate linear regression model, the data must meet certain assumptions like linearity, independence, homoscedasticity, normality, and no multicollinearity. Gradient descent is commonly used to update the model parameters and minimize error to find the best fit line. Model performance is evaluated using metrics like the coefficient of determination and residual sum of squares.

Uploaded by

graphiegy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Linear Regression in Machine learning

Linear regression is a type of supervised machine learning algorithm that computes the linear relationship
between a dependent variable and one or more independent features. When the number of the independent
feature, is 1 then it is known as Univariate Linear regression, and in the case of more than one feature, it is known
as multivariate linear regression.

Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product
price, etc.

Assumption for Linear Regression Model

Linear regression is a powerful tool for understanding and predicting the behaviour
of a variable, however, it needs to meet a few conditions in order to be accurate
and dependable solutions.

1. Linearity: The independent and dependent variables have a linear relationship


with one another. This implies that changes in the dependent variable follow
those in the independent variable(s) in a linear fashion. This means that there
should be a straight line that can be drawn through the data points. If the
relationship is not linear, then linear regression will not be an accurate model.

Independence: The observations in the dataset are independent of each other.


This means that the value of the dependent variable for one observation does
not depend on the value of the dependent variable for another observation. If
the observations are not independent, then linear regression will not be an
accurate model.

Homoscedasticity: Across all levels of the independent variable(s), the


variance of the errors is constant. This indicates that the amount of the
independent variable(s) has no impact on the variance of the errors. If the
variance of the residuals is not constant, then linear regression will not be an
accurate model.
Normality: The residuals should be normally distributed. This means that the
residuals should follow a bell-shaped curve. If the residuals are not normally
distributed, then linear regression will not be an accurate model.

No multicollinearity: There is no high correlation between the independent


variables. This indicates that there is little or no correlation between the
independent variables.Multicollinearity occurs when two or more independent
variables are highly correlated with each other, which can make it difficult to
determine the individual effect of each variable on the dependent variable. If
there is multicollinearity, then linear regression will not be an accurate model.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or
more independent (y) variables, hence called as linear regression. Since linear regression
shows the linear relationship, which means it finds how the value of the dependent variable is
changing according to the value of the independent variable.

The linear regression model provides a sloped straight line representing the relationship
between the variables. Consider the below image:

Types of Linear Regression

There are two main types of linear regression:

Simple Linear Regression: This is the simplest form of linear regression, and it
involves only one independent variable and one dependent variable. The
equation for simple linear regression is:
where:

Y is the dependent variable


X is the independent variable
β0 is the intercept
β1 is the slope
Multiple Linear Regression: This involves more than one independent variable
and one dependent variable. The equation for multiple linear regression is:

where:

Y is the dependent variable


X1, X2, …, Xp are the independent variables
β0 is the intercept
β1, β2, …, βn are the slopes

Some other regression types are-

Polynomial Regression – Polynomial regression goes beyond simple linear


regression by incorporating higher-order polynomial terms of the independent
variable(s) into the model. It is represented by the general equation:
Ridge Regression – Ridge regression is a regularization technique used to prevent
overfitting in linear regression models, especially when dealing with multiple
independent variables. It introduces a penalty term to the least squares objective
function, biasing the model towards solutions with smaller coefficients. The equation
for ridge regression becomes:
Lasso Regression – Lasso regression is another regularization technique that uses
an L1 penalty term to shrink the coefficients of less important independent variables
towards zero, effectively performing feature selection. The equation for lasso
regression becomes:
Elastic Net Regression – Elastic net regression combines the penalties of ridge and
lasso regression, offering a balance between their strengths. It uses a mixed penalty
term of the form
The goal of the algorithm is to find the best Fit Line equation that can predict the values based on
the independent variables

What is the best Fit Line?

Our primary objective while using linear regression is to locate the best-fit line, which
implies that the error between the predicted and actual values should be kept to a
minimum. There will be the least error in the best-fit line.

The best Fit Line equation provides a straight line that represents the relationship between
the dependent and independent variables. The slope of the line indicates how much the
dependent variable changes for a unit change in the independent variable(s).

Positive Linear Relationship:


If the dependent variable increases on the Y-axis and independent variable increases on
X-axis, then such a relationship is termed as a Positive linear relationship.
Negative Linear Relationship:
If the dependent variable decreases on the Y-axis and independent variable increases on
the X-axis, then such a relationship is called a negative linear relationship.
Linear regression performs the task to predict a dependent variable value (y) based
on a given independent variable (x)). Hence, the name is Linear Regression. In the
figure above, X (input) is the work experience and Y (output) is the salary of a
person. The regression line is the best-fit line for our model.

We utilize the cost function to compute the best values in order to get the best fit
line since different values for weights or the coefficient of lines result in different
regression lines.

Hypothesis function for Linear Regression

As we have assumed earlier that our independent feature is the experience i.e X
and the respective salary Y is the dependent variable. Let’s assume there is a linear
relationship between X and Y then the salary can be predicted using:

OR

Here,

are labels to data (Supervised learning)


are the input independent training data (univariate – one input
variable(parameter))
are the predicted values.

The model gets the best regression fit line by finding the best θ1 and θ2 values.

θ1: intercept
θ2: coefficient of x

Once we find the best θ1 and θ2 values, we get the best-fit line. So when we are
finally using our model for prediction, it will predict the value of y for the input
value of x.

Cost function for Linear Regression


The cost function or the loss function is nothing but the error or difference
between the predicted value and the true value Y. It is the Mean Squared Error
(MSE) between the predicted value and the true value. The cost function (J) can be
written as:

How to update θ1 and θ2 values to get the best-fit line?

To achieve the best-fit regression line, the model aims to predict the target value
such that the error difference between the predicted value and the true value Y is
minimum. So, it is very important to update the θ1 and θ2 values, to reach the best
value that minimizes the error between the predicted y value (pred) and the true y
value (y).
Gradient Descent for Linear Regression

A linear regression model can be trained using the optimization algorithm gradient
descent by iteratively modifying the model’s parameters to reduce the mean
squared error (MSE) of the model on a training dataset. To update θ1 and θ2 values
in order to reduce the Cost function (minimizing RMSE value) and achieve the best-
fit line the model uses Gradient Descent. The idea is to start with random θ1 and θ2
values and then iteratively update the values, reaching minimum cost.

A gradient is nothing but a derivative that defines the effects on outputs of the
function with a little bit of variation in inputs.
Let’s differentiate the cost function(J) with respect to

Let’s differentiate the cost function(J) with respect to

Finding the coefficients of a linear equation that best fits the training data is the
objective of linear regression. By moving in the direction of the Mean Squared
Error negative gradient with respect to the coefficients, the coefficients can be
changed. And the respective intercept and coefficient of X will be if is the learning
rate.
Evaluation Metrics for Linear Regression

A variety of evaluation measures can be used to determine the strength of any


linear regression model. These assessment metrics often give an indication of how
well the model is producing the observed outputs.

The most common measurements are:

Coefficient of Determination (R-squared): R-Squared is a statistic that indicates


how much variation the developed model can explain or capture. It is always in
the range of 0 to 1. In general, the better the model matches the data, the
greater the R-squared number.
In mathematical notation, it can be expressed as :
Residual sum of Squares (RSS)– The sum of squares of the residual for each
data point in the plot or data is known as the residual sum of squares, or
RSS. It is a measurement of the difference between the output that was
observed and what was anticipated.

Total Sum of Squares (TSS)– The sum of the data points’ errors from the
answer variable’s mean is known as the total sum of squares, or TSS.

Root Mean Squared Error (RMSE): The square root of the residuals’ variance is
the Root Mean Squared Error. It describes how well the observed data points
match the expected values, or the model’s absolute fit to the data.
In mathematical notation, it can be expressed as:

Rather than dividing the entire number of data points in the model by the number
of degrees of freedom, one must divide the sum of the squared residuals to obtain
an unbiased estimate. Then, this figure is referred to as the Residual Standard Error
(RSE). In mathematical notation, it can be expressed as:

RSME is not as good of a metric as R-squared. Root Mean Squared Error can
fluctuate when the units of the variables vary since its value is dependent on the
variables’ units (it is not a normalized measure).

You might also like