5_AML Lecture 5_Linear regression
5_AML Lecture 5_Linear regression
Read this
2. Consider the navigation of a mobile robot, say an autonomous car. The output is the angle by
which the steering wheel should be turned at each time, to advance without hitting obstacles
and deviating from the route. Inputs are provided by sensors on the car like a video camera,
GPS, and so forth.
3. In finance, the capital asset pricing model uses regression for analyzing and quantifying the
systematic risk of an investment.
4. In economics, regression is the predominant empirical tool. For example, it is used to predict
consumption spending, inventory investment, purchases of a country’s exports, spending on
imports, labor demand, and labor supply
Types of linear regression
The different regression models are defined based on type of functions used to represent the relation
between the dependent variable y and the independent variables.
• Simple Linear Regression
This is the simplest form of linear regression, and it involves only one independent variable and one
dependent variable. The equation for simple linear regression is:
y=β0+β1Xy=β0+β1X
where:
Y is the dependent variable
X is the independent variable
β0 is the intercept
β1 is the slope
Example: The relationship between pollution levels and rising temperatures.
The value of the dependent variable is based on the value of the independent variable.
Multiple Linear Regression
• This involves more than one independent variable and one dependent variable. The equation for
multiple linear regression is:
y=β0+β1X1+β2X2+………βnXn
where:
• Y is the dependent variable
• X1, X2, …, Xn are the independent variables
• β0 is the intercept
• β1, β2, …, βn are the slopes
Example: Consider the task of calculating blood pressure. In this case, height, weight, and amount of
exercise can be considered independent variables. Here, we can use multiple linear regression to
analyze the relationship between the three independent variables and one dependent variable, as all
the variables considered are quantitative.
Polynomial Regression
• If your data points clearly will not fit a linear regression (a straight line
through all data points), it might be ideal for polynomial regression.
• Polynomial regression, like linear regression, uses the relationship
between the variables x and y to find the best way to draw a line
through the data points.
Read this highlighted parts from here till pg 21
• The Polynomial Regression equation is given below:
• It is also called the special case of Multiple Linear Regression in ML. Because we add
some polynomial terms to the Multiple Linear regression equation to convert it into
Polynomial Regression.
• It is a linear model with some modification in order to increase the accuracy. A
• The dataset used in Polynomial regression for training is of non-linear nature.D
• It makes use of a linear regression model to fit the complicated and non-linear functions
and datasets.
• Hence, "In Polynomial regression, the original features are converted into Polynomial
features of required degree (2,3,..,n) and then modeled using a linear model."
Need for Polynomial Regression:
The need of Polynomial Regression in ML can be understood in the below
points:
• If we apply a linear model on a linear dataset, then it provides us a good
result as we have seen in Simple Linear Regression, but if we apply the
same model without any modification on a non-linear dataset, then it
will produce a drastic output. Due to which loss function will increase,
the error rate will be high, and accuracy will be decreased.
• So for such cases, where data points are arranged in a non-linear
fashion, we need the Polynomial Regression model. We can understand
it in a better way using the below comparison diagram of the linear
dataset and non-linear dataset.
In the above image, we have taken a dataset which is arranged non-linearly. So if we try to cover it with a
linear model, then we can clearly see that it hardly covers any data point. On the other hand, a curve is
suitable to cover most of the data points, which is of the Polynomial model.
Hence, if the datasets are arranged in a non-linear fashion, then we should use the Polynomial Regression
model instead of Simple Linear Regression.
When we compare the above three equations, we can clearly see that all three equations are Polynomial
equations but differ by the degree of variables.
The Simple and Multiple Linear equations are also Polynomial equations with a single degree, and the Polynomial
regression equation is Linear equation with the nth degree.
So if we add a degree to our linear equations, then it will be converted into Polynomial Linear equations.
Linear Regression
• If we have data of college student CGPA and LPA salary after
placement of the student as input feature..
• That means it should predict salary of student if we will provide
his/her CGPA.
• How to approach this problem.
• Suppose I have no CGPA and if someone ask me that what will be the
package of students.
• Then we will simply take average of package of students we will tell
the answer. This is very bad guess.
• But if we have data, then we will plot this data
Data is sort of linear not completely linear. A completely linear data looks like this
• This is a real world data with some excepts as seen in graph.
• Those errors which cant be quantified are called
Stochastic errors.
If the data is completely linear, we will draw line and
write line equation
y=mx + b where x is cgpa and y is salary
But our data is sort of linear
So we will draw a line called Best fit line
• Our primary objective while using linear regression is to locate the best-fit line, which is line that will
pass closest to all the data points in the data set. which implies that the error between the predicted and
actual values should be kept to a minimum. There will be the least error in the best-fit line.
• The best Fit Line equation provides a straight line that represents the relationship between the
dependent and independent variables. The slope of the line indicates how much the dependent variable
changes for a unit change in the independent variable(s).
In the above figure,
Here, a line is plotted for the given data points that suitably fit all the issues. Hence, it is called
the ‘best fit line.’ The goal of the linear regression algorithm is to find this best fit line seen in
the above figure.
It is called best fit line because it is doing minimum error.
• So linear regression will draw a line on sort of linear data which pass
closest to all the data points.
• That means it will find that value of m(slope) and b (intercept) which
will do minimum error on the data points
Human intution
• So mathematically we understood that linear regression is drawing a best
fit line by finding the value of m and b in straight line equation
• Y= mx +c
• Package= m * CGPA +B
• Here m can be said as weights that means the way m value will
increase ,the impact of CGPA on package will also increase.
• To understand B lets put b to 0
• Y= mx
• Lets say we have x as experience in years and y is package ,so as per above
equation,if experience is 0 then package will also be 0 which is not correct
• To find value of m and b we can have 2 methods
1. Closed form solution where we can use direct mathematical
formulas to find value called OLS (Ordinary Least Square) and Scikit
learn library also use this method in its linear regression class.
2. Non Closed form solution where we have to use approximations
using integration and differentiation. The technique we use here is
called Gradient Descent.
• Average error =
• So we need to find that value of m,b for which error should be minimum.but
where is m,b in this formula?
• So
• That means we need to find that value of m and b for which Error is
minimum.
• So E= f(m,b)
• That means Error will be impacted both with m and b.lets see this
impact separately
• E=f(m) b=0 (intercept is 0 so line will pas through origin)
• So it is trying to add all these squares area and trying to minimize it.
• Advantage
• We can use it as loss function as it is differentiable
• Disadvantage
• When the number comes then lets say if x is cgpa and y is package is
LPA then MSE will be in LPA square
• It penalize outliers a lot so not robust for outliers
Root Mean Squared Error (RMSE)
• The square root of the residuals’ variance is the Root Mean Squared
Error. It describes how well the observed data points match the expected
values, or the model’s absolute fit to the data.
• MSE,MAE depends on context like 1.5 lpa error is fine but 1.5 degree
error in self driven car is not acceptable
R2 score
• Lets suppose we have dataset of CGPA and package
• Suppose we have cgpa and we have to find package,then we will use
average value(mean value)
• R2 score will tell us how good your model is as compared to your mean
IMP
So in numerator ,it is best what we have done and denominator it is worst that can happen
R-Squared is a statistic that indicates how much variation the developed model can explain or
capture. It is always in the range of 0 to 1.
In general, the better the model matches the data, the greater the R-squared number.
Interpretation of R2 square
• If for CGPA and Package data if r2 score is 80%
• It means your cgpa which is input column is able to explain 80 % of
variance in the output column.
• Suppose we have one more data of IQ and CGPA as input and Package
as output,R2 score is 90 that means 90 % of variation in package
column justification is given by CGPA and IQ column while for 10 %
variation we have no justification
R2 score flaw
• As we increase more and more columns in input columns then R2
score start increasing.
• Because it is assumed that it is giving more justification in variance of
output.
• Bt problem comes when suppose we add one more column say
temperature along with IQ and CGPA.
• This column is totally irrelevant. So ideally R2 score should reduce but
it will either remain constant or increase.
• To solve this problem comes adjusted R2 score
Adjusted R-squared score
• Adjusted R-square accounts the number of predictors in the model and
penalizes the model for including irrelevant predictors that don’t
contribute significantly to explain the variance in the dependent variables.
• Mathematically, adjusted R2 is expressed as: