CSL0777 L12

Program: B.
Tech VII Semester
CSL0777: Machine Learning
Unit No. 2
Supervised learning Part-1
Lecture No. 12
Linear Regression
Mr. Praveen Gupta

Assistant Professor, CSA/SOET
Outlines
• Linear Regression
• Types of Linear
Regression
• Linear Regression Line
• Finding the best fit line
• Cost Function
• Model Performance
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it effectively.
Linear
Regression
•Linear regression is one of the easiest and most popular
Machine Learning algorithms. It is a statistical method that is
used for predictive analysis.
•Linear regression makes predictions for continuous/real or
numeric variables such as sales, salary, age, product
price, etc.
•Linear regression algorithm shows a linear relationship
between a dependent (y) and one or more independent (y)
variables, hence called as linear regression.
•Since linear regression shows the linear relationship, which
means it finds how the value of the dependent variable is
changing according to the value of the independent variable.
4 / 22
Linear
Regression
•The linear regression model provides a sloped straight line
representing the relationship between the variables.
4 / 22
Linear
Regression
• Mathematically, we can represent a linear regression as:
y= a0+a1x+ ε
Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of
freedom)
a1 = Linear regression coefficient (scale factor to each input
value).
ε = random error
4 / 22
Types of Linear
LinearRegression
regression can be further divided into two types of
the algorithm:
Simple Linear Regression:
If a single independent variable is used to predict the value
of a numerical dependent variable, then such a Linear
Regression algorithm is called Simple Linear Regression.
Multiple Linear regression:

If more than one independent variable is used to predict the
value of a numerical dependent variable, then such a Linear
Regression algorithm is called Multiple Linear Regression.
4 / 22
Linear Regression Line
A linear line showing the relationship between the dependent and
independent variables is called a regression line. A regression line
can show two types of relationship:
Positive Linear Relationship:
If the dependent variable increases on the Y-axis and independent
variable increases on X-axis, then such a relationship is termed as a
Positive linear relationship.
4 / 22
Linear Regression Line
Negative Linear Relationship:
If the dependent variable decreases on the Y-axis and
independent variable increases on the X-axis, then such a
relationship is called a negative linear relationship.
4 / 22
Finding the best fit line
•When working with linear regression, our main goal is to
find the best fit line that means the error between predicted
values and actual values should be minimized. The best fit
line will have the least error.
•The different values for weights or the coefficient of lines
(a0, a1) gives a different line of regression, so we need to
calculate the best values for a0 and a1 to find the best fit line,
so to calculate this we use cost function.
4 / 22
Cost Function
The different values for weights or coefficient of lines (a0, a1)
gives the different line of regression, and the cost function is
used to estimate the values of the coefficient for the best fit
line.
Cost function optimizes the regression coefficients or weights.
It measures how a linear regression model is performing.
We can use the cost function to find the accuracy of
the mapping function, which maps the input variable to the
output variable. This mapping function is also known
as Hypothesis function.
For Linear Regression, we use the Mean Squared Error
(MSE) cost function, which is the average of squared error
occurred between the predicted values and actual values.
4 / 22
Cost Function
MSE can be calculated as:
Where,
N=Total number of observation
Yi = Actual value
(a1xi+a0)= Predicted value.
4 / 22
Cost Function
Residuals
•The distance between the actual value and predicted values
is called residual.
• If the observed points are far from the regression line, then
the residual will be high, and so cost function will high.
•If the scatter points are close to the regression line, then the
residual will be small and hence the cost function.
4 / 22
Gradient Descent
•Gradient descent is used to minimize the MSE by calculating
the gradient of the cost function.
• A regression model uses gradient descent to update the
coefficients of the line by reducing the cost function.
•It is done by a random selection of values of coefficient and
then iteratively update the values to reach the minimum cost
function.
4 / 22
Model Performance
•The Goodness of fit determines how the line of regression fits the
set of observations. The process of finding the best model out of
various models is called optimization.
R-squared method:
• R-squared is a statistical method that determines the goodness of fit.
•It measures the strength of the relationship between the dependent
and independent variables on a scale of 0-100%.
•The high value of R-square determines the less difference between
the predicted values and actual values and hence represents a good
model.
•It is also called a coefficient of determination, or coefficient of
multiple determination for multiple regression.
• It can be calculated from the below formula:
4 / 22
Learning Outcomes
The students have learn and understand the followings

• Linear Regression
• Types of Linear Regression
• Linear Regression Line
• Finding the best fit line
• Cost Function
• Model Performance
References
1. Machine Learning for Absolute Beginners by Oliver Theobald. 2019

2. http://noracook.io/Books/Python/introductiontomachinelearningwithpyth
on.pdf
3. https://www.tutorialspoint.com/machine_learning_with_python/machine
_learning_with_python_tutorial.pdf
Thank you

CSL0777 L12

Uploaded by

Copyright:

Available Formats

CSL0777 L12

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSL0777 L12

Uploaded by

Copyright:

Available Formats

Program: B.

Tech VII Semester

CSL0777: Machine Learning

Mr. Praveen Gupta

Multiple Linear regression:

The students have learn and understand the followings

1. Machine Learning for Absolute Beginners by Oliver Theobald. 2019

You might also like