0% found this document useful (0 votes)
0 views

2 Simple Linear Regression

The document provides an overview of linear regression, a statistical method used for predictive analysis that establishes a linear relationship between dependent and independent variables. It distinguishes between simple linear regression with one input variable and multiple linear regression with multiple input variables, emphasizing the goal of minimizing prediction error. Additionally, it discusses the cost function, specifically the Mean Squared Error (MSE), and the gradient descent method for optimizing regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

2 Simple Linear Regression

The document provides an overview of linear regression, a statistical method used for predictive analysis that establishes a linear relationship between dependent and independent variables. It distinguishes between simple linear regression with one input variable and multiple linear regression with multiple input variables, emphasizing the goal of minimizing prediction error. Additionally, it discusses the cost function, specifically the Mean Squared Error (MSE), and the gradient descent method for optimizing regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Linear Regression

Dr. Shweta Sharma


Supervised/Unsupervised Learning

• Supervised Learning It is the machine learning task of inferring


a function from training & testing data
Example: Classification & Regression Problems

• Unsupervised Learning refers to the problem of trying to find


hidden structure in unlabeled data
Example: Clustering Problems

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 2


You have to only deal with ……..

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 3


Examples
• Regression Problem
• Prediction of wheat production
• Prediction of rainfall
• Point prediction of Stock Exchange
• Prediction of Price

• Classification Problems
• Prediction of cancer
• Win prediction of Sh. Narendra Modi ji
• Diabetic Prediction
• Classification of e-mail
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 4
Understand the Data……..

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 5


Classification & Regression

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 6


Data Set : UCI Library
Google “uci dataset”

7
Linear Regression
A statistical method that is used for predictive analysis
Makes predictions for continuous/real or numeric variables such as sales, salary,
age, product price, etc.

Shows a linear relationship between a dependent (y) and one or more


independent (X) variables, hence called as linear regression

Dependent variable id continuous in nature

Since linear regression shows the linear relationship, which means it finds how
the value of the dependent variable is changing according to the value of the
independent variable Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 8
Linear regression performs the task to predict a
dependent variable value (y) based on a given
independent variable (x)

So, this regression technique finds out a linear


relationship between x (input) and y(output).
Hence, the name is Linear Regression

When the value of x increases, the value of y is


likewise increasing

X (input) is the work experience and Y (output)


is the salary of a person

The regression line is the best fit line for our


model
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 9
Simple Linear Regression

If there is a single input variable (x), such linear regression is called simple linear regression
y= b*x+a
or
y= a0+a1*x
• Y= Dependent Variable (Target Variable) (ESTIMATED OUTPUT)
• X= Independent Variable (predictor Variable) (INPUT)
• a0= Intercept of the line (Gives an additional degree of freedom) (regression coefficient)
• a1 = Linear regression coefficient (scale factor to each input value) (SLOPE) (regression coefficient)
The motive of the linear regression algorithm is to find the best values for a_0 and a_1
X: amount of fertilizer, y: size of crop (every time we add a unit to X, the dep variable y increases
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 10
proportionally)
Multiple Linear Regression
If there is more than one input variable, such linear regression is called multiple linear
regression
y= a + b_1*x_1 + b_2*x_2 + b_3*x_3+… + b_n*x_n
or
y= a_0 + a_1*x_1 + a_2*x_2+.. + a_n*x_n

Eg: Add amount of sunlight and rainfall in a growing season to the fertilizer variable, with all 3
affecting y

X_1: amount of fertilizer,


X_2: amount of sunlight,
X_3: amount of rainfall,
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 11
y: size of crop
y= 0.9 + 1.2*x1 + 2*x2 + 4*x3

Which feature is more important?


Example, Result (y)
Quiz(x1), Assignments (x2), Project (x3)

A Linear Regression model’s main aim is to find the best fit


linear line and the optimal values of intercept and
coefficients such that the error is minimized

Error is the difference between the actual value and


Predicted value and the goal is to reduce this difference
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 12
• x is our independent variable which is plotted
on the x-axis
• y is the dependent variable which is plotted
on the y-axis
•Black dots are the data points i.e the actual
values
•a0 is the intercept which is 10
• a1 is the slope of the x variable
•The blue line is the best fit line predicted by
a1
the model i.e the predicted values lie on the
blue line.

The vertical distance between the data point


and the regression line is known as error or
a0 residual

Each data point has one residual and the sum


of all the differences is known as the Sum of
Residuals/Errors
Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 13
Residual/Error = Actual values – Predicted Values

Sum of Residuals/Errors = Sum(Actual- Predicted Values)

Square of Sum of Residuals/Errors = (Sum(Actual- Predicted Values))2

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 14


Cost Function (J)
• To figure out the best possible values for a_0 and a_1 which would provide the best fit line for the data
points
• Since we want the best values for a_0 and a_1, we convert this search problem into a minimization problem
where we would like to minimize the error between the predicted value and the actual value

This provides the average squared error over all the data points. Therefore, this cost function is also known as
the Mean Squared Error (MSE) function
Now, using this MSE function we are going to change the values of a_0 and a_1 such that the MSE value settles
at the minima Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 15
Linear regression: A sample curve-fitting

• Y = m*x + b

• Method of least
square
• Gradient descent
approach

16
Ordinary Least Squares (OLS) Method

y= m*x+b

Linear Regression Simplified - Ordinary Least Square vs Gradient Descent: https://towardsdatascience.com/linear-


regression-simplified-ordinary-least-square-vs-gradient-descent-48145de2cf76
https://shakewingo.github.io/GD-vs-OLS/
Gradient Descent

• Gradient descent is a method of updating a_0 and a_1 to reduce the cost function (MSE)
• The idea is that we start with some values for a_0 and a_1 and then we change these
values iteratively to reduce the cost

In the gradient descent


algorithm, the number of steps
you take is the learning rate

This decides on how fast the


algorithm converges to the
minima

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 18


Gradient Descent
To find these gradients, we take partial derivatives with respect to a_0 and a_1

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 19


Gradient Descent

To find these gradients, we take partial derivatives with respect to a_0 and a_1

The partial derivates are the gradients and they are used to update the values of
a_0 and a_1

Alpha is the learning rate which is a hyper-parameter that you must specify

A smaller learning rate could get you closer to the minima but takes more time to
reach the minima, a larger learning rate converges sooner but there is a chance
that you could overshoot the minima

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 20


Practical- Lab
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

Task: Implement Simple Linear Regression

Dataset: Google classroom (Lab 6)

Code: Google classroom (Lab 6)

Dr. Shweta Sharma, Dept. of CSE, PEC Chandigarh 21


Thank you!

22

You might also like