0% found this document useful (0 votes)
14 views

2-Linear Regression

Uploaded by

aqsa.afzal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

2-Linear Regression

Uploaded by

aqsa.afzal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Linear Regression

by Aqsa Afzal
Supervised Learning
In supervised learning, we are given a labeled data set(labeled training
data) and the desired outcome is already known, where every pair of
training data has some kind of relationship.

Supervised learning is where you have input variables (x) and an output
variable (Y) and you use an algorithm to learn the mapping function
from the input to the output.

Y = f(X)
Supervised Learning

• Linear Regression
• Logistic Regression
• Decision Trees
• Random Forests
• Support Vector Machines (SVM)
• K-Nearest Neighbors (KNN)
• Naive Bayes Classifier
• Gradient Boosting Machines (GBM)
• Neural Networks (Multilayer Perceptron)
• Ensemble Methods (Bagging, Boosting)
Linear Regression
• Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (y) variables, hence
called as linear regression.
• Since linear regression shows the linear relationship, which means it
finds how the value of the dependent variable is changing according
to the value of the independent variable.
Simple Linear regression

Simple linear regression uses traditional slope-intercept form, where 𝑚𝑚


and 𝑏𝑏are the variables our algorithm will try to “learn” to produce the
most accurate predictions. 𝑥𝑥 represents our input data and 𝑦𝑦represents
our prediction.

𝑦𝑦=𝑚𝑚𝑥𝑥+𝑏𝑏
Equation
How to Find a Linear Regression Equation:
Steps
• Step 1: Make a chart of your data, filling in the columns
To find intercept and slope, use following
formula
OR
Build Your Equation
y’ = m + bx
y’ = 65.14 + .385225x

Put any X value , and Calculate Predicted Value


Equation Along Error
Linear Regression Line
• A linear line showing the relationship between the dependent and
independent variables is called a regression line.
Finding the best fit line:

• When working with linear regression, our main goal is to find the best
fit line that means the error between predicted values and actual
values should be minimized. The best fit line will have the least error.

• The different values for weights or the coefficient of lines (a0, a1)
gives a different line of regression, so we need to calculate the best
values for a0 and a1 to find the best fit line, so to calculate this we
use cost function.
• First we find Cost function and then apply further techniques to
improve the residual and then evaluate it on basis of various
metrices.
Step 1: Find Residual
Cost function
The cost function, also known as the loss function or objective function, is a
measure of how well the model is performing. In linear regression, the cost
function is used to calculate the difference between the predicted values and
the actual values, also known as the residuals or errors.

The goal of linear regression is to minimize the cost function, which is achieved
by finding the best-fitting line that minimizes the sum of the squared
differences between the predicted values and the actual values. The most
commonly used cost function in linear regression is the Mean Squared Error
(MSE), which is calculated as the average of the squared differences between
the predicted and actual values
Cost Function / MSE (Mean Squared Error)
If you don’t have predicted y directly, then
Step 2: Gradient Descent
• Gradient descent is an optimization algorithm used to minimize the
MSE by calculating the gradient of the cost function.
• A regression model uses gradient descent to update the coefficients
of the line by reducing the cost function.
• It is done by randomly selecting coefficient values and then iteratively
updating the values to reach the minimum cost function.
• The goal of gradient descent is to find the values of the intercept and
slope coefficients, b0 and b1, that minimize the cost function by
iteratively adjusting the values of these coefficients.
Gradient Descent
You can find Gradient Descent for slope and
intercept by this way:
Derivatives: Dc,Dm and Learning Rate (LR)

Let L be our learning rate. This controls how much the value of
m changes with each step. L could be a small value like 0.0001
for good accuracy.
Summed it Up!
• Initialize the values of b0 and b1 to random values.
• Calculate the predicted values for the given input data using the current
values of b0 and b1.
• Calculate the cost function using the predicted values and the actual
values.
• Calculate the gradient of the cost function with respect to b0 and b1.
• Update the values of b0 and b1 by taking a step in the opposite direction of
the gradient, with a step size determined by the learning rate.
• Repeat steps 2–5 until the cost function is minimized or a maximum
number of iterations is reached.
Model Performance Evaluation
R-Squared (R²)
Sum of Squares Regression
Sum of Squares Error (SSE)
and many more
R-Squared (R²):
• R-squared is the proportion of variance explained
• It is the proportion of variance in the observed data that is explained
by the model or the reduction in error over the null model
• The null model just predicts the mean of the observed response, and
thus it has an intercept and no slope
• R-squared is between 0 and 1
• Higher values are better because it means that more variance is
explained by the model.
References
https://utsavdesai26.medium.com/linear-regression-made-simple-a-
step-by-step-tutorial-fb8e737ea2d9
https://medium.com/swlh/linear-regression-explained-for-beginners-
in-machine-learning-9e74f168d8a8
https://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html

You might also like