0% found this document useful (0 votes)
3 views

09. linear regression

Linear regression is a widely used supervised machine learning algorithm that models the relationship between a dependent variable and one or more independent variables. There are two main types: Simple Linear Regression, which involves one independent variable, and Multiple Linear Regression, which involves two or more. While it is easy to implement and interpret, linear regression has limitations, including sensitivity to outliers and the assumption of a linear relationship.

Uploaded by

d36078067
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

09. linear regression

Linear regression is a widely used supervised machine learning algorithm that models the relationship between a dependent variable and one or more independent variables. There are two main types: Simple Linear Regression, which involves one independent variable, and Multiple Linear Regression, which involves two or more. While it is easy to implement and interpret, linear regression has limitations, including sensitivity to outliers and the assumption of a linear relationship.

Uploaded by

d36078067
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Read Slide : 8, 9, 14, 15, 38, 39, 40, 41, 42, 43

LINEAR REGRESSION
DR. SHUBHRA GOYAL
THE INTUITION BEHIND LINEAR REGRESSION

• Linear regression is the oldest simple, and widely


used supervised machine learning algorithm for
predictive analysis.
• It is a linear approach to modelling the relationship
between a response variable (dependent variable)
and one or more explanatory variables
(independent variables).
• For example, we want to predict the price of a
house given by its size for now. Charting the data
and fitting a line among them will look something
like this:
THE INTUITION BEHIND LINEAR REGRESSION

• To generalize, we draw a straight line such


that it crosses through the maximum points.
• Once we get that line, for house of any size
we just project that data point over the line
which gives us the house price.
• Here the main problem is not finding the
house price but finding the best fit line that
can generalize over the data points provided.
• A linear regression line has an equation of the
form:
TYPES OF LINEAR REGRESSION

• There are mainly two types of the variant in linear regression:


Simple Linear Regression
Multiple Linear Regression
SIMPLE LINEAR REGRESSION:

In Simple Linear Regression, we try to find the relationship


between a single independent variable and a corresponding
target variable.
This can be expressed in the form of a straight line.
The same equation of a line can be re-written as:ŷ=β0+β1X
•𝐲̂represents the output or dependent variable.
•β0 and β1are two unknown constants that represent the
intercept and coefficient (slope) respectively.
The following is a sample graph of a Simple Linear Regression
Model:
MULTIPLE LINEAR REGRESSION

• In Multiple Linear Regression, we try to find


the relationship between2 or more
independent variables and the
corresponding target variable.
• The independent variables can be
continuous or categorical.
• The equation that describes how the
predicted values of y is related to “i”
independent variables is called as Multiple
Linear Regression
MULTIPLE LINEAR REGRESSION


The graph shows
multiple linear
regression,
applied on the iris
data set:
ASSUMPTIONS OF LINEAR REGRESSION

• Linear regression makes certain


assumptions about the data.
• These assumptions are:
• •Homogeneity of variance
(homoscedasticity): the size of the error
in our prediction does not change
significantly across the values of the
independent variable
ASSUMPTIONS OF LINEAR REGRESSION

• Independence of observations: the independent


variables should not be correlated.
• Absence of this phenomenon is known as
multicollinearity.
• •Normality: The data should follow a normal distribution.
• The relationship between the independent and
dependent variable is linear i.e., the line of best fit
through the data points is a straight line (rather than a
curve).
APPLICATIONS OF LINEAR REGRESSION
APPLICATIONS OF LINEAR REGRESSION
APPLICATIONS OF LINEAR REGRESSION
LINEAR REGRESSION
CORRELATION
LINEAR REGRESSION VS CORRELATION
TYPES OF LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
EXAMPLE 1
EXAMPLE 2
EQUATION OF LINE VS LINEAR REGRESSION
GRAPHICAL EXPLANATION
DETERMINATION OF FIT
ESTIMATION OF COEFFICIENTS
RESIDUALS (ERROR)
LEAST SQUARES
LEAST SQUARES IN VISUAL
MULTIPLE LINEAR REGRESSION (MLR)
LINE EQUATION OF MLR
ESTIMATION OF COEFFICIENTS
QUALITATIVE PREDICTORS
EXAMPLE 1
EXAMPLE
CONTD..
CONTD..
MODEL ESTIMATION
ROOT MEAN SQUARED ERROR (RMSE)- TABULAR
VIEW
RMSE IN VISUAL
R SQUARED STATISTIC
REGRESSION ASSUMPTIONS
CONTD..
CONTD..
CONTD..
CONTD..
APPLICATIONS OF LINEAR REGRESSION
ADVANTAGES OF LINEAR REGRESSION

• Linear Regression is easier to implementand interpret the output coefficients.


• It is less complex compared to other algorithms.
• Over-fitting can be avoided using techniques such as dimensionality reduction,
regularization techniques(L1 and L2), and cross-validation.
DISADVANTAGES OF LINEAR REGRESSION

• It is sensitive to outliers.
• It assumes a linear relationship between dependent and independent variables
which might not be true all the time.
• linear regression only looks at a relationship between the mean of the dependent
variables and the independent variables.
• Mean is not a complete description of a single variable;
• linear regression is not a complete description of relationships among variables.

You might also like