Linear Models

Uploaded by

Priyadarshini Chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views50 pages

Linear Models

Uploaded by

Priyadarshini Chavan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

Linear Models

What is the best Fit Line?

Our primary objective while using linear regression is to locate the best-fit line, which implies that the error
between the predicted and actual values should be kept to a minimum. There will be the least error in the best-fit
line.
Least Square Method
Overview
• Least square method is the process of finding a regression line or
best-fitted line for any data set that is described by an equation.
• This method requires reducing the sum of the squares of the residual
parts of the points from the curve or line and the trend of outcomes is
found quantitatively.
• The method of curve fitting is seen while regression analysis and the
fitting equations to derive the curve is the least square method.
Least Square Method Definition
• The least-squares method is a statistical method used to
find the line of best fit of the form of an equation such as
y = mx + b to the given data.
• The curve of the equation is called the regression line.
• Main objective is to reduce the sum of the squares of
errors as much as possible.
• This is the reason this method is called the least-squares
method.
• This method is often used in data fitting where the best
fit result is assumed to reduce the sum of squared errors
that is considered to be the difference between the
observed values and corresponding fitted value.
• The sum of squared errors helps in finding the variation in
observed data. For example, we have 4 data points and
using this method we arrive at the following graph.
Least Square Method Formula
• Least-square method is the curve that best fits a set of observations with a
minimum sum of squared residuals or errors.
• Let us assume that the given points of data are (x1, y1), (x2, y2), (x3, y3), …, (xn,
yn) in which all x’s are independent variables, while all y’s are dependent ones.
• This method is used to find a linear line of the form y = mx + b, where y and x
are variables, m is the slope, and b is the y-intercept.
• The formula to calculate slope m and the value of b is given by:
• m = (n∑xy - ∑y∑x)/n∑x2 - (∑x)2
• b = (∑y - m∑x)/n
• Here, n is the number of data points.
Steps to solve
Following are the steps to calculate the least square using the above
formulas.
• Step 1: Draw a table with 4 columns where the first two columns are
for x and y points.
• Step 2: In the next two columns, find xy and (x)2.
• Step 3: Find ∑x, ∑y, ∑xy, and ∑(x)2.
• Step 4: Find the value of slope m using the above formula.
• Step 5: Calculate the value of b using the above formula.
• Step 6: Substitute the value of m and b in the equation y = mx + b
Example
Find the value of m by using the formula,

m = (n∑xy - ∑y∑x)/n∑x2 - (∑x)2

m = [(5×88) - (15×25)]/(5×55) - (15)2

m = (440 - 375)/(275 - 225)

m = 65/50 = 13/10

Find the value of b by using the formula,

b = (∑y - m∑x)/n

b = (25 - 1.3×15)/5

b = (25 - 19.5)/5

b = 5.5/5
https://www.cuemath.com/data/least-squares/
https://www.geeksforgeeks.org/least-square-method/ So, the required equation of least squares is y =

mx + b = 13/10x + 5.5/5.
Multiple Linear Regression
Multivariate Linear Regression
• Multivariate regression refers to the statistical technique that
establishes a relationship between multiple data variables. It
estimates a linear equation that facilitates the analysis of multiple
dependent or outcome variables depending on one or more predictor
variables at different points in time.
Assumptions
• The validity and reliability of the multivariate regression findings depend upon
the following four assumptions:
• Linearity: The correlation between the predictor and outcome variables is linear.
• Independence: The observations are autonomous of each other, i.e., the value
of the other independent variable should not influence the value of the
independent variables.
• Homoscedasticity: The variance of the errors (residuals) is even across all levels
of the explanatory variables. This ensures that the spread of residuals is the
same for all predicted values.
• Normality: The residuals (differences between observed and predicted values)
should be normally distributed, ensuring that statistical inferences about
regression coefficients are valid.
Advantages
• Some of the benefits of this model are discussed below:
• Better Comprehends Relationships: Unlike simple linear regression, which considers
only one predictor, multivariate regression can account for interactions and
interdependencies among various predictors, capturing complex relationships
between these variables.
• Reliable Predictions: By including multiple predictors, the model might provide more
accurate estimations than simple regression models, leading to a better fit for the
data.
• Correlation, Strength, and Direction: Multivariate regression can help identify which
explanatory variables significantly influence the dependent variable, establishing a
correlation and quantifying the direction and strength of these correlations.
Disadvantages
The various limitations of this regression technique are as follows:
• Difficult to Interpret: Multivariate regression can be challenging to interpret, especially for
individuals unfamiliar with statistical analyses, due to multiple predictors.
• Complex Calculations: Since this model incorporates multiple variables, its computation involves
complex mathematical calculations.
• Extensive Data Requirement: Multivariate regression requires a larger sample size than simple
regression. Small sample sizes can result in unreliable parameter estimates and low statistical
power.
• Overfitting: It occurs when the model fits the training data too closely, capturing noise rather
than the underlying pattern.
https://www.wallstreetmojo.com/multivariate-regression/
Steps to achieve multivariate regression
• Step 1: Select the features
• First, you need to select that one feature that drives the multivariate regression. This is the feature that is
highly responsible for the change in your dependent variable.
• Step 2: Normalize the feature
• Now that we have our selected features, it is time to scale them in a certain range (preferably 0-1) so that
analysing them gets a bit easy.
• To change the value of each feature, we can use:
• Step 3: Select loss function and formulate a hypothesis
• A formulated hypothesis is nothing but a predicted value of the response variable and is denoted by h(x).
• A loss function is a calculated loss when the hypothesis predicts a wrong value. A cost function is a cost
handled for those wrongly predicting hypotheses.
Steps to achieve multivariate regression
• Step 4: Minimize the cost and loss function
• Both cost function and loss function are dependent on each other. Hence, in order to
minimize both of them, minimization algorithms can be run over the datasets. These
algorithms then adjust the parameters of the hypothesis.
• One of the minimization algorithms that can be used is the gradient descent algorithm.
• Step 5: Test the hypothesis
• The formulated hypothesis is then tested with a test set to check its accuracy and correctness.

• https://brilliant.org/wiki/multivariate-regression/: different equations for regression

where we have m data points in training data and y is the observed data of dependent
variable.
What are Overfitting and Underfitting?
• Overfitting is a phenomenon that occurs when a Machine Learning
model is constrained to the training set and not able to perform well
on unseen data. That is when our model learns the noise in the
training data as well. This is the case when our model memorizes the
training data instead of learning the patterns in it.
• Underfitting on the other hand is the case when our model is not able
to learn even the basic patterns available in the dataset. In the case of
the underfitting model is unable to perform well even on the training
data hence we cannot expect it to perform well on the validation
data. This is the case when we are supposed to increase the
complexity of the model or add more features to the feature set.
What is Regularized Regression?
• Regularized regression is a type of regression where the coefficient
estimates are constrained to zero.
• The magnitude (size) of coefficients, as well as the magnitude of the
error term, are penalized.
• Complex models are discouraged, primarily to avoid overfitting.
• “Regularization” is a way to give a penalty to certain models (usually
overly complex ones).
Types of Regularized Regression
Two commonly used types of regularized regression methods are
ridge regression and lasso regression.
• Ridge regression is a way to create a parsimonious model when the
number of predictor variables in a set exceeds the number of
observations, or when a data set has multicollinearity (correlations
between predictor variables).
• Lasso regression is a type of linear regression that uses shrinkage.
Shrinkage is where data values are shrunk towards a central point, like
the mean. This type is very useful when you have high levels of
muticollinearity or when you want to automate certain parts of model
selection, like variable selection/parameter elimination.
Hyper-parameter could be learning rate or epoch etc.

Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Data Science
100% (1)
Data Science
14 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
1-7 Least-Square Regression
100% (1)
1-7 Least-Square Regression
23 pages
UNIT 2 Machine Learning BCAI601BCDS062
No ratings yet
UNIT 2 Machine Learning BCAI601BCDS062
244 pages
Linear Regression
100% (1)
Linear Regression
8 pages
Unit 2 Regression
No ratings yet
Unit 2 Regression
31 pages
Unit-3 Data Analysis
No ratings yet
Unit-3 Data Analysis
36 pages
Least Square Regression
No ratings yet
Least Square Regression
15 pages
Unit 3 1
No ratings yet
Unit 3 1
41 pages
UCS-401 - CSE7th M L - Lect - 10 - Unit-Ll - Least Squares Method, Multivariate Linear Regression, Regul
No ratings yet
UCS-401 - CSE7th M L - Lect - 10 - Unit-Ll - Least Squares Method, Multivariate Linear Regression, Regul
16 pages
Da Unit III
0% (1)
Da Unit III
43 pages
Regression
No ratings yet
Regression
15 pages
5.linear Regression
No ratings yet
5.linear Regression
39 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
(Unit-04) Part-01 - ML Algo
No ratings yet
(Unit-04) Part-01 - ML Algo
49 pages
LinearRegression1 210720 171800
No ratings yet
LinearRegression1 210720 171800
41 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Formula Sheet
No ratings yet
Formula Sheet
7 pages
Regression
No ratings yet
Regression
16 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
Linear Regression
No ratings yet
Linear Regression
34 pages
Mohit Final REASEARCH PAPER
No ratings yet
Mohit Final REASEARCH PAPER
20 pages
IV Ai & Ds Al3451 ML Unit2
No ratings yet
IV Ai & Ds Al3451 ML Unit2
50 pages
Module4 CSE3190 FDA Updated
No ratings yet
Module4 CSE3190 FDA Updated
46 pages
Ratio and Product Methods of Estimation: Y X Y X Y X
No ratings yet
Ratio and Product Methods of Estimation: Y X Y X Y X
23 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
33 pages
ML Unit-4
No ratings yet
ML Unit-4
65 pages
Plots Transformations and Regression An Introduction To Graphical Methods of Diagnostic Regression Analysis 0198533713 9780198533719
100% (1)
Plots Transformations and Regression An Introduction To Graphical Methods of Diagnostic Regression Analysis 0198533713 9780198533719
300 pages
Unit 2
No ratings yet
Unit 2
26 pages
Complete Linear Regression Algorithm
No ratings yet
Complete Linear Regression Algorithm
4 pages
Least Squares Method
No ratings yet
Least Squares Method
36 pages
NOTES - UNIT 2 - Machine Learning
No ratings yet
NOTES - UNIT 2 - Machine Learning
33 pages
FDSA Unit V LECTURE NOTS
No ratings yet
FDSA Unit V LECTURE NOTS
28 pages
Unit III
No ratings yet
Unit III
18 pages
Unit 2-1
No ratings yet
Unit 2-1
30 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
Unit 2
No ratings yet
Unit 2
92 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Linear Regression
No ratings yet
Linear Regression
10 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
3 Da
No ratings yet
3 Da
16 pages
PracticeExamRegression3024 PDF
100% (2)
PracticeExamRegression3024 PDF
13 pages
Machine Learning Unit2
No ratings yet
Machine Learning Unit2
31 pages
Brown - Warner - 1985 How To Calculate Market Return
No ratings yet
Brown - Warner - 1985 How To Calculate Market Return
29 pages
Da Unit III
No ratings yet
Da Unit III
43 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
A) The Least-Squares Method
No ratings yet
A) The Least-Squares Method
19 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Daunit 3
No ratings yet
Daunit 3
32 pages
MD - Nazmul Islam 44008
No ratings yet
MD - Nazmul Islam 44008
23 pages
Unit - Iii
No ratings yet
Unit - Iii
9 pages
FML Unit2
No ratings yet
FML Unit2
13 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics by Jim
No ratings yet
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics by Jim
71 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
GMM 2
No ratings yet
GMM 2
30 pages
Dependent Independent Variable (S) : Regression: What Is Regression
No ratings yet
Dependent Independent Variable (S) : Regression: What Is Regression
15 pages
Combinepdf
No ratings yet
Combinepdf
8 pages
Group 5 Part 2 in Parametric Estimating Kenneth Gutierrez
No ratings yet
Group 5 Part 2 in Parametric Estimating Kenneth Gutierrez
15 pages
08 Introductory Econometrics Fourth Sem
No ratings yet
08 Introductory Econometrics Fourth Sem
4 pages
Class 9 Validation of The Linear Regression Model
No ratings yet
Class 9 Validation of The Linear Regression Model
20 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Ujian Spss Aina Yunisa (020113002) Ikmb Akk 3
No ratings yet
Ujian Spss Aina Yunisa (020113002) Ikmb Akk 3
4 pages
Unbiased Estimation of A Sparse Vector in White Gaussian Noise
No ratings yet
Unbiased Estimation of A Sparse Vector in White Gaussian Noise
46 pages
Steps Ofvvector Estimating Error Correction Model
No ratings yet
Steps Ofvvector Estimating Error Correction Model
4 pages
Case Problem #4
No ratings yet
Case Problem #4
5 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
QP 19SS QEStat Accessible
No ratings yet
QP 19SS QEStat Accessible
8 pages
Durbin Watson Tables
No ratings yet
Durbin Watson Tables
11 pages
RDD Stata 1
No ratings yet
RDD Stata 1
23 pages
Least Square Regression
No ratings yet
Least Square Regression
13 pages
ERROR Pada STUDI EPIDEMIOLOGI GIZI KLINIK
No ratings yet
ERROR Pada STUDI EPIDEMIOLOGI GIZI KLINIK
27 pages
3 Unit - Dspu
No ratings yet
3 Unit - Dspu
23 pages
Uts M Fauzi Budiawan 120020595
No ratings yet
Uts M Fauzi Budiawan 120020595
9 pages
6 Properties of LSE
No ratings yet
6 Properties of LSE
2 pages
Untitled
No ratings yet
Untitled
9 pages
Tystrttttttt
No ratings yet
Tystrttttttt
7 pages
Econometrics Project - Sirjan and Paridhi
No ratings yet
Econometrics Project - Sirjan and Paridhi
20 pages
MOOC Econometrics: Test Exercise 4
No ratings yet
MOOC Econometrics: Test Exercise 4
2 pages
Structural Break
No ratings yet
Structural Break
4 pages
CME 106 - Statistics Cheatsheet
No ratings yet
CME 106 - Statistics Cheatsheet
13 pages
Midterm
No ratings yet
Midterm
9 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)

Linear Models

Uploaded by

Linear Models

Uploaded by

Linear Models

What is the best Fit Line?

m = (n∑xy - ∑y∑x)/n∑x2 - (∑x)2

m = [(5×88) - (15×25)]/(5×55) - (15)2

m = (440 - 375)/(275 - 225)

Find the value of b by using the formula,

• https://brilliant.org/wiki/multivariate-regression/: different equations for regression

You might also like