13 To 16 - Regression
13 To 16 - Regression
Anal ysis
Module 13 to 16
RIDDHI PASARI 1
Topics to be covered
01 02
Regression - I Regression - II
Simple linear regression, SSE, SST, SSR, MSE, MSR,
intercept & slope R square
03 04
Regression - III Regression - IV
Standard error of coefficient,
Multiple linear regression,
T values, Standard error of
ANOVA, F values
the model, degree of freedom
RIDDHI PASARI 3
Examples
01 02
To study relationship
To study numbers of hours
between advertising and
practice and errors .
sales.
RIDDHI PASARI 4
Some Concepts
01 02 03
Dependent Independent Simple Linear
Variable Variable Regression Mddel
Model that estimates
The variable we use to
The variable we are trying relationship between one
predict/ forecast the independent variable and one
to predict/ forecast (y)
dependent variable (x) dependent variable using a
straight line.
04 05 06
Multiple Linear Deterministic Probabilistic
Regression Model Model Model
Model that estimates It incorporates
relationship between more The output of the model
randomness and
than one independent is fully determined by the
uncertainty in the
variable and one dependent independent variables .
predictions .
variable using a straight line. RIDDHI PASARI 5
y= β0 + β1x + ε
RIDDHI PASARI 6
Regression Lines
RIDDHI PASARI 7
Estimating The Coefficients
Population simple linear regression equation:
𝒚𝒚 = 𝜷𝜷𝟎𝟎 + 𝜷𝜷𝟏𝟏 𝒙𝒙 + 𝜺𝜺
Es t im a t e d s im ple line a r re g re s s ion e q u a t ion :
� = 𝒃𝒃𝟎𝟎 + 𝒃𝒃𝟏𝟏 𝒙𝒙
𝒚𝒚
OR
� = 𝜷𝜷
𝒚𝒚 �𝟏𝟏 𝒙𝒙
�𝟎𝟎 + 𝜷𝜷
RIDDHI PASARI 8
02
Regression - II
SSE, SST, SSR, MSE, MSR, R square
RIDDHI PASARI 9
Sum of Squares:
10
RIDDHI PASARI
The sum of squares is a statistical
measure of variability . It is a statistic that
measures the variability of a dataset’s
observations around the mean. Larger
values indicate a greater degree of
dispersion .
S um of Squares
11
RIDDHI PASARI
RIDDHI PASARI
Sum of Squares
SST: Sum of Squares Total/ Total Sum
of Squares
It measures the overall variability of the dependent variable around Total variation due
its mean . Consider it the total amount of variation available for your to independent
model to explain . variable .
RIDDHI PASARI 14
R 2 Coefficient of
Determination
The proportion of variation in Y being
explained by the variation in X.
𝑆𝑆𝑆𝑆𝑆𝑆
𝑅𝑅2 =
𝑆𝑆𝑆𝑆𝑆𝑆
MSE MSR
Mean sum of squared Me a n s um of s qua re d
errors re gre s s ion
𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑆𝑆
𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑀𝑀𝑀𝑀𝑀𝑀 =
𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑
As t he da t a point s fa ll c los e r t o t h e re g re s s ion line , t he
m ode l ha s le s s e rror, d e c re a s in g t h e MSE. A m ode l wit h
le s s e rror prod u c e s m ore p re c is e pre dic t ion .
RIDDHI PASARI 16
RIDDHI PASARI 17
03
Regression - III
Interpreting the Slope, Standard error of
coefficient, T values, p - values, Standard error of
the model, Degrees of freedom
RIDDHI PASARI 18
04
Regression - IV
Multiple Linear Regression Model
RIDDHI PASARI 19
Why not all interval variables
available be included in model?
The obje c t ive is t o de t e rm in e wh e t h e r our h yp ot h e s ize d m ode l is va lid a n d wh e t h e r
Reason 1 t h e in d e pe n de n t va ria b le s in t h e m od e l a re lin e a rly re la t e d t o t h e de pe n de n t
va ria ble . Th a t is , we s h ould s c re e n t h e in de pe n d e n t va ria ble s a n d in c lu de on ly t h os e
t h a t in t h e ory a ffe c t t h e de pe n de n t va ria ble .
RIDDHI PASARI 20
SST= SSR + SSE
RIDDHI PASARI 21
F - Test
• This test determines whether the regression
model as a whole is statistically significant .
F - value > Critical F - value Reject the null and model is valid
RIDDHI PASARI 22
RIDDHI PASARI 23
Reference Sections
From Book
Units :( 16.1, 16.2 )
Units: (16. 4a, 16.4b,4c,4d,4f, 4g )
Units (17.2a, b, c, d ,e ,f, h ,k)
RIDDHI PASARI 24