0% found this document useful (0 votes)

16 views9 pages

SSR 2 PDF

the document is about SSR, SSE, SST

Uploaded by

samuelmaina2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views9 pages

SSR 2 PDF

the document is about SSR, SSE, SST

Uploaded by

samuelmaina2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

OLS/SLR Assessment I: Goodness-of-fit

SST • How close? Goodness-of-Fit (GOF) v. Precision/Inference

• Bring on the ANOVA Table! (SST, SSE and SSR)
SSE • Goodness-of-Fit (GOF) metrics
SSR • GOF I: Mean Squared Error (MSE)… and Root MSE (RMSE)

MSE • GOF II: R-squared

RMSE • Thinking about R-squared

• Applications
R2 • Comparing SLR models using Goodness-of-Fit (GOF) metrics

1
How did we do? Goodness-of-Fit (GOF) v. Precision/Inference
 Assessment: How well did we do? How close are the estimated coefficients to the true
parameters, β 0 and β1 ? We'll have several answers. None will be entirely satisfactory…
though they will be informative, nonetheless.
 Goodness-of-Fit : Goodness-of-Fit metrics tell us something about the quality of the overall
model, about how well the predicteds fit the actuals. They may not tell us much about how
precisely we've estimated the true parameters. But if we have a lot of data and the Goodness-
of-Fit metrics look good, maybe we should feel pretty good about our estimated coefficients.
 Precision/Inference: While goodness-of-fit metrics tell us something about how well our
estimated model fits the data, they don't directly tell us anything about how precisely we
have estimated the unknown parameters, the true β ' s . Later on, we will have lots to say
about precision of estimation… but that discussion awaits the development of the tools of
statistical inference, including Confidence Intervals and Hypothesis Tests.
 Who knew? They are related! At first glance that Goodness-of-fit and Precision/Inference
look to be completely unrelated, as one looks at how well a SLR model fits the data whilst
the other considers the precision of estimation of individual parameters. But quite the
contrary!
Stay tuned! 2
Bring on the ANOVA Table! (SSTs, SSEs and SSRs)

• SST (Total Sum of Squares)

 SST =∑ ( yi − y ) 2 =(n − 1) S yy , (n-1) times the variance of the actuals

• SSE (Explained Sum of Squares)

 SSE =∑ ( yˆi − y ) 2 =(n − 1) S yyˆ ˆ , (n-1) times the variance of the predicteds

• SSR (Residual Sum of Squares)

 SSR =∑ i i ∑ i =(n − 1)Suuˆ ˆ , (n − 1) times the variance of the residuals
( y − ˆ
y ) 2
= ˆ
u 2

• SST = SSE + SSR (if there is a constant term in the model)

SST SSE SSR
 Put differently: = + , or S= S yyˆ ˆ + Suuˆ ˆ
n −1 n −1 n −1
yy

The sample variance of the actuals is the

sum of the sample variances of the predicteds and of the residuals.
3
Goodness-of-Fit (GOF) metrics

GOF I: Mean Squared Error (MSE/RMSE)

SSR
• Mean Squared Error: MSE = (in squared units of the y variable)
n−2
SSR
• =
Root Mean Squared Error: RMSE =
MSE (in units of the y variable)
n−2
GOF II: R-squared
SSR
• The Coefficient of Determination, is defined by: R 2 = 1 −
SST
• So long as there is a constant term in the model (so the mean predicted value is the
same as the mean actual value), SSR= SST − SSE and:
SSR SSE SSE / (n − 1) Sample Var ( predicted ) S yyˆ ˆ
R =
1−
2
= = = =.
SST SST SST / (n − 1) SampleVar (actual ) S yy

4
Thinking about R-squared

• R2 is bounded: By construction, 0 ≤ R 2 ≤ 1 (if there is a constant term in the model)…

higher values mean that you've done a better job explaining the variation in the actuals.
Don’t get too excited if R 2 is close to 1, or too depressed if it’s close to 0. Doing good
econometrics is way more than just maximizing R 2 .
• R2 as the Ratio of Variances. Given the results above, R-squared is the ratio of the
Sample Variance of the predicteds to the Sample Variance of the actuals… the percent
of the variation of the actuals explained by the model. This is the most common, and
perhaps the most insightful, interpretation of R 2 .
• R2 as the correlation2 between predicted and actuals. R 2 is also the square of the
sample correlation between the independent and dependent variables, as well as the
sample correlation between the actuals and predicteds: ρ=2
xy ρ=2
ˆ
yy R2 .

5
OLS/SLR estimation … more generally

• The two goodness-of-fit metrics (R-squared and MSE/RMSE) tell you something
about how well your model captures/explains the variation in the dependent variable,
y. They alone, however, do not tell you how well you’ve estimated the unknown
parameter values β 0 and β1 . In some cases, R-squared will be high and MSE/RMSE
will be low, and your parameter estimates will be quite poor… and vice-versa.
• Some examples:
 Suppose you have a sample of size two. With just two data points, R 2 = 1 and
MSE = 0 … and in all likelihood you have miserable estimates of the unknown
parameter values.
 Here are two examples with just five observations randomly generated using a
true relationship given by the solid red line…. and the dashed black line shows
you the OLS estimated SLR relationship for the given dataset. In both cases, the
R 2 is above.5, and the estimated relationship is all wrong. So n matters too!

6
Comparing SLR models using Goodness-of-Fit (GOF) metrics

• You can use R 2 and MSE/RMSE to compare the performance of different SLR models…
but only to a limited extent. And you must be careful!
• If the different models all have the same LHS data (so the y's are the same in the different
models… both in terms of number and in terms of values), then the SSTs and S yy ' s will be
the same across the models, and you can compare R 2 's and MSE/RMSE's. Under these
conditions the R 2 ' s and the MSE/RMSE's will move in opposite directions, since
SSR1 SSR2 SSR1 SSR2
R > R ⇔ 1−
2 2
> 1− ⇔ SSR1 < SSR2 ⇔ < ⇔ MSE1 < MSE2 .
n−2 n−2
1 2
SST SST
• So under these conditions, models with higher R 2 's (and lower MSE/RMSE's) do a better job
of fitting the data, and in that sense are preferable.
• But: If the y's are not the same across the different models, then R 2 's and MSE/RMSE's are
not directly comparable and accordingly, they won’t tell you much unless you make some
adjustments.
7
OLS/SLR Assessment I - GOFs: TakeAways
• Goodness of Fit metrics tell you something about how well your OLS/SLR model fits the data… about the
relationship between predicteds and actuals.
• SSTs, SSEs and SSRs capture the variances in the actuals, predicteds and residuals, respectively. SST = SSE +
SSR
• Two standard GOF metrics are (Root) Mean Squared Error (MSE = SSR/(n-2)) and the Coefficient of
Determination (R-sq = 1-SSR/SST = SSE/SST = varPredicteds/varActuals))
• MSE is essentially an average squared deviation of predicteds from actuals… RMSE is the square root thereof.
MSE (RMSE) magnitudes tell you little about how well your model has performed, as they have no uniform scale.
• R-sq is essentially the variance of the predicteds relative to the variance of the actuals… or, the percent of the
variation in the actuals explained by the model. 0 ≤ R-sq ≤ 1; closer to 1 and the more of the variation in the
dependent variable explained by the model. High R-sq is terrific if nObs are high as well… but maybe not so
much otherwise.
• R-sq is also equal to the square of correlation between the LHS and RHS variables… as well as the square of the
correlation between predicteds and actuals.
• It’s OK to compares MSE’s and R-sq’s across models with the same LHS variable… but if there are changes to the
LHS variable, such comparisons are meaningless without adjustments
8
onwards… to OLS/SLR Examples

MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
An Introduction To Statistical Learning
No ratings yet
An Introduction To Statistical Learning
19 pages
MMW Module 4 - Statistics
No ratings yet
MMW Module 4 - Statistics
18 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Econometrics Lecture 3 Simple Linear Regression (SLR) For Cross Sectional Data Part 2
No ratings yet
Econometrics Lecture 3 Simple Linear Regression (SLR) For Cross Sectional Data Part 2
39 pages
Correlation and Regression: by Tushar Bhatt
100% (1)
Correlation and Regression: by Tushar Bhatt
66 pages
Data Cepietso-Pcp (7-11 May 2012)
No ratings yet
Data Cepietso-Pcp (7-11 May 2012)
17 pages
Chapter3-Goodness of Fit Tests
No ratings yet
Chapter3-Goodness of Fit Tests
24 pages
Regression
No ratings yet
Regression
46 pages
Two-Variable Regression Model: The Problem of Estimation: Gujarati 4e, Chapter 3
No ratings yet
Two-Variable Regression Model: The Problem of Estimation: Gujarati 4e, Chapter 3
15 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
T5 Moleculardescriptors Models PDF
No ratings yet
T5 Moleculardescriptors Models PDF
9 pages
328formulas03 (2019 - 04 - 03 15 - 13 - 21 UTC)
No ratings yet
328formulas03 (2019 - 04 - 03 15 - 13 - 21 UTC)
12 pages
Variance Worksheet
No ratings yet
Variance Worksheet
2 pages
T5 Moleculardescriptors Models PDF
No ratings yet
T5 Moleculardescriptors Models PDF
9 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
64 pages
Ch10 - Curve Fitting
No ratings yet
Ch10 - Curve Fitting
157 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
QM Formulas
No ratings yet
QM Formulas
12 pages
TABEL RANK SPEARMAN-spearman Ranked Correlation Table
No ratings yet
TABEL RANK SPEARMAN-spearman Ranked Correlation Table
1 page
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Shrinkage Regression: Rolf Sundberg Volume 4, PP 1994-1998 in
No ratings yet
Shrinkage Regression: Rolf Sundberg Volume 4, PP 1994-1998 in
5 pages
05 16 Simple Regression 2
No ratings yet
05 16 Simple Regression 2
84 pages
Stats B
No ratings yet
Stats B
75 pages
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
No ratings yet
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
15 pages
Output SPSS Word
No ratings yet
Output SPSS Word
17 pages
10 Inference For Regression Part2
No ratings yet
10 Inference For Regression Part2
12 pages
Notes 516 Summer 09 Part 2
No ratings yet
Notes 516 Summer 09 Part 2
15 pages
3 Da
No ratings yet
3 Da
16 pages
Welcome To The Course: Financial Econometrics I
No ratings yet
Welcome To The Course: Financial Econometrics I
14 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Econometrics Chap 3
No ratings yet
Econometrics Chap 3
19 pages
Measures of Variation
No ratings yet
Measures of Variation
15 pages
Regresión y Calibración
No ratings yet
Regresión y Calibración
6 pages
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
No ratings yet
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
26 pages
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
No ratings yet
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
2 pages
Reading 11: Correlation and Simple Regression: Calculate and Interpret The Following
No ratings yet
Reading 11: Correlation and Simple Regression: Calculate and Interpret The Following
15 pages
MCT Grouped Data
No ratings yet
MCT Grouped Data
19 pages
Mme 8201-4-Linear Regression Models
No ratings yet
Mme 8201-4-Linear Regression Models
24 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
First Test Mock Exam PDF
No ratings yet
First Test Mock Exam PDF
9 pages
Standar Deviasi
No ratings yet
Standar Deviasi
20 pages
Lecture 6
No ratings yet
Lecture 6
33 pages
T-Test: T-TEST GROUPS Sex (1 2) /missing Analysis /VARIABLES Level - of - Satisfaction /CRITERIA CI (.95)
No ratings yet
T-Test: T-TEST GROUPS Sex (1 2) /missing Analysis /VARIABLES Level - of - Satisfaction /CRITERIA CI (.95)
16 pages
Z Score Table
No ratings yet
Z Score Table
1 page
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
Stat PPT 1
No ratings yet
Stat PPT 1
11 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Module 1
No ratings yet
Module 1
19 pages
Ch08 - Linear Regression
No ratings yet
Ch08 - Linear Regression
37 pages
Statistics Cheat Sheets
No ratings yet
Statistics Cheat Sheets
28 pages
Exercise Chapter 2: C) by Using Appropriate Calculation, Identify The Shape of Distribution For The Above Data
No ratings yet
Exercise Chapter 2: C) by Using Appropriate Calculation, Identify The Shape of Distribution For The Above Data
3 pages
Coefficient of Determination
No ratings yet
Coefficient of Determination
8 pages
REGRESSION
No ratings yet
REGRESSION
8 pages
Goodness of Fit Part 2
No ratings yet
Goodness of Fit Part 2
7 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
Distance (D M) Frequency 6 16 25 18 13 2: 12 8 The Table Summarises The Distances, D M, That 80 Women Threw The Javelin
No ratings yet
Distance (D M) Frequency 6 16 25 18 13 2: 12 8 The Table Summarises The Distances, D M, That 80 Women Threw The Javelin
2 pages
1.5 Mean Central Tendency
No ratings yet
1.5 Mean Central Tendency
21 pages
OTM Theoretical Distribution Dec 23
No ratings yet
OTM Theoretical Distribution Dec 23
8 pages
Basic Statistics
No ratings yet
Basic Statistics
44 pages
Simple Regression Model - Estimation
No ratings yet
Simple Regression Model - Estimation
9 pages
MODULE 4 - Variability
No ratings yet
MODULE 4 - Variability
17 pages
Annotated Stata Regression Output v2
No ratings yet
Annotated Stata Regression Output v2
2 pages
FS2 L4 Averages Slides Final
No ratings yet
FS2 L4 Averages Slides Final
21 pages
Formulae and Interpretation For FBA in JMP
No ratings yet
Formulae and Interpretation For FBA in JMP
10 pages
Deck2 BusinessIntelligence M1 ACSA
No ratings yet
Deck2 BusinessIntelligence M1 ACSA
15 pages
Regression Formulas Quick Notes Simplified
No ratings yet
Regression Formulas Quick Notes Simplified
1 page
Chapter 3
No ratings yet
Chapter 3
31 pages
SRM Formula Sheet
No ratings yet
SRM Formula Sheet
16 pages
Ec410 Lecture 4 - Simple Regression II
No ratings yet
Ec410 Lecture 4 - Simple Regression II
8 pages
The Simple Linear Regression Model (Part 2)
No ratings yet
The Simple Linear Regression Model (Part 2)
38 pages
Mean Median Mode Range Practice Problems
No ratings yet
Mean Median Mode Range Practice Problems
14 pages
Assignment 11 Statistics
No ratings yet
Assignment 11 Statistics
4 pages
Working
No ratings yet
Working
15 pages
Table Analysis
No ratings yet
Table Analysis
22 pages
1.1 Regression Analysis
No ratings yet
1.1 Regression Analysis
33 pages
RPM (SPM) Scores
No ratings yet
RPM (SPM) Scores
2 pages
Tugas Prak DR Suhartomi
No ratings yet
Tugas Prak DR Suhartomi
3 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
Fourth Periodical Exam 1718
No ratings yet
Fourth Periodical Exam 1718
8 pages
Stats135 Reviewer
No ratings yet
Stats135 Reviewer
5 pages
Resultados Factor Fusion Mayo
No ratings yet
Resultados Factor Fusion Mayo
15 pages
P&S Sem Answers
No ratings yet
P&S Sem Answers
96 pages
Unit-3 Notes
No ratings yet
Unit-3 Notes
16 pages
Regression - Validating The Model Sept 2012
No ratings yet
Regression - Validating The Model Sept 2012
68 pages
Linear Regression
No ratings yet
Linear Regression
42 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)

SSR 2 PDF

Uploaded by

SSR 2 PDF

Uploaded by

OLS/SLR Assessment I: Goodness-of-fit

SST • How close? Goodness-of-Fit (GOF) v. Precision/Inference

MSE • GOF II: R-squared

RMSE • Thinking about R-squared

• SST (Total Sum of Squares)

• SSE (Explained Sum of Squares)

• SSR (Residual Sum of Squares)

• SST = SSE + SSR (if there is a constant term in the model)

The sample variance of the actuals is the

GOF I: Mean Squared Error (MSE/RMSE)

• R2 is bounded: By construction, 0 ≤ R 2 ≤ 1 (if there is a constant term in the model)…

You might also like