0% found this document useful (0 votes)

9 views21 pages

Topic 3b

Uploaded by

Edlyn Linet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views21 pages

Topic 3b

Uploaded by

Edlyn Linet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Topic 3b

Analysis of variance (ANOVA)

approach to regression analysis
Learning objectives
• Apply ANOVA … an (alternative) approach
to testing for a linear association
• Know when to use the t-test and the F-test
• Understand and interpret regression output
from software e.g. Stata
The basic idea
• Break down the variation in Y (“total sum
of squares”) into two components:
– a component that is “due to” the change in X
(“regression sum of squares”)
– a component that is just due to random error
(“error sum of squares”)
• If the regression sum of squares is a large
component of the total sum of squares, it
suggests that there is a linear association.
Y  Y Yˆ  Y  Y  Yˆ 
i i i i
The above decomposition holds for the sum of the
squared deviations, too:
2 2 2

     
n n n

 Y  Y  Y ˆ  Y   Y  Yˆ
i i i i
i 1 i 1 i 1

Total sum of squares (SST)

Regression sum of squares (SSR)

Error sum of squares (SSE)

SST = SSR + SSE

Breakdown of degrees of freedom
Degrees of freedom associated with SST

Degrees of freedom associated with SSR

Degrees of freedom associated with SSE

Analysis of Variance (ANOVA) Table
Example: Mortality and Latitude
The regression equation is Mort = 389 - 5.98 Lat

Predictor Coef SE Coef T P

Constant 389.19 23.81 16.34 0.000
Lat -5.9776 0.5984 -9.99 0.000

S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%

Analysis of Variance

Source DF SS MS F P
Regression 1 36464 36464 99.80 0.000
Residual Error 47 17173 365
Total 48 53637
How to find n?
• Recall the degrees of freedom?

( 𝑛 −1 ) =( 𝑘 ) +(𝑛 − 𝑘 −1)
Definitions of Mean Squares
We already know the mean square error (MSE) is defined
as:
MSE 
 
Y Y i
ˆ 2
i

SSE
n k  1 n k  1

For a simple regression k=1 such that:

MSE 
  Yi  Yˆi 
2


SSE
n 2 n 2

Similarly, the regression mean square (MSR) is defined as:

𝑀𝑆𝑅=
∑ ^ 2
( 𝑌 𝑖 −𝑌 𝑖 ) 𝑆𝑆𝑅
=
𝑘 𝑘
R- Squared

• Let us check from the Mortality and Latitude

example!
• Latitude explains 68% of the variation in
mortality. 32% remains unexplained – Has to
always sum up to 100.
Adjusted-
• It is adjusted based on the degrees of freedom (df)
• Relevant in multiple regression
• Adjusted R2 can actually get smaller as additional
variables are added to the model.
• As N gets bigger, the difference between R2 and
Adjusted R2 gets smaller and smaller.
The formal F-test
for slope parameter β1
Null hypothesis H 0: β1 = 0
Alternative hypothesis HA: β1 ≠ 0

* MSR
Test statistic F 
MSE

P-value = What is the probability that we’d get an F* statistic

as large as we did, if the null hypothesis is true? (One-tailed
test!)
The P-value is determined by comparing F* to an F distribution
with 1 numerator degree of freedom and n-k-1 denominator
degrees of freedom.
Row Year Men200m
1 1900 22.20
Winning times (in seconds) 2 1904 21.60
in Men’s 200 meter Olympic 3
4
1908
1912
22.60
21.70
sprints, 1900-1996. 5
6
1920
1924
22.00
21.60
7 1928 21.80
Are men getting faster? 8 1932 21.20
9 1936 20.70
10 1948 21.10
11 1952 20.70
12 1956 20.60
13 1960 20.50
14 1964 20.30
15 1968 19.83
16 1972 20.00
17 1976 20.23
18 1980 20.19
19 1984 19.80
20 1988 19.75
21 1992 20.01
22 1996 19.32
Regression Plot
Men200m = 76.1534 - 0.0283833 Year
S = 0.298134 R-Sq = 89.9 % R-Sq(adj) = 89.4 %

22.5

21.5
Men200m

20.5

19.5

1900 1950 2000

Year
Analysis of Variance Table
DFE = n-k-1 = 22-2 = 20 MSE = SSE/(n-2) = 1.8/20 = 0.09
MSR = SSR/1 = 15.8

Analysis of Variance
Source DF SS MS F P
Regression 1 15.8 15.8 177.7 0.000
Residual Error 20 1.8 0.09
Total 21 17.6

DFTO = n-1 = 22-1 = 21 F* = MSR/MSE = 15.796/0.089 = 177.7

P = Probability that an F(1,20) random variable is greater than 177.7 = 0.000…

For simple linear regression model,
the F-test and t-test are equivalent.
Predictor Coef SE Coef T P
Constant 76.153 4.152 18.34 0.000
Year -0.0284 0.00213 -13.33 0.000

Analysis of Variance
Source DF SS MS F P
Regression 1 15.796 15.796 177.7 0.000
Residual Error 20 1.778 0.089
Total 21 17.574

2
( 13.33) 177.7 t *
 F
( n  k  1)
2 *
(1, n  k  1)
Equivalence of F-test to t-test
• For a given α level, the F-test of β1 = 0
versus β1 ≠ 0 is algebraically equivalent to
the two-tailed t-test.
• Will get exactly same P-values, so…
– If one test rejects H0, then so will the other.
– If one test does not reject H0, then so will the
other.
Should I use the F-test or the t-test?
• The F-test is only appropriate for testing
that the slope differs from 0 (β1 ≠ 0).
• Use the t-test to test that the slope is
positive (β1 > 0) or negative (β1 < 0).
• F-test is more useful for multiple regression
model when we want to test that more than
one slope parameter is 0. Test if β1 and β2
are jointly significant
Alternative formula for F-test
• Null hypothesis • F-critical
H0: β1 = β2 = 0
k – Column, n-k-1 – Row
• Alternative hypothesis
• When F*>F-critical,
HA: β1 ≠ β2 ≠ 0
Reject H0
is statistically significant
• Test statistic
When F*<F-critical,
Fail to reject H0
is not statistically significant
P-values

Werkstatthandbuch Linhai 310 420 600
100% (2)
Werkstatthandbuch Linhai 310 420 600
514 pages
Systematically Attacking The Guard by Gordon Ryan-FlowChart 1.1
No ratings yet
Systematically Attacking The Guard by Gordon Ryan-FlowChart 1.1
1 page
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
ANOVA For Regression
No ratings yet
ANOVA For Regression
2 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
22 pages
4.1 Multiple Regression Models
No ratings yet
4.1 Multiple Regression Models
6 pages
10 Inference For Regression Part2
No ratings yet
10 Inference For Regression Part2
12 pages
L1 QM07 High Yield Notes
No ratings yet
L1 QM07 High Yield Notes
4 pages
Regression - Validating The Model Sept 2012
No ratings yet
Regression - Validating The Model Sept 2012
68 pages
Simple Regression 1
No ratings yet
Simple Regression 1
18 pages
Chapter 5 (Anova)
No ratings yet
Chapter 5 (Anova)
9 pages
Analysing The Variance
No ratings yet
Analysing The Variance
14 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
Quants
No ratings yet
Quants
8 pages
F-Test For Regression
No ratings yet
F-Test For Regression
4 pages
Eco 5
No ratings yet
Eco 5
30 pages
Notes 4
No ratings yet
Notes 4
15 pages
Chapter 4 Stat
No ratings yet
Chapter 4 Stat
14 pages
Hypothesis For Data
No ratings yet
Hypothesis For Data
11 pages
12 W12NSE6220 - Fall 2023 - Zeng
No ratings yet
12 W12NSE6220 - Fall 2023 - Zeng
44 pages
Module 4
No ratings yet
Module 4
17 pages
9 Regression and Correlation Methods 2a 2024
No ratings yet
9 Regression and Correlation Methods 2a 2024
10 pages
Notes 516 Summer 09 Part 2
No ratings yet
Notes 516 Summer 09 Part 2
15 pages
Lecture 10: F - Tests, ANOVA and R 1 Anova: Reg Reg Res
No ratings yet
Lecture 10: F - Tests, ANOVA and R 1 Anova: Reg Reg Res
5 pages
Deepu Final
No ratings yet
Deepu Final
9 pages
Linear Regression and Tire Correlation
No ratings yet
Linear Regression and Tire Correlation
54 pages
Is The Dependent Variable Related To The Independent Variable?
No ratings yet
Is The Dependent Variable Related To The Independent Variable?
10 pages
2025 R10 Module 10.2
No ratings yet
2025 R10 Module 10.2
8 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Review Lecture
No ratings yet
Review Lecture
44 pages
10 F Test and Analysis of Variance ANOVA
No ratings yet
10 F Test and Analysis of Variance ANOVA
7 pages
Last Lecture 1
No ratings yet
Last Lecture 1
17 pages
Use of F Distribution (Analysis of Variance (ANOVA) )
No ratings yet
Use of F Distribution (Analysis of Variance (ANOVA) )
10 pages
ANOVA
No ratings yet
ANOVA
21 pages
Definitions and Formulae With Statistical Tables For Elementary Statistics and Quantitative Methods Courses
No ratings yet
Definitions and Formulae With Statistical Tables For Elementary Statistics and Quantitative Methods Courses
13 pages
Review of Multiple Regression: Assumptions About Prior Knowledge. This Handout Attempts To Summarize and Synthesize
No ratings yet
Review of Multiple Regression: Assumptions About Prior Knowledge. This Handout Attempts To Summarize and Synthesize
12 pages
Anova Biometry
No ratings yet
Anova Biometry
33 pages
Testing Whether The Model Is Useful For Predicting Y
No ratings yet
Testing Whether The Model Is Useful For Predicting Y
4 pages
Regression
No ratings yet
Regression
15 pages
8 Multiple Regression Inference
No ratings yet
8 Multiple Regression Inference
13 pages
Levene's Test
No ratings yet
Levene's Test
2 pages
Sec PPT Excel Sans-1
No ratings yet
Sec PPT Excel Sans-1
1 page
ANOVA
No ratings yet
ANOVA
44 pages
Module 8 ANOVA or F Test
No ratings yet
Module 8 ANOVA or F Test
11 pages
Demand Forecasting Information
No ratings yet
Demand Forecasting Information
66 pages
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
No ratings yet
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
19 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
As Las 4
No ratings yet
As Las 4
6 pages
Review of Multiple Regression
No ratings yet
Review of Multiple Regression
12 pages
REGRESSION
No ratings yet
REGRESSION
8 pages
Lecture 6 Regression Analysis
No ratings yet
Lecture 6 Regression Analysis
35 pages
Chapter 9 Multiple Regression Analysis
No ratings yet
Chapter 9 Multiple Regression Analysis
15 pages
Chapter 09 W12 L1 Multiple Regression Analysis 2015 UTP C10 PDF
No ratings yet
Chapter 09 W12 L1 Multiple Regression Analysis 2015 UTP C10 PDF
17 pages
AMS 572 Presentation: CH 10 Simple Linear Regression
No ratings yet
AMS 572 Presentation: CH 10 Simple Linear Regression
54 pages
Regression Equation For SI
No ratings yet
Regression Equation For SI
12 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
39 pages
Variance Lecture
No ratings yet
Variance Lecture
14 pages
T-Tests, Anova and Regression: Lorelei Howard and Nick Wright MFD 2008
No ratings yet
T-Tests, Anova and Regression: Lorelei Howard and Nick Wright MFD 2008
37 pages
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
Topics on Tournaments in Graph Theory
From Everand
Topics on Tournaments in Graph Theory
John W. Moon
No ratings yet
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Continuous Crystallizers
50% (2)
Continuous Crystallizers
22 pages
Screeningthe Antibacterial Activityof Moringa Oleifera Leavesand
No ratings yet
Screeningthe Antibacterial Activityof Moringa Oleifera Leavesand
5 pages
Module THERMO Thermodynamics BEXET 2 BSIT RACT X BMET MT 2 OK
No ratings yet
Module THERMO Thermodynamics BEXET 2 BSIT RACT X BMET MT 2 OK
124 pages
Glad Tidings of The Kingdom of God Issue 1564
No ratings yet
Glad Tidings of The Kingdom of God Issue 1564
20 pages
AP Biology 1986 With Answers
No ratings yet
AP Biology 1986 With Answers
21 pages
Geometric Design of A Highway Using Autocad Civil 3D: Presenter Name
No ratings yet
Geometric Design of A Highway Using Autocad Civil 3D: Presenter Name
11 pages
Ce - Dam Wall Tiles
No ratings yet
Ce - Dam Wall Tiles
50 pages
Centralised Lubrication System For A Manitou MT 732 Complete
No ratings yet
Centralised Lubrication System For A Manitou MT 732 Complete
35 pages
Grade 10: 2 Term-Test Bank & Mock Exam Model Answers
No ratings yet
Grade 10: 2 Term-Test Bank & Mock Exam Model Answers
90 pages
SIM900 EVB Kit User Guide V1.04
No ratings yet
SIM900 EVB Kit User Guide V1.04
23 pages
Vol 3-2 28-08-07
No ratings yet
Vol 3-2 28-08-07
96 pages
1st Q EXAM
No ratings yet
1st Q EXAM
4 pages
MG DLL WEEK 8 DLL EPP 5 6 DAY 3 2ndQ
No ratings yet
MG DLL WEEK 8 DLL EPP 5 6 DAY 3 2ndQ
9 pages
Studi Karakteristik Permukiman Tepian Sungai Di Kalimantan Barat
No ratings yet
Studi Karakteristik Permukiman Tepian Sungai Di Kalimantan Barat
13 pages
BS EN 10277 5 2008 Bright Steel Products Steel For Quenching and Tempering Part 5 General
No ratings yet
BS EN 10277 5 2008 Bright Steel Products Steel For Quenching and Tempering Part 5 General
11 pages
Mitigation Strategy of Subsynchronous Oscillation Based On Fractional-Order Sliding Mode Control For VSC-MTDC Systems With DFIG-Based Wind Farm Access
No ratings yet
Mitigation Strategy of Subsynchronous Oscillation Based On Fractional-Order Sliding Mode Control For VSC-MTDC Systems With DFIG-Based Wind Farm Access
9 pages
Anaphy Lab Disc 6
No ratings yet
Anaphy Lab Disc 6
25 pages
External Thermal Insulation Composite Systems Etics
No ratings yet
External Thermal Insulation Composite Systems Etics
34 pages
Glaxies
No ratings yet
Glaxies
9 pages
21AB07 Project Management Technology in PM
No ratings yet
21AB07 Project Management Technology in PM
18 pages
G-11 Ict Unit 5 Questions
No ratings yet
G-11 Ict Unit 5 Questions
14 pages
For Fill Slope For Cut Slope
No ratings yet
For Fill Slope For Cut Slope
2 pages
Sorter Stapler A831 Parts Location and List
No ratings yet
Sorter Stapler A831 Parts Location and List
46 pages
Sources of Air Pollution PDF
100% (1)
Sources of Air Pollution PDF
30 pages
Montacarga Hidraulica
No ratings yet
Montacarga Hidraulica
16 pages
ASNT Reference Manual Eddy Current
No ratings yet
ASNT Reference Manual Eddy Current
80 pages
BSL Global Outreach Summit & BSL Excellence Awards 2025
No ratings yet
BSL Global Outreach Summit & BSL Excellence Awards 2025
10 pages
Bio Orthopaedics A New Approach Instant PDF Download
No ratings yet
Bio Orthopaedics A New Approach Instant PDF Download
15 pages

Topic 3b

Uploaded by

Topic 3b

Uploaded by

Topic 3b

Analysis of variance (ANOVA)

Total sum of squares (SST)

Regression sum of squares (SSR)

SST = SSR + SSE

Degrees of freedom associated with SSR

Degrees of freedom associated with SSE

Predictor Coef SE Coef T P

S = 19.12 R-Sq = 68.0% R-Sq(adj) = 67.3%

For a simple regression k=1 such that:

Similarly, the regression mean square (MSR) is defined as:

• Let us check from the Mortality and Latitude

P-value = What is the probability that we’d get an F* statistic

1900 1950 2000

DFTO = n-1 = 22-1 = 21 F* = MSR/MSE = 15.796/0.089 = 177.7

P = Probability that an F(1,20) random variable is greater than 177.7 = 0.000…

You might also like