0% found this document useful (0 votes)
9 views24 pages

L2C-Multiple Regression C 2022-03-03 21 - 20 - 04

This document discusses the comparison of two regression models, specifically focusing on the full model versus the reduced model and the use of the Partial F-test to evaluate the significance of subsets of predictor variables. It explains how to assess the change in the sum of squared errors (SSE) and introduces Type I and Type II sum of squares for regression analysis. Examples illustrate the application of these concepts in determining the influence of variables on apartment prices.

Uploaded by

詠芯謝
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views24 pages

L2C-Multiple Regression C 2022-03-03 21 - 20 - 04

This document discusses the comparison of two regression models, specifically focusing on the full model versus the reduced model and the use of the Partial F-test to evaluate the significance of subsets of predictor variables. It explains how to assess the change in the sum of squared errors (SSE) and introduces Type I and Type II sum of squares for regression analysis. Examples illustrate the application of these concepts in determining the influence of variables on apartment prices.

Uploaded by

詠芯謝
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Part C: Comparing Two Regression Models

1
Outline
— Comparing Two Regression Models
— Full Model Vs. Reduced Model
— Partial F-test
— The Change in SSE
— Sequential Sum Squares Regression

2
Comparing Two Regression Models
— Two models are nested if one model contains all the
terms of the other

— Want to test a subset of the 𝑋-variables for significance as


a group
— Such as, we have 𝑋! , 𝑋" , … , 𝑋# , 𝑋#$! , … , 𝑋%
— Examine the contribution of 𝑋#$! , … , 𝑋% to the relationship
with 𝑌-variable

3
Comparing Two Regression Models
– Full Model vs. Reduced Model Cont’d
— Full model
— Has 𝐾 𝑋-variables
𝑌" = 𝑏! + 𝑏" 𝑋" + ⋯ + 𝑏# 𝑋# + 𝑏#$" 𝑋#$" + ⋯ + 𝑏% 𝑋%

— Reduced model
— Has 𝐿 𝑋-variables
— The subset of 𝑋-variables being tested are not in it
𝑌" = 𝑏! + 𝑏" 𝑋" + ⋯ + 𝑏# 𝑋#

4
Partial F-test
— 𝐻! : 𝛽#$" = 𝛽#$& = ⋯ = 𝛽% = 0
— 𝑋-variables in the subset do not significantly improve the
model when all the other 𝑋-variables are included

— 𝐻" : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝛽#$" , 𝛽#$& , … , 𝛽% ≠ 0


— At least one 𝑋-variable in the subset has a coefficient
significantly different from zero

5
Partial F-test
— Look at how much the SSE change before and after the
inclusion of the subset of 𝑋-variables
— As 𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸
— The reduced model has fewer 𝑋-variables, the SSR and 𝑟 " is
expected to be smaller, while the SSE is expected to be
larger
— The full model contains more 𝑋-variables, the SSR and 𝑟 "
should be larger, while the SSE should be smaller

6
Partial F-test Cont’d
— Partial F-test statistic
((()!"#$%"#*(()&$'' )/(%*#)
F= (()&$'' /(-*%*")
with (𝐾 − 𝐿), (𝑛 − 𝐾 − 1) d.f.
where 𝑆𝑆𝐸!"#$%"# = SSE for the reduced model
𝑆𝑆𝐸&$'' = SSE for the full model
𝐾 = no. of 𝑋-variables in the full model
𝐿 = no. of 𝑋-variables in the reduced model
p-value = 𝑃(𝐹 %*# ,(-*%*") ≥ F)
Reject 𝐻! if F > 𝐹/,(%*#),(-*%*") or p-value < 𝛼

7
Example
— As gross floor area and net floor area are both about the
size of an apartment, examine them as a group to indicate
whether “size” is related to apartment price
— Given that the variables “Age” and “Floor” are included in
the model

8
Example Cont’d
Full model

Reduced model, without the


𝑋-variables being tested

9
Example – Discussion of Results Cont’d
— The reduced model had 𝑅 & = 0.3487, which almost
doubled to 0.6631 by further considering gross floor area
and net floor area
— The p-value for GrossFA = 0.0096, while that for NetFA =
0.1554, showing that size is significantly affecting
apartment price, but may not need both of these variables
in the model

10
Example Cont’d

𝑋-variables being tested for Partial F-test statistic and the


slope coefficient equals to zero corresponding p-value

11
Example – Partial F-test Cont’d
𝐻! : 𝛽0123345 = 𝛽67845 = 0
𝐻" : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝛽0123345 , 𝛽67845 ≠ 0

((()!"#$%"#*(()&$'' )/(%*#) (9!.;9<*=".>=?)/(>*&)


F= =
(()&$'' /(-*%*") =".>=?/(?!*>*")
= 34.987
At 𝛼 = 5% d.f. = 2, 75 C.V. = 3.07

Reject 𝐻! , “size” (GrossFA and NetFA) is significantly


influencing the apartment price.

12
Type I SS –
Sequential Sum Squares Regression
— Type I SS
— The SSR due to a particular 𝑋-variable after including the
preceding 𝑋-variables
Type I SS = 𝑆𝑆𝑅(𝑋#$!|𝑋!, … , 𝑋# )
— Increment of SSR by having an extra 𝑋-variable (i.e. 𝑋#$! )
— The sequence of the variables being entered into the model
would affect the Type I SS

13
Type I SS –
Sequential Sum Squares Regression Cont’d

Price
SSR(GrossFA)
SSR(Age|GrossFA)

GrossFA

Age
Overlapping area: considered in
SSR(GrossFA) as GrossFA is the first 𝑋-
variable being entered
Hence, SSR(GrossFA) + SSR(Age|GrossFA) = SSR(Full) 14
Type I SS –
Sequential Sum Squares Regression Cont’d
— For a regression model with two 𝑋-variables (𝑋" and 𝑋& )
𝑌" = 𝑏! + 𝑏" 𝑋" + 𝑏& 𝑋&
— 𝑆𝑆𝑅@ABB = 𝑆𝑆𝑅 𝑋" + 𝑆𝑆𝑅(𝑋& |𝑋" )
— Partial F-test using Type I SS
𝐻! : 𝛽& = 0
𝐻" : 𝛽& ≠ 0
((C(D(|D))
𝐹= with 1, (𝑛 − 𝐾 − 1) d.f.
(()&$'' /(-*%*")

15
Type I SS –
Sequential Sum Squares Regression Cont’d
— For a regression model with 𝐾 𝑋-variables
𝑌1 = 𝑏( + 𝑏!𝑋! + ⋯ + 𝑏# 𝑋# + 𝑏#$!𝑋#$! + ⋯ + 𝑏% 𝑋%
— Partial F-test
𝐻(: 𝛽#$! = 𝛽#$" = ⋯ = 𝛽% = 0
𝐻!: 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝛽#$!, 𝛽#$", … , 𝛽% ≠ 0
))*(,!"# ,…,,$ |,# ,…,,! )/(%2#)
F= ))3%&'' /(42%2!)
with (𝐾 − 𝐿), (𝑛 − 𝐾 − 1) d.f.
where 𝑆𝑆𝑅(𝑋*+, , … , 𝑋- |𝑋, , … , 𝑋* )
= 𝑆𝑆𝑅 𝑋*+, 𝑋, , … , 𝑋* + 𝑆𝑆𝑅 𝑋*+. 𝑋, , … , 𝑋*+,
+ ⋯ + 𝑆𝑆𝑅(𝑋- |𝑋, , … , 𝑋-/, )
16
Example – Type I SS Cont’d
𝑋-variables being tested are
listed towards the end

SSR(Age)
SSR(Floor|Age)
SSR(GrossFA|Age, Floor)
SSR(NetFA|Age, Floor, GrossFA)
17
Example – Type I SS Cont’d
𝐻! : 𝛽"#$%%&' = 𝛽()*&' = 0
𝐻+ : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑜𝑓 𝑡ℎ𝑒 𝛽"#$%%&' , 𝛽()*&' ≠ 0

[𝑆𝑆𝑅 𝐺𝑟𝑜𝑠𝑠𝐹𝐴 𝐴𝑔𝑒, 𝐹𝑙𝑜𝑜𝑟 + 𝑆𝑆𝑅 𝑁𝑒𝑡𝐹𝐴 𝐴𝑔𝑒, 𝐹𝑙𝑜𝑜𝑟, 𝐺𝑟𝑜𝑠𝑠𝐹𝐴 ] /(𝐾 − 𝐿)
𝐹=
𝑆𝑆𝐸,-.. /(𝑛 − 𝐾 − 1)

()*.,-.-/0.*-1,)/(,4))
= = 34.987
15.,1*0/(*04,45)

At 𝛼 = 5% d.f. = 2, 75 C.V. = 3.07

Reject 𝐻! , “size” is significantly influencing apartment price.

The same partial F-test statistic and conclusion as in the analysis using the change in
SSE.
18
Example – Type I SS Cont’d
— What happened if we put the 𝑋-variables being tested
at the beginning of the model statement?

Type I SS changed!
Ordering matters!

SSR(GrossFA)
SSR(NetFA|GrossFA)
SSR(Age|GrossFA, NetFA)
SSR(Floor|GrossFA, NetFA, Age)
19
Type II SS –
Partial Sum Squares Regression
— Type II SS
— The SSR due to a particular 𝑋-variable after including all the
other 𝑋-variables
Type II SS = 𝑆𝑆𝑅 𝑋Y 𝑎𝑙𝑙 𝑋Z 𝑤ℎ𝑒𝑟𝑒 𝑗 ≠ 𝑖)
— Additional SSR only caused by this 𝑋-variable
— The sequence of the variables being entered into the model
would not affect the Type II SS
— Can be used to determine an individual 𝑋-variable significance
𝐻0 : 𝛽1 = 0
𝐻, : 𝛽1 ≠ 0
223 4/ 566 40 789:9 ;<1)
Partial F = d.f. = 1, (𝑛 − 𝐾 − 1)
22>1233 /(A/-/,)

20
Type II SS –
Partial Sum Squares Regression Cont’d

Price
SSR(GrossFA|Age)
SSR(Age|GrossFA)

GrossFA

Age

Overlapping area: Not considered in


the calculation of any Type II SS

Hence, SSR(GrossFA|Age) + SSR(Age|GrossFA) ≠ SSR(Full) 21


Example – Type II SS Cont’d

• The individual type II SSR do not


sum to the overall SSR
• The type II SSR coincides with the
type I SSR of the last variable
22
Example – Type II SS Cont’d
— Type II SS for Floor = 𝑆𝑆𝑅 𝑋[\]]^ 𝑋_^]``[a , 𝑋bcd[a , 𝑋aec
= 2.6356

— Partial F-test
𝐻0 : 𝛽B6CC: = 0
𝐻, : 𝛽B6CC: ≠ 0
223 𝑋B6CC: 𝑋D:CEEBF , 𝑋G9HBF , 𝑋FI9 ..KLMK
Partial F = = L,.NLO0/(O0/N/,)
22>1233 /(A/-/,)
= 6.2877
At 𝛼 = 5% d.f. = 1, 75 C.V. = 4.00
Reject 𝐻0 , as partial F > C.V., i.e. Floor is significantly related to the
apartment price
Same conclusion with the t-test, as (t). = (2.508). = partial F 23
Summary – Type I & Type II SS
— Type I SS is a decomposition of SSR, measuring the
contributions of predictors in a specific order
— Used for the partial F-test

— Type II SS is about the partial contribution of the predictor,


after accounting for other predictors in the model
— Related to the t-test

24

You might also like