0% found this document useful (0 votes)

71 views7 pages

Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed

The document discusses multiple regression analysis. Multiple regression allows modeling of a dependent variable as a function of more than one independent variable. It defines the general multiple regression model and explains how to interpret the slope coefficients in multiple regression compared to single regression. It also discusses omitted variable bias, which can occur when an important predictor variable is omitted from the regression model.

Uploaded by

Felipe Chen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views7 pages

Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed

Uploaded by

Felipe Chen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Review: multiple regression

• Definition: a regression with more than one independent variable

• The general multiple regression model with K independent variables is:

Lecture 7 Yi = β0 + β1X1i + β2X2i + ... + βKXKi + εi

The lecture is based on teaching material from So Yoon Ahn

How to interpret β? Review: omitted variable bias

• A big difference between multiple and single regression model is in • The error e arises because of factors that influence Y but are not
the interpretation of the slope coefficients included in the regression function; so, there are always omitted
• Now a slope coefficient indicates the change in the average of the variables
dependent variable associated with a one-unit increase in the
explanatory variable holding the other explanatory variables constant
or fixed • Sometimes, the omission of those variables can lead to bias in the
OLS estimator
• Example: LifeExpectancy=b0+b1GDP+b2Population +e
• b1: conditional on population, one unit increase in GDP is associated with b1
unit change in life expectancy
• b2: conditional on GDP, one unit increase in population is associated with b2
unit change in life expectancy
Review: omitted variable bias Omitted variable bias
• The bias in the OLS estimator that occurs as a result of an omitted • Suppose our “Correct” model is: Yi = β0 + β1X1i + β2X2i + εi
factor is called omitted variable bias. For omitted variable bias to
occur, the omitted factor “Z” must be: • But researcher mistakenly ran a regression: Yi = 𝛼! +𝛼" X1i + εi

1. A determinant of Y (i.e. Z is part of e); and • We hope E(𝛼"" )=β1

$$,&
2. Correlated with the regressor X (i.e. corr(Z,X) ¹ 0) • But, E(𝛼"" )=𝛽" + 𝛽#
$$&

• Both conditions must hold for the omission of Z to result in omitted

variable bias.

Omitted variable bias direction Omitted variable bias

• Suppose you regress 𝑌 on 𝑋, but omit variable 𝑍, how would the
coefficient estimate on 𝑋 be biased under the following scenarios:

• 𝑍 is negatively correlated with both 𝑋 and 𝑌

• 𝑍 is positively correlated with 𝑋, but negatively correlated with 𝑌

• 𝑍 is not correlated with 𝑋, but positively correlated with 𝑌

Dummy variable A dummy independent variable
• A dummy variable is a variable that takes on the value 1 or 0 • Consider a simple model with one continuous variable (x) and one
• Examples: male (= 1 if are male, 0 otherwise), south (= 1 if in the dummy (d)
south, 0 otherwise), marital status (=1 if married, 0 otherwise) etc. • 𝑦 = 𝛽! + 𝛽" 𝑑 + 𝛽# 𝑥 + 𝑢
• Dummy variables are also called binary variables • This can be interpreted as an intercept shift
• If d=0, then 𝑦 = 𝛽! + 𝛽# 𝑥 + 𝑢
• If d=1, then 𝑦 = 𝛽! + 𝛽" + 𝛽# 𝑥 + 𝑢
• The case of d=0 is the base group

Example of 𝛽0 > 0 Dummies for Multiple Categories

𝑦 = 𝛽3 + 𝛽0 + 𝛽4 𝑥 + 𝑢 • We can use dummy variables to control for something with multiple
𝑦 d=1
Slope: 𝛽4
categories
• Suppose everyone in your data is either a HS dropout, HS grad only, or
𝑦 = 𝛽3 + 𝛽4 𝑥 + 𝑢 college grad
𝛽0 d=0

𝛽3

𝑥
Multiple Categories (cont) Dummy variable trap
• Any categorical variable can be turned into a set of dummy variables • If we run Health=𝛽! + 𝛽" Less_HS+𝛽# HS_grad+𝛽% College_or_above+e
• The regression will not work
• Education: 0 if HS dropout, 1 if HS grad, and 2 if college grad. Recode • This is called perfect multi-collinearity
it into three dummy variables:
• less_HS:1 if education=0, 0 otherwise Constant less_HS HS_grad college health status
• HS_grad:1 if education=1, 0 otherwise 1 1 0 0 3
• college_or_above:1 if education=2, 0 otherwise 1 0 1 0 4
1 0 1 0 4
Less_HS+HS_grad+college_or_above
• If there are a lot of categories, it may make sense to group some =constant=1
1 0 0 1 3
together Cannot estimate the regression!
1 0 0 1 2
• Example: top 10 ranking, 11 – 25, etc. 1 1 0 0 1
1 0 1 0 4

How to get out of dummy variable trap(1)? How to get out of dummy variable trap(2)?
• There are two ways of getting out of the dummy variable trap • Way#2: omit the constant term
• Way #1: omit one category of the dummy variables • Health=𝛽" Less𝐻𝑆 + 𝛽# HS_grad+𝛽% College_or_above+e
• e.g. if we run a regression: • How does this regression differ from the previous?
• Health=𝛽! + 𝛽" HS_grad+𝛽# College_or_above+e • How do we interpret 𝛽" ?
• We omit HS dropouts dummy and treat it as the baseline. The categories • The average health status of HS dropouts.
we include are compared to the category we exclude • Exercise: Interpret 𝛽# and 𝛽%
• How do we interpret 𝛽" ?
• The average health status of HS grads relative to HS dropouts.
• It is the difference in average health between HS grad and HS dropouts
• Exercise: Interpret 𝛽!, 𝛽#
Exercise Review
• Now we are interested in knowing how GDP is related to different quarters. We
have the following data: • (1) When there are only dummy independent variables
Constant Quarter1 Quarter2 Quarter3 Quarter4 GDP • Differences in mean in dependent variables for different groups
1 1 0 0 0 1342
• (2) When there are dummy and continuous independent variables
1 0 1 0 0 1654
• Allowing different intercepts for different groups
1 0 0 1 0 1565
1 0 0 0 1 1807

• Design a regression to accomplish the task and interpret the 𝛽𝑠. You can use
either one of the two methods to get out of the dummy variable trap
• GDP= 𝛽3 + 𝛽4𝑄𝑢𝑎𝑟𝑡𝑒𝑟2+𝛽C𝑄𝑢𝑎𝑟𝑡𝑒𝑟3+𝛽E𝑄𝑢𝑎𝑟𝑡𝑒𝑟4+e
• GDP=𝛽0𝑄𝑢𝑎𝑟𝑡𝑒𝑟1 + 𝛽4𝑄𝑢𝑎𝑟𝑡𝑒𝑟2+𝛽C𝑄𝑢𝑎𝑟𝑡𝑒𝑟3+𝛽E𝑄𝑢𝑎𝑟𝑡𝑒𝑟4+e

F-test F-test
• With a multiple regression • Let STR = student teacher ratio, Expn = expenditures per pupil, and PctEL =
𝑌𝑖 = 𝛽0 +𝛽1𝑋1𝑖 +⋯+𝛽𝑁𝑋k𝑖 +e𝑖, percent of English learners
we can still perform statistical inference using p-value or confidence • Consider the population regression model:
interval to determine if a specific 𝛽 is statically different from 0 (or
other number) • TestScorei = b0 + b1STRi + b2Expni + b3PctELi + ui

• The null hypothesis is that “school resources don’t matter,” and the alternative
• Now if we want to test whether a group of variables jointly have that they do, corresponds to:
any effect on Y, we will use F-test.
• H0: 𝛽1 = 𝛽2 =..=𝛽k =0 • H0: b1 = 0 and b2 = 0
• H1: either b1 ≠ 0 or b2 ≠ 0 or both
• H1: at least one of the 𝛽s is not zero
• TestScorei = b0 + b1STRi + b2Expni + b3PctELi + ui
F-test F-statistic
• H0: b1 = 0 and b2 = 0 • The F-statistic tests all parts of a joint hypothesis at once: test of joint
• H1: either b1 ≠ 0 or b2 ≠ 0 or both hypothesis
• A joint hypothesis specifies a value for two or more coefficients, that
is, it imposes a restriction on two or more coefficients.
• In general, a joint hypothesis will involve q restrictions. In the
example above, q = 2, and the two restrictions are b1 = 0 and b2 = 0.

The “restricted” and “unrestricted” regressions Simple formula for the F-statistic:
Example: are the coefficients on STR and Expn zero? 2
( Runrestricted - Rrestricted
2
)/q
F=
Unrestricted population regression (under H1): (1 - Runrestricted
2
) /( n - kunrestricted - 1)
TestScorei = b0 + b1STRi + b2Expni + b3PctELi + ui where:
2
Restricted population regression (that is, under H0): Rrestricted = the R2 for the restricted regression
2
TestScorei = b0 + b3PctELi + ui (why?) Runrestricted = the R2 for the unrestricted regression
q = the number of restrictions under the null
• The number of restrictions under H0 is q = 2 (why?). kunrestricted = the number of regressors in the
• The fit will be better (R2 will be higher) in the unrestricted unrestricted regression.
regression (why?) • The bigger the difference between the restricted and
By how much must the R2 increase for the coefficients on Expn unrestricted R2’s – the greater the improvement in fit by
and PctEL to be judged statistically significant? adding the variables in question – the larger is the F.
23 24
Example:
F-statistic – summary
Restricted regression:
2
Test score= 644.7 –0.671PctEL, Rrestricted = 0.4149
(1.0) (0.032)
Unrestricted regression:
Test score = 649.6 – 0.29STR + 3.87Expn – 0.656PctEL
2
( Runrestricted - Rrestricted
2
)/q
F=
(15.5) (0.48) (1.59) (0.032) (1 - Runrestricted ) /( n - kunrestricted - 1)
2

2
Runrestricted = 0.4366, kunrestricted = 3, q = 2
2
( Runrestricted - Rrestricted
2
)/q • The F-statistic rejects when adding the two variables
so F= increased the R2 by “enough” – that is, when adding the two
(1 - Runrestricted
2
) /( n - kunrestricted - 1) variables improves the fit of the regression by “enough”
(.4366 - .4149) / 2
= = 8.01
(1 - .4366) /(420 - 3 - 1)
25 26

Example of F-test
• Healthi=β0 + β1Educi + β2Agei + β3malei+ei
• I want to know whether education, age and
gender are jointly affecting health
• H0:β1=β2=β3=0
• What is H1?
• Where is F-test in STATA output?
• Top right panel
• Like t-stat, F-stat follows a specific
distribution, a bigger F-stat means higher
power to reject the H0
• Prob>F is like the p-value in t-test. We can
look at “Prob>F” and compare it with a to
decide whether rejecting H0

(Ebook PDF) Introductory Econometrics Asia Pacific Edition Instant Download
100% (2)
(Ebook PDF) Introductory Econometrics Asia Pacific Edition Instant Download
55 pages
Econometrics 2
No ratings yet
Econometrics 2
135 pages
ML1+Project+ (Coded) + +Sample+Business+Report
No ratings yet
ML1+Project+ (Coded) + +Sample+Business+Report
56 pages
Indicators: Holistic Rubric For The Research Data-Collection Procedure
0% (1)
Indicators: Holistic Rubric For The Research Data-Collection Procedure
2 pages
Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
Multiple Regression: Curve Estimation
100% (2)
Multiple Regression: Curve Estimation
23 pages
120.508 Module 8 Multiple Regression (PDF Full Page Color)
No ratings yet
120.508 Module 8 Multiple Regression (PDF Full Page Color)
52 pages
Econometrics Cheat Sheet Stock and Watson
100% (5)
Econometrics Cheat Sheet Stock and Watson
2 pages
Econometrics Cheatsheet
No ratings yet
Econometrics Cheatsheet
3 pages
Unit 1 - DATA ANALYTICS - KIT-601 - AKTU
No ratings yet
Unit 1 - DATA ANALYTICS - KIT-601 - AKTU
24 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
SAT Ratios, Rates, Proportional Relationships, and Units
No ratings yet
SAT Ratios, Rates, Proportional Relationships, and Units
59 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
After Midterm Slides
No ratings yet
After Midterm Slides
134 pages
Econometrics I - Lecture 7 (Wooldridge)
No ratings yet
Econometrics I - Lecture 7 (Wooldridge)
34 pages
5ssmn932 Lecture4 2021 Collated
No ratings yet
5ssmn932 Lecture4 2021 Collated
72 pages
Session 11 - Correlation and Regression
No ratings yet
Session 11 - Correlation and Regression
28 pages
Multivariate Regression, Slides
No ratings yet
Multivariate Regression, Slides
61 pages
Lecture 2 - LRM
No ratings yet
Lecture 2 - LRM
43 pages
Economatrics 3
No ratings yet
Economatrics 3
32 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
43 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Lec 5 V 11
No ratings yet
Lec 5 V 11
44 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Aggestam, M. (2006)
No ratings yet
Aggestam, M. (2006)
23 pages
Multi Regrson
No ratings yet
Multi Regrson
40 pages
Regression Analysis Estimation and Interpretation of Regression Equation Dummy Independent Variable
No ratings yet
Regression Analysis Estimation and Interpretation of Regression Equation Dummy Independent Variable
39 pages
Lecture 3 SLR - 2
No ratings yet
Lecture 3 SLR - 2
29 pages
Lecture 3 SLR - 2
No ratings yet
Lecture 3 SLR - 2
29 pages
James Steiger R For MultipleRegressionIntro
No ratings yet
James Steiger R For MultipleRegressionIntro
54 pages
Econometrics Cheatsheet en
100% (1)
Econometrics Cheatsheet en
3 pages
9 - Econometrics - Linear Regression
No ratings yet
9 - Econometrics - Linear Regression
14 pages
Clustering: Unsupervised Learning Methods 15-381
No ratings yet
Clustering: Unsupervised Learning Methods 15-381
25 pages
MDG P1
No ratings yet
MDG P1
17 pages
Stats
No ratings yet
Stats
33 pages
Chapter 9
No ratings yet
Chapter 9
38 pages
Lesson 2 Statistical Inference
No ratings yet
Lesson 2 Statistical Inference
45 pages
Lecture 5 MLR - 2
No ratings yet
Lecture 5 MLR - 2
32 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
2 Regression With Multiple Regressors 1
No ratings yet
2 Regression With Multiple Regressors 1
22 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Lecture 6
No ratings yet
Lecture 6
21 pages
Section F - Group 12 ODC Kingfisher Airlines
No ratings yet
Section F - Group 12 ODC Kingfisher Airlines
33 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Chapter 6
No ratings yet
Chapter 6
18 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Hypothesis Testing in The Multiple Regression
No ratings yet
Hypothesis Testing in The Multiple Regression
23 pages
Survey Paper Updated
No ratings yet
Survey Paper Updated
12 pages
PRD Swiggy AI Based Recommendations 1727247837
No ratings yet
PRD Swiggy AI Based Recommendations 1727247837
9 pages
Regn Lect 5
No ratings yet
Regn Lect 5
9 pages
Homework 3
No ratings yet
Homework 3
10 pages
IDEA Presentation
No ratings yet
IDEA Presentation
12 pages
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
No ratings yet
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
36 pages
Homework 3
No ratings yet
Homework 3
10 pages
Ibm SPSS Statistics 23
No ratings yet
Ibm SPSS Statistics 23
18 pages
Log Linear Models and Logistic Regression Springer Texts in Statistics
No ratings yet
Log Linear Models and Logistic Regression Springer Texts in Statistics
33 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
4 pages
Lecture 9
No ratings yet
Lecture 9
16 pages
Multiple Regression
No ratings yet
Multiple Regression
49 pages
Multiple Regression Applications: Econ 140
No ratings yet
Multiple Regression Applications: Econ 140
26 pages
Lecture06 MultReg
No ratings yet
Lecture06 MultReg
38 pages
Chapter 3 Notes Part 3
No ratings yet
Chapter 3 Notes Part 3
9 pages
BIOLOGY IA 01 Annotations
No ratings yet
BIOLOGY IA 01 Annotations
13 pages
Outlier Detection and Capping
No ratings yet
Outlier Detection and Capping
7 pages
Practical Reseach 2 1st Periodical Exam 2024-2025
No ratings yet
Practical Reseach 2 1st Periodical Exam 2024-2025
7 pages
Assignment No1 - Modified
No ratings yet
Assignment No1 - Modified
22 pages
Municipal Solid Waste
No ratings yet
Municipal Solid Waste
12 pages
PDF Chapter 1 Quiz 2019 20 1q Ece154p Cola - Compress
No ratings yet
PDF Chapter 1 Quiz 2019 20 1q Ece154p Cola - Compress
16 pages
Chapter 7, Dummy Variable
No ratings yet
Chapter 7, Dummy Variable
13 pages
Kunal Ganguly Improvement Process For Rolling Mill Through The Dmaic Six Sigma Approach
No ratings yet
Kunal Ganguly Improvement Process For Rolling Mill Through The Dmaic Six Sigma Approach
11 pages
Choosing A Functional Form
No ratings yet
Choosing A Functional Form
8 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Econometrics Cheatsheet en
No ratings yet
Econometrics Cheatsheet en
3 pages
Econometrics Cheatsheet en
No ratings yet
Econometrics Cheatsheet en
3 pages
Ôn Final KTL
No ratings yet
Ôn Final KTL
5 pages
Practice Midterm2 Fall2011
No ratings yet
Practice Midterm2 Fall2011
9 pages
Finish Analisis Soal
No ratings yet
Finish Analisis Soal
7 pages
X Variable 1 Residual Plot: Regression Statistics
No ratings yet
X Variable 1 Residual Plot: Regression Statistics
6 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Glennis Dsouza CV
No ratings yet
Glennis Dsouza CV
2 pages
Template JBSE
No ratings yet
Template JBSE
5 pages
Summative 4iii q4
No ratings yet
Summative 4iii q4
3 pages
SBI PO Syllabus PDF
No ratings yet
SBI PO Syllabus PDF
2 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
Econometrics Cheat Sheet Stock and Watson
No ratings yet
Econometrics Cheat Sheet Stock and Watson
2 pages
Prabhat Kumar - Resume
No ratings yet
Prabhat Kumar - Resume
1 page
Rohit Rai Resume
No ratings yet
Rohit Rai Resume
1 page
Two Way Anova Dengan Replikasi: Gender Pendidikan Ujian
No ratings yet
Two Way Anova Dengan Replikasi: Gender Pendidikan Ujian
2 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)

Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed

Uploaded by

Review: Multiple Regression: Holding The Other Explanatory Variables Constant or Fixed

Uploaded by

Review: multiple regression

• Definition: a regression with more than one independent variable

Lecture 7 Yi = β0 + β1X1i + β2X2i + ... + βKXKi + εi

The lecture is based on teaching material from So Yoon Ahn

How to interpret β? Review: omitted variable bias

1. A determinant of Y (i.e. Z is part of e); and • We hope E(𝛼"" )=β1

• Both conditions must hold for the omission of Z to result in omitted

Omitted variable bias direction Omitted variable bias

• 𝑍 is negatively correlated with both 𝑋 and 𝑌

• 𝑍 is positively correlated with 𝑋, but negatively correlated with 𝑌

• 𝑍 is not correlated with 𝑋, but positively correlated with 𝑌

Example of 𝛽0 > 0 Dummies for Multiple Categories

You might also like