0% found this document useful (0 votes)
156 views

Lecture 5

This document discusses multiple regression analysis and estimation. It introduces multiple linear regression, where a variable y is explained by variables x1, x2, ..., xk. The multiple regression model can incorporate more explanatory factors, explicitly control for other factors, and allow for more flexible functional forms compared to simple regression. The coefficients are estimated using the ordinary least squares (OLS) method by minimizing the sum of squared residuals. Each coefficient has a ceteris paribus interpretation, holding other variables fixed.

Uploaded by

Eda Ustaoglu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

Lecture 5

This document discusses multiple regression analysis and estimation. It introduces multiple linear regression, where a variable y is explained by variables x1, x2, ..., xk. The multiple regression model can incorporate more explanatory factors, explicitly control for other factors, and allow for more flexible functional forms compared to simple regression. The coefficients are estimated using the ordinary least squares (OLS) method by minimizing the sum of squared residuals. Each coefficient has a ceteris paribus interpretation, holding other variables fixed.

Uploaded by

Eda Ustaoglu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Introductory Econometrics: A Modern Approach (7e)

Chapter 5
Multiple Regression Analysis:
Estimation

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 1
Introductory Econometrics: A Modern Approach (7e)

Parallels with Simple Regression


• b0 is still the intercept
• b1 to bk all called slope parameters
• u is still the error term (or disturbance)
• Still need to make a zero conditional mean assumption, so now
assume that
• E(u|x1,x2, …,xk) = 0
• Still minimizing the sum of squared residuals, so have k+1 first order
conditions

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 2
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (1 of 37)


• Definition of the multiple linear regression model
• “Explains variable y in terms of variables x1, x2,…, xk”

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 3
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (2 of 37)


• Motivation for multiple regression
• Incorporate more explanatory factors into the model
• Explicitly hold fixed other factors that otherwise would be in
• Allow for more flexible functional forms

• Example: Wage equation

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 4
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (3 of 37)


• Example: Average test scores and per student spending

• Per student spending is likely to be correlated with average family


income at a given high school because of school financing.
• Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores.
• In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores.
© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 5
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (4 of 37)


• Example: Family income and family consumption

• Model has two explanatory variables: inome and income squared


• Consumption is explained as a quadratic function of income
• One has to be very careful when interpreting the coefficients:

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 6
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (5 of 37)


• Example: CEO salary, sales and CEO tenure

• Model assumes a constant elasticity relationship between CEO salary and the
sales of his or her firm.
• Model assumes a quadratic relationship between CEO salary and his or her
tenure with the firm.

• Meaning of “linear” regression


• The model has to be linear in the parameters (not in the variables)

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 7
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (6 of 37)


• OLS Estimation of the multiple regression model
• Random sample

• Regression residuals

• Minimize sum of squared residuals

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 8
Introductory Econometrics: A Modern Approach (7e)

Interpreting Multiple Regression

yˆ  ˆ0  ˆ1 x1  ˆ 2 x2  ...  ˆ k xk , so


yˆ  ˆ x  ˆ x  ...  ˆ x , 1 1 2 2 k k

so holding x2 ,..., xk fixed implies that


yˆ  ˆ x , that is each  has 1 1

a ceteris paribus interpretation

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 9
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (7 of 37)


• Interpretation of the multiple regression model

• The multiple linear regression model manages to hold the values of


other explanatory variables fixed even if, in reality, they are correlated
with the explanatory variable under consideration.
• “Ceteris paribus”-interpretation
• It has still to be assumed that unobserved factors do not change if the
explanatory variables are changed.
© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 10
Introductory Econometrics: A Modern Approach (7e)

A “Partialling Out” Interpretation

Consider the case where k  2, i.e.


yˆ  ˆ  ˆ x  ˆ x , then0 1 1 2 2

ˆ1   rˆi1 yi   rˆ 2
i1 , where rˆi1 are
the residuals from the estimated
regression xˆ1  ˆ0  ˆ2 xˆ2

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 11
Introductory Econometrics: A Modern Approach (7e)

“Partialling Out” continued


• Previous equation implies that regressing y on x1 and x2 gives same
effect of x1 as regressing y on residuals from a regression of x1 on x2
• This means only the part of xi1 that is uncorrelated with xi2 are being
related to yi so we’re estimating the effect of x1 on y after x2 has been
“partialled out”

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 12
Introductory Econometrics: A Modern Approach (7e)

Simple vs Multiple Reg Estimate


~ ~ ~
Compare the simple regression y   0  1 x1
with the multiple regression yˆ  ˆ0  ˆ1 x1  ˆ 2 x2
~
Generally, 1  ˆ1 unless :
ˆ  0 (i.e. no partial effect of x ) OR
2 2

x1 and x2 are uncorrelated in the sample

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 13
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (8 of 37)


• Example: Determinants of college GPA

• Interpretation
• Holding ACT fixed, another point on high school grade point average is
associated with another .453 points college grade point average
• Or: If we compare two students with the same ACT, but the hsGPA of student A
is one point higher, we predict student A to have a colGPA that is .453 higher
than that of student B
• Holding high school grade point average fixed, another 10 points on ACT are
associated with less than one point on college GPA
© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 14
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (9 of 37)


• Properties of OLS on any sample of data
• Fitted values and residuals

• Algebraic properties of OLS regression

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 15
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (10 of 37)


• “Partialling out” interpretation of multiple regression

• One can show that the estimated coefficient of an explanatory


variable in a multiple regression can be obtained in two steps:
• 1) Regress the explanatory variable on all other explanatory variables
• 2) Regress on the residuals from this regression

• Why does this procedure work?


• The residuals from the first regression is the part of the explanatory variable
that is uncorrelated with the other explanatory variables.
• The slope coefficient of the second regression therefore represents the isolated
effect of the explanatory variable on the dep. variable.

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 16
Introductory Econometrics: A Modern Approach (7e)

Goodness-of-Fit
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi  yˆ i  uˆi We then define the following :
 
 yi  y is the total sum of squares (SST)
2

  yi  y  is the explained sum of squares (SSE)


2
ˆ

 i is the residual sum of squares (SSR)


ˆ
u 2

Then SST  SSE  SSR

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 17
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (11 of 37)


• Goodness-of-Fit
• Decomposition of total variation

• R squared

• Alternative expression for R squared

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 18
Introductory Econometrics: A Modern Approach (7e)

More about R-squared


• R2 can never decrease when another independent variable is added
to a regression, and usually will increase

• Because R2 will usually increase with the number of independent


variables, it is not a good way to compare models

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 19
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (12 of 37)


• Example: Explaining arrest records

• Interpretation:
• If the proportion prior arrests increases by 0.5, the predicted fall in arrests is 7.5
arrests per 100 men.
• If the months in prison increase from 0 to 12, the predicted fall in arrests is
0.034 arrests for a particular man.
• If the quarters employed increase by 1, the predicted fall in arrests is 10.4
arrests per 100 men.
© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 20
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (13 of 37)


• Example: Explaining arrest records (cont.)
• An additional explanatory variable is added.

• Interpretation:
• Average prior sentence increases number of arrests (?)
• Limited additional explanatory power as R-squared increases by little

• General remark on R-squared


• Even if R-squared is small (as in the given example), regression may still provide
good estimates of ceteris paribus effects.
© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 21
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (14 of 37)


• Standard assumptions for the multiple regression model
• Assumption MLR.1 (Linear in parameters)

• Assumption MLR.2 (Random sampling)

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 22
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (15 of 37)


• Standard assumptions for the multiple regression model (cont.)
• Assumption MLR.3 (No perfect collinearity)
• In the sample (and therefore in the population), none of the independent
variables is constant and there are no exact linear relationships among the
independent variables.

• Remarks on MLR.3
• The assumption only rules out perfect collinearity/correlation bet-ween
explanatory variables; imperfect correlation is allowed.
• If an explanatory variable is a perfect linear combination of other explanatory
variables it is superfluous and may be eliminated.
• Constant variables are also ruled out (collinear with intercept).

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 23
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (16 of 37)


• Example for perfect collinearity: small sample

• Example for perfect collinearity: relationships between regressors

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 24
Introductory Econometrics: A Modern Approach (7e)

Multiple Regression Analysis: Estimation (17 of 37)


• Standard assumptions for the multiple regression model (cont.)
• Assumption MLR.4 (Zero conditional mean)

• In a multiple regression model, the zero conditional mean assumption is much


more likely to hold because fewer things end up in the error.

• Example: Average test scores

© 2020 Cengage. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a
password-protected website or school-approved learning management system for classroom use. 25

You might also like