Multiple Regression Analysis - Estimation
Multiple Regression Analysis - Estimation
Analysis with
Cross-
Sectional Data
Multiple regression
analysis: estimation
Pertemuan 3
regression framework. avgscore
But the model
C0 falls outsidesimple
C1expend regression
C2avginc u. because it contains[3.2]
2
two functions of income, inc and inc (and therefore three parameters, C0, C1, and C2).
The coefficient
Nevertheless, theofconsumption
interest for policy purposes
function is Cwritten
is easily 1, the ceteris paribus effect
as a regression of expend
model with twoon
avgscore. Byvariables
independent including lettingexplicitly
byavginc in the
x1 inc and x2 model,
inc2. we are able to control for Regression
its effect
on avgscore. This is likely to be important because average family income tends to be
Mechanically, there will be no difference in using the method of ordinary least squares
correlated with per student spending: spending levels are often determined by both
(introduced in Section 3.2) to estimate equations as different as (3.1) and (3.4). Each equa-
Analysis
prop-
with Cross-
the Model with two
erty and local income taxes. In simple regression analysis, avginc would be included in the
tion can be written as (3.3), which is all that matters for computation. There is, however, an
error term, which would likely be correlated with expend, causing the OLS estimator of C
Sectional
important difference in how one interprets the parameters. In equation (3.1), C1 is the ceteris1
in the two-variable model to be biased.
Independent Variables
paribus effect of educ on wage. The parameter C1 has no such interpretation in Data
In the two previous similar examples, we have shown how observable factors2 other
other words, it makes no sense to measure the effect of inc on cons while holding inc fixed,
than the variable of primary interest [educ in equation (3.1) and expend in equation (3.2)]
(3.4). In
because if inc changes, then so must inc2! Instead, the change in consumption with respect to
can be included in a regression model. Generally, we can write a model with two indepen-
the change in income—the marginal propensity to consume—is approximated by
dent variables as
∆cons C 2C inc.
_____
y ∆inc
C0 C1x11 C22x2 u, [3.3]
See Appendix A for the calculus needed to derive this equation. In other words, the
where
marginal effect of income on consumption depends on C2 as well as on C1 and the level of
C0 is
income. theexample
This intercept.shows that, in any particular application, the definitions of the inde-
C1 measures
pendent variables the
are change
crucial.inBut
y with respect
for the to x1, holding
theoretical other factors
development fixed.regression,
of multiple
we can measures
C2 be the change
vague about in y withWe
such details. respect to x2, examples
will study holding other
like factors fixed.
this more completely in
Chapter 6.
In the model with two independent variables, the key assumption about how u is
related to x1 and x2 is
E(uUx1, x2) 0. [3.5]
The interpretation of condition (3.5) is similar to the interpretation of Assumption SLR.4
for simple regression analysis. It means that, for any values of x1 and x2 in the population,
the average of the unobserved factors is equal to zero. As with simple regression, the
important part of the assumption is that the expected value of u is the same for all combi-
nations of x1 and x2; that this common value is zero is no assumption at all as long as the
interceptC0 is included in the model (see Section 2.1).
Regression
Analysis
with Cross-
Sectional
PART 1 Regression Analysis with Cross-Sectional Data Data
This fits into the multiple regression model (with k 3) by defining y log(salary), x1
log(sales), x2 ceoten, and x3 ceoten2. As we know from Chapter 2, the parameterC1
with Two Independent Variables
framework. But the model falls outside simple regression because it contains
Analysis
ons with Cross-Sectional
ome simple examples to(and
of income, inc and inc 2
Data
therefore
show how three parameters,
multiple C0, C1,analysis
regression
Regression
and C2). ca
ss, the consumption function is easily written as a regressionAnalysis model with two
oblems that cannot be solved by simple regression. with Cross-
nt variables by letting x1 inc and x2 inc2.
ample
nically, is a simple
there will be novariation
differenceofinthe usingwage equation
the method introduced
Sectional
of ordinary in Chap
least squares
ssion
d effect
analysis
Beberapa is also
of education on hourly wage:
in Section 3.2) to estimate
useful
Contoh
equations as
for generalizing
different as (3.1) and
functional
Data
(3.4). Each equa-
rela
As an example,
written as (3.3), whichsuppose
is all thatfamily
mattersconsumption
for computation. (cons)There is,is a quadrati
however, an
nc): inwage
difference how one interprets
C0 C1the educ C2exper
parameters. u, (3.1), C1 is the ceteris [
In equation
ect of educ on wage. The parameter C1 has no such interpretation in (3.4). In
years of labor
s, it makes nocons
sensemarket
to experience.
measure the effect ofThus,
inc on 2wage
cons is determined
while holding inc 2 by the
fixed,
C0 2 C1inc C2inc u,
ndependent
nc changes, then variables,
so must inc education
! Instead, theandchange
experience, and by
in consumption other
with respectunobse
to
incontained
her income—the
factors marginal
u. We arepropensity
in affecting still to consume—is
primarily
consumption. interested
In thisapproximated
inmodel, by of educ on w
the effect
consumption
other factors affecting ∆cons
ed factor, income;∆inc so it Cthat
wage;
_____
1 is,
might 2Cwe seem
2inc. are interested
that it can in the
be parameter
handled C in1
with
ork.AaBut
ndix simple
for the regression
thecalculus
model fallsanalysis
needed deriverelating
outside
to simple
this wage Intoother
regression
equation. educ, equation
because
words, the it
exper
ffect out of on
of income theconsumption
error2termdepends and puts on it
C2explicitly
as well as on inCthe equation.
1 and the level ofBec
ncome, inc and inc (and therefore three parameters, C 0, C 1
n the equation, its coefficient, C2, measures the ceteris paribus effe
his example shows that, in any particular application, the definitions of the inde-
onsumption
riables are crucial.function
But
which is also of some interest. for the is easily
theoretical written
development as a
of regression
multiple model
regression,
vague about such details. We will study examples2 like this more completely in
les by letting x inc and x inc .
1 2
treatment of them. In this section, we cover some variations and extensions on
cost C0 C1quantity C2quantity2 C3quantity3 u.
forms that often arise in applied work.
Estimating such a model causes no complications. Interpreting the Regression
parameters is more
Analysis
involved (though straightforward using calculus); we do not study these models further.
n Using Logarithmic Functional Forms with Cross-
Sectional
Data
yModels
reviewingwith Interaction
how to Terms in the model
interpret the parameters
Sometimes, it is natural for the partial effect, elasticity, or semi-elasticity of the dependent
log(price)
variable with respect to 0 C1log(nox)
anCexplanatory 2roomson
toCdepend
variable the [6.6]
u, magnitude of yet another
explanatory variable. For example, in the model
e variables are taken from Example 4.5. Recall that throughout the text log(x) is
log of x. The coefficient
price 1 is the
C0 CC1sqrft elasticity
C2bdrms ofC3price with respect
sqrftbdrms to nox
C4bthrms (pollution).
u,
cient C2 is the change in log( price), when ∆rooms 1; as we have seen many
n the partial effect
multiplied of bdrms
by 100, thisonis price (holding all other
the approximate variables change
percentage fixed) is in price. Recall
is sometimes called the semi-elasticity∆price of price with respect to rooms.
_______ [6.17]
Model Regresi haruslah
estimated using the data in HPRICE2.RAW, berdasarkan 3sqrft.
C2 hubungan
Cwe yang
obtain
∆bdrms
didasarkan pada sebuah teori atau logika umum
(common sense)
!
log(price) 9.23 .718 log(nox) .306 rooms
(0.19) (.066) (.019) [6.7]
n. We
ation. Webegin with
begin with thethe
casecase
of twoof two independent
independent variables: variables:
n (3.14) is the predicted value of y when x1 0 and x2 0.
2 both equal
Interpreting
ˆto
yˆy ˆ0zero
C ˆ ˆ1xis1the
C0 C1x1interesting
C ˆan ˆ2OLS
C x2. Cˆ2Regression
x2. scenario; Equationin [3.14]
other cases, it[3.14]
Regression
eequation
ine,
OLS
heless,
as
the
More
Regression
(3.14)
of the
(3.14)
intercept
important
estimated
makes
than is the
is the predicted value
equation.
always
clear.
Equation
needed
details underlying
We begin
to
the obtain
computation
of y when x1 0 and x2 0.
a predictionˆ of
of the Cj is theAnalysis
with the case of two independent variables:
interpretation
uation (3.14) is the predicted
1 and x2 both equal to zero is an interesting scenario; value of yinwhenother cases, x1 it 0 andwith x2 Cross-
0.
e2Nevertheless,
nd details
have underlying
partial effect,
x2 both equal to zero is an interesting
the intercept the
or
is alwayscomputation
ceteris needed
ˆ
y paribus,
ˆ
Cto ˆ
C of the
1 1 scenario;
obtain
x a ˆ
interpretations.
ˆ C
prediction
C x . is the
j in other Data
of Sectional
interpretation
From cases, [3.14]
it
asInterpretasi
0 2 2
ession
on. Weline,begin (3.14) withmakes theclear.case of two independent variables:
vertheless,
ˆ The the
intercept intercept
ˆ
C in equation is always
(3.14) is
1 and C2 have partial effect, or ceteris paribus, interpretations. From
0
needed
the predicted to obtain
value of a prediction
y when x1 0 and xof
2 0.
on line,
have as
Sometimes,(3.14)
ˆ makes
setting x1 and clear.
ˆ
x2 both equal to zero is an interesting scenario; in other cases, it
ˆ ∆yˆ
will not ˆ
y C 1∆x
make
ˆsense.
C 1 ˆ
C
C 2∆xx , C
Nevertheless,
2 ˆ2thex .intercept is always needed to obtain a prediction [3.14] of
nd C2 have y from∆partial effect, or ceteris paribus, interpretations. From
0 1 1 2
ˆythe ˆ
OLS
C1∆x1 C2∆x2, ˆ
regression line, as (3.14) makes clear.
eted changeTheinestimates y givenC ˆ1 and C
the ˆchanges
2 have partial ineffect,
x1 and x2. (Note
or ceteris paribus, how the
interpretations. From
uation
predicted (3.14)
equation
change is ythe
(3.14),
in we have
given predicted
the changes value
in x and of
x . y when
(Note how xthe1 0 and x2 0.
with the changes in y.) In particular, when x2 is held fixed, so 1 2
gnd to do boththeequal
x2 with yˆ C
∆changes ˆto1in∆xzero
y.)
1
In isˆ2an ∆x2interesting
particular,
C when
∆,yˆ Cˆ1∆x1x2isC scenario;
ˆheld
∆x ,
fixed, so in other cases, it
2 2
evertheless, the intercept is always needed to obtain a prediction of
so we can obtain the predicted change in y given the changes in x1 and x2. (Note how the
edicted
ion line, change
as ∆ ˆ
(3.14)
y
∆ ˆ
y ˆ1C
in
C
ˆy∆x given
makes
1∆x11to
thethe
clear.
,, do with changes in x1 and x2. (Note how the
intercept has nothing changes in y.) In particular, when x2 is held fixed, so
nd ˆwith
doC that the
have changes
∆x2partial 0, then effect, in y.)orInceteris particular, paribus,when interpretations.
x2 is held fixed, From so
e key point is that, by including x2 in our model, we obtain a coeffi-
2
oint is that, by including x2 in our model, we obtain a coeffi-
e
eris paribus interpretation. This is why multiple ˆ
∆y C1∆x1, analysis is
ˆ regression
ibus interpretation. This is why multiple regression analysis is
holding x2 fixed. ∆ yˆˆ TheC
ˆkey ∆x ,
point
ˆ is that, by including x2 in our model, we obtain a coeffi-
cient on∆xy ˆ1∆with
yˆ C ∆x C2∆x
ˆceteris
aC
1
21∆x2, 1 paribus
1
2,
interpretation. This is why multiple regression analysis is
so useful. Similarly,
ˆ
ey point is ∆yˆ C
that, by2∆x2,
including x in our model, we obtain a coeffi-
Regression
ETERMINANTS OF COLLEGE GPA
Analysis
with Cross-
es in GPA1.RAW include the college grade point average (colGPA), Sectional
high
(hsGPA), and achievement test score CHAPTER
(ACT ) 3forMultiple
a sample of 141 students
Contoh Interpretasi Data
Regression Analysis:
university; both college and high school GPAs are on a four-point scale. We
Estimation
or about
Since exper or
2.6%.
andabout 2.6%.
Since
tenure Since
exper
each and exper and
tenure
increase tenure
byeach eachwe
oneincrease
year, increase
by one
just byyear,
add one
theyear,
we we add
just just add
coefficients the the coefficients
coefficients
on exper
enure andontenure
experbyand
and multiply andtenure
100 and the
multiply
to turn multiply
by effect
100 tobyinto
100ato
turn theturn the effect
effect into ainto
percentage. a percentage.
percentage.
OLS OLSValues
Fitted Fitted Values
and and Residuals
Residuals
d Values and Residuals
After obtaining the OLS regression line (3.11), we can obtain a fitted or predicted value
ˆ0which
yˆi C Cˆ1xis justˆ the predicted
i1 C2xi2 … Ckxik,
ˆ value obtained by plugging [3.20] the values of the independe
variables for observationyˆii ˆ0
into
C ˆ1xi1 C
equation
C ˆ(3.11).
2xi2 …We
ˆkshould
C xik, not forget about [3.20]the intercept
ed value obtained
OLSisFitted by plugging the values of theanswer
independent
obtaining
which the theValues
justfitted values;
predicted and
value Residuals
otherwise,
obtainedthe by pluggingcan be very
the values of misleading.
the independent As an examp
into equation (3.11). We shouldi into notequation
forget about the intercept in about the Regression
if invariables
(3.15), for observation
hsGPA 3.5 and ACT (3.11).
After obtaining theiOLS regression line i(3.11), we can obtain We
24, !should
colGPA not forget
1.29 intercept
.453(3.5)
i a fitted or predicted valuein
.0094(24)
s; otherwise, the
obtaining answer
the fittedcanvalues;
be very misleading.
otherwise, Asvalue
an be
example, Analysis
3.101 for(rounded to three
each observation. For places
observation after i,the answer
the
the can
decimal).
fitted isvery
simply misleading. As an example,
.453(3.5) with Cross-
5 and ACTiif
laces after 3.101
the
OLS fitted Values and
in 24,
Normally,
!
(3.15),colGPA
decimal).
hsGPAii
(roundedthe
3.5
actual
to three
1.29 and
yˆvalue
places
ACT
i after
i 24, !
.453(3.5)
iCfor
C0 ythe
ˆ ˆ any
colGPA
.0094(24)
i 1.29
C2xobservation
ˆ
1xdecimal).
i1
i2 … Ckxik, i will not equal
ˆ
.0094(24)
the predicted val
[3.20]
Sectional
ˆy i: OLS
value yi forˆy any
prediction
residuals
which minimizes
Normally,
is just the
observationthe
error
i: OLS minimizes
variables
the average
the predicted
actual
for observation
value yi for
value
i will not squared
for anyaverage
iparticular
squared
equal the
into equationobservation.
prediction
any observation
obtained by plugging
predicted
prediction
(3.11).
i willthe
error,
We should
not
value,
The
error,
residual
which
not
which
equal of
values thethe
says
forget for
Datasaysvalue,
predicted nothing about t
independent
the observation
nothing
about about the
intercept in i is defin
verage squared prediction error, which says nothing about
canresidual
the
justprediction
obtaining
as in thethe error for values;
fitted
simple any particular
otherwise,
regression observation.
case, the answerThe for observation
be very misleading. As ani is defined
example,
articular observation.
just
if inas(3.15), The iresidual
in the hsGPA
simple regression
3.5 and case,for
ACTobservation
i 24, ! colGPAi i is defined
1.29 .453(3.5) .0094(24)
ssion case, 3.101 (rounded to three places after theuˆdecimal). uˆi ˆi.yi yˆi.
i yi y [3.21] [3.2
Normally, the actual value yi for any observation i will not equal the predicted value,
There
There ˆis
ˆy i:u i
OLS a residual
is ˆi. for
ayminimizes
iy
residual for each
each
the observation.
observation.
average uˆi If0,u
squaredIf prediction ˆthen yˆ0,
i error, [3.21]
i isthen
below
which yˆiysays
is below
i, which yi,about
means
nothing which formeans that,
that,the
thisthis observation,
observation,
prediction i is
erroryyfor is underpredicted.
underpredicted.
any If uˆi
particular observation. If ˆ
0,
u then
The0, thenyˆi, and
yi residual y yi isyˆobservation
for
overpredicted.
, and yi isi overpredicted.
is defined
h observation. If
The ˆ
u 0, then
i ˆ
y is below y , which i means that, i for i
just
The iniOLS
asOLS fitted values
the fitted
simple i and residuals
regression
values andcase, i have some important properties that are immediate
residuals have some important properties that are immedi
If uˆi 0,from
rpredicted. extensions yˆi, variable
thentheyi single and yi is overpredicted.
case:
extensions from the single variable uˆi yi yˆi.
case: _ [3.21]
s and residuals have some important properties
1. The sample average of the residuals is zero and so y yˆ. _ that are immediate
_
ˆ ˆ _
variable 1. There
case: is a residual for each observation. If u 0, then y is below y , which means that, for
The
2. The sample
sample average
covariance
this observation, yi is underpredicted.
of the
between residuals
each
If u
i
is
independentzero i
and
variableso y
and
ˆi 0, then yi yˆi, and yi is overpredicted.
i
ˆ
y.
the OLS residuals is
_
_ the sample covariance between the OLS fitted values and the
of the residuals zero.
2. Theissample
The Consequently,
zero and
OLS covariance
fitted so y and
values yˆ. residuals
between have each someindependent
important properties variable and
that are the OLS residuals
immediate
OLS
extensions residuals is zero.
from the single variable case: covariance between the OLS fitted values and t
ce between zero. each Consequently,
independent _ _ _ the
variable
3. The point ( x1, x2, …, xk, y) is always
_ sample
and theonOLS the OLSresiduals _ is line: _
_regression ˆ0 C
y C ˆ1_x1
the sample OLS C2x2residuals
1. ˆThe_ sample average
covariance … between ˆkis
C
_ zero.
xk.
of the
the residuals
OLS fitted is zerovaluesand so yand yˆ. the
_ _ _ _ each independent variable and the OLS residuals_ is
. 3. The2. The sample
point ( x1covariance
, x2, …, between xk, y) is always on the OLS regression line: y C ˆ1_x1
ˆ0 C
_ _ _zero. Consequently, _ the sample covariance _ between the _ OLS fitted values and the
ˆ
C2xOLS
xk, y) is always 2 … theOLS
onresiduals ˆ .regression line: y C0 C1 x1
Ckisxkzero. ˆ ˆ
_ _ _ _ _
ˆ ˆ_
The fact that R never decreases when any variable is added to a regression makes
it a poor tool for deciding whether one variable or several variables should be added to
a model. The factor that should determine whether an explanatory variable belongs in a
model is whether the explanatory variable has a nonzero partial effectRegression
on y in the popu-
lation. We will show how to test this hypothesis in Chapter 4 when Analysiswe cover statisti-
cal inference. We will also see that, when used properly, R2 allows uswith to test Cross-
a group of
variables to see if it is important for explaining y. For now, we use it asSectional
a goodness-of-fit
Goodness-of-fit
measure for a given model.
Data
Example 3.5 deserves a final word of caution. The fact that the four explanatory
105
CHAPTER 3 Multiple Regression Analysis: Estimation Regression
Analysis
with Cross-
The Gauss-Markov
THE GAUSS-MARKOV ASSUMPTIONS
The following is a summary of the five Gauss-Markov assumptions that we used Sectional
in this chap-
Assumptions Data
ter. Remember, the first four were used to establish unbiasedness of OLS, whereas the fifth was
added to derive the usual variance formulas and to conclude that OLS is best linear unbiased.
Key Terms
Best Linear Unbiased Explained Sum of Squares OLS Intercept Estimate
Estimator (BLUE) (SSE) OLS Regression Line
Biased Toward Zero First Order Conditions OLS Slope Estimate
Ceteris Paribus Gauss-Markov Assumptions Omitted Variable Bias