Multiple Regression Analysis - Estimation

Uploaded by

Gd Agus Mahendra

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

Multiple Regression Analysis - Estimation

Uploaded by

Gd Agus Mahendra

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Regression

Analysis with
Cross-
Sectional Data
Multiple regression
analysis: estimation
Pertemuan 3
regression framework. avgscore
But the model
C0 falls outsidesimple
C1expend regression
C2avginc u. because it contains[3.2]
2
two functions of income, inc and inc (and therefore three parameters, C0, C1, and C2).
The coefficient
Nevertheless, theofconsumption
interest for policy purposes
function is Cwritten
is easily 1, the ceteris paribus effect
as a regression of expend
model with twoon
avgscore. Byvariables
independent including lettingexplicitly
byavginc in the
x1 inc and x2 model,
inc2. we are able to control for Regression
its effect
on avgscore. This is likely to be important because average family income tends to be
Mechanically, there will be no difference in using the method of ordinary least squares
correlated with per student spending: spending levels are often determined by both
(introduced in Section 3.2) to estimate equations as different as (3.1) and (3.4). Each equa-
Analysis
prop-
with Cross-
the Model with two
erty and local income taxes. In simple regression analysis, avginc would be included in the
tion can be written as (3.3), which is all that matters for computation. There is, however, an
error term, which would likely be correlated with expend, causing the OLS estimator of C
Sectional
important difference in how one interprets the parameters. In equation (3.1), C1 is the ceteris1
in the two-variable model to be biased.
Independent Variables
paribus effect of educ on wage. The parameter C1 has no such interpretation in Data
In the two previous similar examples, we have shown how observable factors2 other
other words, it makes no sense to measure the effect of inc on cons while holding inc fixed,
than the variable of primary interest [educ in equation (3.1) and expend in equation (3.2)]
(3.4). In

because if inc changes, then so must inc2! Instead, the change in consumption with respect to
can be included in a regression model. Generally, we can write a model with two indepen-
the change in income—the marginal propensity to consume—is approximated by
dent variables as
∆cons C 2C inc.
_____
y ∆inc
C0 C1x11 C22x2 u, [3.3]
See Appendix A for the calculus needed to derive this equation. In other words, the
where
marginal effect of income on consumption depends on C2 as well as on C1 and the level of
C0 is
income. theexample
This intercept.shows that, in any particular application, the definitions of the inde-
C1 measures
pendent variables the
are change
crucial.inBut
y with respect
for the to x1, holding
theoretical other factors
development fixed.regression,
of multiple
we can measures
C2 be the change
vague about in y withWe
such details. respect to x2, examples
will study holding other
like factors fixed.
this more completely in
Chapter 6.
In the model with two independent variables, the key assumption about how u is
related to x1 and x2 is
E(uUx1, x2) 0. [3.5]
The interpretation of condition (3.5) is similar to the interpretation of Assumption SLR.4
for simple regression analysis. It means that, for any values of x1 and x2 in the population,
the average of the unobserved factors is equal to zero. As with simple regression, the
important part of the assumption is that the expected value of u is the same for all combi-
nations of x1 and x2; that this common value is zero is no assumption at all as long as the
interceptC0 is included in the model (see Section 2.1).
Regression
Analysis
with Cross-
Sectional
PART 1 Regression Analysis with Cross-Sectional Data Data

T A B L E 3 . 1 Terminology for Multiple Regression

y x1, x2, …, xk
Dependent variable Independent variables

© Cengage Learning, 2013

Explained variable Explanatory variables
Response variable Control variables
Predicted variable Predictor variables
Regressand Regressors

This fits into the multiple regression model (with k 3) by defining y log(salary), x1
log(sales), x2 ceoten, and x3 ceoten2. As we know from Chapter 2, the parameterC1
with Two Independent Variables
framework. But the model falls outside simple regression because it contains
Analysis
ons with Cross-Sectional
ome simple examples to(and
of income, inc and inc 2
Data
therefore
show how three parameters,
multiple C0, C1,analysis
regression
Regression
and C2). ca
ss, the consumption function is easily written as a regressionAnalysis model with two
oblems that cannot be solved by simple regression. with Cross-
nt variables by letting x1 inc and x2 inc2.
ample
nically, is a simple
there will be novariation
differenceofinthe usingwage equation
the method introduced
Sectional
of ordinary in Chap
least squares
ssion
d effect
analysis
Beberapa is also
of education on hourly wage:
in Section 3.2) to estimate
useful
Contoh
equations as
for generalizing
different as (3.1) and
functional
Data
(3.4). Each equa-
rela
As an example,
written as (3.3), whichsuppose
is all thatfamily
mattersconsumption
for computation. (cons)There is,is a quadrati
however, an
nc): inwage
difference how one interprets
C0 C1the educ C2exper
parameters. u, (3.1), C1 is the ceteris [
In equation
ect of educ on wage. The parameter C1 has no such interpretation in (3.4). In
years of labor
s, it makes nocons
sensemarket
to experience.
measure the effect ofThus,
inc on 2wage
cons is determined
while holding inc 2 by the
fixed,
C0 2 C1inc C2inc u,
ndependent
nc changes, then variables,
so must inc education
! Instead, theandchange
experience, and by
in consumption other
with respectunobse
to
incontained
her income—the
factors marginal
u. We arepropensity
in affecting still to consume—is
primarily
consumption. interested
In thisapproximated
inmodel, by of educ on w
the effect
consumption
other factors affecting ∆cons
ed factor, income;∆inc so it Cthat
wage;
_____
1 is,
might 2Cwe seem
2inc. are interested
that it can in the
be parameter
handled C in1
with
ork.AaBut
ndix simple
for the regression
thecalculus
model fallsanalysis
needed deriverelating
outside
to simple
this wage Intoother
regression
equation. educ, equation
because
words, the it
exper
ffect out of on
of income theconsumption
error2termdepends and puts on it
C2explicitly
as well as on inCthe equation.
1 and the level ofBec
ncome, inc and inc (and therefore three parameters, C 0, C 1
n the equation, its coefficient, C2, measures the ceteris paribus effe
his example shows that, in any particular application, the definitions of the inde-
onsumption
riables are crucial.function
But
which is also of some interest. for the is easily
theoretical written
development as a
of regression
multiple model
regression,
vague about such details. We will study examples2 like this more completely in
les by letting x inc and x inc .
1 2
treatment of them. In this section, we cover some variations and extensions on
cost C0 C1quantity C2quantity2 C3quantity3 u.
forms that often arise in applied work.
Estimating such a model causes no complications. Interpreting the Regression
parameters is more
Analysis
involved (though straightforward using calculus); we do not study these models further.
n Using Logarithmic Functional Forms with Cross-
Sectional
Data
yModels
reviewingwith Interaction
how to Terms in the model
interpret the parameters
Sometimes, it is natural for the partial effect, elasticity, or semi-elasticity of the dependent
log(price)
variable with respect to 0 C1log(nox)
anCexplanatory 2roomson
toCdepend
variable the [6.6]
u, magnitude of yet another
explanatory variable. For example, in the model
e variables are taken from Example 4.5. Recall that throughout the text log(x) is
log of x. The coefficient
price 1 is the
C0 CC1sqrft elasticity
C2bdrms ofC3price with respect
sqrftbdrms to nox
C4bthrms (pollution).
u,
cient C2 is the change in log( price), when ∆rooms 1; as we have seen many
n the partial effect
multiplied of bdrms
by 100, thisonis price (holding all other
the approximate variables change
percentage fixed) is in price. Recall
is sometimes called the semi-elasticity∆price of price with respect to rooms.
_______ [6.17]
Model Regresi haruslah
estimated using the data in HPRICE2.RAW, berdasarkan 3sqrft.
C2 hubungan
Cwe yang
obtain
∆bdrms
didasarkan pada sebuah teori atau logika umum
(common sense)
!
log(price) 9.23 .718 log(nox) .306 rooms
(0.19) (.066) (.019) [6.7]
n. We
ation. Webegin with
begin with thethe
casecase
of twoof two independent
independent variables: variables:
n (3.14) is the predicted value of y when x1 0 and x2 0.
2 both equal
Interpreting
ˆto
yˆy ˆ0zero
C ˆ ˆ1xis1the
C0 C1x1interesting
C ˆan ˆ2OLS
C x2. Cˆ2Regression
x2. scenario; Equationin [3.14]
other cases, it[3.14]
Regression
eequation
ine,
OLS
heless,
as
the
More
Regression
(3.14)
of the
(3.14)
intercept
important
estimated
makes
than is the
is the predicted value
equation.
always
clear.
Equation
needed
details underlying
We begin
to
the obtain
computation
of y when x1 0 and x2 0.
a predictionˆ of
of the Cj is theAnalysis
with the case of two independent variables:
interpretation
uation (3.14) is the predicted
1 and x2 both equal to zero is an interesting scenario; value of yinwhenother cases, x1 it 0 andwith x2 Cross-
0.
e2Nevertheless,
nd details
have underlying
partial effect,
x2 both equal to zero is an interesting
the intercept the
or
is alwayscomputation
ceteris needed
ˆ
y paribus,
ˆ
Cto ˆ
C of the
1 1 scenario;
obtain
x a ˆ
interpretations.
ˆ C
prediction
C x . is the
j in other Data
of Sectional
interpretation
From cases, [3.14]
it
asInterpretasi
0 2 2
ession
on. Weline,begin (3.14) withmakes theclear.case of two independent variables:
vertheless,
ˆ The the
intercept intercept
ˆ
C in equation is always
(3.14) is
1 and C2 have partial effect, or ceteris paribus, interpretations. From
0
needed
the predicted to obtain
value of a prediction
y when x1 0 and xof
2 0.
on line,
have as
Sometimes,(3.14)
ˆ makes
setting x1 and clear.
ˆ
x2 both equal to zero is an interesting scenario; in other cases, it
ˆ ∆yˆ
will not ˆ
y C 1∆x
make
ˆsense.
C 1 ˆ
C
C 2∆xx , C
Nevertheless,
2 ˆ2thex .intercept is always needed to obtain a prediction [3.14] of
nd C2 have y from∆partial effect, or ceteris paribus, interpretations. From
0 1 1 2
ˆythe ˆ
OLS
C1∆x1 C2∆x2, ˆ
regression line, as (3.14) makes clear.
eted changeTheinestimates y givenC ˆ1 and C
the ˆchanges
2 have partial ineffect,
x1 and x2. (Note
or ceteris paribus, how the
interpretations. From
uation
predicted (3.14)
equation
change is ythe
(3.14),
in we have
given predicted
the changes value
in x and of
x . y when
(Note how xthe1 0 and x2 0.
with the changes in y.) In particular, when x2 is held fixed, so 1 2
gnd to do boththeequal
x2 with yˆ C
∆changes ˆto1in∆xzero
y.)
1
In isˆ2an ∆x2interesting
particular,
C when
∆,yˆ Cˆ1∆x1x2isC scenario;
ˆheld
∆x ,
fixed, so in other cases, it
2 2
evertheless, the intercept is always needed to obtain a prediction of
so we can obtain the predicted change in y given the changes in x1 and x2. (Note how the
edicted
ion line, change
as ∆ ˆ
(3.14)
y
∆ ˆ
y ˆ1C
in
C
ˆy∆x given
makes
1∆x11to
thethe
clear.
,, do with changes in x1 and x2. (Note how the
intercept has nothing changes in y.) In particular, when x2 is held fixed, so
nd ˆwith
doC that the
have changes
∆x2partial 0, then effect, in y.)orInceteris particular, paribus,when interpretations.
x2 is held fixed, From so
e key point is that, by including x2 in our model, we obtain a coeffi-
2
oint is that, by including x2 in our model, we obtain a coeffi-
e
eris paribus interpretation. This is why multiple ˆ
∆y C1∆x1, analysis is
ˆ regression
ibus interpretation. This is why multiple regression analysis is
holding x2 fixed. ∆ yˆˆ TheC
ˆkey ∆x ,
point
ˆ is that, by including x2 in our model, we obtain a coeffi-
cient on∆xy ˆ1∆with

yˆ C ∆x C2∆x
ˆceteris
aC
1
21∆x2, 1 paribus
1
2,
interpretation. This is why multiple regression analysis is
so useful. Similarly,
ˆ
ey point is ∆yˆ C
that, by2∆x2,
including x in our model, we obtain a coeffi-
Regression
ETERMINANTS OF COLLEGE GPA
Analysis
with Cross-
es in GPA1.RAW include the college grade point average (colGPA), Sectional
high
(hsGPA), and achievement test score CHAPTER
(ACT ) 3forMultiple
a sample of 141 students
Contoh Interpretasi Data
Regression Analysis:
university; both college and high school GPAs are on a four-point scale. We
Estimation

ollowing OLS regression

EXAMPLE 3.1 line to predict college
DETERMINANTS OFGPA from high
COLLEGE GPA school GPA
ment test score: The variables in GPA1.RAW include the college grade point average (colG
school GPA (hsGPA), and achievement test score (ACT ) for a sample of 14
!
colGPA 1.29 .453 hsGPA .0094 ACT
[3.15]
from a large university; both college and high school GPAs are on a four-point
n 141.
obtain the following OLS regression line to predict college GPA from high sc
and achievement test score:
interpret this equation?
colGPA First, the intercept
= IPK Kuliah !
colGPA 1.29 is
1.29 the.453predicted college
hsGPA .0094 ACTGPA
d ACT are hsGPA = Nem
both set SMA Since no one who attends college has either a zero
as zero. n 141.
ACT = Nilai UTS / UAS
GPA or a zero on the achievement
How do we interprettest,thisthe intercept
equation? First,inthethis equation
intercept 1.29 isisthenot, by co
predicted
ngful. if hsGPA and ACT are both set as zero. Since no one who attends college has ei
teresting estimates arehightheschool
slopeGPA or a zero on the achievement test, the intercept in this equation
coefficients on hsGPA and ACT. As expected,
itself, meaningful.
sitive partial relationshipMore between
interesting estimatesand
colGPA hsGPA:
are the Holding
slope coefficients hsGPAfixed,
on ACT and ACT. As
nt on hsGPA is associated there iswith .453partial
a positive of a point on the
relationship college
between GPA,
colGPA or almost
and hsGPA: Holding A
The coefficient
olding x2, x3, …, xk fixed. Thus, we haveon x1 measures
controlled forthethe
change in yˆ due
variables x2, tox3a, …,
one-unit increase in x1,
xk when
stimating the effect of x1 onother independent
y. The variables fixed.
other coefficients have aThat is, interpretation.
similar
The following is an example with three independent variables. ˆ1∆x1, Regression
∆yˆ C
Analysis
holding x2, x3, …, xk fixed. Thus, we have controlledwithfor the variables x2, x3, …
Cross-
estimating the effect of x1 on y. The other coefficientsSectional
have a similar interpretat
.2 HOURLY WAGE EQUATION
The following is an example with three independent variables.
Contoh Interpretasi Data
sing the 526 observations on workers in WAGE1.RAW, we include educ (years of educa-
on), exper (yearsEXAMPLE
of labor market
3.2 experience),
HOURLY and
WAGE (years with the current employer)
tenure EQUATION
n an equation explaining log(wage). The estimated equation is
Using the 526 observations on workers in WAGE1.RAW, we include educ (year
!
log(wage) .284
tion), .092
exper (years .0041
educoflabor exper
market .022 tenure
experience), and tenure (years with the current
[3.19]
in an equation explaining log(wage). The estimated equation is
n 526.
!
log(wage) .284 .092 educ .0041 exper .022 tenure
s in the simple regression case, the coefficients have a percentage interpretation. The
nly differenceLog(Wage)
here is that=they also Upah
log dari have a ceteris n 526.
paribus interpretation. The coefficient
092 means that,Educ = lama
holding tahun
exper
As untuk
inand
the tenurependidikan
simple fixed, yang
case,diselesaikan
another
regression year of education
the coefficients haveisapredicted
percentage interpret
Exper = lama only
tahun untuk berada pada posisi sebagai angkatan kerja
o increase log(wage) by .092, which translates
difference intothey
here is that an approximate
also 9.2%
have a ceteris [100(.092)]
paribus interpretation. The
Tenur = lama tahun bekerja saat pekerjaan terkini
.092 if
ncrease in wage. Alternatively, means
we takethat,two
holding exper
people andthe
with samefixed,
tenure levelsanother year of education is
of experience
nd job tenure, the coefficientto increase
on educlog(wage) by .092, which
is the proportionate translatesininto
difference an approximate
predicted wage 9.2% [
hen their education levelsincrease in wage.
differ by Alternatively,
one year. if we take
This measure tworeturn
of the peopleto with the same levels of
education
and job tenure, the coefficient on educ is the proportionate difference in pred
least keeps two important productivity factors fixed; whether it is a good estimate of
when their education levels differ by one year. This measure of the return to
he ceteris paribus return to atanother yeartwo
least keeps of education requires us factors
important productivity to study the whether
fixed; statistical
it is a good e
roperties of OLS (see Section 3.3).
CHAPTER 3 Multiple Regression Analysis: Estimation 77
CHAPTER 3 Multiple Regression Analysis: Estimation 77
Regression
Changing More than One
CHAPTER 3 Multiple Regression Analysis: Estimation Analysis 77
er of multiple regression analysis is that analysis
it allowsisusthat
to do in nonexperimental with Cross-
what
The power
environments
Independent Variable
of multiple
natural scientists
what are of
natural
The power
regression
able to do inarea controlled
scientists able analysis
multiple regression to do in
it allows
laboratory
isathat
us to do in nonexperimental
controlledsetting: keep setting:
it allowslaboratory Sectional
keep
us to do in nonexperimental
ixed.
other factors Simultaneously
fixed.
environments what natural scientists are able to do in a controlled laboratoryData
other factors fixed.
setting: keep

!  Bagaimana interpretasinya, jika lebih dari satu

g More Than
Changing One Than
More Independent
independent Variable
One Independent
variabel yang berubah Variable
secara
eously Changing
simultan?More Than One Independent Variable
Simultaneously Simultaneously
e Sometimes, weMisalnya,
! 
want to change more to
want than berdasarkan
one independent
change more than one contoh
variable 3.2.
at
independent thebagaimana
same time
variable at to
thefind
same time to find
Sometimes,
jika kita we want to
tertarik change
untuk more than
mengetahuione independent
pengaruh variable at
dari the same time to find
ffect
the on the dependent
resulting effect onvariable.
the This
dependent isvariable.
easily done
This using
is equation
easily done (3.17).
using For (3.17). For
equation
the resulting
exper effect tenur
dan on the dependent
secara variable.
simultan This(karena
is easily done using equation (3.17). For
individu
quation (3.19), we
example, inexample,can obtain
equationin(3.19),
equationthe estimated
we(3.19),
can obtain effect
we canthe on wage
estimated
obtain when an
effecteffect
the estimated individual
on wage when
on wage an individual
when an individual
me firmatfor tidak
another pindah
year: bekerja)
(general workforce
stays thestays
sameat firm
the forexper
same another
firm for year:
anotherexper experexperience)
year:(general workforce
(general and tenure
experience)
workforce andand
experience) tenure
tenure
byboth
one increase
year. both
Thebyincrease
total
one effect
year. (holding
The
by one educ
totalThe
year. effectfixed)
effectis(holding
(holding
total educ fixed) is is
educ fixed)
! ∆!
log(wage) .0041 ∆) !
∆exper
log(wage .022)∆tenure
.0041
log(wage
∆exper
.0041 .0041
.022
∆exper
.022
∆tenure
.022
.0261,
.0041
∆tenure .022
.0041 .0261,
.022 .0261,

or about
Since exper or
2.6%.
andabout 2.6%.
Since
tenure Since
exper
each and exper and
tenure
increase tenure
byeach eachwe
oneincrease
year, increase
by one
just byyear,
add one
theyear,
we we add
just just add
coefficients the the coefficients
coefficients
on exper
enure andontenure
experbyand
and multiply andtenure
100 and the
multiply
to turn multiply
by effect
100 tobyinto
100ato
turn theturn the effect
effect into ainto
percentage. a percentage.
percentage.

OLS OLSValues
Fitted Fitted Values
and and Residuals
Residuals
d Values and Residuals
After obtaining the OLS regression line (3.11), we can obtain a fitted or predicted value
ˆ0which
yî C Cˆ1xis justˆ the predicted
i1 C2xi2 … Ckxik,
ˆ value obtained by plugging [3.20] the values of the independe
variables for observationyîi ˆ0
into
C ˆ1xi1 C
equation
C ˆ(3.11).
2xi2 …We
ˆkshould
C xik, not forget about [3.20]the intercept
ed value obtained
OLSisFitted by plugging the values of theanswer
independent
obtaining
which the theValues
justfitted values;
predicted and
value Residuals
otherwise,
obtainedthe by pluggingcan be very
the values of misleading.
the independent As an examp
into equation (3.11). We shouldi into notequation
forget about the intercept in about the Regression
if invariables
(3.15), for observation
hsGPA 3.5 and ACT (3.11).

After obtaining theiOLS regression line i(3.11), we can obtain We
24, !should
colGPA not forget
1.29 intercept
.453(3.5)
i a fitted or predicted valuein
.0094(24)
s; otherwise, the
obtaining answer
the fittedcanvalues;
be very misleading.
otherwise, Asvalue
an be
example, Analysis
3.101 for(rounded to three
each observation. For places
observation after i,the answer
the
the can
decimal).
fitted isvery
simply misleading. As an example,
.453(3.5) with Cross-
5 and ACTiif
laces after 3.101
the
OLS fitted Values and
in 24,
Normally,
!
(3.15),colGPA
decimal).
hsGPAii
(roundedthe
3.5
actual
to three
1.29 and
yˆvalue
places
ACT
i after
i 24, !
.453(3.5)
iCfor
C0 ythe
ˆ ˆ any
colGPA
.0094(24)
i 1.29
C2xobservation
ˆ
1xdecimal).
i1

i2 … Ckxik, i will not equal
ˆ
.0094(24)
the predicted val
[3.20]
Sectional
ˆy i: OLS
value yi forˆy any
prediction
residuals
which minimizes
Normally,
is just the
observationthe
error
i: OLS minimizes
variables
the average
the predicted
actual
for observation
value yi for
value
i will not squared
for anyaverage
iparticular
squared
equal the
into equationobservation.
prediction
any observation
obtained by plugging
predicted
prediction
(3.11).
i willthe
error,
We should
not
value,
The
error,
residual
which
not
which
equal of
values thethe
says
forget for
Datasaysvalue,
predicted nothing about t
independent
the observation
nothing
about about the
intercept in i is defin
verage squared prediction error, which says nothing about
canresidual
the
justprediction
obtaining
as in thethe error for values;
fitted
simple any particular
otherwise,
regression observation.
case, the answerThe for observation
be very misleading. As ani is defined
example,
articular observation.
just
if inas(3.15), The iresidual
in the hsGPA
simple regression
3.5 and case,for
ACTobservation
i 24, ! colGPAi i is defined
1.29 .453(3.5) .0094(24)
ssion case, 3.101 (rounded to three places after theuˆdecimal). uî î.yi yî.
i yi y [3.21] [3.2
Normally, the actual value yi for any observation i will not equal the predicted value,
There
There îs
ˆy i:u i
OLS a residual
is î. for
ayminimizes
iy
residual for each
each
the observation.
observation.
average uî If0,u
squaredIf prediction ˆthen yˆ0,
i error, [3.21]
i isthen
below
which yîysays
is below
i, which yi,about
means
nothing which formeans that,
that,the
thisthis observation,
observation,
prediction i is
erroryyfor is underpredicted.
underpredicted.
any If uî
particular observation. If ˆ
0,
u then
The0, thenyî, and
yi residual y yi isyôbservation
for
overpredicted.
, and yi isi overpredicted.
is defined
h observation. If
The ˆ
u 0, then
i ˆ
y is below y , which i means that, i for i
just
The iniOLS
asOLS fitted values
the fitted
simple i and residuals
regression
values andcase, i have some important properties that are immediate
residuals have some important properties that are immedi
If uî 0,from
rpredicted. extensions yî, variable
thentheyi single and yi is overpredicted.
case:
extensions from the single variable uî yi yî.
case: _ [3.21]
s and residuals have some important properties
1. The sample average of the residuals is zero and so y yˆ. _ that are immediate
_
ˆ ˆ _
variable 1. There
case: is a residual for each observation. If u 0, then y is below y , which means that, for
The
2. The sample
sample average
covariance
this observation, yi is underpredicted.
of the
between residuals
each
If u
i
is
independentzero i
and
variableso y
and
î 0, then yi yî, and yi is overpredicted.
i
ˆ
y.
the OLS residuals is
_
_ the sample covariance between the OLS fitted values and the
of the residuals zero.
2. Theissample
The Consequently,
zero and
OLS covariance
fitted so y and
values yˆ. residuals
between have each someindependent
important properties variable and
that are the OLS residuals
immediate
OLS
extensions residuals is zero.
from the single variable case: covariance between the OLS fitted values and t
ce between zero. each Consequently,
independent _ _ _ the
variable
3. The point ( x1, x2, …, xk, y) is always
_ sample
and theonOLS the OLSresiduals _ is line: _
_regression ˆ0 C
y C ˆ1_x1
the sample OLS C2x2residuals
1. ˆThe_ sample average
covariance … between ˆkis
C
_ zero.
xk.
of the
the residuals
OLS fitted is zerovaluesand so yand yˆ. the
_ _ _ _ each independent variable and the OLS residuals_ is
. 3. The2. The sample
point ( x1covariance
, x2, …, between xk, y) is always on the OLS regression line: y C ˆ1_x1
ˆ0 C
_ _ _zero. Consequently, _ the sample covariance _ between the _ OLS fitted values and the
ˆ
C2xOLS
xk, y) is always 2 … theOLS
onresiduals ˆ .regression line: y C0 C1 x1
Ckisxkzero. ˆ ˆ
_ _ _ _ _
ˆ ˆ_
The fact that R never decreases when any variable is added to a regression makes
it a poor tool for deciding whether one variable or several variables should be added to
a model. The factor that should determine whether an explanatory variable belongs in a
model is whether the explanatory variable has a nonzero partial effectRegression
on y in the popu-
lation. We will show how to test this hypothesis in Chapter 4 when Analysiswe cover statisti-
cal inference. We will also see that, when used properly, R2 allows uswith to test Cross-
a group of
variables to see if it is important for explaining y. For now, we use it asSectional
a goodness-of-fit
Goodness-of-fit
measure for a given model.
Data

EXAMPLE 3.4 DETERMINANTS OF COLLEGE GPA

From the grade point average regression that we did earlier, the equation with R2 is
!
colGPA 1.29 .453 hsGPA .0094 ACT
n 141, R2 .176.
This means that hsGPA and ACT together explain about 17.6% of the variation in college
GPA for this sample of students. This may not seem like a high percentage, but we must
remember that there are many other factors—including family background, personality,
quality of high school education, affinity for college—that contribute to a student’s college
performance. If hsGPA and ACT explained almost all of the variation in colGPA, then
performance in college would be preordained by high school performance!

Example 3.5 deserves a final word of caution. The fact that the four explanatory
105
CHAPTER 3 Multiple Regression Analysis: Estimation Regression
Analysis
with Cross-
The Gauss-Markov
THE GAUSS-MARKOV ASSUMPTIONS
The following is a summary of the five Gauss-Markov assumptions that we used Sectional
in this chap-
Assumptions Data
ter. Remember, the first four were used to establish unbiasedness of OLS, whereas the fifth was
added to derive the usual variance formulas and to conclude that OLS is best linear unbiased.

Assumption MLR.1 (Linear in Parameters)

The model in the population can be written as
y C0 C1x1 C2x2 … Ckxk u,
where C0, C1, …, Ck are the unknown parameters (constants) of interest and u is an unobserved
random error or disturbance term.

Assumption MLR.2 (Random Sampling)

We have a random sample of n observations, {(xi1, xi2, …, xik, yi): i 1, 2, …, n}, following the
population model in Assumption MLR.1.

Assumption MLR.3 (No Perfect Collinearity)

In the sample (and therefore in the population), none of the independent variables is constant,
and there are no exact linear relationships among the independent variables.

Assumption MLR.4 (Zero Conditional Mean)

The error u has an expected value of zero given any values of the independent variables. In
other words,
Assumption MLR.2 (Random Sampling)
We have a random sample of n observations, {(xi1, xi2, …, xik, yi): i 1, 2, …, n}, following the
Regression
population model in Assumption MLR.1.
Analysis
Assumption MLR.3 (No Perfect Collinearity) with Cross-
Sectional
In the sample (and therefore in the population), none of the independent variables is constant,
and there are no exact linear relationships among the independent variables. Data

Assumption MLR.4 (Zero Conditional Mean)

The error u has an expected value of zero given any values of the independent variables. In
other words,
E(uUx1, x2, …, xk) 0.

Assumption MLR.5 (Homoskedasticity)

The error u has the same variance given any value of the explanatory variables. In other
words,
Var(uUx1, …, xk) T2.

Key Terms
Best Linear Unbiased Explained Sum of Squares OLS Intercept Estimate
Estimator (BLUE) (SSE) OLS Regression Line
Biased Toward Zero First Order Conditions OLS Slope Estimate
Ceteris Paribus Gauss-Markov Assumptions Omitted Variable Bias

Design, Science and Wicked Problems PDF
No ratings yet
Design, Science and Wicked Problems PDF
25 pages
Research 8: Most Essential Learning Competencies in Research 8 Course Description
100% (5)
Research 8: Most Essential Learning Competencies in Research 8 Course Description
6 pages
Test Statistics For One and Two Tailed Tests Mba Statistics Sem1
No ratings yet
Test Statistics For One and Two Tailed Tests Mba Statistics Sem1
6 pages
Multiple Regression Analysis: Estimation
No ratings yet
Multiple Regression Analysis: Estimation
50 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Regression Analysis With Cross-Sectional Data
No ratings yet
Regression Analysis With Cross-Sectional Data
0 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Multiple Regression Analysis: Estimation
No ratings yet
Multiple Regression Analysis: Estimation
36 pages
CH - 03 - Multiple Regression Analysis Estimation
No ratings yet
CH - 03 - Multiple Regression Analysis Estimation
36 pages
Ch_03_Multiple Regression Analysis Estimation [Autosaved]
No ratings yet
Ch_03_Multiple Regression Analysis Estimation [Autosaved]
36 pages
CH - 03 - Multiple Regression Analysis Estimation
No ratings yet
CH - 03 - Multiple Regression Analysis Estimation
36 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Chap_2_Econometrics I Jonse (3)
No ratings yet
Chap_2_Econometrics I Jonse (3)
41 pages
The Fundamentals of Regression Analysis PDF
No ratings yet
The Fundamentals of Regression Analysis PDF
99 pages
3 Multiple Regression Analysis Estimation
No ratings yet
3 Multiple Regression Analysis Estimation
37 pages
Reading 07-Correlation and Regression
No ratings yet
Reading 07-Correlation and Regression
18 pages
MGT-Three
No ratings yet
MGT-Three
86 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Multiple Regression Analysis, The Problem of Estimation
No ratings yet
Multiple Regression Analysis, The Problem of Estimation
53 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
ECON3049 Lecture Notes 1
No ratings yet
ECON3049 Lecture Notes 1
32 pages
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
100% (3)
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
34 pages
Econometrics for Mgt ppt-2 (1)
No ratings yet
Econometrics for Mgt ppt-2 (1)
58 pages
Chapter three
No ratings yet
Chapter three
35 pages
Domodar N. Gujarati: Chapter # 8: Multiple Regression Analysis
No ratings yet
Domodar N. Gujarati: Chapter # 8: Multiple Regression Analysis
41 pages
B.A. (Hons.) Economics Introductory Econometrics SEM-III (7033)
No ratings yet
B.A. (Hons.) Economics Introductory Econometrics SEM-III (7033)
31 pages
PHD - Regression
No ratings yet
PHD - Regression
48 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Chapter 2 - Regression Analysis
No ratings yet
Chapter 2 - Regression Analysis
49 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
11 pages
DIMPAS_BSCPE_2-7_ASSIGNMENT_NO.9
No ratings yet
DIMPAS_BSCPE_2-7_ASSIGNMENT_NO.9
17 pages
Answered Sheets Combined
No ratings yet
Answered Sheets Combined
52 pages
Lesson 2: Multiple Linear Regression Model (I) : E L F V A L U A T I O N X E R C I S E S
No ratings yet
Lesson 2: Multiple Linear Regression Model (I) : E L F V A L U A T I O N X E R C I S E S
14 pages
Mutiple Regresson Model
No ratings yet
Mutiple Regresson Model
3 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Cfa Level 2 2023 Summary
No ratings yet
Cfa Level 2 2023 Summary
100 pages
University of Caloocan City: Managerial Economics Eco 3
No ratings yet
University of Caloocan City: Managerial Economics Eco 3
34 pages
FinQuiz - Curriculum Note, Study Session 2, Reading 4
No ratings yet
FinQuiz - Curriculum Note, Study Session 2, Reading 4
5 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Econometrics Lecture Chapter 2 Note pdf-1
No ratings yet
Econometrics Lecture Chapter 2 Note pdf-1
34 pages
Chapter 2
No ratings yet
Chapter 2
58 pages
Econometrics Chapter Three (1)
No ratings yet
Econometrics Chapter Three (1)
55 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Httpweb.du.Ac.inpreviousquestionpapersunder20graduateb.a.20honseconomicsii20yeariii20sem2013b.a.20(Hons.)20econ
No ratings yet
Httpweb.du.Ac.inpreviousquestionpapersunder20graduateb.a.20honseconomicsii20yeariii20sem2013b.a.20(Hons.)20econ
31 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
04 16 Simple Regression
No ratings yet
04 16 Simple Regression
47 pages
Multiple Regression Analysis Further Issues
No ratings yet
Multiple Regression Analysis Further Issues
27 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Multiple Linear Regression Model_Final
No ratings yet
Multiple Linear Regression Model_Final
16 pages
Wooldridge (2018) - Introductury Econometrics_ A Modern Approach-Chapter 2
No ratings yet
Wooldridge (2018) - Introductury Econometrics_ A Modern Approach-Chapter 2
47 pages
Chapter 2 - Simple Linear Regression Function
100% (1)
Chapter 2 - Simple Linear Regression Function
49 pages
Ch_03_Wooldridge_5e_PPT
No ratings yet
Ch_03_Wooldridge_5e_PPT
35 pages
IE Chapter2
No ratings yet
IE Chapter2
46 pages
Metrics Final Slides From Darmouth PDF
100% (1)
Metrics Final Slides From Darmouth PDF
126 pages
Regression Analysis: Causal Relationship Between The Explanatory and
No ratings yet
Regression Analysis: Causal Relationship Between The Explanatory and
17 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
Resume Ekonometrika Bab 2
No ratings yet
Resume Ekonometrika Bab 2
6 pages
UNIT II Regression
No ratings yet
UNIT II Regression
59 pages
Chapter Five Demand Estimation: Page 1 of 22
No ratings yet
Chapter Five Demand Estimation: Page 1 of 22
22 pages
DOC-20240414-WA0006.
No ratings yet
DOC-20240414-WA0006.
4 pages
Student Solutions Manual to Accompany Modern Macroeconomics
From Everand
Student Solutions Manual to Accompany Modern Macroeconomics
Sanjay K. Chugh
No ratings yet
Correlation and Regression
No ratings yet
Correlation and Regression
22 pages
6-Sigma Capability Calculator
No ratings yet
6-Sigma Capability Calculator
12 pages
Chapter 12
No ratings yet
Chapter 12
2 pages
Antenna Article Small Ant Efficiency
100% (1)
Antenna Article Small Ant Efficiency
144 pages
CO Lesson Plan
No ratings yet
CO Lesson Plan
12 pages
qaa
No ratings yet
qaa
26 pages
Aqa 8461 Practicals
No ratings yet
Aqa 8461 Practicals
69 pages
Lecture CH 1 Nature of Science
No ratings yet
Lecture CH 1 Nature of Science
12 pages
Definition of Statistics and Its Basic Components
No ratings yet
Definition of Statistics and Its Basic Components
6 pages
23 Advanced Research Methodology
No ratings yet
23 Advanced Research Methodology
2 pages
Chapter 17
No ratings yet
Chapter 17
78 pages
Is Science Denial Worse Than It's Ever Been
No ratings yet
Is Science Denial Worse Than It's Ever Been
4 pages
Collocations
No ratings yet
Collocations
30 pages
UD105 - PDF Full Issue
No ratings yet
UD105 - PDF Full Issue
50 pages
G103 - A2LA Guide For Estimation of Uncertainty of Dimensional Calibration and Testing Results
No ratings yet
G103 - A2LA Guide For Estimation of Uncertainty of Dimensional Calibration and Testing Results
30 pages
October 1998 November 2001
No ratings yet
October 1998 November 2001
19 pages
51-Article Text-169-1-10-20220302
No ratings yet
51-Article Text-169-1-10-20220302
16 pages
Action Research Template Final
50% (2)
Action Research Template Final
42 pages
BR MCQ Student Copy-Converted All CH
No ratings yet
BR MCQ Student Copy-Converted All CH
63 pages
Type How To Carry Out Advantages Disadvantages: Simple Random Sampling
No ratings yet
Type How To Carry Out Advantages Disadvantages: Simple Random Sampling
1 page
NIM Qunatitative Methods Workshop1
No ratings yet
NIM Qunatitative Methods Workshop1
30 pages
Machine Learning Report
92% (12)
Machine Learning Report
42 pages
By Okite Moses
No ratings yet
By Okite Moses
33 pages
W1 Lesson 1. Introduction To Methods of Research
No ratings yet
W1 Lesson 1. Introduction To Methods of Research
9 pages
Illustrating The Central Limit Theorem: Lesson
100% (2)
Illustrating The Central Limit Theorem: Lesson
16 pages
BE A2 S22023 Final-1
No ratings yet
BE A2 S22023 Final-1
4 pages
Research Methods PDF
100% (1)
Research Methods PDF
48 pages