Chapter 3 Multiple Regression Analysis Estimation
Chapter 3 Multiple Regression Analysis Estimation
ANALYSIS ESTIMATE
In Chapter 2 we learned how to use simple regression analysis to explain a DV
7 as a function of a single IV The main issue is that in a ten unrealistic to draw
a ceteris Paribus relationship especially with the 2cm assumption
Naturally adding more factors that are useful for explaining y would explain more
Section 3 1 formally discusses multiple regression model and its advantages over
simple regression model
Thus
wage is determined by educ exper and the other unobserved Factors contained in n
We are still interested in me errect of educ on wage Byputting 13 we explicitly take
exper out of u and put it explicitly in the equation We can also see theceteris paribus
effect of exper and wage which may be of some interest
We will also have to make assumptions or how u is related to both IVs However we can
be confident that exper is held fixed unlike in simple regression where we assumethat
MRA is also useful in generalising functional relationships between variables For example
Suppose
family consumption is a Quadraticfunction of family income
Theres one important difference in howone the parameters In 3.1 131 is me ceteris
interprets
Paribus correct or educ and wage We cant do that with 3.4 as it inc changes so does incl
Inthemodel with two variables the key assumption of how X and Xz are related to u is
E CUI Xi x2 0 3.5
This assumption is similar to the 2cm for simple regression analysis When applied to a
Quadratic function such as 3.4 writing ECUline incl o and Ecu line o are the
The terminology is also the same for simple regression such as u as the error terms No
matter how many IVs we include there will be factors that we cant include
We must also know how to interpret the parameters which we would practice a lot in the
subsequent chapters Here is a reminder of what we already know
when ceoten increase by one year When 13340 men the effect is more complicated
We will return to the detailed treatment of general models with Quadratic in Ch G
3.7 Provides an important reminder that the parameters are linear but is a nonlinear
relationship between the IV and DV Many applications of the MRM involve his relationship
Section 3 3 will discuss how 3.8 implies that the Ocs is unbiased and will derive me
bias when a key variable has been omitted In Ch 15 and 16 we will study how it
7 50 Fix t Bix 3g
I Yi Bi Bi xi Paiz 3.10
to be as small as possible
To understand what he Ols is doing we must master the subscripts first The Us have
two subscripts i followed by I or 2 i indicates the observation while I and 2 are used
to distinguish between the dirrerent IV
The OLS estimates let1 of them are chosen to minimise the sum of SQuared residuals
I Yi Ro Bixi Bexia 3 12
I Yi BI Bixi Bixile 7 0
Even with moderate n and k the calculation would be hidious and hard so just
use computers
As with simple regression 3.11 is called the OLS regression line or the sample regression
function SRF We will call 138
me OLSintercept estimate and Bi Bu me OLSslope
estimates
Y BotBix tBzxr 3 14
The intercept Po in 3.14 is the predicted value of Y when X O and X2 o Setting both
to zero sometimes yield interesting results sometimes it doesnt make sense
The estimates I and 2 have partial effect or ceteris Paribas interpretations From3.14
We have
AT Bi Axit R2Axe
AT BIAxe
COTGPA
1.29 t 453 hs6pA t Ooga Ace 3 15
n 141
Theinterpretation is as follow Theintercept 1.2g is me predicted value when heGPA and ACT are
Zero Assuming hat students with 0 ACT and hs6pA wont go to oni this valve is meaningless
in itself
There is a positive correlation between ColGPA and he6pA holding Act fixed another point
on hisGPA is associated with 453 of a point on the ColGPA
The sign on ACT implies hat holding Col6pA fixed an increase of 10 Points only increase
ColGPA by less man 0.1 This is a small erreet andsuggest hat Actscore is not a strong predictor
of ColGPA We will also discuss stat Sig later which we will find that his value is also
statistically insignificant
The OLS regression line for more man two IVs is similar
Coefficient
0.092 means that holding others fixed another
year of education is Predicted
to an increase of log wage by 0.092 g 27 This keeps two other important Productivityfactors
fixed Whether his is a good estimate requires us to study thestatisticalProperties of OCS Section33
In example 3.1 we observed the coefficient on Act measures the predicted change in
ColGPA holding hisGPA fixed Thepower of MRA is mat it provides a data in a ceteris Paribas fashion
even though the data may not have been collected in such fashion
In giving he Act a partial effectinterpretation it may seem look like we went out and sampled
People with the same hsGPA with different Actscores even though that this was not how the
Sampling was done
Rarely we could hold certain variables fixed in obtaining our sample to we couldobtain
samples of individuals with the same hsGPA we would just run a simple regression or ColGPA
on ACT MRAeffectively allows us to do this without restricting the values of any IVs
3 2d Changing More Than One Independent Variables Simultaneously
We can easily do his using 3.17 For example in 3.19 we can obtain wage when an
individual stays at me rirm for another year exper and tenure increase by one year
Normally yi Yi OLS minimises the average squared Prediction error which says nothing
about the prediction error for particular observation
any
Ni Yi Yi 3.21
For each observation If hi O then Yi is below yi yi is under
predicted If Mico then
Ti is overpredicted
The OCS fitted values have some important properties that are immediate extension or mesingle
variable case
1 The sample average of me residual is zero so Y É
2 The sample covariance between each IV and the Ocs residuals is zero
3 The point Xi Xz Xk Y is always on the Ols regression line
3 25 A Partialling Out Interpretation of Multiple Regression
When applying OLS we dont need to know explicit formulas for Bj that some the
system in 3.31 Nevertheless for certain derivations we need explicit formulas for the Pi These
Pi Erin Ir 3.223
Where Fil are the OLS residuals from a simple regression of Xi on Xz using mesample at hand
The representation in 3.22 Shows another demonstration of Bi's partial effect interpretation
The residuals ri are he part of Xin mats uncorrelated with Xiz In other words it is
xis after he effects or Xiz have been Partialledout or netted out
We dont have his in SCR as theres no omer variables included in the analysis
Weknow that the simple regression coefficient 3T doesnt usually eaual the multiple regression
coefficient Ri Theres a simple relationship between RT and Bi
E Bites 3 23
Where J is he slope coefficient from the simple regression of Xia on Xia This equation
shows how BT differs from the partial effect of X1 on Y The confounding term is he partial
effect or x2 on Y times the slope in thesimple regression of Xz on Xi
The relationship between RT and Bi also shows there are two distinct cases when theyre equal
1 The partial effect between Xz and Y is zero in me sample Bi o
Eventhough simple and multiple regression are almost never identical we can use meabove
formula to characterise why they might be very different or Quite similar For example when
2 is small we might expect the multiple and simple regression estimates or B to be similar
In example 3.1 the sample correlation between hisGPA and Acc is about 346 which is a
nontrivial correlation But he coefficient on ACT is pretty small Its not surprising to find that
a simple regression of ColGPA on hisGPA produces a slope estimate of 482 not that much
The for prate is 87 36 732 For mrate and 13.2 for age Thus bow mrate and
avg
age have he expected effect
If we dont control age we might have a large change in the estimated change or mrate
if age is dropped from the regression However regressing prate on mrate gives us
Prate 83.08 5.86 mrate
The difference isnt to big only 6.21 larger his may be explained with me tact mat
the sample correlation between mate and age is only 12
In the case with k Ns the simple regression of Y and X1 and the multiple regression a
Y on XI Xi Xk Produce an identical estimate of Xa ire
I the OCS coefficients on Xz through all xx are all Zero
3 2h Goodness or Fit
This is very similar with SLR we can define total sum of squares SST he explained
sum of SQuares SSE and the residual sum or SQuares SSR as
É can
An important fact about R is it never decreases and usually increases when another IV is
added and the same set of observations is used for both regressions
Animportant caveat is that we dont have missing data on me IVs if two regressions use
different sets of observations even if one uses a subset or regressors we cant compare
how well the R will compare If we have full data on Y xi.az but not on Xs then
we cant say mat regressing y on X1 and Xz is less man regressing on Xa X2 and xp it
could go either way This is an important issue that we will return in Ch.g
The fact that R never decreases when any variable is added is a poor tool for deciding
whether to add more IVs Thisshould be done byjudging whether the IV has a nonzero
Partial effect on Y in thepopulation we will show his in Ch u when we cover statistical interference
7 51 11132 2 t t Fuxk
Where n is used to distinguish from the OLS estimates When 41 0 42 01 Xu o hen
Yis hero
3.30 always minimises the squared residuals although he properties will be dirrerent
In particular me OLS residuals no longer have a zero sample average Further in R is
defined as l SSRIsst then R could actually be negative This means that I explains
more he variation in Yi mean he Ns Weshould put a Bo or mat meIVspoorly explain 7
One serious drawback with regression through the origin is mat it Bo to then he Ots ve me
slope Parameters will bebiased which can be severe in some cases
3 3 The Expected Valves or the OLS Estimators
We turn to the statistical properties of me OLS for estimating the parameters In this section
we derive he expected values of the Ols estimators We will discuss four assumptions and
obtain he bias in Ols when an important variable has been omitted from the regression
3 31 formally States the Population model or the true model to allow for thepossibility mat
we might estimate a model mats different man 3.31
Section 3 2 we saw mat Olschooses me scope and intercept so mat he residuals average
equals to zero and that theres no correlation between each W and he residuals We still dont
have conditions underwhich he Ols estimates are well defined for a sample
It does allow for some correlation just not Perfectly correlated k we dont allow some
correlation men MCR would be of very limited use For example
We fully expect that expend and angina to be correlated schooldistricts with highang Family
incomes tend to spend more on student In fact we put avg inc in the model to hold it
constant as we suspect it to have some correlation with expend MCR 3 any rules out
Thesimplest way MLR 3 Fails is mat is one variable is a constant multiple or another for
example if he same variable measured in different units such as when income and income
thousand are in the same model
The model cons BotBinet Rindt u doesnt violate MLR 3 as inch is not an exact linear
function or inc Including inch is a useful way to generalise anctional forms
A subtle way MLR 3 rails is with me model log cons Bot131logCine 31210gCine ta
Xz 2 1 and his violates MLR 3 we shouldve Put logCine instead
Thesolution is to just drop one of the variables preferably totexpend as men we can measure
the increasing expendA on voteA holding voted fixed
The examples show that MCR can fail is we are not careful in specifying our models
If its well specified and nz ktl in rare cases we couldobtain samples where he
valves are exactly twice although this may only happen where n is small
Ommitting important variables could also cause MCR 4 to fail this is more of an issue in
MLR as we're able to include multiple variables
We are now ready to show unbiasedness under the first four MLR
7 BotBaxa 132 2 tu 3 40
and we assume MLR I 4
Suppose
that our Primaryinterest is with 131 the partial erect or ya on Y where 7 is wage X1 is
educ and Xz is a measure of innateability Toget an unbiased estimator or B we should run a
wage Bo theduct V
Where v Rabi tu Obtaining Bi is not difficult as its just the same with using me Ols
formula Theonlydifference is that we must analyse its properties when an SLRis misspeciried due to
an omitted variable We have done almost all the work on his using 3.23 we have
E BePisa
Where Pi and Pi are he slope estimators from the multiple regression
Yi on Xia Xie i 1,2 in 3.43
and Jais the slope from the simple regression
xiz on Xia si 1.2 in 3 44
ECBI ECRITIA
ECRI t E Br Ja
Rat281 3.45
Whichimplies thathe bias in F1 is
Bias PI E BI 131 13281 3.46
Because thebias arises from omitting the N x2 the term on me right hand side of 3.46 is often
There are two cases where It is unbiased The first is it 132 0 and it 81 0 even
if 3240 This is because It is only the sample cow so if Ya and Xz are unrelated BI is
Unbiased
When x1 and Xz are correlated It has me same sign as the correlation between ta and
x2 I o ie XI and xu are positively correlated and vice versa Table 3.2 warrants a
careful study as it shows me sign or the bias BI depending on he sign of 132and 1
The size is also very important a small one of either sign wont be a cause for concern
Because P2 is unknown we dont know whether its positive or negative We could make an educated
guess between Xt and Xz For example there are reasons to believe that educ and abi are
Positively correlated
We know that the avg of me estimates across all random samples would be too large
we say that RI has an upward bias When ECBI 131 it has a downward bias These
are the same whether its positive or negative ThePhrase biased towards zero revers to cases
Where E PT is closer man zero compared to 131 12 Ba is positive then It is biased towards
zero it it has a downward bias and vice versa
single IV and the error generally results in all Ols estimators being biased Suppose
7 50 5 2152 2 3.50
Suppose that Xz and Xz are uncorrelated but Xa and Xz are Xa is correlated with
the omitted but Xz is not In his case instead of only BI is biased boy will generally
be Theonly exception is where x1 and Xz are also uncorrelated
Its also difficult to obtain the direction of bias in BI and R2 as X1 x2 and as can be
Pairwise correlated We can still approximate this if we assume Xt and Xz are uncorrelated
then we can study me bias in BI as it Xz were absent from both the Population andestimated
models Ie Xt and Xz are uncorrelated it can be shown hat
As an example
uncorrelated with exper le we're interested in educ and would like to know thebias direction
of BI mis is impossible without more information As an approximation if educ and exper are
also uncorrelated and 13370 and educ and abil are Positively correlated then RIhas a positive bias
The example is used as a roughguide for more complicated modes Usually the focus
is on a particular relationship such as xa and he omittedvariable Strictly speaking ignoring
all other IVs is valid practice it each one is uncorrelated
MLR 5 means that the variance in U conditional on me lus is he same for all combinations
In the discussion mat follows x will beused to denote all IVs Thus X Cedve exper tenure
We can write MLR I and 4 as
and MLR
War xix T
This illustrates themain difference between MLR 5 and 4 the former doesnt depend on the
individual values or the N while MLR 4 does
For proof look at the appendix chapter
Theorem 3.2 Sampling Variances of me OLS Estimators
Under MCR1 5 conditional on the sample valves or the Ius
Var Bi sgF.pt 3 51
Where SST is the total sample variation in Xj and Rj is obtained from regressing Xj on
Before studying 3.51 in detail we need to know all the 6 MA used for 3.51 The size
of VarCRj is also important A larger variance means a less precise estimator and translates
The error variance p Alarger f means larger sampling variances more noise in me
eauation translates to harder difficulty for the estimator to estimate the partial effect or any
IVs on Y Because f is a feature of the Population its one of me components mats unknown
we will see later how to find an unbiased estimator for G
Total Sample Variation in Aj SsTj The larger SStj he smaller Var Bj is Thus ceteris
Paribus we want to have as much sample variation in Xj as possible Although we rarely can
individually choose the sample valves of the N we can increase n This is thecomponent hat
Small SST doesnt violate MLR 3 unless its Zero as we cant even have an Ocsestimate
then As SST 70 Var Bi 700
The linear Relationships
among the IVs Rj This is the most difficult to understand This
doesnt appear in SLR as theres only one IV Its also different from an R obtained from
regressing y on Xa xa xp R is obtained from only regressing lbs in the original model
where x plays the role of the DV
VarCRI TICSST CI R
where RI is he from me simple regression or Xt and Xz A value or RE indicates
R
that Xz explains much or the variation in x2 Xa and he are highly correlated
A more realistic case is when RI is close to 1 high correlation between two or more
Small value of SST can cause a large Var Bj too therefore one cant really decrease
the sample size even if it may reduce R SST will fall too
One thing thats clear Ceteris Paribus For estimating Bj its better to have a small correlation
between x and other IVs This means its tricky to solve his problem in social science as we
are often only observing me data We can try dropping an IV but his may lead to bias
Heres an example illustrating the issue of multicolinearity Suppose
we're interested in estimating
the effect of numerous school expenditures on student performance Its highly that expenditures
on teachers materials athletics are highly correlated wealthierschools tend to spendmore on everything
Its difficultto estimate the effect co any Particular expenditure category when theres little
variation in one category that cant be largely explained by variations in other categories
this leads to high R for each IVs
subtle to be useful It might be useful to change the scope of our analysis and lumping
all the expenditures together
7 Bot131 1 t 12 2 33 3 tux
where Xz and Xz are highly correlated Then Var B2 and Var 133 I
may be large but
this tells nothing on Var BI In fact it xa is uncorrelated with Xz and Xs then
RE O regardless of the correlation between Xz and Xz Ie Bt is he parameter of interest
We dont need to care about the correlation between Xz and Xs
The previous example is important as economists often include many control variables in order
to isolate the causal effect of a particular variable High correlation between these variables
dont make it harder to determine an effect of a particularvariable
Its easy to misuse such Statistics as we cant specify howmuch correlation among IVs is too
muchThey might reveal a Problem even though its just two control variables whose coefficients
we dont care about
A somewhat useful but still misused are statistics for individual coefficients A common one is
called the variance inflation factor Ulf obtained directly from 3.51
The Vlf for slope coefficient j is
Ulf 1 C1 Rj
Precisely the term in Var Bj thats determined by correlation between xj and theother Ivs We
can write Var Bj as
If we had the choice we would want to minimise Ufj but we rarely have the choice
If we think certain IV needs to be included then we are resistant to dropthem high Ulf cant
really aerect the decision If we'reinterested with me erreet of Xa and 7 we would ignore
all the Uts or other coefficients
This doesnt hold it we bring variance intothe analysis Conditioning on the value of ta and
Where SSTa is the total variation in X1 and RI is the R sauared mom regressing XI on
3.557
We can see that Var PI is always smaller man 1 unless x1 and xz are uncorrelated
in which the two estimators BI and Pa are the same Assuming that X1 and Xz are
correlated we have
1 When13240 BI is biased Bt is unbiased and VarCBI VarCBI
L When 132 0 PI and PI are unbiased and Var BI a Var B1
Fromthe second conclusion its clear that 51 is preferred is its zero Butincluding it only
exacerbates multicolinearity as x2 doesnt have a partial correct on y leading to a less efficient
estimator of 311 Ahigher variance for me estimator 131 is the cost of including an irrelevant IV
Its more difficult when 13240 as leaving out Xz will result in a bias for theestimator of 131
When Xz to there are two favorable reasons for including x2 in the model Themostimportant
is that the bias in PI doesnt shrink as
me sample size grows it doesnt follow any
Pattern The bias will be roughly the same for
any sample size On the other hand both
VarCBI and Var BI will 70 as n gets large reducing the multicolinearity issue In a large
Sample we would Prefer 131
The other reason to prefer BI is 3.55 is too generous in measuring
more subtle Essentially
the Precision or BI Leaving out Xz when 13240 will increase 02 But 3.55ignores me
increasing
as it will treat both regressors as nonrandom Idle either lmao
the errors as
W YI 130 BAXia t 132xp Bexik
which means that we dont know he errors as we dont know the Pj When we replace
Itseems natural toestimate 8 byreplacing ni with Mi In the SLR case we can see mat
this will lead to a bias The unbiased estimator or t in the general multiple regression case is
82 Evil Cn K l
SSRICn ie 1 3.56
The term n ie l is the degrees of freedom doo for me general OLS problems with
doe n CK ti
number or observations number of estimated Parameters
The Positive square root of f denoted f is called the standard error of the regression
SER Its an estimator or me SD in the error term Its usually reported by regression
Packages although its called different things by diverent packages MSEin Star
For constructing conf int and conducting tests in Ch 4 we will need to estimate the
standard deviation of Bj which is just square root of the variance
Sd Pj J I SST C1 R
Se Pj F SST C1 R 3.58
As sect depends on J it has a sampling distribution which will play a role in Ch 4
Theres one thing that should be emphasised with se Because 3.58 is obtained from 3.51
and because 3.51 relies on homoskedacity 3.58 is not a valid estimator of Sd Bj it the
errors exhibit heteroshedacity
Thus While thePresence of heteroshedacity doesnt cause bias in the Bj it does lead to bias
in the usual formula or Var Bj which then invalidates the standard errors Chapter 8 will
discuss what methods are available for dealing with heterosnedacity
F
SeCB 3.5g
Insect say
in which
Selex FICxijIj
to be the sample Sd where the SST is divided by n rather man n 1 Theimportance a
35g is how n directly acrects the se The other three terms R f and Sd xj will turn to
a constant as n gets larger the Presion a B increases as we have moresample In contrast
biasedness will hold for any value or n We will discuss thisfurther in Ch 5
We can show that the Ols is also the estimator with the least variance its the best linear
unbiased estimator BLUE To state the theorem we must discuss eachcomponent ortheacronym
First an estimator is a rule that can be applied to any sample a data to produce an estimate
Bi I wi ei 3 60
where wi can be a unction of the sample valves of all the IVs OLS is linear as seen in 3.21
Finally best under me current theorem can be defined as having the smallest variance which
is always preferable given two unbiased estimators
The GMA States that For any estimator Bj mats linear and unbiased Var Bj E Var B
Inthe class of linear and unbiased estimators OLS has the smallest variance It says more than
this to we want to estimate any linear function or Bj the corresponding linear combinations of
the Us estimator will achieve the smallest variance We conclude with a theorem
Theorem 3.4 Gauss Markov Theorem
i i
It alsojustifies the usage of Ols to estimate the
i
MRM If any of the assumptions rails men so does
the theorem
If 2CM fails it will cause the Ols to be biased Thepresence of heterosmendacity failure at
MLR 5 doesntbias the Ols but it will no longer has the smallest variance In Ch d we will
The first thing is that Ols is an estimation method not a model Amodel describes an underlying
Population and depends on unknown parameters The linear model we've been studying can be written
in the population as
Y 130 131 1 thx z t t u 3 61
where the Parameters are the Bj Importantly we can talk about the meaning of Bj without any
data We may not be able to learn much without data but the interpretation or Bj is obtained moon 3.61
Once sample hasbeen acQuired we can estimate me parameters There are various models Ocs is
just one of them Under MLR 1 5 OLS is Preferred and different assumptions call for different
models Afew examples include weighted least sauares in Ch 8 least absolute deviations in Chg
This is the ideal approach Write an eauation like 3.61 with easy to read variable names e.g
Mathy BotBicassizey t 132math t 313logCincome Bumotheduthramedo tu 3 62
it we're trying to explain outcomes on a fourthgrade math test
Then one includes a discussion or whether its reasonable to maintain MCR Y focusing on other
factors that affect me Next one describes me sampling method ideally through random sampling
as well as the Ols estimate
We will see that we add more to this in Ch 4 and 5 We might also add that we dont
can
MCR I where u is additive is always subject to criticism although this is not too restrictive
as we can transform both IV and DV Its always a good starting point and he functional
form issue isnt critical
3 7 a Prediction
Sometimes we're justinterested in pure prediction exercise Suppose a
college admissions officer
want to predict ColGPA with the available information at the time or the application me 1Us
The best predictor a Y as measured by mean savared error is the conditional expectation
EC 7 Xa x2 xx Bo 131 1 t it Blexle
which is the same as writing
MLR 4 is true by construction once we assume linearity If we have random sampling and we
can me out Perfect Colinearity we can obtain an unbiased estimator by Ocs
We can also get which Bj is the most important factor The next chapter will discuss which tu to
include
3 2b Efficient Markets
In economics these theories imply that a single variable acts as a surricient statistic for predicting
the outcome variable Y Foremphasis call this special predictor w Then givenotherfactors Xa xu
where theslight notation change denotes the special status of w Ch 4 will discuss how to
As a specific example take the sport betting market It produces a pointspread w spread
which is determined before a game It varies a bit days preceding the gamebutwouldessentially
Settles on some value Theactually scoredifferential in thegame is 7 scoredirt Eroiciency in
the gambling market implies that
ECScoredierI spread x1 xx E Scoredirt spread spread
where X involves any observable variables Theidea is that because a lot of money is involved
thespread will move until it incorporates all relevant information
MLR can be used to test this because MCR U holds once we assume a linear model
7 PotPew taxa t Tru xu tu 3.66
ECU W Xa xx o 3.67
As we're measuring tradeover it doesnt matter which one is Y and W However functionalforms
can comeintoplay as seen in Example 4.10
If we have accounted For all factors in wage that should affect productivity then wage
differences may be attributed to discrimination In thesimplest case we would use a linearmodel
For a general Y and W we have 3.66 and 3.67 So MCR 4 holds which means we can use
Ols as an unbiased estimator The main issue is still the omitted variables problem
Weimagine the potential outcomes Yeo and Ycl If we assume a constant treatment erect
yea T t TCO
When the treatment errect can vary
by i the average treatment errect is
Tate E Yet y co
where the expectation is taken over the entire population
For a random draw i the outcome we observe i can be written
Random assignment is still rare in social sciences as true experiments are still rare le we
can control variables that can predict the outcome and determine assignment to control and
This assumption is called conditional independence where its important to note the variables
included in X In treatment effects literature 3.71 is also called onconfounded assignment
Theelements in x include educ age exper etc Suppose that workers are more likely to participate
if they have lower educ age and experThen because these are very likely to predict color
74 random assignment doesnt hold Nevertheless once we group people by education age
and prior work history its Possiblethat assignment is random
As a concrete example consider a group of people with 12yrs or schooling of 35years old
and an avg earnings of 256 over the past 2yrs What 3.71 reauires is that assignment to
treatment and control withinthis group is random
The more variables we observe prior to the implementation of the program the more likely that3.71
holds If we observe no information on x then we're back to assuming pure random assignment
To use 3.71 in multiple regressionhere we only consider the case of a treatment errect T
Section 7.6 considers me more general case Then in the population
7 7CO tr W
and
E CYI Wix ECYCO Wix TW ECYCO X J t Tw 3.72
where the second eauality follows from conditional independence Now assume that
ECYco x is linear
EC Yeo x at X r
pluggingin gives
ECY Wix at TW X r
yay yay t truth 3
Without random assignment its highly likely that the negative coefficient and its large magnitude
is a product of nonrandom assignment This could be that men with Poor earnings history
are more likely to be selected or are more likely to be eligible Once we factor in me controls
earning8 4.67 2.41 train to373earngo t 363educ 181age t 2.48 married 3.75
7 1.130 R 0.405
the change in train is remarkable we now expect an increase of 2.410 once the control
variables have been taken into account
The signs of the other coefficients arent surprising as well We expect earnings to be Positively
correlated workers with more educ earn more and married men tend to earn more The R2
Still could explain more but did a good job overall
Summary
1 The MCRmodel allows us to effectively holds other factors fixed while examining the effect
or an IV on a DV It explicitly allows the Ns to be correlated
3 OLS easily applicable to estimate MRM Eachslope estimate measures the partial effect
is
of the corresponding IV on the DV holding all other IVs fixed
4 R is the proportion in the sample variation in the DVexplained by the IV Dont put too much
emphasise on this when evaluating econometric models
5 Under MLR1 4 the Ols estimators are unbiased Including an irrelevant variable has
no effect on unbiasedness Omitting a relevant variabledoes however
6 Under MCR 1 5 the variance of an Ols slope estimator is given by
Var 1357 84 SST CI RI
As 8 increases so does Var Bj and as SST increases Var 1B decreases R measures
10 We will begin to use the se of the Ols coefficients to compute cont int for the
Population Parameters and to obtain test statistics for testing hypotheses aboutthe population
parameters
1614122
2 19 36