0% found this document useful (0 votes)
38 views

Chapter 3 Multiple Regression Analysis Estimation

This document provides an overview of multiple regression analysis. Multiple regression allows us to explicitly control for extraneous variables that could affect the dependent variable. It allows for more flexibility by incorporating more independent variables into the model compared to simple regression. Section 3.1 formally discusses the multiple regression model and its advantages over simple regression. Section 3.2 demonstrates how to estimate the parameters using ordinary least squares methods.

Uploaded by

SiyaB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Chapter 3 Multiple Regression Analysis Estimation

This document provides an overview of multiple regression analysis. Multiple regression allows us to explicitly control for extraneous variables that could affect the dependent variable. It allows for more flexibility by incorporating more independent variables into the model compared to simple regression. Section 3.1 formally discusses the multiple regression model and its advantages over simple regression. Section 3.2 demonstrates how to estimate the parameters using ordinary least squares methods.

Uploaded by

SiyaB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

CHAPTER 3 MULTIPLE REGRESSION

ANALYSIS ESTIMATE
In Chapter 2 we learned how to use simple regression analysis to explain a DV
7 as a function of a single IV The main issue is that in a ten unrealistic to draw
a ceteris Paribus relationship especially with the 2cm assumption

Multiple regression analysis allows us to explicitly control these extraneous variables


that acrect the DV

Naturally adding more factors that are useful for explaining y would explain more

variations in Y mus allowing us to build bettermodels

It allows us to be more flexible in terms of allowing more functions to be incorporated


into the model

Section 3 1 formally discusses multiple regression model and its advantages over
simple regression model

Section 3 2 demonstrates how to estimate the parameters using Ols methods

Section 3 3 3 5 describe me various statistical properties of the OLS estimators


including unbiasedness and efficiency

The multiple regression


model is still the most widely usedmodel in empirical analysis
while the OLS method is Popularly used for estimating the parameters
3 1 Motivation Multiple Regression
For
3 1a The Model with Two Independent Variables
We begin by some simple examples to show how multiple regression analysis can
be used to solve problems that cant be solved by simple regression

The first is a variation of the wage educ example in Ch 2

Wage Bot B educ t 132 expert u 31

Thus
wage is determined by educ exper and the other unobserved Factors contained in n
We are still interested in me errect of educ on wage Byputting 13 we explicitly take

exper out of u and put it explicitly in the equation We can also see theceteris paribus
effect of exper and wage which may be of some interest

We will also have to make assumptions or how u is related to both IVs However we can

be confident that exper is held fixed unlike in simple regression where we assumethat

exper is uncorrelated with educ a tenuous assumption

Generally we can write model with two IVs as

7 BotBix t Bax tu 3.3


Where
Bo is he intercept
B measures changes in Y with respect to x ceteris Paribus
P2 measures changes in Y with respect to Xz ceteris paribus

MRA is also useful in generalising functional relationships between variables For example
Suppose
family consumption is a Quadraticfunction of family income

Cons Bot Riina t Mind t u 3.4


This model cant be written using SR as there are two different IVs
Mechanically there will be no direerence with using Ols as an estimator Eachequation can
be written as 3 3 which is all that matter for the computation

Theres one important difference in howone the parameters In 3.1 131 is me ceteris
interprets

Paribus correct or educ and wage We cant do that with 3.4 as it inc changes so does incl

Instead he change in consumption in respect to a change in income MPC is approximated by

Iff I Bit 2132 inc


In other words the MPC depends on R2 131 and the level of income Thisshowsthat he
definitions of the N are crucial But for me theoretical development of multiple regression we can
be vague for such details we will study more examples in Ch G

Inthemodel with two variables the key assumption of how X and Xz are related to u is

E CUI Xi x2 0 3.5

This assumption is similar to the 2cm for simple regression analysis When applied to a

Quadratic function such as 3.4 writing ECUline incl o and Ecu line o are the

same although the latter is more concise

3 lb The Mode with le


Independent Variables
We can add many observed factors The general multiple linear regression MCR
model also called the multiple regression model can be written in the population as

Y Bot Bix 312 2 t 133


3 t t Be xn tu 3.6
3 6 contains Kt I Connnown Population Parameters For Shorthand Purposes we will sometimes

refer to the parameters otherthan meintercept as slope parameters

The terminology is also the same for simple regression such as u as the error terms No
matter how many IVs we include there will be factors that we cant include

We must also know how to interpret the parameters which we would practice a lot in the
subsequent chapters Here is a reminder of what we already know

Suppose mat CEO salary is related to sales and Ceoten by

log salary Bo B log sales f132Ceoten t 133Ceotent th 3.7


From Ch 2 Me parameter B is the ceteris Paribus elasticity of salary with respect to
sales

8 313 0 then is approximately he ceteris Paribus Percentage increase in salary


100312

when ceoten increase by one year When 13340 men the effect is more complicated
We will return to the detailed treatment of general models with Quadratic in Ch G

3.7 Provides an important reminder that the parameters are linear but is a nonlinear
relationship between the IV and DV Many applications of the MRM involve his relationship

The 2cm key assumption is also the same

ECU Xi x2 Xu7 0 3.8

Section 3 3 will discuss how 3.8 implies that the Ocs is unbiased and will derive me
bias when a key variable has been omitted In Ch 15 and 16 we will study how it

may fail and what can be done about it

3 2 MECHANICS AND INTERPRETATION OF

ORDINARY LEAST SQUARES


We will summarise computational and algebraic features of the Old method and how
to interpret it
3 2a OBTAINING THE OLS ESTIMATES
We first consider when there are two IV Theestimated Ocs is written as

7 50 Fix t Bix 3g

We obtain Bo B and B2 by ordinary least sowares that chooses the estimate to


minimize the sum of the squared residuals That is given n observations on taxi and Xz
Xin Xie Yi i 112 in the estimates Bo I and Bi are chosen simultaneously to

I Yi Bi Bi xi Paiz 3.10
to be as small as possible

To understand what he Ols is doing we must master the subscripts first The Us have
two subscripts i followed by I or 2 i indicates the observation while I and 2 are used
to distinguish between the dirrerent IV

In the general case with k Us we seen estimates Bo Bi Bi in me equation

4 138 t Bix Bix t t Fuxu 3.11

The OLS estimates let1 of them are chosen to minimise the sum of SQuared residuals

I Yi Ro Bixi Bexia 3 12

This minimisation problem can be solved using multivariable calculus Appendix 3A


This leads to Ktl linearequations in Kfi unknowns 138 Bi Ek

I Yi BI Bixi Bixile 7 0

Exit Yi Po Pixie Buxik7 0


B
Exile Cti Bi Pixie Fuxing o
These are often called the Ols first order conditions This can also beobtained by
the methods of moments

Even with moderate n and k the calculation would be hidious and hard so just
use computers

uniquely for me Bj For now


Theres one caveat we must assume 3.13 can be solved
this will be assumed as the usual case with well specified models Section 3 3 will discuss the
necessary assumptions for unique OLS estimates

As with simple regression 3.11 is called the OLS regression line or the sample regression
function SRF We will call 138
me OLSintercept estimate and Bi Bu me OLSslope
estimates

3 2 b Interpreting the OLS Regression Eauation


Interpretation is more important than calculating We begin with two IVs

Y BotBix tBzxr 3 14

The intercept Po in 3.14 is the predicted value of Y when X O and X2 o Setting both
to zero sometimes yield interesting results sometimes it doesnt make sense

The estimates I and 2 have partial effect or ceteris Paribas interpretations From3.14
We have
AT Bi Axit R2Axe

So we can obtain a predicted change in Y given a change in X and Xz The intercept


doesnt have anything to do with the changes in Y In particular when X2 is held fixed Axe O men

AT BIAxe

Thekey point is mat by including Xr we obtain a coefficient on X with a ceteris Paribas


Example 3.1 Determinants or College GPA
The variables in GPAI include ColGPA hs6pA and ACT for a sample of 141 students from

a largeuniversity We obtain the following OLS regression line

COTGPA
1.29 t 453 hs6pA t Ooga Ace 3 15
n 141
Theinterpretation is as follow Theintercept 1.2g is me predicted value when heGPA and ACT are
Zero Assuming hat students with 0 ACT and hs6pA wont go to oni this valve is meaningless
in itself

There is a positive correlation between ColGPA and he6pA holding Act fixed another point
on hisGPA is associated with 453 of a point on the ColGPA

The sign on ACT implies hat holding Col6pA fixed an increase of 10 Points only increase
ColGPA by less man 0.1 This is a small erreet andsuggest hat Actscore is not a strong predictor

of ColGPA We will also discuss stat Sig later which we will find that his value is also
statistically insignificant

to we only regress ColGPA on Act

COPPA 2.40 t 0271 ACT


17 141
The valve is almost 37 in 3.15 but this doesnt allow us to compare two people win a ditzerent
hsGPA This dirverence will be discussed later

The OLS regression line for more man two IVs is similar

Y Got Ix Fix t tÉhxu 3.16


Written in terms of changes
114 5 ax Bixz t Fuxu 3.17
Holding all other IV fixed
A Y Maxi 3.18
We have coded Xz 43 Xie whenestimating the effect of Xi on Y

Thefollowing is an example with 3 IVs

Example 3.2 Hourly Wage Equation


From n 526 we have

logTage 284 to.ogz educt 0041 expert 022 tenure


7 526 3.1g
Similar to simple regression he coefficients have a percentage interpretation Theonlydisferent
is that it has a ceters paribus interpretation

Coefficient
0.092 means that holding others fixed another
year of education is Predicted
to an increase of log wage by 0.092 g 27 This keeps two other important Productivityfactors
fixed Whether his is a good estimate requires us to study thestatisticalProperties of OCS Section33

3 2C On the Meaning of Holding Other Factors Fixed in


Multiple Regression
The Partial effect interpretation or slope coefficients in multiple regression analysis can
cause some confusion so we Provide a further discussion

In example 3.1 we observed the coefficient on Act measures the predicted change in
ColGPA holding hisGPA fixed Thepower of MRA is mat it provides a data in a ceteris Paribas fashion

even though the data may not have been collected in such fashion

In giving he Act a partial effectinterpretation it may seem look like we went out and sampled
People with the same hsGPA with different Actscores even though that this was not how the
Sampling was done

Rarely we could hold certain variables fixed in obtaining our sample to we couldobtain

samples of individuals with the same hsGPA we would just run a simple regression or ColGPA
on ACT MRAeffectively allows us to do this without restricting the values of any IVs
3 2d Changing More Than One Independent Variables Simultaneously
We can easily do his using 3.17 For example in 3.19 we can obtain wage when an
individual stays at me rirm for another year exper and tenure increase by one year

holding educ fixed the total effect is

Alogewage 0.0041 A expert 0.22 Atenure


0 0041 to 21
0.0261
or about 2.611

3 Ze OLS Fitted Valves and Residuals


After obtaining the OLSline 3.11 we canobtain a fitted or Predictedrawes for each obs
For obs i me fitted valve is simply

Yi BE Xxii t Pixie t t Buxik 3.203


Whichis just the predicted valve obtained by plugging the vanes of me IV for observation i
into 3.11 We shouldnt forget he intercept as the result can be very misleading

Normally yi Yi OLS minimises the average squared Prediction error which says nothing
about the prediction error for particular observation
any

The residual is also the same with SCR

Ni Yi Yi 3.21
For each observation If hi O then Yi is below yi yi is under
predicted If Mico then
Ti is overpredicted

The OCS fitted values have some important properties that are immediate extension or mesingle

variable case
1 The sample average of me residual is zero so Y É
2 The sample covariance between each IV and the Ocs residuals is zero
3 The point Xi Xz Xk Y is always on the Ols regression line
3 25 A Partialling Out Interpretation of Multiple Regression
When applying OLS we dont need to know explicit formulas for Bj that some the
system in 3.31 Nevertheless for certain derivations we need explicit formulas for the Pi These

formulas also shed former light on the Ols mechanisms

Consider when u 2 and if we want to focus on Bi

Pi Erin Ir 3.223
Where Fil are the OLS residuals from a simple regression of Xi on Xz using mesample at hand

We regress x on X2 7 Plays no role here 3 22 Men shows that we can do a simple


regression or y on ri to obtain 131

The representation in 3.22 Shows another demonstration of Bi's partial effect interpretation
The residuals ri are he part of Xin mats uncorrelated with Xiz In other words it is
xis after he effects or Xiz have been Partialledout or netted out

We dont have his in SCR as theres no omer variables included in the analysis

In a general model involving k IVs we can still write B


as in 3.22 but he residuals
ria comes from the regression of Xa on X2 exk In econometrics this is known as the
Frisch Waugh theorem whichhas many use throughout such as in Ch.IO

3 2g Comparison of Simple and Multiple Regression Estimates


Two special cases exist in which the simple regression or y on XI will produce the same
OLS estimate on X1 as me regression or y on X1 and x2

To be more precise write he simple regression or Y on X1 as 7 139 131 1


and write
me multiple regression as Y Bo Fixa tRzxz

Weknow that the simple regression coefficient 3T doesnt usually eaual the multiple regression
coefficient Ri Theres a simple relationship between RT and Bi

E Bites 3 23

Where J is he slope coefficient from the simple regression of Xia on Xia This equation
shows how BT differs from the partial effect of X1 on Y The confounding term is he partial
effect or x2 on Y times the slope in thesimple regression of Xz on Xi

The relationship between RT and Bi also shows there are two distinct cases when theyre equal
1 The partial effect between Xz and Y is zero in me sample Bi o

2 XI and Xz are uncorrelated in the sample I O

Eventhough simple and multiple regression are almost never identical we can use meabove
formula to characterise why they might be very different or Quite similar For example when
2 is small we might expect the multiple and simple regression estimates or B to be similar

In example 3.1 the sample correlation between hisGPA and Acc is about 346 which is a
nontrivial correlation But he coefficient on ACT is pretty small Its not surprising to find that
a simple regression of ColGPA on hisGPA produces a slope estimate of 482 not that much

of a dirrerence from the estimate 453 in 3.15

Example 3.3 Participation in 4014 pension Plans


Regressing prate and mrate from dataset 4014 gives us

Prate 80.12 5.52 mrate t 243 age


n 1,534

The for prate is 87 36 732 For mrate and 13.2 for age Thus bow mrate and
avg
age have he expected effect

If we dont control age we might have a large change in the estimated change or mrate
if age is dropped from the regression However regressing prate on mrate gives us
Prate 83.08 5.86 mrate
The difference isnt to big only 6.21 larger his may be explained with me tact mat
the sample correlation between mate and age is only 12

In the case with k Ns the simple regression of Y and X1 and the multiple regression a
Y on XI Xi Xk Produce an identical estimate of Xa ire
I the OCS coefficients on Xz through all xx are all Zero

2 Xy is uncorrelated with each Xx


Neither of these is practically very unlikely but it bom are small then the simple and multiple
estimate of 13 can be similar

3 2h Goodness or Fit
This is very similar with SLR we can define total sum of squares SST he explained
sum of SQuares SSE and the residual sum or SQuares SSR as

SST I CYi D 3.24


SSE ICY Y 3.25
SSRI Wil 3.26
Using the same argument with SLR
SST SSE SSR 3.27
In otherwords me total variation in Yi is the same as the total variation in fi and ed

Assuming the total variation in Yi to it its a constant for example we have

SSRISST SSE SST 1


The R is defined as
R2 SSESST 1 SSRSgt 3.28
and its interpreted as me variation in Ti man explained by the Ols regression line R can
also be shown as equal to the sample correlation between Yi and Yi

É can
An important fact about R is it never decreases and usually increases when another IV is
added and the same set of observations is used for both regressions

Animportant caveat is that we dont have missing data on me IVs if two regressions use
different sets of observations even if one uses a subset or regressors we cant compare

how well the R will compare If we have full data on Y xi.az but not on Xs then
we cant say mat regressing y on X1 and Xz is less man regressing on Xa X2 and xp it
could go either way This is an important issue that we will return in Ch.g

The fact that R never decreases when any variable is added is a poor tool for deciding
whether to add more IVs Thisshould be done byjudging whether the IV has a nonzero

Partial effect on Y in thepopulation we will show his in Ch u when we cover statistical interference

3 2 Regression Through the Origin


Sometimes Bo can be zero we seek an equation co hee form

7 51 11132 2 t t Fuxk
Where n is used to distinguish from the OLS estimates When 41 0 42 01 Xu o hen

Yis hero

3.30 always minimises the squared residuals although he properties will be dirrerent
In particular me OLS residuals no longer have a zero sample average Further in R is
defined as l SSRIsst then R could actually be negative This means that I explains
more he variation in Yi mean he Ns Weshould put a Bo or mat meIVspoorly explain 7

One serious drawback with regression through the origin is mat it Bo to then he Ots ve me
slope Parameters will bebiased which can be severe in some cases
3 3 The Expected Valves or the OLS Estimators
We turn to the statistical properties of me OLS for estimating the parameters In this section
we derive he expected values of the Ols estimators We will discuss four assumptions and
obtain he bias in Ols when an important variable has been omitted from the regression

The first assumption simply defines me MLR model

Assumption MLR 1 Linear in Parameters

The model in the population can be written as


70 Bo 131 1 132 2 t Buxu ta 3.31

3 31 formally States the Population model or the true model to allow for thepossibility mat
we might estimate a model mats different man 3.31

Assumption MLB 2 Random Sampling


We have a random sample of n observations following the model in 3.31

Sometimes we need to write 3.31 for a specific observation i


Ti Bo t Pixie t Baxiz t t Buxin ta 3.32
i revers to the observation number while the second subscript is the variable number An example
of an eduation for a Particular CEO

logCsalaryi Bot 13110g sales B2Ceoten t 133ceoten tui 3 33


3.31 contains less clutter and indicate that we're interested in estimating population relationship

Section 3 2 we saw mat Olschooses me scope and intercept so mat he residuals average

equals to zero and that theres no correlation between each W and he residuals We still dont
have conditions underwhich he Ols estimates are well defined for a sample

Assumption MLR 3 No Perfect Colinearity


In the sample cand therefore in the Population none of the Ius is constant and theres no

perfect colinearity between the IVs


This isdifferent from the SLRmodel as now we look at the relationship between all the IVs
K an IV is an exact linear combination of other IVs we say that the model surfers from

Perfect colinearity and cant be estimated by Ols

It does allow for some correlation just not Perfectly correlated k we dont allow some
correlation men MCR would be of very limited use For example

avgscore Bo 131expend t Beangina tu

We fully expect that expend and angina to be correlated schooldistricts with highang Family
incomes tend to spend more on student In fact we put avg inc in the model to hold it
constant as we suspect it to have some correlation with expend MCR 3 any rules out

perfect correlation between the two Ivs

Thesimplest way MLR 3 Fails is mat is one variable is a constant multiple or another for
example if he same variable measured in different units such as when income and income
thousand are in the same model

The model cons BotBinet Rindt u doesnt violate MLR 3 as inch is not an exact linear
function or inc Including inch is a useful way to generalise anctional forms

A subtle way MLR 3 rails is with me model log cons Bot131logCine 31210gCine ta
Xz 2 1 and his violates MLR 3 we shouldve Put logCine instead

Another can tail is when an N can be expressed as an exact linear unction


way it
of two or more IVs For example the model vote A BothexpendA t132expend13 73totexpend th
This violates MCR3 as x XItxz Interpreting his ceterisParibus reveals theproblem Wewant
to measure 13 while holding 132and133 fixed This is nonsense as ie totexpend and expendB are
held constant we wouldnt be able to increase vote A

Thesolution is to just drop one of the variables preferably totexpend as men we can measure
the increasing expendA on voteA holding voted fixed
The examples show that MCR can fail is we are not careful in specifying our models

If its well specified and nz ktl in rare cases we couldobtain samples where he
valves are exactly twice although this may only happen where n is small

The final and most important assumption is

Assumption MCR 4 Zero Conditional Mean


The error u has an expected value or zero given
any values or he 10
ECU Xa Xa Xx 0
One way MCR U can fail is when we misspeciried the model such as forgetting to
include inch in cons Bo thine t 132inch th It can also rail when we use level
when log is what shows up in the population model and vice versa This will be discusse
Further in Ch g

Ommitting important variables could also cause MCR 4 to fail this is more of an issue in
MLR as we're able to include multiple variables

In Ch G and 15 we will discuss the problem of measurement error in the IV In Ch 16


we will cover a harder issue in which one or more Ivs is determined jointly win x

When MLR U holds we have exogenous explanatory variables is it fails we have

endogenous explanatory rails

We are now ready to show unbiasedness under the first four MLR

Theorem 3.1 Unbiasedness or OLS


Under MCR I 4
ECBj Bj j oil ok 3 37

It is useful to reiterate the meaning of unbiasedness An estimate cant be unbiased


its a random variable determined by a particular sample What unbiasedness refers is
the Procedures by which he Ols estimates are obtained We hope that we obtain an
estimate close to theestimate but his cant be assured Whats sure is that we believe
that theres no reason to believe that our estimate is far from the population model

3 39 Including Irrelevant Variables in a Regression Mode


Inclusion of an irrelevant variable or overspecifying the model is an issue mat we can
dispense fairly Quickly

Suppose hat an X has a population coefficient or zero with Y in


7 130 131 1 Bex2 t 93 3 tu 3.38
after XI and Xi are controlled Xs doesnt have an correct on 7 133 0 In terms of conditional

expectations E CY Xa Xz Xz ECY X1 x2 Bot131 1 t 132 2 le we were to alsoestimate 133

Y RotBIxa Exit 133 3 3.39


is doesn tarrect the unbiasedness or the Ols estimate Including an irrelevant variable is
not harmless however as we see in Section 3 y it can have an undesirable effect on the
variances of the Ols estimators

3 36 Omitted Variable Bias The Simple Case


Now suppose that we omit a variable that actually belongs in the population model This problemis
called excluding a relevant variable or underspecifying he model This will cause the OLS
estimators to be biased and we will show explicitly how

Deriving thebias caused by omitting an important N is an example of misspecirication analysis


We begin withme case where thetrue population model has two IVs and an error term

7 BotBaxa 132 2 tu 3 40
and we assume MLR I 4

Suppose
that our Primaryinterest is with 131 the partial erect or ya on Y where 7 is wage X1 is
educ and Xz is a measure of innateability Toget an unbiased estimator or B we should run a

regression of Y on Xi and Xs Butsuppose mat we are not able to dothis


F To Taxa 3.413
We use n instead of n to emphasize an underspeciried model For example is wage is
determined by
wage Bo B eductBabil tu 3 42
Because abi is not observed we instead estimate

wage Bo theduct V
Where v Rabi tu Obtaining Bi is not difficult as its just the same with using me Ols
formula Theonlydifference is that we must analyse its properties when an SLRis misspeciried due to
an omitted variable We have done almost all the work on his using 3.23 we have

E BePisa
Where Pi and Pi are he slope estimators from the multiple regression
Yi on Xia Xie i 1,2 in 3.43
and Jais the slope from the simple regression
xiz on Xia si 1.2 in 3 44

Because I depends only on he Ns we treat it as fixed when computing ECBT Former


because MLR I 4 havebeen satisfied we know that I and Bi are unbiased merecore

ECBI ECRITIA
ECRI t E Br Ja
Rat281 3.45
Whichimplies thathe bias in F1 is
Bias PI E BI 131 13281 3.46
Because thebias arises from omitting the N x2 the term on me right hand side of 3.46 is often

called me omitted variable bias

There are two cases where It is unbiased The first is it 132 0 and it 81 0 even
if 3240 This is because It is only the sample cow so if Ya and Xz are unrelated BI is
Unbiased
When x1 and Xz are correlated It has me same sign as the correlation between ta and
x2 I o ie XI and xu are positively correlated and vice versa Table 3.2 warrants a
careful study as it shows me sign or the bias BI depending on he sign of 132and 1

Corr Ca x2 o Corr Xan ko


31270 Positive Negative
Baco Negative Positive

Table 3.2 Summary of Bias RI when Xz Omitted in Estimating 3.40

The size is also very important a small one of either sign wont be a cause for concern

Because P2 is unknown we dont know whether its positive or negative We could make an educated

guess between Xt and Xz For example there are reasons to believe that educ and abi are

Positively correlated

Example 3.6 Hourly Wage EQuation


logewage BotBaeductBabil tu If
Suppose that satisfies MCR I 4 we donthave data
on abit so we estimate 131 by

logtwage 584 t 083 educ


n 526 R 186 3.47
As this is only from a single sample we cant say that 0.83 is greater than BI nevertheless

We know that the avg of me estimates across all random samples would be too large

As a second example consider

avgscore Rot31expend t Bepourate tu 3 48


B2 is Probably negative and CorrCxa Ye is also probably negative Using Table 3.2 we can

concludethat he BI is Probably Positively biased


Some notes on terminologies reading and Perrorming empirical work IF ECBI 7131 men
for

we say that RI has an upward bias When ECBI 131 it has a downward bias These
are the same whether its positive or negative ThePhrase biased towards zero revers to cases
Where E PT is closer man zero compared to 131 12 Ba is positive then It is biased towards
zero it it has a downward bias and vice versa

3 3C Omitted Variable Bias More General Cases


Deriving he signs here are more difficult We mustremember that correlation between a

single IV and the error generally results in all Ols estimators being biased Suppose

7 130 131 1 f132 2 33 3 th 3.49


Satisfies MLR I 4 If we omit X3

7 50 5 2152 2 3.50
Suppose that Xz and Xz are uncorrelated but Xa and Xz are Xa is correlated with
the omitted but Xz is not In his case instead of only BI is biased boy will generally
be Theonly exception is where x1 and Xz are also uncorrelated

Its also difficult to obtain the direction of bias in BI and R2 as X1 x2 and as can be
Pairwise correlated We can still approximate this if we assume Xt and Xz are uncorrelated
then we can study me bias in BI as it Xz were absent from both the Population andestimated
models Ie Xt and Xz are uncorrelated it can be shown hat

ECRI Bat B EC Xia Fa xis


ICxia X
This isjust like 3.45 but133 replaces B2 and Xz replaces Xz in 3.44 Therefore the bias in
BI is obtainedby replacing 132 with B and Xz with Xs It 13370 and Corr x2 x2 70
has a positive bias and so on

As an example

Wage Bot131educ t Beexpert Babi the


If abi is omited from me model PI and Pi are both biased even if we assume abit is

uncorrelated with exper le we're interested in educ and would like to know thebias direction
of BI mis is impossible without more information As an approximation if educ and exper are
also uncorrelated and 13370 and educ and abil are Positively correlated then RIhas a positive bias
The example is used as a roughguide for more complicated modes Usually the focus
is on a particular relationship such as xa and he omittedvariable Strictly speaking ignoring
all other IVs is valid practice it each one is uncorrelated

3 4 The Variance of the OLS Estimators


In addition to bias we will discuss how far spread an estimator is in its sampling distribution
First we add a homoshedacity assumption for two reasons to simplify me formulas and
in Section 3 5 we will see that an OLS will have an important efficiency property is homostedacity i
present

Assumption MCR 5 Homoshedacity


The error u has me samevariance given any value or he Ius
Var CU Xa xx P

MLR 5 means that the variance in U conditional on me lus is he same for all combinations

of outcomes of the IVs If this fail hen we have heterostedaeity

Assumptions MLR 1 5 are collectively known as the Gauss Markov assumptions So


far these are only applicable to crosssectional analysis with random sampling For timeseries
analysis and others these will be more difficult to state although there are some similarities

In the discussion mat follows x will beused to denote all IVs Thus X Cedve exper tenure
We can write MLR I and 4 as

E CY X RotBaxa t Baxa t t Bexk

and MLR
War xix T
This illustrates themain difference between MLR 5 and 4 the former doesnt depend on the
individual values or the N while MLR 4 does
For proof look at the appendix chapter
Theorem 3.2 Sampling Variances of me OLS Estimators
Under MCR1 5 conditional on the sample valves or the Ius

Var Bi sgF.pt 3 51

Where SST is the total sample variation in Xj and Rj is obtained from regressing Xj on

all other IVs including an intercept

Before studying 3.51 in detail we need to know all the 6 MA used for 3.51 The size
of VarCRj is also important A larger variance means a less precise estimator and translates

to larger conf int and less accurate hypotheses tests

3 4a The Components of the OLS Variances Multicolinearity


3 51 Shows that Var Bj depends on threefactors T SST and Rj We consider each in turn

The error variance p Alarger f means larger sampling variances more noise in me
eauation translates to harder difficulty for the estimator to estimate the partial effect or any
IVs on Y Because f is a feature of the Population its one of me components mats unknown
we will see later how to find an unbiased estimator for G

There is for Y that is to add more IVs although its


only one way to reduce t not always
Possible to find more legitimate factors that affect y

Total Sample Variation in Aj SsTj The larger SStj he smaller Var Bj is Thus ceteris
Paribus we want to have as much sample variation in Xj as possible Although we rarely can
individually choose the sample valves of the N we can increase n This is thecomponent hat

systematically depends on the sample size

Small SST doesnt violate MLR 3 unless its Zero as we cant even have an Ocsestimate
then As SST 70 Var Bi 700
The linear Relationships
among the IVs Rj This is the most difficult to understand This
doesnt appear in SLR as theres only one IV Its also different from an R obtained from
regressing y on Xa xa xp R is obtained from only regressing lbs in the original model
where x plays the role of the DV

Consider the first le 2 case Y BotBaxa thx e th Then

VarCRI TICSST CI R
where RI is he from me simple regression or Xt and Xz A value or RE indicates
R
that Xz explains much or the variation in x2 Xa and he are highly correlated

As RI 1 Var B1 gets larger and larger In general case RY is he proportion of the


total variation in x thats explainable
by me other IVs The smallest Var Bj is when
RI O and only happens in Xj has zero sample correlation with all the omer IVs This is
the best estimator or Bj but it rarely happens

RI 1 is ruled out by MLR as Xj is he porrect linear combination or some of the IVs

A more realistic case is when RI is close to 1 high correlation between two or more

IVs is called multicolinearity

As multicolinearity doesnt violate any of its hard to Quantify me problems


our assumptions
Theres no absolute number of what close to I means It does indicate hat Xj has a strong
linear relationship with theother IVs Whether his means a high Var Bj depends on me
SST and 5

Small value of SST can cause a large Var Bj too therefore one cant really decrease
the sample size even if it may reduce R SST will fall too

One thing thats clear Ceteris Paribus For estimating Bj its better to have a small correlation
between x and other IVs This means its tricky to solve his problem in social science as we
are often only observing me data We can try dropping an IV but his may lead to bias
Heres an example illustrating the issue of multicolinearity Suppose
we're interested in estimating
the effect of numerous school expenditures on student performance Its highly that expenditures

on teachers materials athletics are highly correlated wealthierschools tend to spendmore on everything

Its difficultto estimate the effect co any Particular expenditure category when theres little
variation in one category that cant be largely explained by variations in other categories
this leads to high R for each IVs

Such problems can be mitigated data although he Questions might be too


by collecting more

subtle to be useful It might be useful to change the scope of our analysis and lumping
all the expenditures together

high correlation between certain Ius doesnt tell how well we


Another point is a can estimate

other Parameters in the model Forexample consider

7 Bot131 1 t 12 2 33 3 tux
where Xz and Xz are highly correlated Then Var B2 and Var 133 I
may be large but
this tells nothing on Var BI In fact it xa is uncorrelated with Xz and Xs then
RE O regardless of the correlation between Xz and Xz Ie Bt is he parameter of interest
We dont need to care about the correlation between Xz and Xs

The previous example is important as economists often include many control variables in order
to isolate the causal effect of a particular variable High correlation between these variables
dont make it harder to determine an effect of a particularvariable

Someresearchers tend to compute statistics intended to determine the severity of multicolinearity

Its easy to misuse such Statistics as we cant specify howmuch correlation among IVs is too
muchThey might reveal a Problem even though its just two control variables whose coefficients
we dont care about

A somewhat useful but still misused are statistics for individual coefficients A common one is
called the variance inflation factor Ulf obtained directly from 3.51
The Vlf for slope coefficient j is

Ulf 1 C1 Rj

Precisely the term in Var Bj thats determined by correlation between xj and theother Ivs We
can write Var Bj as

Var Bj stg Vieri


which Shows that he factor by which VarCBI is higheris because xj is correlated with
theother IVs Because Ulf j is a function of RI our previous discussion can be based entirely
on Vij

If we had the choice we would want to minimise Ufj but we rarely have the choice
If we think certain IV needs to be included then we are resistant to dropthem high Ulf cant

really aerect the decision If we'reinterested with me erreet of Xa and 7 we would ignore
all the Uts or other coefficients

Setting a cutoff for a Ulf is arbitrary and not helpful as he SD depends on me


SST and J Just looking at the Ulfj similar to R is of limited use although one might do
it for curiosity 1514122
1946
1614122
3 4b Variance in Misspeciried Models 11 35
The choice or whether to include an N can be made by measuring the tradeoff between
bias and variance Beside from obtaining a bias by omitting a variable we continue the
analysis by comparing the variance of the Ols estimators

Write the true population mode under GMA


7 130 131 1 f 132 2 t U
We consider two estimators of 131 The estimator 1 comes from

7 310 11 1 Fax 3.52


While RI comes from omitting X2
7 50 51 1 3.53
When 31240 3.53 excludes a relevant variable and will add a bias unless X1 and Xz are
uncorrelated le bias is used as he only criterion B1 will always be preferred lover RI as
I will be unbiased for any value of 132

This doesnt hold it we bring variance intothe analysis Conditioning on the value of ta and

Xz in the sample from 3.51 we have

Where SSTa is the total variation in X1 and RI is the R sauared mom regressing XI on

x2 Further we can show that


i

3.557

We can see that Var PI is always smaller man 1 unless x1 and xz are uncorrelated

in which the two estimators BI and Pa are the same Assuming that X1 and Xz are

correlated we have
1 When13240 BI is biased Bt is unbiased and VarCBI VarCBI
L When 132 0 PI and PI are unbiased and Var BI a Var B1

Fromthe second conclusion its clear that 51 is preferred is its zero Butincluding it only
exacerbates multicolinearity as x2 doesnt have a partial correct on y leading to a less efficient

estimator of 311 Ahigher variance for me estimator 131 is the cost of including an irrelevant IV

Its more difficult when 13240 as leaving out Xz will result in a bias for theestimator of 131

Traditionally econometricians suggest using R to decide whether to include Xz by comparing


thesize of the bias and reduction in variance

When Xz to there are two favorable reasons for including x2 in the model Themostimportant
is that the bias in PI doesnt shrink as
me sample size grows it doesnt follow any
Pattern The bias will be roughly the same for
any sample size On the other hand both
VarCBI and Var BI will 70 as n gets large reducing the multicolinearity issue In a large
Sample we would Prefer 131
The other reason to prefer BI is 3.55 is too generous in measuring
more subtle Essentially

the Precision or BI Leaving out Xz when 13240 will increase 02 But 3.55ignores me

increasing
as it will treat both regressors as nonrandom Idle either lmao

3 4C Estimating J Standard Errors of the OCS Estimators


We show how to get an unbiased estimator or t which will allow us to get an unbiased
estimator or Var Bi

Because 82 Ecu an unbiased estimator is me sample average or the squared errors


n Emi
This is not a true estimator because we dont observe the mi Nevertheless we can write

the errors as
W YI 130 BAXia t 132xp Bexik
which means that we dont know he errors as we dont know the Pj When we replace

each Bj with its Old estimators we get the Ols residuals

Wi Yi Bd Taxis B xie Fuxile

Itseems natural toestimate 8 byreplacing ni with Mi In the SLR case we can see mat
this will lead to a bias The unbiased estimator or t in the general multiple regression case is

82 Evil Cn K l
SSRICn ie 1 3.56
The term n ie l is the degrees of freedom doo for me general OLS problems with

n observations and le Ns Because here are let1 Parameters we can write

doe n CK ti
number or observations number of estimated Parameters

Technically the division bythe de comes from the fact that


E SSR n K 1 T
Intuitively we can figure out why he de is necessary by going back to theFCO for me
OLS estimators

guy o and Exijni O


Thus the restrictions KH are imposed on the Ols residuals This can be summarised in

Theorem 3.3 Unbiased Estimation Or O


Under the Gauss Markov assumptions MCR 1 5 E T2 82

The Positive square root of f denoted f is called the standard error of the regression
SER Its an estimator or me SD in the error term Its usually reported by regression
Packages although its called different things by diverent packages MSEin Star

For constructing conf int and conducting tests in Ch 4 we will need to estimate the
standard deviation of Bj which is just square root of the variance

Sd Pj J I SST C1 R

Because it with f Thisgives us me standard Bj


J is unknown we replace error or

Se Pj F SST C1 R 3.58
As sect depends on J it has a sampling distribution which will play a role in Ch 4

Theres one thing that should be emphasised with se Because 3.58 is obtained from 3.51
and because 3.51 relies on homoskedacity 3.58 is not a valid estimator of Sd Bj it the
errors exhibit heteroshedacity

Thus While thePresence of heteroshedacity doesnt cause bias in the Bj it does lead to bias
in the usual formula or Var Bj which then invalidates the standard errors Chapter 8 will
discuss what methods are available for dealing with heterosnedacity

For some purposes its helpful to write

F
SeCB 3.5g
Insect say
in which
Selex FICxijIj
to be the sample Sd where the SST is divided by n rather man n 1 Theimportance a
35g is how n directly acrects the se The other three terms R f and Sd xj will turn to
a constant as n gets larger the Presion a B increases as we have moresample In contrast
biasedness will hold for any value or n We will discuss thisfurther in Ch 5

3 5 Efficiency or OLS The Gauss Markov Theorem


We discuss meimportant Gauss Markov Theorem which justifies the usage of the Ols
method rather than using a variety of competing estimators Wealready know that Old is
unbiased but there are many others that share this property

We can show that the Ols is also the estimator with the least variance its the best linear
unbiased estimator BLUE To state the theorem we must discuss eachcomponent ortheacronym

First an estimator is a rule that can be applied to any sample a data to produce an estimate

We also know what an unbiased estimator is

Linear meanwhile in the context or an estimator 35 or Bj islinear ire it can be expressed as


a linearfunctiont of the data on the DU

Bi I wi ei 3 60
where wi can be a unction of the sample valves of all the IVs OLS is linear as seen in 3.21

Finally best under me current theorem can be defined as having the smallest variance which
is always preferable given two unbiased estimators

The GMA States that For any estimator Bj mats linear and unbiased Var Bj E Var B
Inthe class of linear and unbiased estimators OLS has the smallest variance It says more than
this to we want to estimate any linear function or Bj the corresponding linear combinations of
the Us estimator will achieve the smallest variance We conclude with a theorem
Theorem 3.4 Gauss Markov Theorem

i i
It alsojustifies the usage of Ols to estimate the
i
MRM If any of the assumptions rails men so does
the theorem

If 2CM fails it will cause the Ols to be biased Thepresence of heterosmendacity failure at
MLR 5 doesntbias the Ols but it will no longer has the smallest variance In Ch d we will

analyse an estimator that improves me ou it heteroshedacity is Present

3 Ga Some Comments on the language of Multiple Regression


Analysis
Its common to report that they estimated an Ols model Although this might be understandabt
this is just wrong not only aesthetically but also regarding its components

The first thing is that Ols is an estimation method not a model Amodel describes an underlying

Population and depends on unknown parameters The linear model we've been studying can be written
in the population as
Y 130 131 1 thx z t t u 3 61
where the Parameters are the Bj Importantly we can talk about the meaning of Bj without any
data We may not be able to learn much without data but the interpretation or Bj is obtained moon 3.61

Once sample hasbeen acQuired we can estimate me parameters There are various models Ocs is

just one of them Under MLR 1 5 OLS is Preferred and different assumptions call for different
models Afew examples include weighted least sauares in Ch 8 least absolute deviations in Chg

and instrumental variables in Ch15

We need to specify which assumptions we make when creating theestimate Simplysaying


estimating an Ols mode isnt enough as it will end up with direerent estimators

This is the ideal approach Write an eauation like 3.61 with easy to read variable names e.g
Mathy BotBicassizey t 132math t 313logCincome Bumotheduthramedo tu 3 62
it we're trying to explain outcomes on a fourthgrade math test

Then one includes a discussion or whether its reasonable to maintain MCR Y focusing on other
factors that affect me Next one describes me sampling method ideally through random sampling
as well as the Ols estimate

A proper way to introduce it is


I estimated 3.62 by OCS Under the assumption that no important variables have been
omitted from theeduation and assuming random sampling the Olsestimator of the class

size effect Ba is unbiased

We will see that we add more to this in Ch 4 and 5 We might also add that we dont
can

control me variables enough which may cause a bias in me Ols

3 7 Several Scenarios For Applying Multiple Regression


Now we will discuss scenarios where the unbiasedness of the Ols can be established In
Particular we are trying to Verity MLR I and 4 as these are important populationassumptions

MLR 2 and3 are rarely a concern

MCR I where u is additive is always subject to criticism although this is not too restrictive
as we can transform both IV and DV Its always a good starting point and he functional
form issue isnt critical

3 7 a Prediction
Sometimes we're justinterested in pure prediction exercise Suppose a
college admissions officer
want to predict ColGPA with the available information at the time or the application me 1Us

The best predictor a Y as measured by mean savared error is the conditional expectation

ECYI XI Xx If we assume linear function of me conditional expectation men

EC 7 Xa x2 xx Bo 131 1 t it Blexle
which is the same as writing

7 130 317 1 t t Buxte


ECUXa KU O

MLR 4 is true by construction once we assume linearity If we have random sampling and we
can me out Perfect Colinearity we can obtain an unbiased estimator by Ocs

We can also get which Bj is the most important factor The next chapter will discuss which tu to
include

3 2b Efficient Markets
In economics these theories imply that a single variable acts as a surricient statistic for predicting
the outcome variable Y Foremphasis call this special predictor w Then givenotherfactors Xa xu

We want to test the assumption

ECYlw x ELYIw 3.63


we can test this using a linear model for ECyl w X

where theslight notation change denotes the special status of w Ch 4 will discuss how to

test whether all he ri are zero

Manyefficient market theories imply more than 3.63 Inaddition typically


ECY W W
Which means that in 3.64 130 0 and 311 1 We will learn how to test such restrictions in Ch 4

As a specific example take the sport betting market It produces a pointspread w spread

which is determined before a game It varies a bit days preceding the gamebutwouldessentially
Settles on some value Theactually scoredifferential in thegame is 7 scoredirt Eroiciency in
the gambling market implies that
ECScoredierI spread x1 xx E Scoredirt spread spread

where X involves any observable variables Theidea is that because a lot of money is involved
thespread will move until it incorporates all relevant information

MLR can be used to test this because MCR U holds once we assume a linear model
7 PotPew taxa t Tru xu tu 3.66
ECU W Xa xx o 3.67

3 7C Measuring the Tradeoff between Two Variables


Sometimes
regression models are not used to predict but to measure how an economic agent
trades are one variable for another Call thesevariables y and w Forexample in a population
of teachers in the Us let y be annual salary and w be a measure of Pension compensation

to teachers are indifferent a 1 increase in pension should be associated with 1 fall in


salary only total compensation matters In particular this is a ceteris Paribas Question and a
matter of positive correlation between the two variables We want to know howdoes a teacher
trade or one for the other

As we're measuring tradeover it doesnt matter which one is Y and W However functionalforms
can comeintoplay as seen in Example 4.10

Arter w has been chosen we are interested in ECI w x Assuming a linearmodel


we would then have 3.66 and 3.67 A key difference is mat it x properly controls me diveren
in the individuals the theory of a one to one tradeoff is 131 1 without restricting Bo which
is Quite different from the erricient market hypothesis

Weinclude the Xj to control for dirverences we dont expect 7 0 as we wouldnt beinterested


in testing 3.65

If we donthave enough X the estimator would be biased tantamount to me omitted variable


For example we
Problem
may not have a suitable measure for risk aversion
3 2d Testing for Ceteris Paribus Group Differences
Another
usage is to test for differences among groups once we account for other factors
In Section 2 7 we discussed the example wage and race To his end define
a estimating

a binary variable white We noted that finding a discerence didntindicatediscrimination as mere


are other contributing factors 1614122
1613
Let ya x2 xx denote other observable factors that can arrest wage then we're 191
interested in
E Wage white Xa xa xx

If we have accounted For all factors in wage that should affect productivity then wage
differences may be attributed to discrimination In thesimplest case we would use a linearmodel

ECwagelwhite Xa x2 xx Bot131white t taxi t t Turk 3 68


where we're primarily interested in the coefficient Pa which measures the differences in whites
and nonwhites given the same levels of the control variables

For a general Y and W we have 3.66 and 3.67 So MCR 4 holds which means we can use
Ols as an unbiased estimator The main issue is still the omitted variables problem

3 Te Potential Outcomes Treatment Effects and Policy Analysis


We introduced the simple regression model in 2 7a in the context of binary policy intervention
Wechange thenotion slightly here using w to denote thebinary intervention or policy indicator

Weimagine the potential outcomes Yeo and Ycl If we assume a constant treatment erect

Say T then we can write for any unit i

yea T t TCO
When the treatment errect can vary
by i the average treatment errect is

Tate E Yet y co
where the expectation is taken over the entire population
For a random draw i the outcome we observe i can be written

Yi I wi yico t wixici 3.6g

One of the important conclusions from 2 la is that thesimple regression Y on w is an


unbiased estimator of care only if we have a random assignment or w

W is independent a Yeo yep 3.70

Random assignment is still rare in social sciences as true experiments are still rare le we

can control variables that can predict the outcome and determine assignment to control and

treatment we can use multiple regression

Consider the following assumption

W is independent of Cyco x Cl conditional on X 3.71

This assumption is called conditional independence where its important to note the variables
included in X In treatment effects literature 3.71 is also called onconfounded assignment

or unconfoundnessconditional on X ignorable assignment or Inorability are also used

Assumption 3.71 has a simple interpretation Think about


Partitioning the population based on the
observed variables in X Consider the job training Programme in 2 1a w indicates whemer a
worker participates in a jobtraining program and y is an outcome e g labourincome

Theelements in x include educ age exper etc Suppose that workers are more likely to participate

if they have lower educ age and experThen because these are very likely to predict color
74 random assignment doesnt hold Nevertheless once we group people by education age
and prior work history its Possiblethat assignment is random

As a concrete example consider a group of people with 12yrs or schooling of 35years old
and an avg earnings of 256 over the past 2yrs What 3.71 reauires is that assignment to
treatment and control withinthis group is random

The more variables we observe prior to the implementation of the program the more likely that3.71
holds If we observe no information on x then we're back to assuming pure random assignment

To use 3.71 in multiple regressionhere we only consider the case of a treatment errect T
Section 7.6 considers me more general case Then in the population

7 7CO tr W
and
E CYI Wix ECYCO Wix TW ECYCO X J t Tw 3.72
where the second eauality follows from conditional independence Now assume that
ECYco x is linear
EC Yeo x at X r

pluggingin gives

ECY Wix at TW X r
yay yay t truth 3

We are primarily interested in mecoerricient on w which we have called T The ri are


of interest for logical consistency checks we expect educ to lead to higherearnings on average

for example butthe main role d he xj is as controls

Example 3.7 Evaluating a JobTraining Program


Using data on TRAINgo we would like to explainFearngethvariacewatanism
binary participation indicator The participation in the job training Program was partly based on past
labor market outcomes and is partly voluntary Therefore random assignment might not be a good
assumption For controls we use earn96 educ age and married

A simple regression estimates are

earngo 10.61 2.05 train 3.743


N 1.130 R 0.016
The coefficient means those whoparticipated in the program earned 2050less than those
who participated Theaverage of those who dont participate is theintercept 10.610

Without random assignment its highly likely that the negative coefficient and its large magnitude
is a product of nonrandom assignment This could be that men with Poor earnings history
are more likely to be selected or are more likely to be eligible Once we factor in me controls

earning8 4.67 2.41 train to373earngo t 363educ 181age t 2.48 married 3.75
7 1.130 R 0.405
the change in train is remarkable we now expect an increase of 2.410 once the control
variables have been taken into account

The signs of the other coefficients arent surprising as well We expect earnings to be Positively
correlated workers with more educ earn more and married men tend to earn more The R2
Still could explain more but did a good job overall

Summary
1 The MCRmodel allows us to effectively holds other factors fixed while examining the effect
or an IV on a DV It explicitly allows the Ns to be correlated

2 Although linear in parameters it can be used to model nonlinear relationships by appropriately


choosing the N and DU

3 OLS easily applicable to estimate MRM Eachslope estimate measures the partial effect
is
of the corresponding IV on the DV holding all other IVs fixed

4 R is the proportion in the sample variation in the DVexplained by the IV Dont put too much
emphasise on this when evaluating econometric models

5 Under MLR1 4 the Ols estimators are unbiased Including an irrelevant variable has
no effect on unbiasedness Omitting a relevant variabledoes however
6 Under MCR 1 5 the variance of an Ols slope estimator is given by
Var 1357 84 SST CI RI
As 8 increases so does Var Bj and as SST increases Var 1B decreases R measures

the colinearity between x and theother IVs As R 71 Var Bi is unbounded

7 Adding an irrelevant variable generally increases the variances or me remaining estimators


because of multicolinearity

8 Under MCR 1 5 the OLS estimators are the BLUES

9 Section 3 7 discusses various applications or MLR we will see more examples

10 We will begin to use the se of the Ols coefficients to compute cont int for the
Population Parameters and to obtain test statistics for testing hypotheses aboutthe population
parameters

1614122
2 19 36

You might also like