0% found this document useful (0 votes)
65 views80 pages

03 - Causality PDF

The document discusses a lecture on causality in empirical finance methods. It reviews potential biases like omitted variable bias, measurement error bias, and simultaneity bias that can invalidate causal inferences. It also covers ways to address omitted variable bias, such as adding observable omitted variables as controls to the regression.

Uploaded by

bhaskkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views80 pages

03 - Causality PDF

The document discusses a lecture on causality in empirical finance methods. It reviews potential biases like omitted variable bias, measurement error bias, and simultaneity bias that can invalidate causal inferences. It also covers ways to address omitted variable bias, such as adding observable omitted variables as controls to the regression.

Uploaded by

bhaskkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

FNCE 926

Empirical Methods in CF
Lecture 3 – Causality

Professor Todd Gormley


Announcement
n  You should have uploaded Exercise #1
and your DO files to Canvas already
n  Exercise #2 due two weeks from today

2
Background readings for today
n  Roberts-Whited
q  Section 2
n  Angrist and Pischke
q  Section 3.2
n  Wooldridge
q  Sections 4.3 & 4.4
n  Greene
q  Sections 5.8-5.9

3
Outline for Today
n  Quick review
n  Motivate why we care about causality
n  Describe three possible biases & some
potential solutions
q  Omitted variable bias
q  Measurement error bias
q  Simultaneity bias

n  Student presentations of "Classics #2"

4
Quick Review [Part 1]
n  Why is adding irrelevant regressors a
potential problem?
q  Answer = It can inflate standard errors if the
irrelevant regressors are highly collinear with
variable of interest

n  Why is a larger sample helpful?


q  Answer = It gives us more variation in x,
which helps lower our standard errors

5
Quick Review [Part 2]
n  Suppose, β1 < 0 and β3 > 0 … what is the
sign of the effect of an increase in x1 for the
average firm in the below estimation?
y = β 0 + β1 x1 + β 2 x2 + β3 x1 x2 + u
q  Answer: It is the sign of
dy x2 = x2
| = β1 + β3 x2
dx1

6
Quick Review [Part 3]
n  How could we make the coefficients easier
to interpret in the prior example?
q  Shift all the variables by subtracting out their
sample mean before doing the estimation
q  It will allow the non-interacted coefficients to be
interpreted as effect for average firm

7
Quick Review [Part 4]
n  Consider the following estimate:
ln( wage) = 0.32 − 0.11 female + 0.21married
−0.30 ( female × married ) + 0.08education

q  Question: How much lower are wages of married


and unmarried females after controlling for
education, and who is this relative to?
n  Answer = unmarried females make 11% less than single
males; married females make –11%+21%–30%=20% less

8
Outline for Today
n  Quick review
n  Motivate why we care about causality
n  Describe three possible biases & some
potential solutions
q  Omitted variable bias
q  Measurement error bias
q  Simultaneity bias

n  Student presentations of "Classics #2"

9
Motivation
n  As researchers, we are interested in
making causal statements
q  Ex. #1 – what is the effect of a change in
corporate taxes on firms' leverage choice?
q  Ex. #2 – what is the effect of giving a CEO
more stock ownership in the firm on the
CEO's desire to take on risky investments?

n  I.e. we don't like to just say variables are


'associated' or 'correlated' with each other

10
What do we mean by causality?
n  Recall from earlier lecture, that if our linear
model is the following…

y = β 0 + β1 x1 + ... + β k xk + u

And, we want to infer β1 as the causal


effect of x1 on y, holding all else equal, then
we need to make the following
assumptions…

11
The basic assumptions
n  Assumption #1: E(u) = 0
n  Assumption #2: E(u|x1,…,xk) = E(u)
q  In words, average of u (i.e. unexplained portion
of y) does not depend on value of x
q  This is "conditional mean independence" (CMI)
n  Generally speaking, you need the estimation
error to be uncorrelated with all the x's

12
Tangent – CMI versus correlation
n  CMI (which implies x and u are
uncorrelated) is needed for unbiasedness
[which is again a finite sample property]
n  But, we only need to assume a zero
correlation between x and u for consistency
[which is a large sample property]
q  This is why I'll typically just refer to whether
u and x are correlated in my test of whether
we can make causal inferences

13
Three main ways this will be violated
n  Omitted variable bias
n  Measurement error bias
n  Simultaneity bias

n  Now, let's go through each in turn…

14
Omitted variable bias (OVB)
n  Probably the most common concern you
will hear researchers worry about
n  Basic idea = the estimation error, u,
contains other variable, e.g. z, that affects
y and is correlated with an x
q  Please note! The omitted variable is only
problematic if correlated with an x

15
OVB more formally, with one variable
n  You estimate: y = β 0 + β1 x + u
n  But, true model is: y = β 0 + β1 x + β 2 z + v

n  Then, βˆ1 = β1 + δ xz β2 , where δ xz is the


coefficient you'd get from regressing the
omitted variable, z, on x; and

cov( x, z )
δ xz =
var( x)

16
Interpreting the OVB formula
cov( x, z ) Bias
ˆ
β1 = β1 + β2
var( x)

Effect of Effect of
x on y Regression z on y
of z on x

n  Easy to see, estimated coefficient is only unbiased


if cov(x, z) = 0 [i.e. x and z are uncorrelated] or z
has no effect on y [i.e. β2 = 0]

17
Direction and magnitude of the bias
ˆ cov( x, z )
β1 = β1 + β2
var( x)

n  Direction of bias given by signs of β2, cov(x, z)


q  E.g. If know z has positive effect on y [i.e. β2 > 0]
and x and z are positively correlated [cov(x, z) > 0],
then the bias will be positive

n  Magnitude of the bias will be given by


magnitudes of β2, cov(x, z)/var(x)

18
Example – One variable case
n  Suppose we estimate: ln( wage) = β 0 + β1educ + w
n  But, true model is:
ln( wage) = β 0 + β1educ + β 2 ability + u

n  What is likely bias on β̂1 ? Recall,

ˆ cov(educ, ability )
β1 = β1 + β2
var(educ)

19
Example – Answer
q  Ability & wages likely positively correlated, so β2 > 0
q  Ability & education likely positive correlated, so
cov(educ, ability) > 0
q  Thus, the bias is likely to positive! βˆ is too big!
1

20
OVB – General Form
n  Once move away from simple case of just one
omitted variable, determining sign (and
magnitude) of bias will be a lot harder
q  Let β be vector of coefficients on k included variables
q  Let γ be vector of coefficient on l excluded variables
q  Let X be matrix of observations of included variables
q  Let Z be matrix of observations of excluded variables

ˆβ = β + E[ X'Z] γ
E[ X'X]

21
OVB – General Form, Intuition
ˆβ = β + E[ X'Z] γ
E[ X'X]
Vector of partial effects of
Vector of regression excluded variables
coefficients

n  Same idea as before, but more complicated


n  Frankly, this can be a real mess!
[See Gormley and Matsa (2014) for example with
just two included and two excluded variables]

22
Eliminating Omitted Variable Bias

n  How we try to get rid of this bias will


depend on the type of omitted variable
q  Observable omitted variable
q  Unobservable omitted variable

How can we deal with an


observable omitted variable?

23
Observable omitted variables
n  This is easy! Just add them as controls
q  E.g. if the omitted variable, z, in my simple case
was 'leverage', then add leverage to regression
n  A functional form misspecification is a special
case of an observable omitted variable
Let's now talk about this…

24
Functional form misspecification
n  Assume true model is…
y = β 0 + β1 x1 + β 2 x2 + β3 x22 + u
2
n  x
But, we omit squared term, 2
q  Just like any OVB, bias on (β0 , β1 , β2 )will
2
depend on β3 and correlations among 1 2 2 ) ( x , x , x
q  You get same type of problem if have incorrect
functional form for y [e.g. it should be ln(y) not y]
n  In some sense, this is minor problem… Why?

25
Tests for correction functional form
n  You could add additional squared and
cubed terms and look to see whether
they make a difference and/or have
non-zero coefficients
n  This isn't as easy when the possible
models are not nested…

26
Non-nested functional form issues…
n  Two non-nested examples are:
y = β 0 + β1 x1 + β 2 x2 + u Let's use this
versus example and
y = β 0 + β1 ln( x1 ) + β 2 ln( x2 ) + u see how we can
try to figure out
which is right
y = β 0 + β1 x1 + β 2 x2 + u
versus
y = β 0 + β1 x1 + β 2 z + u

27
Davidson-MacKinnon Test [Part 1]
n  To test which is correct, you can try this…
q  Take fitted values, ŷ , from 1st model and add them
as a control in 2nd model
y = β0 + β1 ln( x1 ) + β2 ln( x2 ) + θ1 yˆ + u
q  Look at t-stat on θ1; if significant rejects 2nd model!
q  Then, do reverse, and look at t-stat on θ1 in
y = β 0 + β1 x1 + β 2 x2 + θ1 yˆˆ + u
where ŷˆ is predicted value from 2nd model… if
significant then 1st model is also rejected L

28
Davidson-MacKinnon Test [Part 2]
n  Number of weaknesses to this test…
q  A clear winner may not emerge
n  Both might be rejected
n  Both might be accepted [If this happens, you can
use the R2 to choose which model is a better fit]
q  And, rejecting one model does NOT imply
that the other model is correct L

29
Bottom line advice on functional form
n  Practically speaking, you hope that changes
in functional form won't effect coefficients
on key variables very much…
q  But, if it does… You need to think hard about
why this is and what the correct form should be
q  The prior test might help with that…

30
Eliminating Omitted Variable Bias

n  How we try to get rid of this bias will


depend on the type of omitted variable
q  Observable omitted variable
q  Unobservable omitted variable

Unobservable are much harder to deal with,


but one possibility is to find a proxy variable

31
Unobserved omitted variables
n  Again, consider earlier estimation

ln( wage) = β 0 + β1educ + β 2 ability + u

q  Problem: we don't observe & can't measure ability


q  What can we do? Ans. = Find a proxy variable that
is correlated with the unobserved variable, E.g. IQ

32
Proxy variables [Part 1]
n  Consider the following model…
y = β 0 + β1 x1 + β 2 x2 + β3 x3* + u
where x3* is unobserved, but we have proxy x3
Then, suppose 3 = δ 0 + δ1 x3 + v
*
n  x
q  v is error associated with proxy's imperfect
representation of unobservable x3
q  Intercept just accounts for different scales
[e.g. ability has different average value than IQ]

33
Proxy variables [Part 2]
n  If we are only interested in β1 or β2, we can just
replace x3* with x3 and estimate
y = β 0 + β1 x1 + β 2 x2 + β3 x3 + u

n  But, for this to give us consistent estimates of β1


and β2 , we need to make some assumptions
#1 – We've got the right model, and
#2 – Other variables don't explain unobserved
variable after we've accounted for our proxy

34
Proxy variables – Assumptions
3 ) = 0 ; i.e. we have the right
*
#1 – E (u | x ,
1 2x , x
model and x3 would be irrelevant if we could
control for x1, x2, 3 , such that E (u | x3 ) = 0
*
x
q  This is a common assumption; not controversial

#2 – E (v | x1 , x2 , x3 ) = 0 ; i.e. x3 is a good proxy


* *
for x3 such that after controlling for x3, x3
doesn't depend on x1 or x2
I.e. E ( x3 | x1 , x2 , x3 ) = E ( x3 | x3 )
* *
q 

35
Why the proxy works…
n  Recall true model: y = β 0 + β x
1 1 + β x
2 2 + β 3 3 +u
x *

n  Now plug-in for x3*, using x3* = δ 0 + δ1 x3 + v

y = ( β0 + β3δ 0 ) + β1x1 + β 2 x2 + ( β3δ 1 ) x3 + ( u + β3v )


!#"#$ % ! #"# $
α0 α1 e

q  Prior assumptions ensure that E (e | x1 , x2 , x3 ) = 0


such that the estimates of (α 0 , β1 , β 2 , α1 ) are consistent
q  Note: β0 and β3 are not identified

36
Proxy assumptions are key [Part 1]
n  Suppose assumption #2 is wrong such that
x = δ 0 + δ 1x3 + γ 1x1 + γ 2 x2 + w
*
3
!##"## $
v
where E ( w | x1 , x2 , x3 ) = 0

q  If above is true, E (v | x1 , x2 , x3 ) ≠ 0, and if you


substitute into model of y, you'd get…

37
Proxy assumptions are key [Part 2]
*
n  x
Plugging in for 3 , you'd get
y = α 0 + α1 x1 + α 2 x2 + α 3 x3 + e
where α 0 = β 0 + β3δ 0 E.g. α1 captures effect
α1 = β1 + β 3γ 1 of x1 on y, β1 , but also
α 2 = β 2 + β 3γ 2 its correlation with
unobserved variable
α 3 = β 3δ1

n  We'd get consistent estimates of (α 0 ,α1 ,α 2 ,α 3 )


But that isn't what we want!

38
Proxy variables – Example #1
n  Consider earlier wage estimation
ln( wage) = β 0 + β1educ + β 2 ability + u

q  If use IQ as proxy for unobserved ability, what


assumption must we make? Is it plausible?
n  Answer: We assume E (ability | educ, IQ) = E (ability | IQ) ,
i.e. average ability does not change with education after
accounting for IQ… Could be questionable assumption!

39
Proxy variables – Example #2
n  Consider Q-theory of investment
investment = β 0 + β1Q + u

q  Can we estimate β1 using a firm's market-to-book


ratio (MTB) as proxy for Q? Why or why not?
n  Answer: Even if we believe this is the correct model
(Assumption #1) or that Q only depends on MTB
(Assumption #2), e.g. Q=δ0+δ1MTB, we are still not
getting estimate of β1… see next slide for the math

40
Proxy variables – Example #2 [Part 2]
n  Even if assumptions held, we'd only be getting
consistent estimates of
investment = α 0 + α1Q + e
where α 0 = β0 + β1δ 0
α1 = β1δ1

q  While we can't get β1, is there something we can


get if we make assumptions about sign of δ1?
q  Answer: Yes, the sign of β1

41
Proxy variables – Summary

n  If the coefficient on the unobserved variable


isn't what we are interested in, then a proxy
for it can be used to identify and remove
OVB from the other parameters
q  Proxy can also be used to determine sign of
coefficient on unobserved variable

42
Random Coefficient Model
n  So far, we've assumed that the effect of x on y
(i.e. β) was the same for all observations
q  In reality, this is unlikely true; model might look
more like yi = αi + βi xi + ui , where
α i = α + ci I.e. each observation's
βi = β + di relationship between x
and y is slightly different
E (ci ) = E (d i ) = 0

q  α is the average intercept and β is what we call the


"average partial effect" (APE)

43
Random Coefficient Model [Part 2]
n  Regression would seem to be incorrectly
specified, but if willing to make assumptions,
we can identify the APE
If like, can think of
q  Plug in for α and β the unobserved
yi = α + β xi + ( ci + di xi + ui ) differential intercept
and slopes as
q  Identification requires omitted variable

E ( ci + di xi + ui | x ) = 0
What does this imply?

44
Random Coefficient Model [Part 3]
n  This amounts to requiring
E ( ci | x ) = E ( ci ) = 0 ⇒ E (α i | x ) = E (α i )
E ( di | x ) = E ( di ) = 0 ⇒ E ( βi | x ) = E ( βi )

q  We must assume that the individual slopes and


intercepts are mean independent (i.e. uncorrelated
with the value of x) in order to estimate the APE
n  I.e. knowing x, doesn't help us predict the
individual's partial effect

45
Random Coefficient Model [Part 4]
n  Implications of APE
q  Be careful interpreting coefficients when
you are implicitly arguing elsewhere in paper
that effect of x varies across observations
n  Keep in mind the assumption this requires
n  And, describe results using something like…
"we find that, on average, an increase in x
causes a β change in y"

46
Three main ways this will be violated
n  Omitted variable bias
n  Measurement error bias
n  Simultaneity bias

47
Measurement error (ME) bias
n  Estimation will have measurement error whenever
we measure the variable of interest imprecisely
q  Ex. #1: Altman-z-score is noisy measure of default risk
q  Ex. #2: Avg. tax rate is noisy measure of marg. tax rate

n  Such measurement error can cause bias, and


the bias can be quite complicated

48
Measurement error vs. proxies
n  Measurement error is similar to proxy variable,
but very different conceptually
q  Proxy is used for something that is entirely
unobservable or measureable (e.g. ability)
q  With measurement error, the variable we don't
observe is well-defined and can be quantified… it's
just that our measure of it contains error

49
ME of Dep. Variable [Part 1]
n  Usually not a problem (in terms of bias); just
causes our standard errors to be larger. E.g. …
Let y = β0 + β1 x1 + ... + β k xk + u
*
q 

But, we measure y* with error e = y − y


*
q 

q  Because we only observe y, we estimate


y = β 0 + β1 x1 + ... + β k xk + ( u + e )

Note: we always assume E(e)=0; this


is innocuous because if untrue, it
only affects the bias on the constant

50
ME of Dep. Variable [Part 2]
n  As long as E(e|x)=0, the OLS estimates
are consistent and unbiased
q  I.e. as long as the measurement error of y is
uncorrelated with the x's, we're okay
q  Only issue is that we get larger standard errors
when e and u are uncorrelated [which is what
we typically assume] because Var(u+e)>Var(u)

What are some common examples of ME?

51
ME of Dep. Variable [Part 3]
n  Some common examples
q  Market leverage – typically use book value
of debt because market value hard to observe
q  Firm value – again, hard to observe market
value of debt, so we use book value
q  CEO compensation – value of options are
approximated using Black-Scholes

Is assuming e and x are uncorrelated plausible?

52
ME of Dep. Variable [Part 4]
n  Answer = Maybe… maybe not
q  Ex. – Firm leverage is measured with error; hard to
observe market value of debt, so we use book value
n  But, the measurement error is likely to be larger when firm's
are in distress… Market value of debt falls; book value doesn't
n  This error could be correlated with x's if it includes things like
profitability (i.e. ME larger for low profit firms)
n  This type of ME will cause inconsistent estimates

53
ME of Independent Variable [Part 1]
n  Let's assume the model is y = β 0 + β1 x *
+u
n  But, we observe x* with error, e = x − x*
q  We assume that E(y|x*, x) = E(y|x*) [i.e. x
doesn't affect y after controlling for x*; this is
standard and uncontroversial because it is just
stating that we've written the correct model]

n  What are some examples in CF?

54
ME of Independent Variable [Part 2]
n  There are lots of examples!
q  Average Q measures marginal Q with error
q  Altman-z score measures default prob. with error
q  GIM, takeover provisions, etc. are all just noisy
measures of the nebulous "governance" of firm

Will this measurement error cause bias?

55
ME of Independent Variable [Part 2]
n  Answer depends crucially on what we assume
about the measurement error, e
n  Literature focuses on two extreme assumptions
#1 – Measurement error, e, is uncorrelated
with the observed measure, x
#2 – Measurement error, e, is uncorrelated
with the unobserved measure, x*

56
Assumption #1: e uncorrelated with x
n  Substituting x* with what we actually
observe, x* = x – e, into true model, we have
y = β 0 + β1 x + u − β1e
q  Is there a bias?
n  Answer = No. x is uncorrelated with e by assumption,
and x is uncorrelated with u by earlier assumptions

q  What happens to our standard errors?


n  Answer = They get larger; error variance is now σ u2 + β12σ e2

57
Assumption #2: e uncorrelated with x*
n  We are still estimating y = β 0 + β1 x + u − β1e ,
but now, x is correlated with e
q  e uncorrelated with x* guarantees e is correlated
with x; cov( x, e) = E ( xe) = E ( x*e) + E (e2 ) = σ e2
q  I.e. an independent variable will be correlated with
the error… we will get biased estimates!

n  This is what people call the Classical Error-


in-Variables (CEV) assumption

58
CEV with 1 variable = attenuation bias
n  If work out math, one can show that the
estimate of β1, βˆ1 , in prior example (which
had just one independent variable) is…
⎛ σ x2* ⎞ This scaling
p lim( βˆ1 ) = β1 ⎜ 2
⎜ σ * + σ e2 ⎟⎟
factors is always
⎝ x ⎠
between 0 and 1
q  The estimate is always biased towards zero; i.e. it
is an attenuation bias
n  And, if variance of error, σ e2, is small, then attenuation
bias won't be that bad

59
Measurement error… not so bad?
n  Under current setup, measurement error
doesn't seem so bad…
q  If error uncorrelated with observed x, no bias
q  If error uncorrelated with unobserved x*, we
get an attenuation bias… so at least the sign
on our coefficient of interest is still correct

n  Why is this misleading?

60
Nope, measurement error is bad news
n  Truth is, measurement error is
probably correlated a bit with both the
observed x and unobserved x*
q  I.e… some attenuation bias is likely

n  Moreover, even in CEV case, if there


is more than one independent variable,
the bias gets horribly complicated…

61
ME with more than one variable
n  If estimating y = β 0 + β1 x1 + ... + β k xk + u , and
just one of the x's is mismeasured, then…
q  ALL the β's will be biased if the mismeasured
variable is correlated with any other x
[which presumably is true since it was included!]
q  Sign and magnitude of biases will depend on all
the correlations between x's; i.e. big mess!
n  See Gormley and Matsa (2014) math for AvgE
estimator to see how bad this can be

62
ME example
n  Fazzari, Hubbard, and Petersen (1988) is
classic example of a paper with ME problem
q  Regresses investment on Tobin's Q (it's measure
of investment opportunities) and cash
q  Finds positive coefficient on cash; argues there
must be financial constraints present
q  But Q is noisy measure; all coefficients are biased!

n  Erickson and Whited (2000) argues the pos.


coeff. disappears if you correct the ME

63
Three main ways this will be violated
n  Omitted variable bias
n  Measurement error bias
n  Simultaneity bias

64
Simultaneity bias
n  This will occur whenever any of the supposedly
independent variables (i.e. the x's) can be
affected by changes in the y variable; E.g.
y = β 0 + β1 x + u
x = δ 0 + δ1 y + v
q  I.e. changes in x affect y, and changes in y affect x;
this is the simplest case of reverse causality
q  An estimate of y = β0 + β1 x + u will be biased…

65
Simultaneity bias continued…
n  To see why estimating y = β 0 + β1 x + u won't
reveal the true β1, solve for x
x = δ 0 + δ1 y + v
x = δ 0 + δ1 ( β 0 + β1 x + u ) + v
⎛ δ 0 + δ1 β 0 ⎞ ⎛ v ⎞ ⎛ δ1 ⎞
x=⎜ ⎟+⎜ ⎟+⎜ ⎟u
⎝ 1 − δ1β1 ⎠ ⎝ 1 − δ1β1 ⎠ ⎝ 1 − δ1β1 ⎠

q  Easy to see that x is correlated with u! I.e. bias!

66
Simultaneity bias in other regressors
n  Prior example is case of reverse causality; the
variable of interest is also affected by y
n  But, if y affects any x, their will be a bias; E.g.
y = β 0 + β1 x1 + β 2 x2 + u
x2 = γ 0 + γ 1 y + w
q  Easy to show that x2 is correlated with u; and there
will be a bias on all coefficients
q  This is why people use lagged x's

67
"Endogeneity" problem – Tangent
n  In my opinion, the prior example is
what it means to have an "endogeneity"
problem or and "endogenous" variable
q  But, as I mentioned earlier, there is a lot of
misusage of the word "endogeneity" in
finance… So, it might be better just saying
"simultaneity bias"

68
Simultaneity Bias – Summary
n  If your x might also be affected by the y
(i.e. reverse causality), you won't be able to
make causal inferences using OLS
q  Instrumental variables or natural experiments
will be helpful with this problem

n  Also can't get causal estimates with OLS if


controls are affected by the y

69
"Bad controls"
n  Similar to simultaneity bias… this is when
one x is affected by another x; e.g.
y = β 0 + β1 x1 + β 2 x2 + u
x2 = γ 0 + γ 1 x1 + v
q  Angrist-Pischke call this a "bad control", and it
can introduce a subtle selection bias when
working with natural experiments
[we will come back to this in later lecture]

70
"Bad Controls" – TG's Pet Peeve
n  But just to preview it… If you have an x
that is truly exogenous (i.e. random) [as you
might have in natural experiment], do not put
in controls, that are also affected by x!
q  Only add controls unaffected by x, or just
regress your various y's on x, and x alone!

We'll revisit this in later lecture…

71
What is Selection Bias?
n  Easiest to think of it just as an omitted
variable problem, where the omitted
variable is the unobserved counterfactual
q  Specifically, error, u, contains some unobserved
counterfactual that is correlated with whether
we observe certain values of x
q  I.e. it is a violation of the CMI assumption

72
Selection Bias – Example
n  Mean health of hospital visitors = 3.21
n  Mean health of non-visitors = 3.93
q  Can we conclude that going to the hospital
(i.e. the x) makes you less healthy?
n  Answer = No. People going to the hospital are
inherently less healthy [this is the selection bias]
n  Another way to say this: we fail to control for what
health outcomes would be absent the visit, and this
unobserved counterfactual is correlated with going
to hospital or not [i.e. omitted variable]

73
Selection Bias – More later
n  We'll treat it more formally later when
we get to natural experiments

74
Summary of Today [Part 1]
n  We need conditional mean independence
(CMI), to make causal statements
n  CMI is violated whenever an independent
variable, x, is correlated with the error, u
n  Three main ways this can be violated
q  Omitted variable bias
q  Measurement error bias
q  Simultaneity bias

75
Summary of Today [Part 2]
n  The biases can be very complex
q  If more than one omitted variable, or omitted
variable is correlated with more than one
regressor, sign of bias hard to determine
q  Measurement error of an independent
variable can (and likely does) bias all
coefficients in ways that are hard to determine
q  Simultaneity bias can also be complicated

76
Summary of Today [Part 3]
n  To deal with these problems, there are
some tools we can use
q  E.g. Proxy variables [discussed today]
q  We will talk about other tools later, e.g.
n  Instrumental variables
n  Natural experiments
n  Regression discontinuity

77
In First Half of Next Class
n  Before getting to these other tools, will first
discuss panel data & unobserved heterogeneity
q  Using fixed effects to deal with unobserved variables
n  What are the benefits? [There are many!]
n  What are the costs? [There are some…]

q  Fixed effects versus first differences


q  When can FE be used?

n  Related readings: see syllabus

78
Assign papers for next week…
n  Rajan and Zingales (AER 1998)
q  Financial development & growth

n  Matsa (JF 2010)


q  Capital structure & union bargaining

n  Ashwini and Matsa (JFE 2013)


q  Labor unemployment risk & corporate policy

79
Break Time
n  Let's take our 10 minute break
n  We'll do presentations when we get back

80

You might also like