0% found this document useful (0 votes)

106 views

Econometrics Slides

Uploaded by

jeena tarakai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

106 views

Econometrics Slides

Uploaded by

jeena tarakai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 289

The Nature of Econometrics

and Economic Data

Chapter 1

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

Econometrics = use of statistical methods to analyze economic data

Econometricians typically analyze nonexperimental data

Typical goals of econometric analysis

Estimating relationships between economic variables

Testing economic theories and hypotheses

Forecasting economic variables

Evaluating and implementing government and business policy

1) Economic model (this step is often skipped)

2) Econometric model

Economic models

Maybe micro- or macromodels

Often use optimizing behaviour, equilibrium modeling, …

Establish relationships between economic variables

Examples: demand equations, pricing equations, …

Hours spent in
criminal activities

Age
„Wage“ of cri-
minal activities Probability of Expected
Wage for legal
Other Probability of conviction if sentence
employment
income getting caught caught

Functional form of relationship not specified

Equation could have been postulated without economic modeling

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Model of job training and worker productivity
What is effect of additional training on worker productivity?
Formal economic theory not really needed to derive equation:

Hourly wage

Years of formal
education Weeks spent
Years of work- in job training
force experience

Other factors may be relevant, but these are the most important (?)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Econometric model of criminal activity
The functional form has to be specified
Variables may have to be approximated by other quantities

Measure of cri- Wage for legal Other Frequency of

minal activity employment income prior arrests
Unobserved deter-
minants of criminal
activity

e.g. moral character,

wage in criminal activity,
Frequency of Average sentence Age family background …
conviction length after conviction

Unobserved deter-
minants of the wage

e.g. innate ability,

Hourly wage Years of formal Years of work- Weeks spent quality of education,
education force experience in job training family background …

Most of econometrics deals with the specification of the error

Econometric models may be used for hypothesis testing

For example, the parameter represents effect of training on wage

How large is this effect? Is it different from zero?

Different kinds of economic data sets

Cross-sectional data

Time series data

Pooled cross sections

Panel/Longitudinal data

Econometric methods depend on the nature of the data used

Use of inappropriate methods may lead to misleading results

Sample of individuals, households, firms, cities, states, countries,

or other units of interest at a given point of time/in a given period

Cross-sectional observations are more or less independent

For example, pure random sampling from a population

Sometimes pure random sampling is violated, e.g. units refuse to

respond in surveys, or if sampling is characterized by clustering

Cross-sectional data typically encountered in applied microeconomics

Indicator variables
(1=yes, 0=no)

Observation number Hourly wage

Growth rate of real Government consumtion Adult secondary

per capita GDP as percentage of GDP education rates

For example, stock prices, money supply, consumer price index,

gross domestic product, annual homicide rates, automobile sales, …

Time series observations are typically serially correlated

Ordering of observations conveys important information

Data frequency: daily, weekly, monthly, quarterly, annually, …

Typical features of time series: trends and seasonality

Typical applications: applied macroeconomics and finance

Average minimum Average Unemployment Gross national

wage for given year coverage rate rate product

Cross sections are drawn independently of each other

Pooled cross sections often used to evaluate policy changes

Example:

• Evaluate effect of change in property taxes on house prices

• Random sample of house prices for the year 1993

• A new random sample of house prices for the year 1995

• Compare before/after (1993: before reform, 1995: after reform)

Number of bathrooms

Before reform

After reform

Panel data have a cross-sectional and a time series dimension

Panel data can be used to account for time-invariant unobservables

Panel data can be used to model lagged responses

Example:

• City crime statistics; each city is observed in two years

• Time-invariant unobserved city characteristics may be modeled

• Effect of police on crime rates may exhibit time lag

Each city has two time

series observations

Number of
police in 1986

Number of
police in 1990

Definition of causal effect of on :

„How does variable change if variable is changed

but all other relevant factors are held constant“

Most economic questions are ceteris paribus questions

It is important to define which causal effect one is interested in

It is useful to describe how an experiment would have to be

designed to infer the causal effect in question

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Causal effect of fertilizer on crop yield
„By how much will the production of soybeans increase if one
increases the amount of fertilizer applied to the ground“
Implicit assumption: all other factors that influence crop yield such
as quality of land, rainfall, presence of parasites etc. are held fixed
Experiment:
Choose several one-acre plots of land; randomly assign different
amounts of fertilizer to the different plots; compare yields
Experiment works because amount of fertilizer applied is unrelated
to other factors influencing crop yields

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Measuring the return to education
„If a person is chosen from the population and given another
year of education, by how much will his or her wage increase? “
Implicit assumption: all other factors that influence wages such as
experience, family background, intelligence etc. are held fixed
Experiment:
Choose a group of people; randomly assign different amounts of
eduction to them (infeasable!); compare wage outcomes
Problem without random assignment: amount of education is related
to other factors that influence wages (e.g. intelligence)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Effect of law enforcement on city crime level
„If a city is randomly chosen and given ten additional police officers,
by how much would its crime rate fall? “
Alternatively: „If two cities are the same in all respects, except that
city A has ten more police officers, by how much would the two cities
crime rates differ?“
Experiment:
Randomly assign number of police officers to a large number of cities
In reality, number of police officers will be determined by crime rate
(simultaneous determination of crime and number of police)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Nature of Econometrics
and Economic Data
Effect of the minimum wage on unemployment
„By how much (if at all) will unemployment increase if the minimum
wage is increased by a certain amount (holding other things fixed)? “
Experiment:
Government randomly chooses minimum wage each year and
observes unemployment outcomes
Experiment will work because level of minimum wage is unrelated
to other factors determining unemployment
In reality, the level of the minimum wage will depend on political
and economic factors that also influence unemployment

Economic theories are not always stated in terms of causal effects

For example, the expectations hypothesis states that long term

interest rates equal compounded expected short term interest rates

An implicaton is that the interest rate of a three-months T-bill should

be equal to the expected interest rate for the first three months of a
six-months T-bill; this can be tested using econometric methods

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model

Chapter 2

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

„Explains variable in terms of variable “

Intercept Slope parameter

Dependent variable,
explained variable, Error term,
Independent variable, disturbance,
response variable,… explanatory variable, unobservables,…
regressor,…

„Studies how varies with changes in :“

as long as

By how much does the dependent Interpretation only correct if all other
variable change if the independent things remain equal when the indepen-
variable is increased by one unit? dent variable is increased by one unit

The simple linear regression model is rarely applicable in prac-

tice but its discussion is useful for pedagogical reasons

Rainfall,
land quality,
presence of parasites, …
Measures the effect of fertilizer on
yield, holding all other factors fixed

Example: A simple wage equation

Labor force experience,

tenure with current employer,
work ethic, intelligence …
Measures the change in hourly wage
given another year of education,
holding all other factors fixed

The explanatory variable must not

contain information about the mean
of the unobserved factors

Example: wage equation

e.g. intelligence …

The conditional mean independence assumption is unlikely to hold because

individuals with more education will also be more intelligent on average.

This means that the average value of the dependent variable

can be expressed as a linear function of the explanatory variable

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model

Population regression function

For individuals with , the

average value of is

A random sample of observations

First observation

Second observation

Third observation Value of the dependent

variable of the i-th ob-
Value of the expla-
servation
natory variable of
the i-th observation
n-th observation

Fitted regression line

For example, the i-th
data point

Minimize sum of squared regression residuals

Ordinary Least Squares (OLS) estimates

Salary in thousands of dollars Return on equity of the CEO‘s firm

Fitted regression

Intercept
If the return on equity increases by 1 percent,
then salary is predicted to change by 18,501 $
Causal interpretation?

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model

Fitted regression line

(depends on sample)

Unknown population regression line

Hourly wage in dollars Years of education

Fitted regression

Intercept
In the sample, one more year of education was
associated with an increase in hourly wage by 0.54 $
Causal interpretation?

Percentage of vote for candidate A Percentage of campaign expenditures candidate A

Fitted regression

Intercept
If candidate A‘s share of spending increases by one
percentage point, he or she receives 0.464 percen-
Causal interpretation? tage points more of the total vote

Fitted or predicted values Deviations from regression line (= residuals)

Algebraic properties of OLS regression

Deviations from regression Correlation between deviations Sample averages of y and

line sum up to zero and regressors is zero x lie on regression line

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model

For example, CEO number 12‘s salary was

526,023 $ lower than predicted using the
the information on his firm‘s return on equity

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model
Goodness-of-Fit

„How well does the explanatory variable explain the dependent variable?“

Measures of Variation

Total sum of squares, Explained sum of squares, Residual sum of squares,

represents total variation represents variation represents variation not
in dependent variable explained by regression explained by regression

Total variation Explained part Unexplained part

Goodness-of-fit measure (R-squared)

R-squared measures the fraction of the

total variation that is explained by the
regression

The regression explains only 1.3 %

of the total variation in salaries

Voting outcomes and campaign expenditures

The regression explains 85.6 % of the

total variation in election outcomes

Caution: A high R-squared does not necessarily mean that the

regression has a causal interpretation!

Natural logarithm of wage

This changes the interpretation of the regression coefficient:

Percentage change of wage

… if years of education
are increased by one year

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model
Fitted regression

The wage increases by 8.3 % for

every additional year of education
(= return to education)

For example:

Growth rate of wage is 8.3 %

per year of education

Natural logarithm of CEO salary Natural logarithm of his/her firm‘s sales

This changes the interpretation of the regression coefficient:

Percentage change of salary

… if sales increase by 1 %

Logarithmic changes are

always percentage changes

For example: + 1 % sales ! + 0.257 % salary

The log-log form postulates a constant elasticity model,

whereas the semi-log form assumes a semi-elasticity model

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model
Expected values and variances of the OLS estimators
The estimated regression coefficients are random variables
because they are calculated from a random sample

Data is random and depends on particular sample that has been drawn

The question is what the estimators will estimate on average

and how large their variability in repeated samples is

Assumption SLR.1 (Linear in parameters)

In the population, the relationship

between y and x is linear

Assumption SLR.2 (Random sampling)

The data is a random sample

drawn from the population

Each data point therefore follows

the population equation

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model
Discussion of random sampling: Wage and education
The population consists, for example, of all workers of country A
In the population, a linear relationship between wages (or log wages)
and years of education holds
Draw completely randomly a worker from the population
The wage and the years of education of the worker drawn are random
because one does not know beforehand which worker is drawn

Throw back worker into population and repeat random draw times

The wages and years of education of the sampled workers are used to
estimate the linear relationship between wages and education

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model

The values drawn

for the i-th worker

The implied deviation

from the population
relationship for
the i-th worker:

Assumption SLR.3 (Sample variation in explanatory variable)

The values of the explanatory variables are not all

the same (otherwise it would be impossible to stu-
dy how different values of the explanatory variable
lead to different values of the dependent variable)

Assumption SLR.4 (Zero conditional mean)

The value of the explanatory variable must

contain no information about the mean of
the unobserved factors

Interpretation of unbiasedness
The estimated coefficients may be smaller or larger, depending on
the sample that is the result of a random draw
However, on average, they will be equal to the values that charac-
terize the true relationship between y and x in the population
„On average“ means if sampling was repeated, i.e. if drawing the
random sample und doing the estimation was repeated many times
In a given sample, estimates may differ considerably from true values

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Simple
Regression Model
Variances of the OLS estimators
Depending on the sample, the estimates will be nearer or farther
away from the true population values
How far can we expect our estimates to be away from the true
population values on average (= sampling variability)?
Sampling variability is measured by the estimator‘s variances

Assumption SLR.5 (Homoskedasticity)

The value of the explanatory variable must

contain no information about the variability
of the unobserved factors

The variability of the unobserved

influences does not dependent on
the value of the explanatory variable

The variance of the unobserved

determinants of wages increases
with the level of education

Under assumptions SLR.1 – SLR.5:

Conclusion:
The sampling variability of the estimated regression coefficients will be
the higher the larger the variability of the unobserved factors, and the
lower, the higher the variation in the explanatory variable

The variance of u does not depend on x,

i.e. is equal to the unconditional variance

One could estimate the variance of the

errors by calculating the variance of the
residuals in the sample; unfortunately
this estimate would be biased

An unbiased estimate of the error variance can be obtained by

substracting the number of estimated regression coefficients
from the number of observations

Calculation of standard errors for regression coefficients

Plug in for
the unknown

The estimated standard deviations of the regression coefficients are called „standard
errors“. They measure how precisely the regression coefficients are estimated.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation

Chapter 3

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

„Explains variable in terms of variables “

Intercept Slope parameters

Dependent variable,
explained variable, Error term,
Independent variables, disturbance,
response variable,… explanatory variables, unobservables,…
regressors,…

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Motivation for multiple regression
Incorporate more explanatory factors into the model
Explicitly hold fixed other factors that otherwise would be in
Allow for more flexible functional forms

Example: Wage equation

Now measures effect of education explicitly holding experience fixed

All other factors…

Hourly wage Years of education Labor market experience

Other factors

Average standardized Per student spending Average family income

test score of school at this school of students at this school

Per student spending is likely to be correlated with average family

income at a given high school because of school financing
Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores
In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores

Other factors

Family consumption Family income Family income squared

Model has two explanatory variables: inome and income squared

Consumption is explained as a quadratic function of income
One has to be very careful when interpreting the coefficients:

By how much does consumption Depends on how

increase if income is increased much income is
by one unit? already there

Log of CEO salary Log sales Quadratic function of CEO tenure with firm

Model assumes a constant elasticity relationship between CEO salary

and the sales of his or her firm
Model assumes a quadratic relationship between CEO salary and his
or her tenure with the firm
Meaning of „linear“ regression
The model has to be linear in the parameters (not in the variables)

Random sample

Regression residuals

Minimize sum of squared residuals

Minimization will be carried out by computer

By how much does the dependent variable change if the j-th

independent variable is increased by one unit, holding all
other independent variables and the error term constant

The multiple linear regression model manages to hold the values

of other explanatory variables fixed even if, in reality, they are
correlated with the explanatory variable under consideration
„Ceteris paribus“-interpretation
It has still to be assumed that unobserved factors do not change if
the explanatory variables are changed

Grade point average at college High school grade point average Achievement test score

Interpretation
Holding ACT fixed, another point on high school grade point average
is associated with another .453 points college grade point average
Or: If we compare two students with the same ACT, but the hsGPA of
student A is one point higher, we predict student A to have a colGPA
that is .453 higher than that of student B
Holding high school grade point average fixed, another 10 points on
ACT are associated with less than one point on college GPA
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
„Partialling out“ interpretation of multiple regression
One can show that the estimated coefficient of an explanatory
variable in a multiple regression can be obtained in two steps:
1) Regress the explanatory variable on all other explanatory variables
2) Regress on the residuals from this regression
Why does this procedure work?
The residuals from the first regression is the part of the explanatory
variable that is uncorrelated with the other explanatory variables
The slope coefficient of the second regression therefore represents
the isolated effect of the explanatory variable on the dep. variable

Fitted or predicted values Residuals

Algebraic properties of OLS regression

Deviations from regression Correlations between deviations Sample averages of y and of the
line sum up to zero and regressors are zero regressors lie on regression line

Decomposition of total variation

Notice that R-squared can only

increase if another explanatory
variable is added to the regression
R-squared

Alternative expression for R-squared R-squared is equal to the squared

correlation coefficient between the
actual and the predicted value of
the dependent variable

Number of times Proportion prior arrests Months in prison 1986 Quarters employed 1986
arrested 1986 that led to conviction

Interpretation:
Proportion prior arrests +0.5 ! -.075 = -7.5 arrests per 100 men
Months in prison +12 ! -.034(12) = -0.408 arrests for given man
Quarters employed +1 ! -.104 = -10.4 arrests per 100 men

Average sentence in prior convictions

R-squared increases only slightly

Interpretation:
Average prior sentence increases number of arrests (?)
Limited additional explanatory power as R-squared increases by little
General remark on R-squared
Even if R-squared is small (as in the given example), regression may
still provide good estimates of ceteris paribus effects
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Standard assumptions for the multiple regression model

Assumption MLR.1 (Linear in parameters)

In the population, the relation-
ship between y and the expla-
natory variables is linear

Assumption MLR.2 (Random sampling)

The data is a random sample

drawn from the population

Each data point therefore follows the population equation

Assumption MLR.3 (No perfect collinearity)

„In the sample (and therefore in the population), none
of the independent variables is constant and there are
no exact relationships among the independent variables“

Remarks on MLR.3
The assumption only rules out perfect collinearity/correlation bet-
ween explanatory variables; imperfect correlation is allowed
If an explanatory variable is a perfect linear combination of other
explanatory variables it is superfluous and may be eliminated
Constant variables are also ruled out (collinear with intercept)

In a small sample, avginc may accidentally be an exact multiple of expend; it will not
be possible to disentangle their separate effects because there is exact covariation

Example for perfect collinearity: relationships between regressors

Either shareA or shareB will have to be dropped from the regression because there
is an exact linear relationship between them: shareA + shareB = 1

The value of the explanatory variables

must contain no information about the
mean of the unobserved factors

In a multiple regression model, the zero conditional mean assumption

is much more likely to hold because fewer things end up in the error
Example: Average test scores

If avginc was not included in the regression, it would end up in the error term;
it would then be hard to defend that expend is uncorrelated with the error

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Discussion of the zero mean conditional assumption
Explanatory variables that are correlated with the error term are
called endogenous; endogeneity is a violation of assumption MLR.4
Explanatory variables that are uncorrelated with the error term are
called exogenous; MLR.4 holds if all explanat. var. are exogenous
Exogeneity is the key assumption for a causal interpretation of the
regression, and for unbiasedness of the OLS estimators

Theorem 3.1 (Unbiasedness of OLS)

Unbiasedness is an average property in repeated samples; in a given

sample, the estimates may still be far away from the true values

No problem because . = 0 in the population

However, including irrevelant variables may increase sampling variance.

Omitting relevant variables: the simple case

True model (contains x1 and x2)

Estimated model (x2 is omitted)

If y is only regressed If y is only regressed error term

on x1 this will be the on x1, this will be the
estimated intercept estimated slope on x1

Conclusion: All estimated coefficients will be biased

Will both be positive

The return to education will be overestimated because . It will look

as if people with many years of education earn very high wages, but this is partly
due to the fact that people with more education are also more able on average.

When is there no omitted variable bias?

If the omitted variable is irrelevant or uncorrelated

True model (contains x1, x2 and x3)

Estimated model (x3 is omitted)

No general statements possible about direction of bias

Analysis as in simple case if one regressor uncorrelated with others
Example: Omitting ability in a wage equation

If exper is approximately uncorrelated with educ and abil, then the direction
of the omitted variable bias can be as analyzed in the simple two variable case.

The value of the explanatory variables

must contain no information about the
variance of the unobserved factors

Example: Wage equation

This assumption may also be hard
to justify in many cases

Short hand notation All explanatory variables are

collected in a random vector

with

Under assumptions MLR.1 – MLR.5:

Variance of the error term

Total sample variation in R-squared from a regression of explanatory variable

explanatory variable xj: xj on all other independent variables
(including a constant)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Components of OLS Variances:
1) The error variance
A high error variance increases the sampling variance because there is
more „noise“ in the equation
A large error variance necessarily makes estimates imprecise
The error variance does not decrease with sample size
2) The total sample variation in the explanatory variable
More sample variation leads to more precise estimates
Total sample variation automatically increases with the sample size
Increasing the sample size is thus a way to get more precise estimates

Regress on all other independent variables (including a constant)

The R-squared of this regression will be the higher

the better xj can be linearly explained by the other
independent variables

Sampling variance of will be the higher the better explanatory

variable can be linearly explained by other independent variables
The problem of almost linearly dependent explanatory variables is
called multicollinearity (i.e. for some )

Average standardized Expenditures Expenditures for in- Other ex-

test score of school for teachers structional materials penditures

The different expenditure categories will be strongly correlated because if a school has a lot
of resources it will spend a lot on everything.

It will be hard to estimate the differential effects of different expenditure categories because
all expenditures are either high or low. For precise estimates of the differential effects, one
would need information about situations where expenditure categories change differentially.

As a consequence, sampling variance of the estimated effects will be large.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Discussion of the multicollinearity problem
In the above example, it would probably be better to lump all expen-
diture categories together because effects cannot be disentangled
In other cases, dropping some independent variables may reduce
multicollinearity (but this may lead to omitted variable bias)
Only the sampling variance of the variables involved in multicollinearity
will be inflated; the estimates of other effects may be very precise
Note that multicollinearity is not a violation of MLR.3 in the strict sense
Multicollinearity may be detected through „variance inflation factors“

As an (arbitrary) rule of thumb, the variance

inflation factor should not be larger than 10

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Variances in misspecified models
The choice of whether to include a particular variable in a regression
can be made by analyzing the tradeoff between bias and variance

True population model

Estimated model 1

Estimated model 2

It might be the case that the likely omitted variable bias in the
misspecified model 2 is overcompensated by a smaller variance

Conditional on x1 and x2 , the

variance in model 2 is always
smaller than that in model 1

Case 1: Conclusion: Do not include irrelevant regressors

Case 2: Trade off bias and variance; Caution: bias will not vanish even in large samples

An unbiased estimate of the error variance can be obtained by substracting the number of
estimated regression coefficients from the number of observations. The number of obser-
vations minus the number of estimated parameters is also called the degrees of freedom.
The n estimated squared residuals in the sum are not completely independent but related
through the k+1 equations that define the first order conditions of the minimization problem.

Theorem 3.3 (Unbiased estimator of the error variance)

The true sampling

variation of the
estimated

Plug in for the unknown

The estimated samp-

ling variation of the
estimated

Note that these formulas are only valid under assumptions

MLR.1-MLR.5 (in particular, there has to be homoscedasticity)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Efficiency of OLS: The Gauss-Markov Theorem
Under assumptions MLR.1 - MLR.5, OLS is unbiased
However, under these assumptions there may be many other
estimators that are unbiased
Which one is the unbiased estimator with the smallest variance?
In order to answer this question one usually limits oneself to linear
estimators, i.e. estimators linear in the dependent variable

May be an arbitrary function of the sample values

of all the explanatory variables; the OLS estimator
can be shown to be of this form

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Estimation
Theorem 3.4 (Gauss-Markov Theorem)
Under assumptions MLR.1 - MLR.5, the OLS estimators are the best
linear unbiased estimators (BLUEs) of the regression coefficients, i.e.

for all for which .

OLS is only the best estimator if MLR.1 – MLR.5 hold; if there is

heteroscedasticity for example, there are better estimators.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference

Chapter 4

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Statistical inference in the regression model
Hypothesis tests about population parameters
Construction of confidence intervals

Sampling distributions of the OLS estimators

The OLS estimators are random variables

We already know their expected values and their variances

However, for hypothesis tests we need to know their distribution

In order to derive their distribution we need additional assumptions

Assumption about distribution of errors: normal distribution

independently of

It is assumed that the unobserved

factors are normally distributed around
the population regression function.

The form and the variance of the

distribution does not depend on
any of the explanatory variables.

It follows that:

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Discussion of the normality assumption
The error term is the sum of „many“ different unobserved factors
Sums of independent factors are normally distributed (CLT)
Problems:
• How many different factors? Number large enough?
• Possibly very heterogenuous distributions of individual factors
• How independent are the different factors?
The normality of the error term is an empirical question
At least the error distribution should be „close“ to normal
In many cases, normality is questionable or impossible by definition

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Discussion of the normality assumption (cont.)
Examples where normality cannot hold:
• Wages (nonnegative; also: minimum wage)
• Number of arrests (takes on a small number of integer values)
• Unemployment (indicator variable, takes on only 1 or 0)
In some cases, normality can be achieved through transformations
of the dependent variable (e.g. use log(wage) instead of wage)
Under normality, OLS is the best (even nonlinear) unbiased estimator
Important: For the purposes of statistical inference, the assumption
of normality can be replaced by a large sample size

„Gauss-Markov assumptions“ „Classical linear model (CLM) assumptions“

Theorem 4.1 (Normal sampling distributions)

Under assumptions MLR.1 – MLR.6:

The estimators are normally distributed The standardized estimators follow a

around the true parameters with the standard normal distribution
variance that was derived earlier

Under assumptions MLR.1 – MLR.6:

If the standardization is done using the estimated

standard deviation (= standard error), the normal
distribution is replaced by a t-distribution

Note: The t-distribution is close to the standard normal distribution if n-k-1 is large.

Null hypothesis (for more general hypotheses, see below)

The population parameter is equal to zero, i.e. after
controlling for the other independent variables, there
is no effect of xj on y

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
t-statistic (or t-ratio)
The t-statistic will be used to test the above null hypothesis.
The farther the estimated coefficient is away from zero, the
less likely it is that the null hypothesis holds true. But what
does „far“ away from zero mean?

This depends on the variability of the estimated coefficient, i.e. its

standard deviation. The t-statistic measures how many estimated
standard deviations the estimated coefficient is away from zero.

Distribution of the t-statistic if the null hypothesis is true

Goal: Define a rejection rule so that, if it is true, H0 is rejected

only with a small probability (= significance level, e.g. 5%)

Test against .

Reject the null hypothesis in favour of the

alternative hypothesis if the estimated coef-
ficient is „too large“ (i.e. larger than a criti-
cal value).

Construct the critical value so that, if the

null hypothesis is true, it is rejected in,
for example, 5% of the cases.

In the given example, this is the point of the t-

distribution with 28 degrees of freedom that is
exceeded in 5% of the cases.

! Reject if t-statistic greater than 1.701

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Wage equation
Test whether, after controlling for education and tenure, higher work
experience leads to higher hourly wages

Standard errors

Test against .

One would either expect a positive effect of experience on hourly wage or no effect at all.

Degrees of freedom;
here the standard normal
approximation applies

Critical values for the 5% and the 1% significance level (these

are conventional significance levels).

The null hypothesis is rejected because the t-statistic exceeds

the critical value.

„The effect of experience on hourly wage is statistically greater

than zero at the 5% (and even at the 1%) significance level.“

Test against .

Reject the null hypothesis in favour of the

alternative hypothesis if the estimated coef-
ficient is „too small“ (i.e. smaller than a criti-
cal value).

Construct the critical value so that, if the

null hypothesis is true, it is rejected in,
for example, 5% of the cases.

In the given example, this is the point of the t-

distribution with 18 degrees of freedom so that
5% of the cases are below the point.

! Reject if t-statistic less than -1.734

Percentage of students Average annual tea- Staff per one thou- School enrollment
passing maths test cher compensation sand students (= school size)

Test against .

Do larger schools hamper student performance or is there no such effect?

Degrees of freedom;
here the standard normal
approximation applies

Critical values for the 5% and the 15% significance level.

The null hypothesis is not rejected because the t-statistic is

not smaller than the critical value.

One cannot reject the hypothesis that there is no effect of school size on
student performance (not even for a lax significance level of 15%).

R-squared slightly higher

Test against .

t-statistic

Critical value for the 5% significance level ! reject null hypothesis

The hypothesis that there is no effect of school size on student performance

can be rejected in favor of the hypothesis that the effect is negative.

How large is the effect? + 10% enrollment ! -0.129 percentage points

students pass test

(small effect)

Test against .

Reject the null hypothesis in favour of the

alternative hypothesis if the absolute value
of the estimated coefficient is too large.

Construct the critical value so that, if the

null hypothesis is true, it is rejected in,
for example, 5% of the cases.

In the given example, these are the points

of the t-distribution so that 5% of the cases
lie in the two tails.

! Reject if absolute value of t-statistic is less than

-2.06 or greater than 2.06

For critical values, use standard normal distribution

The effects of hsGPA and skipped are

significantly different from zero at the
1% significance level. The effect of ACT
is not significantly different from zero,
not even at the 10% significance level.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
„Statistically significant“ variables in a regression
If a regression coefficient is different from zero in a two-sided test, the
corresponding variable is said to be „statistically significant“
If the number of degrees of freedom is large enough so that the nor-
mal approximation applies, the following rules of thumb apply:

„statistically significant at 10 % level“

„statistically significant at 5 % level“

„statistically significant at 1 % level“

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Guidelines for discussing economic and statistical significance
If a variable is statistically significant, discuss the magnitude of the
coefficient to get an idea of its economic or practical importance
The fact that a coefficient is statistically significant does not necessa-
rily mean it is economically or practically significant!
If a variable is statistically and economically important but has the
„wrong“ sign, the regression model might be misspecified
If a variable is statistically insignificant at the usual levels (10%, 5%,
1%), one may think of dropping it from the regression
If the sample size is small, effects might be imprecisely estimated so
that the case for dropping insignificant variables is less strong

t-statistic

The test works exactly as before, except that the hypothesized

value is substracted from the estimate when forming the statistic

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Campus crime and enrollment
An interesting hypothesis is whether crime increases by one percent
if enrollment is increased by one percent

Estimate is different from

one but is this difference
statistically significant?

The hypothesis is
rejected at the 5%
level

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Computing p-values for t-tests
If the significance level is made smaller and smaller, there will be a
point where the null hypothesis cannot be rejected anymore
The reason is that, by lowering the significance level, one wants to
avoid more and more to make the error of rejecting a correct H0
The smallest significance level at which the null hypothesis is still
rejected, is called the p-value of the hypothesis test
A small p-value is evidence against the null hypothesis because one
would reject the null hypothesis even at small significance levels
A large p-value is evidence in favor of the null hypothesis
P-values are more informative than tests at fixed significance levels

The p-value is the significance level at which

one is indifferent between rejecting and not
rejecting the null hypothesis.

These would be the

critical values for a
In the two-sided case, the p-value is thus the
5% significance level probability that the t-distributed variable takes
on a larger absolute value than the realized
value of the test statistic, e.g.:

From this, it is clear that a null hypothesis is

rejected if and only if the corresponding p-
value is smaller than the significance level.
value of test statistic
For example, for a significance level of 5% the
t-statistic would not lie in the rejection region.

Simple manipulation of the result in Theorem 4.2 implies that

Lower bound of the Upper bound of the Confidence level

Confidence interval Confidence interval

Interpretation of the confidence interval

The bounds of the interval are random
In repeated samples, the interval that is constructed in the above way
will cover the population regression coefficient in 95% of the cases

Use rules of thumb

Relationship between confidence intervals and hypotheses tests

reject in favor of

Spending on R&D Annual sales Profits as percentage of sales

The effect of sales on R&D is relatively precisely estimated This effect is imprecisely estimated as the in-
as the interval is narrow. Moreover, the effect is significantly terval is very wide. It is not even statistically
different from zero because zero is outside the interval. significant because zero lies in the interval.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing hypotheses about a linear combination of parameters
Example: Return to education at 2 year vs. at 4 year colleges
Years of education Years of education
at 2 year colleges at 4 year colleges

Test against .

A possible test statistic would be:

The difference between the estimates is normalized by the estimated
standard deviation of the difference. The null hypothesis would have
to be rejected if the statistic is „too negative“ to believe that the true
difference between the parameters is equal to zero.

Usually not available in regression output

Alternative method

Define and test against .

Insert into original regression a new regressor (= total years of college)

Hypothesis is rejected at 10%

level but not at 5% level

This method works always for single linear hypotheses

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing multiple linear restrictions: The F-test
Testing exclusion restrictions
Salary of major lea- Years in Average number of
gue base ball player the league games per year

Batting average Home runs per year Runs batted in per year

against

Test whether performance measures have no effect/can be exluded from regression.

None of these variabels is statistically significant when tested individually

Idea: How would the model fit be if these variables were dropped from the regression?

The sum of squared residuals necessarily increases, but is the increase statistically significant?

Test statistic Number of restrictions

The relative increase of the sum of

squared residuals when going from
H1 to H0 follows a F-distribution (if
the null hypothesis H0 is correct)

A F-distributed variable only takes on positive

values. This corresponds to the fact that the
sum of squared residuals can only increase if
one moves from H1 to H0.

Choose the critical value so that the null hypo-

thesis is rejected in, for example, 5% of the
cases, although it is true.

Degrees of freedom in
the unrestricted model

The null hypothesis is overwhel-

mingly rejected (even at very
small significance levels).

Discussion
The three variables are „jointly significant“
They were not significant when tested individually
The likely reason is multicollinearity between them
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Test of overall significance of a regression

The null hypothesis states that the explanatory

variables are not useful at all in explaining the
dependent variable
Restricted model
(regression on constant)

The test of overall significance is reported in most regression

packages; the null hypothesis is usually overwhelmingly rejected

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing general linear restrictions with the F-test
Example: Test whether house price assessments are rational
The assessed housing value Size of lot
Actual house price
(before the house was sold) (in feet)

Square footage Number of bedrooms

In addition, other known factors should

not influence the price once the assessed
value has been controlled for.
If house price assessments are rational, a 1% change in the
assessment should be associated with a 1% change in price.

The restricted model is actually a

Restricted regression
regression of [y-x1] on a constant

Test statistic

cannot be rejected

When tested individually,

there is also no evidence
against the rationality of
house price assessments

The F-test works for general multiple linear hypotheses

For all tests and confidence intervals, validity of assumptions
MLR.1 – MLR.6 has been assumed. Tests may be invalid otherwise.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: OLS Asymptotics

Chapter 5

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: OLS Asymptotics
So far we focused on properties of OLS that hold for any sample
Properties of OLS that hold for any sample/sample size
Expected values/unbiasedness under MLR.1 – MLR.4
Variance formulas under MLR.1 – MLR.5
Gauss-Markov Theorem under MLR.1 – MLR.5
Exact sampling distributions/tests under MLR.1 – MLR.6

Properties of OLS that hold in large samples

Without assuming nor-
Consistency under MLR.1 – MLR.4 mality of the error term!

Asymptotic normality/tests under MLR.1 – MLR.5

An estimator is consistent for a population parameter if

for arbitrary and .

Alternative notation:
The estimate converges in proba-
bility to the true population value
Interpretation:
Consistency means that the probability that the estimate is arbitrari-
ly close to the true population value can be made arbitrarily high by
increasing the sample size
Consistency is a minimum requirement for sensible estimators

Special case of simple regression model

One can see that the slope estimate is consistent

Assumption MLR.4‘
if the explanatory variable is exogenous, i.e. un-
correlated with the error term.

All explanatory variables must be uncorrelated with the

error term. This assumption is weaker than the zero
conditional mean assumption MLR.4.

True model

Misspecified
model

Bias

There is no omitted variable bias if the omitted variable is

irrelevant or uncorrelated with the included variable

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: OLS Asymptotics
Asymptotic normality and large sample inference
In practice, the normality assumption MLR.6 is often questionable
If MLR.6 does not hold, the results of t- or F-tests may be wrong
Fortunately, F- and t-tests still work if the sample size is large enough
Also, OLS estimates are normal in large samples even without MLR.6

Theorem 5.2 (Asymptotic normality of OLS)

Under assumptions MLR.1 – MLR.5:

In large samples, the
standardized estimates also
are normally distributed

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: OLS Asymptotics
Practical consequences
In large samples, the t-distribution is close to the N(0,1) distribution
As a consequence, t-tests are valid in large samples without MLR.6
The same is true for confidence intervals and F-tests
Important: MLR.1 – MLR.5 are still necessary, esp. homoscedasticity

Asymptotic analysis of the OLS sampling errors

Converges to

Converges to Converges to a fixed

number

shrinks with the rate

This is why large samples are better

Example: Standard errors in a birth weight equation

Use only the first half of observations

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Further Issues

Chapter 6

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
More on Functional Form
More on using logarithmic functional forms
Convenient percentage/elasticity interpretation
Slope coefficients of logged variables are invariant to rescalings
Taking logs often eliminates/mitigates problems with outliers
Taking logs often helps to secure normality and homoscedasticity
Variables measured in units such as years should not be logged
Variables measured in percentage points should also not be logged
Logs must not be used if variables take on zero or negative values
It is hard to reverse the log-operation when constructing predictions

The first year of experience increases

the wage by some .30$, the second
Marginal effect of experience year by .298-2(.0061)(1) = .29$ etc.

Does this mean the return to experience

becomes negative after 24.4 years?

Not necessarily. It depends on how many

observations in the sample lie right of the
turnaround point.

In the given example, these are about 28%

of the observations. There may be a speci-
fication problem (e.g. omitted variables).

Example: Effects of pollution on housing prices

Does this mean that, at a low number of rooms,

more rooms are associated with lower prices?

Turnaround point:

This area can be ignored as

it concerns only 1% of the
observations.

Increase rooms from 5 to 6:

Increase rooms from 6 to 7:

Higher polynomials

Interaction term

The effect of the number

of bedrooms depends on
the level of square footage

Interaction effects complicate interpretation of parameters

Effect of number of bedrooms, but for a square footage of zero

Effect of x2 if all variables take on their mean values

Advantages of reparametrization
Easy interpretation of all parameters
Standard errors for partial effects at the mean values available
If necessary, interaction may be centered at other interesting values

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
More on goodness-of-fit and selection of regressors
General remarks on R-squared
A high R-squared does not imply that there is a causal interpretation
A low R-squared does not preclude precise estimation of partial effects
Adjusted R-squared
What is the ordinary R-squared supposed to measure?

is an estimate for

Population R-squared

A better estimate taking into account degrees of freedom would be

The adjusted R-squared imposes a penalty for adding new regressors

The adjusted R-squared increases if, and only if, the t-statistic of a
newly added regressor is greater than one in absolute value
Relationship between R-squared and adjusted R-squared

The adjusted R-squared

may even get negative

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Using adjusted R-squared to choose between nonnested models
Models are nonnested if neither model is a special case of the other

A comparison between the R-squared of both models would be unfair

to the first model because the first model contains fewer parameters
In the given example, even after adjusting for the difference in
degrees of freedom, the quadratic model is preferred

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Comparing models with different dependent variables
R-squared or adjusted R-squared must not be used to compare models
which differ in their definition of the dependent variable
Example: CEO compensation and firm performance

There is much
less variation
in log(salary)
that needs to
be explained
than in salary

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Controlling for too many factors in regression analysis
In some cases, certain variables should not be held fixed
In a regression of traffic fatalities on state beer taxes (and other
factors) one should not directly control for beer consumption
In a regression of family health expenditures on pesticide usage
among farmers one should not control for doctor visits
Different regressions may serve different purposes
In a regression of house prices on house characteristics, one would
only include price assessments if the purpose of the regression is to
study their validity; otherwise one would not include them

Adding regressors may excarcerbate multicollinearity problems

On the other hand, adding regressors reduces the error variance

Variables that are uncorrelated with other regressors should be added

because they reduce error variance without increasing multicollinearity

However, such uncorrelated variables may be hard to find

Example: Individual beer consumption and beer prices

Including individual characteristics in a regression of beer consumption

on beer prices leads to more precise estimates of the price elasticity

Under the additional assumption that is independent of :

Prediction for y

These are the R-squareds for the predictions of the unlogged

salary variable (although the second regression is originally for
logged salaries). Both R-squareds can now be directly compared.

Chapter 7

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Qualitative Information
Examples: gender, race, industry, region, rating grade, …
A way to incorporate qualitative information is to use dummy variables
They may appear as the dependent or as independent variables

A single dummy independent variable

= the wage gain/loss if the person Dummy variable:

is a woman rather than a man =1 if the person is a woman
(holding other things fixed) =0 if the person is man

Alternative interpretation of coefficient:

i.e. the difference in mean wage between

men and women with the same level of
education.

Intercept shift

When using dummy variables, one category always has to be omitted:

The base category are men

The base category are women

Alternatively, one could omit the intercept: Disadvantages:

1) More difficult to test for diffe-
rences between the parameters
2) R-squared formula only valid
if regression contains intercept

Holding education, experience,

and tenure fixed, women earn
1.81$ less per hour than men

Does that mean that women are discriminated against?

Not necessarily. Being female may be correlated with other produc-
tivity characteristics that have not been controlled for.

Not holding other factors constant, women

earn 2.51$ per hour less than men, i.e. the
difference between the mean wage of men
and that of women is 2.51$.

Discussion
It can easily be tested whether difference in means is significant
The wage difference between men and women is larger if no other
things are controlled for; i.e. part of the difference is due to differ-
ences in education, experience and tenure between men and women

Hours training per employee Dummy indicating whether firm received training grant

This is an example of program evaluation

Treatment group (= grant receivers) vs. control group (= no grant)
Is the effect of treatment on the outcome of interest causal?

Dummy indicating
whether house is of
colonial style

As the dummy for colonial

style changes from 0 to 1,
the house price increases
by 5.4 percentage points

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Using dummy variables for multiple categories
1) Define membership in each category by a dummy variable
2) Leave out one category (which becomes the base category)

Holding other things fixed, married

women earn 19.8% less than single
men (= the base category)

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Incorporating ordinal information using dummy variables
Example: City credit ratings and municipal bond interest rates

Municipal bond rate Credit rating from 0-4 (0=worst, 4=best)

This specification would probably not be appropriate as the credit rating only contains
ordinal information. A better way to incorporate this information is to define dummies:

Dummies indicating whether the particular rating applies, e.g. CR1=1 if CR=1 and CR1=0
otherwise. All effects are measured in comparison to the worst rating (= base category).

= intercept men = slope men

= intercept women = slope women

Interesting hypotheses

The return to education is the The whole wage equation is

same for men and women the same for men and women

Interacting both the intercept and

the slope with the female dummy
enables one to model completely
independent wage equations for
men and women

Does this mean that there is no significant evidence of

No evidence against hypothesis that lower pay for women at the same levels of educ, exper,
the return to education is the same and tenure? No: this is only the effect for educ = 0. To
for men and women answer the question one has to recenter the interaction
term, e.g. around educ = 12.5 (= average education).

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Testing for differences in regression functions across groups
Unrestricted model (contains full set of interactions)

College grade point average Standardized aptitude test score High school rank percentile

Total hours spent

Restricted model (same regression for both groups) in college courses

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Null hypothesis All interaction effects are zero, i.e.
the same regression coefficients
apply to men and women

Estimation of the unrestricted model

Tested individually,
the hypothesis that
the interaction effects
are zero cannot be
rejected

Alternative way to compute F-statistic in the given case

Run separate regressions for men and for women; the unrestricted
SSR is given by the sum of the SSR of these two regressions
Run regression for the restricted model and store SSR
If the test is computed in this way it is called the Chow-Test
Important: Test assumes a constant error variance accross groups

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
A Binary dependent variable: the linear probability model
Linear regression when the dependent variable is binary

If the dependent variable only

takes on the values 1 and 0

Linear probability
model (LPM)

In the linear probability model, the coefficients

describe the effect of the explanatory variables
on the probability that y=1

=1 if in labor force, =0 otherwise Non-wife income (in thousand dollars per year)

If the number of kids under six

years increases by one, the pro-
probability that the woman
works falls by 26.2%

Does not look significant (but see below)

Graph for nwifeinc=50, exper=5,

age=30, kindslt6=1, kidsge6=0

The maximum level of education in

the sample is educ=17. For the gi-
ven case, this leads to a predicted
probability to be in the labor force
of about 50%.

Negative predicted probability but

no problem because no woman in
the sample has educ < 5.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Disadvantages of the linear probability model
Predicted probabilities may be larger than one or smaller than zero
Marginal probability effects sometimes logically impossible
The linear probability model is necessarily heteroskedastic

Variance of Ber-
noulli variable

Heterosceasticity consistent standard errors need to be computed

Advantanges of the linear probability model

Easy estimation and interpretation
Estimated effects and predictions often reasonably good in practice

Percentage of defective items =1 if firm received training grant, =0 otherwise

No apparent effect of
grant on productivity

Treatment group: grant reveivers, Control group: firms that received no grant

Grants were given on a first-come, first-served basis. This is not the same as giving them out
randomly. It might be the case that firms with less productive workers saw an opportunity to
improve productivity and applied first.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Qualitative Information
Self-selection into treatment as a source for endogeneity
In the given and in related examples, the treatment status is probably
related to other characteristics that also influence the outcome
The reason is that subjects self-select themselves into treatment
depending on their individual characteristics and prospects
Experimental evaluation
In experiments, assignment to treatment is random
In this case, causal effects can be inferred using a simple regression

The dummy indicating whether or not there was

treatment is unrelated to other factors affecting
the outcome.

Dummy indicating whether Race dummy

loan was approved Credit rating

It is important to control for other characteristics that may be

important for loan approval (e.g. profession, unemployment)
Omitting important characteristics that are correlated with the non-
white dummy will produce spurious evidence for discriminiation

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Heteroscedasticity

Chapter 8

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

OLS still unbiased and consistent under heteroscedastictiy!

Also, interpretation of R-squared is not changed

Unconditional error variance is unaffected

by heteroscedasticity (which refers to the
conditional error variance)

Heteroscedasticity invalidates variance formulas for OLS estimators

The usual F-tests and t-tests are not valid under heteroscedasticity

Under heteroscedasticity, OLS is no longer the best linear unbiased

estimator (BLUE); there may be more efficient linear estimators

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Heteroscedasticity-robust inference after OLS
Formulas for OLS standard errors and related statistics have been
developed that are robust to heteroscedasticity of unknown form
All formulas are only valid in large samples
Formula for heteroscedasticity-robust OLS standard error
Also called White/Eicker standard errors. They involve
the squared residuals from the regression and from a
regression of xj on all other explanatory variables.

Using these formulas, the usual t-test is valid asymptotically

The usual F-statistic does not work under heteroscedasticity, but
heteroscedasticity robust versions are available in most software

Heteroscedasticity robust standard errors may be

larger or smaller than their nonrobust counterparts.
The differences are often small in practice.

F-statistics are also often not too different.

If there is strong heteroscedasticity, differences may be larger.

To be on the safe side, it is advisable to always compute robust
standard errors.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Testing for heteroscedasticity
It may still be interesting whether there is heteroscedasticity because
then OLS may not be the most efficient linear estimator anymore

Breusch-Pagan test for heteroscedasticity

Under MLR.4

The mean of u2 must not

vary with x1, x2, …, xk

Regress squared residuals on all expla-

natory variables and test whether this
regression has explanatory power.

A large test statistic (= a high R-

squared) is evidence against the
null hypothesis.

Alternative test statistic (= Lagrange multiplier statistic, LM).

Again, high values of the test statistic (= high R-squared) lead
to rejection of the null hypothesis that the expected value of u2
is unrelated to the explanatory variables.

Heteroscedasticity

In the logarithmic specification, homoscedasticity cannot be rejected

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Regress squared residuals on all expla-
White test for heteroscedasticity natory variables, their squares, and in-
teractions (here: example for k=3)

The White test detects more general

deviations from heteroscedasticity
than the Breusch-Pagan test

Disadvantage of this form of the White test

Including all squares and interactions leads to a large number of esti-
mated parameters (e.g. k=6 leads to 27 parameters to be estimated)

This regression indirectly tests the dependence of the squared residuals

on the explanatory variables, their squares, and interactions, because the
predicted value of y and its square implicitly contain all of these terms.

Example: Heteroscedasticity in (log) housing price equations

The functional form of the

heteroscedasticity is known

Transformed model

Note that this regression

model has no intercept
The transformed model is homoscedastic

If the other Gauss-Markov assumptions hold as well, OLS applied

to the transformed model is the best linear unbiased estimator!

Observations with a large

variance get a smaller weight
in the optimization problem

Why is WLS more efficient than OLS in the original model?

Observations with a large variance are less informative than observa-
tions with small variance and therefore should get less weight

WLS is a special case of generalized least squares (GLS)

Assumed form of heteroscedasticity:

WLS estimates have considerably

smaller standard errors (which is
line with the expectation that
they are more efficient).

Participation in 401K pension plan

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Important special case of heteroscedasticity
If the observations are reported as averages at the city/county/state/-
country/firm level, they should be weighted by the size of the unit

Average contribution to Average earnings Percentage firm Heteroscedastic

pension plan in firm i and age in firm i contributes to plan error term

Error variance if errors

are homoscedastic at
the employee level

If errors are homoscedastic at the employee level, WLS with weights equal to firm size mi should
be used. If the assumption of homoscedasticity at the employee level is not exactly right, one can
calculate robust standard errors after WLS (i.e. for the transformed model).

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Unknown heteroscedasticity function (feasible GLS)
Assumed general form
of heteroscedasticity;
exp-function is used to
ensure positivity

Multiplicative error (assumption:

independent of the explanatory
variables)

Use inverse values of the

estimated heteroscedasticity
funtion as weights in WLS

Feasible GLS is consistent and asymptotically more efficient than OLS.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
Example: Demand for cigarettes
Smoking
Estimation by OLS restrictions in
restaurants
Cigarettes smoked per day Logged income and cigarette price

Reject homo-
scedasticity

Discussion
The income elasticity is now statistically significant; other coefficients
are also more precisely estimated (without changing qualit. results)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Heteroscedasticity
What if the assumed heteroscedasticity function is wrong?
If the heteroscedasticity function is misspecified, WLS is still consistent
under MLR.1 – MLR.4, but robust standard errors should be computed
WLS is consistent under MLR.4 but not necessarily under MLR.4‘

If OLS and WLS produce very different estimates, this typically indi-
cates that some other assumptions (e.g. MLR.4) are wrong
If there is strong heteroscedasticity, it is still often better to use a
wrong form of heteroscedasticity in order to increase efficiency

In the LPM, the exact form of

heteroscedasticity is known

Use inverse values

as weights in WLS
Discussion
Infeasible if LPM predictions are below zero or greater than one
If such cases are rare, they may be adjusted to values such as .01/.99
Otherwise, it is probably better to use OLS with robust standard errors

Chapter 9

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Tests for functional form misspecification
One can always test whether explanatory should appear as squares or
higher order terms by testing whether such terms can be excluded
Otherwise, one can use general specification tests such as RESET

Regression specification error test (RESET)

The idea of RESET is to include squares and possibly higher order
fitted values in the regression (similarly to the reduced White test)

Test for the exclusion of these terms. If they cannot be exluded, this is evidence for
omitted higher order terms and interactions, i.e. for misspecification of functional form.

Evidence for
misspecification

Less evidence for

misspecification
Discussion
One may also include higher order terms, which implies complicated
interactions and higher order terms of all explanatory variables
RESET provides little guidance as to where misspecification comes from
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Testing against nonnested alternatives
Which specification
is more appropriate?
Model 1:

Model 2:

Define a general model that contains both models as subcases and test:

Discussion
Can always be done; however, a clear winner need not emerge
Cannot be used if the models differ in their definition of the dep. var.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Using proxy variables for unobserved explanatory variables
Example: Omitted ability in a wage equation Replace by proxy

In general, the estimates for the returns to education and experience will be biased because
one has omit the unobservable ability variable. Idea: find a proxy variable for ability which is
able to control for ability differences between individuals so that the coefficients of the other
variables will not be biased. A possible proxy for ability is the IQ score or similar test scores.

General approach to using proxy variables

Omitted variable, e.g. ability

Regression of the omitted variable on its proxy

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Assumptions necessary for the proxy variable method to work
The proxy is „just a proxy“ for the omitted variable, it does not belong
into the population regression, i.e. it is uncorrelated with its error
If the error and the proxy were correlated, the proxy
would actually have to be included in the population
regression function

The proxy variable is a „good“ proxy for the omitted variable, i.e. using
other variables in addition will not help to predict the omitted variable

Otherwise x1 and x2 would

have to be included in the
regression for the omitted
variable

In this regression model, the error term is uncorrelated with all explanatory variables. As a
consequence, all coefficients will be correctly estimated using OLS. The coefficents for the
explanatory variables x1 and x2 will be correctly identified. The coefficient for the proxy va-
riable may also be of interest (it is a multiple of the coefficient of the omitted variable).

Discussion of the proxy assumptions in the wage example

Assumption 1: Should be fullfilled as IQ score is not a direct wage
determinant; what matters is how able the person proves at work
Assumption 2: Most of the variation in ability should be explainable by
variation in IQ score, leaving only a small rest to educ and exper

As expected, the measured return to

education decreases if IQ is included
as a proxy for unobserved ability.

The coefficient for the proxy suggests

that ability differences between indivi-
duals are important (e.g. + 15 points
IQ score are associated with a wage
increase of 5.4 percentage points).

Even if IQ score imperfectly soaks up

the variation caused by ability, inclu-
ding it will at least reduce the bias in
the measured return to education.

No significant interaction effect bet-

ween ability and education.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Using lagged dependent variables as proxy variables
In many cases, omitted unobserved factors may be proxied by the
value of the dependent variable from an earlier time period

Example: City crime rates

Including the past crime rate will at least partly control for the many
omitted factors that also determine the crime rate in a given year
Another way to interpret this equation is that one compares cities
which had the same crime rate last year; this avoids comparing cities
that differ very much in unobserved crime factors

The model has a random

intercept and a random slope

Average Random Average Random

intercept component slope component
Error term

The individual random com-

ponents are independent of
Assumptions: the explanatory variable

WLS or OLS with robust standard

errors will consistently estimate the
average intercept and average
slope in the population

Mismeasured value = True value + Measurement error

Population regression

Estimated regression

Consequences of measurement error in the dependent variable

Estimates will be less precise because the error variance is higher
Otherwise, OLS will be unbiased and consistent (as long as the mea-
surement error is unrelated to the values of the explanatory variables)

Population regression

Estimated regression

Error unrelated
Classical errors-in-variables assumption: to true value

The mismeasured
variable x1 is cor-
related with the
error term!

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Consequences of measurement error in an explanatory variable
Under the classical errors-in-variables assumption, OLS is biased and
inconsistent because the mismeasured variable is endogenous
One can show that the inconsistency is of the following form:

This factor (which involves the error

variance of a regression of the true value
of x1 on the other explanatory variables)
will always be between zero and one

The effect of the mismeasured variable suffers from attenuation bias,

i.e. the magnitude of the effect will be attenuated towards zero
In addition, the effects of the other explanatory variables will be biased

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Missing data and nonrandom samples
Missing data as sample selection
Missing data is a special case of sample selection (= nonrandom samp-
ling) as the observations with missing information cannot be used
If the sample selection is based on independent variables there is no
problem as a regression conditions on the independent variables
In general, sample selection is no problem if it is uncorrelated with the
error term of a regression (= exogenous sample selection)
Sample selection is a problem, if it is based on the dependent variable
or on the error term (= endogenous sample selection)

If the sample was nonrandom in the way that certain age groups, income groups, or household sizes
were over- or undersampled, this is not a problem for the regression because it examines the savings
for subgroups defined by income, age, and hh-size. The distribution of subgroups does not matter.

Example for endogenous sample selection

If the sample is nonrandom in the way individuals refuse to take part in the sample survey if their
wealth is particularly high or low, this will bias the regression results because these individuals may
be systematically different from those who do not refuse to take part in the sample survey.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Outliers and influential observations
Extreme values and outliers may be a particular problem for OLS
because the method is based on squaring deviations
If outliers are the result of mistakes that occured when keying in the
data, one should just discard the affected observations
If outliers are the result of the data generating process, the decision
whether to discard the outliers is not so easy

Example: R&D intensity and firm size

The outlier is not the result of a mistake:

One of the sampled firms is much larger The regression without the
than the others. outlier makes more sense.

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Specification and Data Issues
Least absolute deviations estimation (LAD)
The least absolute deviations estimator minimizes the sum of absolute
deviations (instead of the sum of squared deviations, i.e. OLS)

It may be more robust to outliers as deviations are not squared

The least absolute deviations estimator estimates the parameters of
the conditional median (instead of the conditional mean with OLS)
The least absolute deviations estimator is a special case of quantile
regression, which estimates parameters of conditional quantiles

Chapter 10

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
The nature of time series data
Temporal ordering of observations; may not be arbitrarily reordered
Typical features: serial correlation/nonindependence of observations
How should we think about the randomness in time series data?
• The outcome of economic variables (e.g. GNP, Dow Jones) is
uncertain; they should therefore be modeled as random variables
• Time series are sequences of r.v. (= stochastic processes)
• Randomness does not come from sampling from a population
• „Sample“ = the one realized path of the time series out of the
many possible paths the stochastic process could have taken

Here, there are only two time series. There may

be many more variables whose paths over time
are observed simultaneously.

Time series analysis focuses on modeling the

dependency of a variable on its own past, and
on the present and past values of other variables.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Examples of time series regression models
Static models
In static time series models, the current value of one variable is
modeled as the result of the current values of explanatory variables

Examples for static models

There is a contemporaneous relationship between
unemployment and inflation (= Phillips-Curve).

The current murderrate is determined by the current conviction rate, unemployment rate,
and fraction of young males in the population.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Finite distributed lag models
In finite distributed lag models, the explanatory variables are allowed
to influence the dependent variable with a time lag

Example for a finite distributed lag model

The fertility rate may depend on the tax value of a child, but for
biological and behavioral reasons, the effect may have a lag

Children born per Tax exemption Tax exemption Tax exemption

1,000 women in year t in year t in year t-1 in year t-2

Effect of a past shock on the current value of the dep. variable

Effect of a transitory shock: Effect of permanent shock:

If there is a one time shock in a If there is a permanent shock in a past period, i.e.
past period, the dep. variable will the explanatory variable permanently increases by
change temporarily by the amount one unit, the effect on the dep. variable will be the
indicated by the coefficient of the cumulated effect of all relevant lags. This is a long-
corresponding lag. run effect on the dependent variable.

For example, the effect is biggest

after a lag of one period. After that,
the effect vanishes (if the initial
shock was transitory).

The long run effect of a permanent

shock is the cumulated effect of all
relevant lagged effects. It does not
vanish (if the initial shock is a per-
manent one).

Assumption TS.1 (Linear in parameters)

The time series involved obey a linear relationship. The stochastic processes yt, xt1,…,
xtk are observed, the error process ut is unobserved. The definition of the explanatory
variables is general, e.g. they may be lags or functions of other explanatory variables.

Assumption TS.2 (No perfect collinearity)

„In the sample (and therefore in the underlying time series
process), no independent variable is constant nor a perfect
linear combination of the others.“

The values of all explanatory

variables in period number t

Assumption TS.3 (Zero conditional mean)

The mean value of the unobserved factors is unrelated to

the values of the explanatory variables in all periods

The mean of the error term is unrelated to the

Exogeneity: explanatory variables of the same period

The mean of the error term is unrelated to the

Strict exogeneity: values of the explanatory variables of all periods

Strict exogeneity is stronger than contemporaneous exogeneity

TS.3 rules out feedback from the dep. variable on future values of the
explanatory variables; this is often questionable esp. if explanatory
variables „adjust“ to past changes in the dependent variable
If the error term is related to past values of the explanatory variables,
one should include these values as contemporaneous regressors

Assumption TS.4 (Homoscedasticity)

The volatility of the errors must not be related to

the explanatory variables in any of the periods

A sufficient condition is that the volatility of the error is independent of

the explanatory variables and that it is constant over time
In the time series context, homoscedasticity may also be easily violated,
e.g. if the volatility of the dep. variable depends on regime changes

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Assumption TS.5 (No serial correlation)
Conditional on the explanatory variables, the un-
observed factors must not be correlated over time

Discussion of assumption TS.5

Why was such an assumption not made in the cross-sectional case?
The assumption may easily be violated if, conditional on knowing the
values of the indep. variables, omitted factors are correlated over time
The assumption may also serve as substitute for the random sampling
assumption if sampling a cross-section is not done completely randomly
In this case, given the values of the explanatory variables, errors have
to be uncorrelated across cross-sectional units (e.g. states)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Theorem 10.2 (OLS sampling variances)

Under assumptions TS.1 – TS.5: The same formula as in

the cross-sectional case

The conditioning on the values of the explanatory variables is not easy to understand. It effectively
means that, in a finite sample, one ignores the sampling variability coming from the randomness of
the regressors. This kind of sampling variability will normally not be large (because of the sums).

Theorem 10.3 (Unbiased estimation of the error variance)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Theorem 10.4 (Gauss-Markov Theorem)
Under assumptions TS.1 – TS.5, the OLS estimators have the minimal
variance of all linear unbiased estimators of the regression coefficients
This holds conditional as well as unconditional on the regressors

Assumption TS.6 (Normality) This assumption implies TS.3 – TS.5

independently of

Theorem 10.5 (Normal sampling distributions)

Under assumptions TS.1 – TS.6, the OLS estimators have the usual nor-
mal distribution (conditional on ). The usual F- and t-tests are valid.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Example: Static Phillips curve
Contrary to theory, the estimated Phillips
Curve does not suggest a tradeoff between
inflation and unemployment

The error term contains factors such

as monetary shocks, income/demand
shocks, oil price shocks, supply
Discussion of CLM assumptions shocks, or exchange rate shocks

TS.1:

TS.2: A linear relationship might be restrictive, but it should be a good approximation.

Perfect collinearity is not a problem as long as unemployment varies over time.

TS.3: Easily violated

For example, past unemployment shocks may lead to

future demand shocks which may dampen inflation
For example, an oil price shock means more inflation
and may lead to future increases in unemployment

Assumption is violated if monetary

TS.4: policy is more „nervous“ in times
of high unemployment
TS.5: Assumption is violated if ex-
change rate influences persist
Questionable over time (they cannot be
TS.6: explained by unemployment)

Interest rate on 3-months T-bill Government deficit as percentage of GDP

The error term represents other

factors that determine interest
rates in general, e.g. business
Discussion of CLM assumptions cycle effects

TS.1:

TS.2: A linear relationship might be restrictive, but it should be a good approximation.

Perfect collinearity will seldomly be a problem in practice.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Discussion of CLM assumptions (cont.)
Easily violated
TS.3:
For example, past deficit spending may boost economic
activity, which in turn may lead to general interest rate rises
For example, unobserved demand shocks may increase
interest rates and lead to higher inflation in future periods

Assumption is violated if higher deficits lead

TS.4: to more uncertainty about state finances
and possibly more abrupt rate changes

TS.5: Assumption is violated if business cylce

effects persist across years (and they
Questionable cannot be completely accounted for by
TS.6: inflation and the evolution of deficits)

Children born per Tax exemption Dummy for World War Dummy for availabity of con-
1,000 women in year t in year t II years (1941-45) traceptive pill (1963-present)

Interpretation
During World War II, the fertility rate was temporarily lower
It has been permanently lower since the introduction of the pill in 1963

Example for a time

series with a linear
upward trend

Abstracting from random deviations, the dependent

variable increases by a constant amount per time unit

Alternatively, the expected value of the dependent

variable is a linear function of time

Modelling an exponential time trend

Abstracting from random deviations, the dependent vari-

able increases by a constant percentage per time unit

Abstracting from
random deviations,
the time series has a
constant growth rate

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
Using trending variables in regression analysis
If trending variables are regressed on each other, a spurious re-
lationship may arise if the variables are driven by a common trend
In this case, it is important to include a trend in the regression

Example: Housing investment and prices

Per capita housing investment Housing price index

It looks as if investment and

prices are positively related

There is no significant relationship

between price and investment anymore

When should a trend be included?

If the dependent variable displays an obvious trending behaviour
If both the dependent and some independent variables have trends
If only some of the independent variables have trends; their effect on
the dep. var. may only be visible after a trend has been substracted

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Basic Regression Analysis
A Detrending interpretation of regressions with a time trend
It turns out that the OLS coefficients in a regression including a trend
are the same as the coefficients in a regression without a trend but
where all the variables have been detrended before the regression
This follows from the general interpretation of multiple regressions
Computing R-squared when the dependent variable is trending
Due to the trend, the variance of the dep. var. will be overstated
It is better to first detrend the dep. var. and then run the regression
on all the indep. variables (plus a trend if they are trending as well)
The R-squared of this regression is a more adequate measure of fit

=1 if obs. from december

=0 otherwise

Similar remarks apply as in the case of deterministic time trends

The regression coefficients on the explanatory variables can be seen as
the result of first deseasonalizing the dep. and the explanat. variables
An R-squared that is based on first deseasonalizing the dep. var. may
better reflect the explanatory power of the explanatory variables

Chapter 11

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
The assumptions used so far seem to be too restricitive
Strict exogeneity, homoscedasticity, and no serial correlation are very
demanding requirements, especially in the time series context
Statistical inference rests on the validity of the normality assumption
Much weaker assumptions are needed if the sample size is large
A key requirement for large sample analysis of time series is that
the time series in question are stationary and weakly dependent
Stationary time series
Loosely speaking, a time series is stationary if its stochastic properties
and its temporal dependence structure do not change over time

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Stationary stochastic processes
A stochastic process is stationary, if for every
collection of indices the joint distribution of
, is the same as that of
for all integers .

Covariance stationary processes

A stochastic process is covariance stationary, if its
expected value, its variance, and its covariances are constant over time:
1) , 2) , and 3) .

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Weakly dependent time series
A stochastic process is weakly dependent , if
is „almost independent“ of if grows to infinity (for all ).

Discussion of the weak dependence property

An implication of weak dependence is that the correlation between
, and must converge to zero if grows to infinity
For the LLN and the CLT to hold, the individual observations must not
be too strongly related to each other; in particular their relation must
become weaker (and this fast enough) the farther they are apart
Note that a series may be nonstationary but weakly dependent

The process is a short moving average of an i.i.d. series et

The process is weakly dependent because observations that are more than one
time period apart have nothing in common and are therefore uncorrelated.

Autoregressive process of order one (AR(1))

The process carries over to a certain extent the value of the
previous period (plus random shocks from an i.i.d. series et)

If the stability condition holds, the process is weakly dependent because serial
correlation converges to zero as the distance between observations grows to infinity.

Assumption TS.1‘ (Linear in parameters)

Same as assumption TS.1 but now the dependent and independent
variables are assumed to be stationary and weakly dependent
Assumption TS.2‘ (No perfect collinearity)
Same as assumption TS.2
Assumption TS.3‘ (Zero conditional mean)
Now the explanatory variables are assumed to be only contempo-
raneously exogenous rather than strictly exogenous, i.e.
The explanatory variables of the same period are
uninformative about the mean of the error term

Important note: For consistency it would even suffice to assume that the explanatory
variables are merely contemporaneously uncorrelated with the error term.

Why is it important to relax the strict exogeneity assumption?

Strict exogeneity is a serious restriction beause it rules out all kinds of
dynamic relationships between explanatory variables and the error term
In particular, it rules out feedback from the dep. var. on future values of
the explanat. variables (which is very common in economic contexts)
Strict exogeneity precludes the use of lagged dep. var. as regressors

This is the simplest possible regression

model with a lagged dependent variable

Contemporanous exogeneity:

Strict exogeneity: Strict exogeneity would imply

that the error term is uncorre-
lated with all yt, t=1, … , n-1
This leads to a contradiction because:

OLS estimation in the presence of lagged dependent variables

Under contemporaneous exogeneity, OLS is consistent but biased

The errors are contemporaneously homoscedastic

Assumption TS.5‘ (No serial correlation)

Conditional on the explanatory variables in

periods t and s, the errors are uncorrelated

Theorem 11.2 (Asymptotic normality of OLS)

Under assumptions TS.1‘ – TS.5‘, the OLS estimators are asymptotically
normally distributed. Further, the usual OLS standard errors, t-statistics
and F-statistics are asymptotically valid.

The EMH in a strict form states that information observable to the market prior to week t should
not help to predict the return during week t. A simplification assumes in addition that only past
returns are considered as relevant information to predict the return in week t.This implies that

A simple way to test the EMH is to specify an AR(1) model. Under the EMH assumption,TS.3‘ holds
so that an OLS regression can be used to test whether this week‘s returns depend on last week‘s.

There is no evidence against the

EMH. Including more lagged
returns yields similar results.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Using trend-stationary series in regression analysis
Time series with deterministic time trends are nonstationary
If they are stationary around the trend and in addition weakly
dependent, they are called trend-stationary processes
Trend-stationary processes also satisfy assumption TS.1‘
Using highly persistent time series in regression analysis
Unfortunately many economic time series violate weak dependence
because they are highly persistent (= strongly dependent)
In this case OLS methods are generally invalid (unless the CLM hold)
In some cases transformations to weak dependence are possible

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Random walks
The random walk is called random walk because it wanders
from the previous position yt-1 by an i.i.d. random amount et

The value today is the accumulation of all past shocks plus an initial value. This is the reason why
the random walk is highly persistent: The effect of a shock will be contained in the series forever.

The random walk is not covariance stationary

because its variance and its covariance depend
on time.

It is also not weakly dependent because the

correlation between observations vanishes very
slowly and this depends on how large t is.

The random walks

wander around with
no clear direction

A random walk is a special case

of a unit root process.

Unit root processes are defined

as the random walk but et may
be an arbitrary weakly depen-
dent process.

From an economic point of view

it is important to know whether
a time series is highly persistent.
In highly persistent time series,
shocks or policy changes have
lasting/permanent effects, in
weakly dependent processes
their effects are transitory.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Random walks with drift
In addition to the usual random walk mechanism, there is
a deterministic increase/decrease (= drift) in each period

This leads to a linear time trend around which the series follows its random walk behaviour. As there
is no clear direction in which the random walk develops, it may also wander away from the trend.

Otherwise, the random walk with drift has similar

properties as the random walk without drift.

Random walks with drift are not covariance statio-

nary and not weakly dependent.

Note that the series does not

regularly return to the trend line.

Random walks with drift may be

good models for time series that
have an obvious trend but are not
weakly dependent.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Transformations on highly persistent time series
Order of integration
Weakly dependent time series are integrated of order zero (= I(0))
If a time series has to be differenced one time in order to obtain a
weakly dependent series, it is called integrated of order one (= I(1))
Examples for I(1) processes
After differencing, the
resulting series are weakly
dependent (because et is
weakly dependent).

Differencing is often a way to achieve weak dependence

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Deciding whether a time series is I(1)
There are statistical tests for testing whether a time series is I(1)
(= unit root tests); these will be covered in later chapters
Alternatively, look at the sample first order autocorrelation:

Measures how strongly adjacent times series

observations are related to each other.

If the sample first order autocorrelation is close to one, this suggests

that the time series may be highly persistent (= contains a unit root)
Alternatively, the series may have a deterministic trend
Both unit root and trend may be eliminated by differencing

This equation could be estimated by OLS if the CLM assumptions hold. These may be questionable,
so that one would have to resort to large sample analysis. For large sample analysis, the fertility
series and the series of the personal tax exemption have to be stationary and weakly dependent.
This is questionable because the two series are highly persistent:

It is therefore better to estimate the equation in first differences. This makes sense because if the
equation holds in levels, it also has to hold in first differences:

Estimate of

The elasticity of hourly wage with respect

to output per hour (=productivity) seems
implausibly large.

It turns out that even after detrending, both series display sample autocorrelations
close to one so that estimating the equation in first differences seems more adequate:

This estimate of the elasticity of hourly

wage with respect to productivity makes
much more sense.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Dynamically complete models
A model is said to be dynamically complete if enough lagged variab-
les have been included as explanatory variables so that further lags
do not help to explain the dependent variable:

Dynamic completeness implies absence of serial correlation

If further lags actually belong in the regression, their omission will
cause serial correlation (if the variables are serially correlated)
One can easily test for dynamic completeness
If lags cannot be excluded, this suggests there is serial correlation

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Further Issues Using OLS
Sequential exogeneity
A set of explanatory variables is said to be sequentially exogenous if
„enough“ lagged explanatory variables have been included:

Sequential exogeneity is weaker than strict exogeneity

Sequential exogeneity is equivalent to dynamic completeness if the
explanatory variables contain a lagged dependent variable
Should all regression models be dynamically complete?
Not necessarily: If sequential exogeneity holds, causal effects will be
correctly estimated; absence of serial correlation is not crucial

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Serial Correlation and
Heteroscedasticity in
Time Series Regressions

Chapter 12

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Properties of OLS with serially correlated errors
OLS still unbiased and consistent if errors are serially correlated
Correctness of R-squared also does not depend on serial correlation
OLS standard errors and tests will be invalid if there is serial correlation
OLS will not be efficient anymore if there is serial correlation
Serial correlation and the presence of lagged dependent variables
Is OLS inconsistent if there are ser. corr. and lagged dep. variables?
No: Including enough lags so that TS.3‘ holds guarantees consistency
Including too few lags will cause an omitted variable problem and serial
correlation because some lagged dep. var. end up in the error term

AR(1) model for serial correlation (with an i.i.d. series et)

Replace true unobserved errors by estimated residuals

Test in

Example: Static Phillips curve (see above)

Reject null hypothesis
of no serial correlation

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Durbin-Watson test under classical assumptions
Under assumptions TS.1 – TS.6, the Durbin-Watson test is an exact
test (whereas the previous t-test is only valid asymptotically).

Unfortunately, the Durbin-Watson

vs. test works with a lower and and an
upper bound for the critical value.
In the area between the bounds
Reject if , „Accept“ if the test result is inconclusive.

Example: Static Phillips curve (see above)

Reject null hypothesis of no serial correlation

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Testing for AR(1) serial correlation with general regressors
The t-test for autocorrelation can be easily generalized to allow for the
possibility that the explanatory variables are not strictly exogenous:

The test now allows for the possibility that

Test for
the strict exogeneity assumption is violated.

The test may be carried out in a heteroscedasticity robust way

General Breusch-Godfrey test for AR(q) serial correlation

Test

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Correcting for serial correlation with strictly exog. regressors
Under the assumption of AR(1) errors, one can transform the model
so that it satisfies all GM-assumptions. For this model, OLS is BLUE.

Simple case of regression with only one explana-

tory variable. The general case works analogously.

Lag and multiply by

The transformed error satis-

fies the GM-assumptions.

Problem: The AR(1)-coefficient is not known and has to be estimated

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Correcting for serial correlation (cont.)
Replacing the unknown by leads to a FGLS-estimator
There are two variants:
• Cochrane-Orcutt estimation omits the first observation
• Prais-Winsten estimation adds a transformed first observation
In smaller samples, Prais-Winsten estimation should be more efficient
Comparing OLS and FGLS with autocorrelation
For consistency of FGLS more than TS.3‘ is needed (e.g. TS.3) because
the transformed regressors include variables from different periods
If OLS and FGLS differ dramatically this might indicate violation of TS.3

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Serial correlation-robust inference after OLS
In the presence of serial correlation, OLS standard errors overstate
statistical significance because there is less independent variation
One can compute serial correlation-robust std. errors after OLS
This is useful because FGLS requires strict exogeneity and assumes a
very specific form of serial correlation (AR(1) or, generally, AR(q))
Serial correlation-robust standard errors:
The usual OLS standard errors are
normalized and then „inflated“ by
a correction factor.

Serial correlation-robust F- and t-tests are also available

This term is the product of the residuals and the residuals

of a regression of xtj on all other explanatory variables

The integer g controls how much serial correlation is allowed:

g=2: The weight of higher order

autocorrelations is declining

g=3:

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Discussion of serial correlation-robust standard errors
The formulas are also robust to heteroscedasticity; they are therefore
called „heteroscedasticity and autocorrelation consistent“ (=HAC)
For the integer g, values such as g=2 or g=3 are normally sufficient
(there are more involved rules of thumb for how to choose g)
Serial correlation-robust standard errors are only valid asymptotically;
they may be severely biased if the sample size is not large enough
The bias is the higher the more autocorrelation there is; if the series
are highly correlated, it might be a good idea to difference them first
Serial correlation-robust errors should be used if there is serial corr.
and strict exogeneity fails (e.g. in the presence of lagged dep. var.)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Heteroscedasticity in time series regressions
Heteroscedasticity usually receives less attention than serial correlation
Heteroscedasticity-robust standard errors also work for time series
Heteroscedasticity is automatically corrected for if one uses the serial
correlation-robust formulas for standard errors and test statistics
Testing for heteroscedasticity
The usual heteroscedasticity tests assume absence of serial correlation
Before testing for heteroscedasticity one should therefore test for serial
correlation first, using a heteroscedasticity-robust test if necessary
After ser. corr. has been corrected for, test for heteroscedasticity

Test for serial correlation:

No evidence for serial
correlation

Test for heteroscedasticity:

Strong evidence for heteroscedasticity

Note: Volatility is higher

if returns are low

Even if there is no heteroscedasticity in the usual sense (the error variance depends
on the explanatory variables), there may be heteroscedasticity in the sense that the
variance depends on how volatile the time series was in previous periods:

ARCH(1) model

Consequences of ARCH in static and distributed lag models

If there are no lagged dependent variables among the regressors, i.e.
in static or distributed lag models, OLS remains BLUE under TS.1-TS.5
Also, OLS is consistent etc. for this case under assumptions TS.1‘-TS.5‘
As explained, in this case, assumption TS.4 still holds under ARCH

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analyzing Time Series:
Serial Correl. and Heterosced.
Consequences of ARCH in dynamic models
In dynamic models, i.e. models including lagged dependent variables,
the homoscedasticity assumption TS.4 will necessarily be violated:

because

This means the error variance indirectly depends on explanat. variables

In this case, heteroscedasticity-robust standard error and test statistics
should be computed, or a FGLS/WLS-procedure should be applied
Using a FGLS/WLS-procedure will also increase efficiency

Are there ARCH-effects in these errors?

Estimating equation for ARCH(1) model

There are statistically significant ARCH-effects:

If returns were particularly high or low (squared
returns were high) they tend to be particularly
high or low again, i.e. high volatility is followed
by high volatility.

Given or estimated model for heteroscedasticity

Model for serial correlation

Estimate transformed model by Cochrane-Orcutt or Prais-Winsten

techniques (because of serial correlation in transformed error term)

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Pooling Cross Sections across
Time: Simple Panel Data Methods

Chapter 13

Wooldridge: Introductory Econometrics:

A Modern Approach, 5e

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Pooled Cross Sections and
Simple Panel Data Methods
Policy analysis with pooled cross sections
Two or more independently sampled cross sections can be used to
evaluate the impact of a certain event or policy change
Example: Effect of new garbage incinerator on housing prices
Examine the effect of the location of a house on its price before and
after the garbage incinerator was built:
After incinerator
was built

Before incinerator
was built

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Pooled Cross Sections and
Simple Panel Data Methods
Example: Garbage incinerator and housing prices (cont.)
It would be wrong to conclude from the regression after the incinerator
is there that being near the incinerator depresses prices so strongly
One has to compare with the situation before the incinerator was built:

Incinerator depresses prices but location

In the given case, this is equivalent to was one with lower prices anyway

This is the so called difference-in-differences estimator (DiD)

Differential effect of being in the location and after the incinerator was built

In this way standard errors for the DiD-effect can be obtained

If houses sold before and after the incinerator was built were sys-
tematically different, further explanatory variables should be included
This will also reduce the error variance and thus standard errors

Before/After comparisons in „natural experiments“

DiD can be used to evaluate policy changes or other exogenous events

Compare outcomes of the two groups

before and after the policy change

Compare the difference in outcomes of the units that are affected by the policy change (= treatment
group) and those who are not affected (= control group) before and after the policy was enacted.

For example, the level of unemployment benefits is cut but only for group A (= treatment group).
Group A normally has longer unemployment durations than group B (= control group). If the diffe-
rence in unemployment durations between group A and group B becomes smaller after the reform,
reducing unemployment benefits reduces unemployment duration for those affected.

Caution: Difference-in-differences only works if the difference in outcomes between the two groups
is not changed by other factors than the policy change (e.g. there must be no differential trends).

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Pooled Cross Sections and
Simple Panel Data Methods
Two-period panel data analysis
Example: Effect of unemployment on city crime rate
Assume that no other explanatory variables are available. Will it be
possible to estimate the causal effect of unemployment on crime?
Yes, if cities are observed for at least two periods and other factors
affecting crime stay approximately constant over those periods:

Time dummy for Unobserved time-constant Other unobserved factors

the second period factors (= fixed effect) (= idiosyncratic error)

Subtract:

Estimate differenced equation by OLS: Fixed effect drops out!

+ 1 percentage point unemploy-

ment rate leads to 2.22 more
crimes per 1,000 people

Secular increase in crime

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Pooled Cross Sections and
Simple Panel Data Methods
Discussion of first-differenced panel estimator
Further explanatory variables may be included in original equation
Note that there may be arbitrary correlation between the unobserved
time-invariant characteristics and the included explanatory variables
OLS in the original equation would therefore be inconsistent
The first-differenced panel estimator is thus a way to consistently
estimate causal effects in the presence of time-invariant endogeneity
For consistency, strict exogeneity has to hold in the original equation
First-differenced estimates will be imprecise if explanatory variables
vary only little over time (no estimate possible if time-invariant)

IIAP Mock 1 Answer Key
67% (3)
IIAP Mock 1 Answer Key
6 pages
ACCO-20063 CFAS - Syllabus
No ratings yet
ACCO-20063 CFAS - Syllabus
5 pages
CH 10
No ratings yet
CH 10
125 pages
Business and Economic Forecasting EMET3007/EMET8012 Problem Set 1
No ratings yet
Business and Economic Forecasting EMET3007/EMET8012 Problem Set 1
2 pages
Personal Property Security Act: Republic Act No. 11057 Section 2. Declaration of Policy
100% (6)
Personal Property Security Act: Republic Act No. 11057 Section 2. Declaration of Policy
21 pages
Structure Project Topics For Projects 2016
No ratings yet
Structure Project Topics For Projects 2016
4 pages
Lecture 6
No ratings yet
Lecture 6
23 pages
Instant Ebooks Textbook Introductury Econometrics: A Modern Approach 7th Edition Jeffrey M. Wooldridge Download All Chapters
100% (1)
Instant Ebooks Textbook Introductury Econometrics: A Modern Approach 7th Edition Jeffrey M. Wooldridge Download All Chapters
53 pages
CH 09
No ratings yet
CH 09
172 pages
Principles of Econometrics, 5th Ed. R. Carter Hill - Download the ebook and explore the most detailed content
100% (1)
Principles of Econometrics, 5th Ed. R. Carter Hill - Download the ebook and explore the most detailed content
49 pages
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
100% (1)
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
11 pages
Multicollinearity Among The Regressors Included in The Regression Model
No ratings yet
Multicollinearity Among The Regressors Included in The Regression Model
13 pages
Chapter 2 Power Point Slides
No ratings yet
Chapter 2 Power Point Slides
40 pages
Chapter 4 Power Point Slides
No ratings yet
Chapter 4 Power Point Slides
38 pages
Lecture 4
No ratings yet
Lecture 4
17 pages
Fundamental Eviews
No ratings yet
Fundamental Eviews
302 pages
CH 02
No ratings yet
CH 02
88 pages
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
100% (1)
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
71 pages
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
No ratings yet
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
20 pages
Lecture 3
No ratings yet
Lecture 3
30 pages
Topic 6 Two Variable Regression Analysis Interval Estimation and Hypothesis Testing
No ratings yet
Topic 6 Two Variable Regression Analysis Interval Estimation and Hypothesis Testing
36 pages
4 - LM Test and Heteroskedasticity
No ratings yet
4 - LM Test and Heteroskedasticity
13 pages
G Lecture01
No ratings yet
G Lecture01
21 pages
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
No ratings yet
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
88 pages
Chapter 1: The Nature of Econometrics and Economic Data Chapter 2: The Simple Regression Model
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data Chapter 2: The Simple Regression Model
19 pages
Studenmund Ch01 v2
No ratings yet
Studenmund Ch01 v2
31 pages
Ch11 Panel PA Feb2021
No ratings yet
Ch11 Panel PA Feb2021
27 pages
Lecture 15-3 Cross Section and Panel (Truncated Regression, Heckman Sample Selection)
No ratings yet
Lecture 15-3 Cross Section and Panel (Truncated Regression, Heckman Sample Selection)
50 pages
Studenmund Ch14 v2
No ratings yet
Studenmund Ch14 v2
48 pages
Chapter 03 Productivity, Output and Employment
No ratings yet
Chapter 03 Productivity, Output and Employment
44 pages
Stock Watson 4E Exercisesolutions Chapter12 Students
No ratings yet
Stock Watson 4E Exercisesolutions Chapter12 Students
6 pages
Chap 5 Two Variable Regression Interval Estimation and Hypothesis Testing
100% (1)
Chap 5 Two Variable Regression Interval Estimation and Hypothesis Testing
46 pages
Week 1 - Intro To Stata
No ratings yet
Week 1 - Intro To Stata
35 pages
Mediation, Moderation, And Interaction Definitions, Discrimination, And (Some) Means of Testing
No ratings yet
Mediation, Moderation, And Interaction Definitions, Discrimination, And (Some) Means of Testing
101 pages
Econometrics 1
No ratings yet
Econometrics 1
74 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Multiple Regression Analysis: Inference: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
No ratings yet
Multiple Regression Analysis: Inference: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
23 pages
The Classical Model: Slides by Niels-Hugo Blunch Washington and Lee University
100% (1)
The Classical Model: Slides by Niels-Hugo Blunch Washington and Lee University
22 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
21 pages
Preferences and Utility: Powerpoint Slides Prepared By: V. Andreea Chiritescu Eastern Illinois University
No ratings yet
Preferences and Utility: Powerpoint Slides Prepared By: V. Andreea Chiritescu Eastern Illinois University
31 pages
CH - 14 - Advanced Panel Data Methods
No ratings yet
CH - 14 - Advanced Panel Data Methods
12 pages
CH-15 - IInd Sem 23-24
No ratings yet
CH-15 - IInd Sem 23-24
99 pages
Nature of Regression Analysis
No ratings yet
Nature of Regression Analysis
22 pages
Econometrics Main Slides
No ratings yet
Econometrics Main Slides
175 pages
CH04 Consumption and Saving
No ratings yet
CH04 Consumption and Saving
64 pages
ARCH Model
No ratings yet
ARCH Model
26 pages
Economics: Thinking Like An Economist
No ratings yet
Economics: Thinking Like An Economist
29 pages
CH - 13 - Pooling Cross Sections Across Time Simple Panel Data Methods
No ratings yet
CH - 13 - Pooling Cross Sections Across Time Simple Panel Data Methods
8 pages
Studenmund Ch02 v2
No ratings yet
Studenmund Ch02 v2
30 pages
Econometrics Till Midsem
No ratings yet
Econometrics Till Midsem
236 pages
Week 11
No ratings yet
Week 11
38 pages
The Simple Linear Regression Model: Specification and Estimation
No ratings yet
The Simple Linear Regression Model: Specification and Estimation
66 pages
Perloff Chapter 4
No ratings yet
Perloff Chapter 4
25 pages
File Download PDF
No ratings yet
File Download PDF
136 pages
Econometrics Multiple Regression Analysis: Heteroskedasticity
No ratings yet
Econometrics Multiple Regression Analysis: Heteroskedasticity
19 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Chapter8 Econometrics Heteroskedasticity
No ratings yet
Chapter8 Econometrics Heteroskedasticity
15 pages
Wooldridge Session 4
No ratings yet
Wooldridge Session 4
64 pages
NIA Practice Numericals
No ratings yet
NIA Practice Numericals
3 pages
Wooldridge 7e Ch01 SM-1
No ratings yet
Wooldridge 7e Ch01 SM-1
5 pages
CH 07 Specification and Data Issues TQT
No ratings yet
CH 07 Specification and Data Issues TQT
45 pages
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
No ratings yet
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
42 pages
The Nature of Econometrics and Economic Data: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
100% (1)
The Nature of Econometrics and Economic Data: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
23 pages
SPR_20231207
No ratings yet
SPR_20231207
9 pages
Project Book Format
No ratings yet
Project Book Format
10 pages
ISA-84.00.02 Part 4 (2002)
No ratings yet
ISA-84.00.02 Part 4 (2002)
58 pages
T5 Taxation Q&A
No ratings yet
T5 Taxation Q&A
23 pages
Patient safety assignment-1
No ratings yet
Patient safety assignment-1
6 pages
ECR-4 Time Line
No ratings yet
ECR-4 Time Line
1 page
Accounting Cycle Journal Entries with Chart of Accounts
No ratings yet
Accounting Cycle Journal Entries with Chart of Accounts
2 pages
CAF 1 FAR1 Spring 2023
No ratings yet
CAF 1 FAR1 Spring 2023
6 pages
Get Access Control and Identity Management Information Systems Security Assurance 3rd Edition Mike Chapple PDF ebook with Full Chapters Now
100% (3)
Get Access Control and Identity Management Information Systems Security Assurance 3rd Edition Mike Chapple PDF ebook with Full Chapters Now
65 pages
TESFAY ASSEFA SILICA SAND FACTORY INVESTMENT PROPOSAL
100% (1)
TESFAY ASSEFA SILICA SAND FACTORY INVESTMENT PROPOSAL
64 pages
Customer Details Loan Details Account Summary
No ratings yet
Customer Details Loan Details Account Summary
2 pages
Morgan Stanley Mutual Fund Vskartick Das 1994 SCC (4) 225
No ratings yet
Morgan Stanley Mutual Fund Vskartick Das 1994 SCC (4) 225
20 pages
Strategic Management-II: Cooperative Strategy
No ratings yet
Strategic Management-II: Cooperative Strategy
17 pages
SWOT (Strength, Weakness, Opportunities and Threats) Analysis of Fast Moving Consumer Goods (FMCG) Industries in India
No ratings yet
SWOT (Strength, Weakness, Opportunities and Threats) Analysis of Fast Moving Consumer Goods (FMCG) Industries in India
10 pages
Econ 310 Syllabus
No ratings yet
Econ 310 Syllabus
3 pages
M.Com (Sem-II) Assignment (Session 2023-24)
No ratings yet
M.Com (Sem-II) Assignment (Session 2023-24)
6 pages
Double Entry System Test
100% (2)
Double Entry System Test
2 pages
Budget Preparation
No ratings yet
Budget Preparation
7 pages
BTRE Answer
No ratings yet
BTRE Answer
13 pages
Davao - Eagle - Com JOSEPH
No ratings yet
Davao - Eagle - Com JOSEPH
6 pages
Role of Suppliers in Supply Chain Management: Identifying Performance Opportunities
No ratings yet
Role of Suppliers in Supply Chain Management: Identifying Performance Opportunities
5 pages
Project - Setting Up of Food Business in India
No ratings yet
Project - Setting Up of Food Business in India
46 pages
Bankruptcy & Types of Money
No ratings yet
Bankruptcy & Types of Money
2 pages
SAP Production Planning
No ratings yet
SAP Production Planning
3 pages
Shareholding-Patter
No ratings yet
Shareholding-Patter
12 pages
Memorandum of Agreement - DATA LAND COPY
No ratings yet
Memorandum of Agreement - DATA LAND COPY
3 pages
BA4002 FINANCIAL MARKETS STUDY MATERIAL UNIT I,II & III
No ratings yet
BA4002 FINANCIAL MARKETS STUDY MATERIAL UNIT I,II & III
40 pages