Fundamentals of Applied Econometrics: by Richard A. Ashley
Fundamentals of Applied Econometrics: by Richard A. Ashley
FUNDAMENTALS OF
APPLIED
ECONOMETRICS
by
RICHARD A. ASHLEY
Economics Department
Virginia Tech
This book was set in 10/12 Times Roman by Thomson Digital and printed and bound by RR Donnelley. The cover was
printed by RR Donnelly.
Founded in 1807, John Wiley & Sons, Inc. has been a valued source of knowledge and understanding for more than
200 years, helping people around the world meet their needs and fulfill their aspirations. Our company is built on a
foundation of principles that include responsibility to the communities we serve and where we live and work. In 2008,
we launched a Corporate Citizenship Initiative, a global effort to address the environmental, social, economic, and ethical
challenges we face in our business. Among the issues we are addressing are carbon impact, paper specifications and
procurement, ethical conduct within our business and among our vendors, and community and charitable support. For more
information, please visit our Web site: www.wiley.com/go/citizenship.
Copyright # 2012 John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a
retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or
otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright
Clearance Center, Inc. 222 Rosewood Drive, Danvers, MA 01923, Web site www.copyright.com. Requests to the Publisher
for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030-5774, (201)748-6011, fax (201)748-6008, Web site: www.wiley.com/go/permissions.
Evaluation copies are provided to qualified academics and professionals for review purposes only, for use in their courses
during the next academic year. These copies are licensed and may not be sold or transferred to a third party. Upon
completion of the review period, please return the evaluation copy to Wiley. Return instructions and a free of charge return
mailing label are available at www.wiley.com/go/returnlabel. If you have chosen to adopt this textbook for use in your
course, please accept this book as your complimentary desk copy. Outside of the United States, please contact your local
sales representative.
10 9 8 7 6 5 4 3 2 1
FFIRS 11/21/2011 18:42:57 Page 3
BRIEF CONTENTS
What’s Different about This Book xiii
Working with Data in the “Active Learning Exercises” xxii
Acknowledgments xxiii
Notation xxiv
Part I INTRODUCTION AND STATISTICS REVIEW 1
Chapter 1 INTRODUCTION 3
Chapter 2 A REVIEW OF PROBABILITY THEORY 11
Chapter 3 ESTIMATING THE MEAN OF A NORMALLY DISTRIBUTED RANDOM VARIABLE 46
Chapter 4 STATISTICAL INFERENCE ON THE MEAN OF A NORMALLY
DISTRIBUTED RANDOM VARIABLE 68
Part II REGRESSION ANALYSIS 97
Chapter 5 THE BIVARIATE REGRESSION MODEL: INTRODUCTION, ASSUMPTIONS,
AND PARAMETER ESTIMATES 99
Chapter 6 THE BIVARIATE LINEAR REGRESSION MODEL: SAMPLING DISTRIBUTIONS
AND ESTIMATOR PROPERTIES 131
Chapter 7 THE BIVARIATE LINEAR REGRESSION MODEL: INFERENCE ON b 150
Chapter 8 THE BIVARIATE REGRESSION MODEL: R2 AND PREDICTION 178
Chapter 9 THE MULTIPLE REGRESSION MODEL 191
Chapter 10 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE
REGRESSION MODEL: DEALING WITH POTENTIAL OUTLIERS AND
HETEROSCEDASTICITY IN THE CROSS-SECTIONAL DATA CASE 224
Chapter 11 STOCHASTIC REGRESSORS AND ENDOGENEITY 259
Chapter 12 INSTRUMENTAL VARIABLES ESTIMATION 303
Chapter 13 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE
REGRESSION MODEL: THE TIME-SERIES DATA CASE (PART A) 342
Chapter 14 DIAGNOSTICALLY CHECKING AND RESPECIFYING THE MULTIPLE
REGRESSION MODEL: THE TIME-SERIES DATA CASE (PART B) 389
Part III ADDITIONAL TOPICS IN REGRESSION ANALYSIS 455
Chapter 15 REGRESSION MODELING WITH PANEL DATA (PART A) 459
Chapter 16 REGRESSION MODELING WITH PANEL DATA (PART B) 507
Chapter 17 A CONCISE INTRODUCTION TO TIME-SERIES ANALYSIS AND
FORECASTING (PART A) 536
Chapter 18 A CONCISE INTRODUCTION TO TIME-SERIES ANALYSIS AND
FORECASTING (PART B) 595
Chapter 19 PARAMETER ESTIMATION BEYOND CURVE-FITTING:
MLE (WITH AN APPLICATION TO BINARY-CHOICE MODELS)
AND GMM (WITH AN APPLICATION TO IV REGRESSION) 647
Chapter 20 CONCLUDING COMMENTS 681
Mathematics Review 693
iv
FTOC02 11/24/2011 13:31:44 Page 5
TABLE OF CONTENTS
v
FTOC02 11/24/2011 13:31:44 Page 6
vi TABLE OF CONTENTS
2.13 Conclusion 36
Exercises 37
ALE 2a: The Normal Distribution 42
ALE 2b: Central Limit Theorem Simulators on the Web (Online)
Appendix 2.1: The Conditional Mean of a Random Variable 44
Appendix 2.2: Proof of the Linearity Property for the Expectation of a Weighted
Sum of Two Discretely Distributed Random Variables 45
TABLE OF CONTENTS ix
10.3 Reasons for Checking the Normality of the Model Errors, U1,...UN 228
10.4 Heteroscedasticity and Its Consequences 237
10.5 Testing for Heteroscedasticity 239
10.6 Correcting for Heteroscedasticity of Known Form 243
10.7 Correcting for Heteroscedasticity of Unknown Form 248
10.8 Application: Is Growth Good for the Poor? Diagnostically Checking the
Dollar/Kraay (2002) Model.1 252
Exercises 256
ALE 10a: The Fitting Errors as Approximations for the Model Errors 257
ALE 10b: Does Output Per Person Depend on Human Capital? (A Test of the
Augmented Solow Model of Growth)2 (Online)
ALE 10c: Is Trade Good or Bad for the Environment? (First Pass)3 (Online)
1
Uses data from Dollar, D., and A. Kraay (2002), “Growth Is Good for the Poor,” Journal of Economic Growth 7, 195–225.
2
Uses data from Mankiw, G. N., D. Romer, and D. N. Weil (1992), “A Contribution to the Empirics of Economic Growth,”
The Quarterly Journal of Economics 107(2), 407–37. Mankiw et al. estimate and test a Solow growth model, augmenting it
with a measure of human capital, quantified by the percentage of the population in secondary school.
3
Uses data from Frankel, J. A., and A. K. Rose (2005), “Is Trade Good or Bad for the Environment? Sorting Out the
Causality,” The Review of Economics and Statistics 87(1), 85–91. Frankel and Rose quantify and test the effect of trade openness
{(X + M)/Y} on three measures of environmental damage (SO2, NO2, and total suspended particulates). Since trade openness may
well be endogenous, Frankel and Rose also obtain 2SLS estimates; these are examined in Active Learning Exercise 12b.
FTOC02 11/24/2011 13:31:44 Page 10
x TABLE OF CONTENTS
4
Uses data from Acemoglu, D., S. Johnson, and J. A. Robinson (2001), “The Colonial Origins of Comparative Development,”
The American Economic Review 91(5), 1369–1401. These authors argue that the European mortality rate in colonial times is a
valid instrument for current institutional quality because Europeans settled (and imported their cultural institutions) only in
colonies with climates they found healthy.
5
See footnote for Active Learning Exercise 10c.
6
Uses data from Bedard, K., and O. Desch^enes (2006), “The Long-Term Impact of Military Service on Health: Evidence from
World War II and Korean War Veterans.” The American Economic Review 96(1), 176–194. These authors quantify the impact
of the provision of free and/or low-cost tobacco products to servicemen on smoking and (later) on mortality rates, using
instrumental variable methods to control for the nonrandom selection into military service.
FTOC02 11/24/2011 13:31:44 Page 11
TABLE OF CONTENTS xi
xiii
FLAST01 11/24/2011 12:52:15 Page 14
Second, the treatment here frames the linear regression model as an explicit parameterization of
the conditional mean of the dependent variable – plus, of course, a model error term. From this point
of view it is natural to initially focus (in Chapters 3 and 4) on what one might call the “univariate
regression model”:
Y i ¼ a þ Ui U i NIIDð0; s 2 Þ
The estimation of the parameters a and s2 in this model is essentially identical to the typical
introductory-statistics-course topic of estimating the mean and variance of a normally distributed
random variable. Consequently, using this “univariate regression model” to begin the coverage of
the essential topics in regression analysis – the least squares estimator, its sampling distribution, its
desirable properties, and the inference machinery based on it – provides a thorough and integrated
review of the key topics which the students need to have understood (and retained) from their
introductory statistics class. It also provides an extension, in the simplest possible setting, to key
concepts – e.g., estimator properties – which are usually not covered in an introductory statistics
course.
Bivariate and multiple regression analysis are then introduced in the middle part of the book
(Chapters 5 through 10) as a relatively straightforward extension to this framework – directly
exploiting the vocabulary, concepts, and techniques just covered in this initial analysis. The always-
necessary statistics “review” is in this way gracefully integrated with the orderly development of the
book’s central topic.
The treatment of stochastic regressors requires the deeper understanding of asymptotic theory
provided in Chapter 11; this material provides a springboard for the more advanced material which
makes up the rest of the book. This portion of the book is ideal for the second term of an
undergraduate econometrics sequence, a Master’s degree level course, or as a companion (auxiliary)
text in a first-term Ph.D. level course.3
A CHAPTER-BY-CHAPTER ROADMAP
After an introductory chapter, the concepts of basic probability theory needed for Chapters 3
through 10 are briefly reviewed in Chapter 2. As noted above, classroom coverage of much of this
material can be skipped for relatively well prepared groups; it is essential, however, for students
with weak (or half-forgotten) statistics backgrounds. The most fundamentally necessary tools are a
clear understanding of what is meant by the probability distribution, expected value, and variance of
a random variable. These concepts are developed in a highly accessible fashion in Chapter 2 by
initially focusing on a discretely distributed random variable.
As noted above, Chapter 3 introduces the notion of a parameter estimator and its sampling
distribution in the simple setting of the estimation of the mean of a normally distributed variate using
a random sample. Both least squares estimation and estimator properties are introduced in this
chapter. Chapter 4 then explains how one can obtain interval estimates and hypothesis tests
regarding the population mean, again in this fundamental context.
Chapters 3 and 4 are the first point at which it becomes crucial to distinguish between an estimator
as a random variable (characterized by its sampling distribution) and its sample realization – an
ordinary number. One of the features of this book is that this distinction is explicitly incorporated in
the notation used. This distinction is consistently maintained throughout – not just for estimators,
but for all of the various kinds of random variables that come up in the development: dependent
3
Thus, in using this book as the text for a one-term undergraduate course, an instructor might want to order copies of the book
containing only Chapter 1 through 12 and Chapter 20. This can be easily done using the Wiley “Custom Select” facility at the
customselect.wiley.com Web site.
FLAST01 11/24/2011 12:52:15 Page 16
variables, model error terms, and even model fitting errors. A summary of the notational
conventions used for these various kinds of random variables (and their sample realizations) is
given in the “Notation” section, immediately prior to Part I of the book. In helping beginners to keep
track of which variables are random and which are not, this consistent notation is well worth the
additional effort involved.
While Chapters 3 and 4 can be viewed as a carefully integrated “statistics review,” most of the
crucial concepts and techniques underlying the regression analysis covered in the subsequent
chapters are first thoroughly developed here:
What constitutes a “good” parameter estimator?
How do the properties (unbiasedness, BLUness, etc.) embodying this “goodness” rest on the
assumptions made?
How can we obtain confidence intervals and hypothesis tests for the underlying parameters?
How does the validity of this inference machinery rest on the assumptions made?
After this preparation, Part II of the book covers the basics of regression analysis. The analysis in
Chapter 5 coherently segues – using an explicit empirical example – from the estimation of the mean
of a random variable into the particular set of assumptions which is here called “The Bivariate
Regression Model,” where the (conditional) mean of a random variable is parameterized as a linear
function of observed realizations of an explanatory variable. In particular, what starts out as a model
for the mean of per capita real GDP (from the Penn World Table) becomes a regression model
relating a country’s output to its aggregate stock of capital. A microeconometric bivariate regression
application later in Chapter 5 relates household weekly earnings (from the Census Bureau’s Current
Population Survey) to a college-graduation dummy variable. This early introduction to dummy
variable regressors is useful on several grounds: it both echoes the close relationship between
regression analysis and the estimation of the mean (in this case, the estimation of two means) and it
also introduces the student early on to an exceedingly useful empirical tool.4
The detailed coverage of the Bivariate Regression Model then continues with the exposition (in
Chapter 6) of how the model assumptions lead to least-squares parameter estimators with desirable
properties and (in Chapter 7) to a careful derivation of how these assumptions yield confidence
intervals and hypothesis tests. These results are all fairly straightforward extensions of the material just
covered in Chapters 3 and 4. Indeed, that is the raison d’^etre for the coverage of this material in
Chapters 3 and 4: it makes these two chapters on bivariate regression the second pass at this material.
Topics related to goodness of fit (R2) and simple prediction are covered in Chapter 8.
Chapter 9 develops these same results for what is here called “The Multiple Regression Model,” as
an extension of the analogous results obtained in detail for the Bivariate Regression Model. While the
mathematical analysis of the Multiple Regression Model is necessarily limited here by the restriction
to scalar algebra, the strategy is to leverage the thorough understanding of the Bivariate Regression
Model gained in the previous chapters as much as is possible toward understanding the corresponding
aspects of the Multiple Regression Model. A careful – albeit necessarily, at times, intuitive –
discussion of several topics which could not be addressed in the exposition of the Bivariate Regression
Model completes the exposition in Chapter 9. These topics include the issues arising from over-
elaborate model specifications, underelaborate model specifications, and multicollinearity. This
chapter closes with several worked applications and several directed applications (“Active Learning
Exercises,” discussed below) for the reader to pursue.
4
Chapter 5 also makes the link – both numerically (in Active Learning Exercise 5d) and analytically (in Appendix 5.1) –
between the estimated coefficient on a dummy variable regressor and sample mean estimates. This linkage is useful later on
(in Chapter 15) when the fixed-effects model for panel data is discussed.
FLAST01 11/24/2011 12:52:15 Page 17
By this point in the book it is abundantly clear how the quality of the model parameter estimates
and the validity of the statistical inference machinery both hinge on the model assumptions.
Chapter 10 (and, later, Chapters 13 through 15) provide a coherent summary of how one can, with a
reasonably large data set, in practice use the sample data to check these assumptions. Many of the
usual methods aimed at testing and/or correcting for failures in these assumptions are in essence
described in these chapters, but the emphasis is not on an encyclopedia-like coverage of all the
specific tests and procedures in the literature. Rather, these chapters focus on a set of graphical
methods (histograms and plots) and on a set of simple auxiliary regressions which together suggest
revisions to the model specification that are likely to lead to a model which at least approximately
satisfies the regression model assumptions.
In particular, Chapter 10 deals with the issues – gaussianity, homoscedasticity, and parameter
stability – necessary in order to diagnostically check (and perhaps respecify) a regression model
based on cross-sectional data. Robust (White) standard error estimates are obtained in a particularly
transparent way, but the emphasis is on taking observed heteroscedasticity as a signal that the form
of the dependent variable needs respecification, rather than on FGLS corrections or on simply
replacing the usual standard error estimates by robust estimates. The material in this chapter suffices
to allow the student to get started on a range of practical applications.5
The remaining portion of Part II – comprising Chapters 11 through 14 – abandons the rather
artificial assumption that the explanatory variables are fixed in repeated samples. Stochastic
regressors are, of course, necessary in order to deal with the essential real-world complications
of endogeneity and dynamics, but the analysis of models with stochastic regressors requires a primer
on asymptotic theory. Chapter 11 provides this primer and focuses on endogeneity; Chapter 12
focuses on instrumental variables estimation; and Chapters 13 and 14 focus on diagnostically
checking the nonautocorrelation assumption and on modeling dynamics.
Each of these chapters is described in more detail below, but they all share a common approach in
terms of the technical level of the exposition: The (scalar) algebra of probability limits is laid out –
without proof – in Appendix 11.1; these results are then used in each of the chapters to rather easily
examine the consistency (or otherwise) of the OLS slope estimator in the relevant bivariate
regression models. Technical details are carefully considered, but relegated to footnotes. And the
asymptotic sampling distributions of these slope estimators are fairly carefully derived, but these
derivations are provided in chapter appendices. This approach facilitates the coverage of the basic
econometric issues regarding endogeneity and dynamics in a straightforward way, while also
allowing an instructor to easily fold in a more rigorous treatment, where the time available (and the
students’ preparation level) allows.
Chapter 11 examines how each of the three major sources of endogeneity – omitted variables,
measurement error, and joint determination – induces a correlation between an explanatory variable
and the model error. In particular, simultaneous equations are introduced at this point using the
simplest possible economic example: a just-identified pair of supply and demand equations.6
The chapter ends with a brief introduction to simulation methods (with special attention to the
bootstrap and its implementation in Stata), in the context of answering the perennial question about
asymptotic methods, “How large a sample is really necessary?”
Chapter 12 continues the discussion of endogeneity initiated in Chapter 11 – with particular
emphasis on the “reverse causality” source of endogeneity and on the non-equivalence of
5
In particular, see Active Learning Exercises 10b and 10c in the Table of Contents. Also, even though their primary focus is
on 2SLS, students can begin working on the OLS-related portions of Active Learning Exercises 12a, 12b, and 12c at this
point.
6
Subsequently – in Chapter 12, where instrumental variables estimation is covered – 2SLS is heuristically derived and
applied to either a just-identified or an over-identified equation from a system of simultaneous equations. The development
here does not dwell on the order and rank conditions for model identification, however.
FLAST01 11/24/2011 12:52:15 Page 18
correlation and causality. Instrumental variables estimation is then developed as the solution to the
problem of using a single (valid) instrument to obtain a consistent estimator of the slope coefficient
in the Bivariate Regression Model with an endogenous regressor. The approach of restricting
attention to this simple model minimizes the algebra needed and leverages the work done in Chapter
11. A derivation of the asymptotic distribution of the instrumental variables estimator is provided in
Appendix 12.1, giving the instructor a graceful option to either cover this material or not. The two-
stage least squares estimator is then heuristically introduced and applied to the classic Angrist-
Krueger (1991) study of the impact of education on log-wages. Several other economic applications,
whose sample sizes are more feasible for student-version software, are given as Active Learning
Exercises at the end of the chapter.
Attention then shifts, in a pair of chapters – Chapters 13 and 14 – to time-series issues. Because
Chapters 17 and 18 cover forecasting in some detail, Chapters 13 and 14 concentrate on the estimation
and inference issues raised by time-series data.7 The focus in Chapter 13 is on how to check the non-
autocorrelation assumption on the regression model errors and deal with any violations. The emphasis
here is not on named tests (in this case, for serially correlated errors) or on assorted versions of FGLS,
but rather on how to sensibly respecify a model’s dynamics so as to reduce or eliminate observed
autocorrelation in the errors. Chapter 14 then deals with the implementation issues posed by integrated
(and cointegrated) time-series, including the practical decision as to whether it is preferable to model
the data in levels versus in differences. The “levels” versus “changes” issue is first addressed at this
point, in part using insights gained from simulation work reported in Ashley and Verbrugge (2009).
These results indicate that it is usually best to model in levels, but to generate inferential conclusions
using a straightforward variation on the Lag-Augmented VAR approach of Toda and Yamamoto
(1995).8 On the other hand, the differenced data is easier to work with (because it is far less serially
dependent) and it provides the opportunity (via the error-correction formulation) to dis-entangle the
long-run and short-run dynamics. Thus, in the end, it is probably best to model the data both ways.9 This
synthesis of the material is carefully developed in the context of a detailed analysis of an illustrative
empirical application: modeling monthly U.S. consumption expenditures data. This example also
provides a capstone illustration of the diagnostic checking techniques described here.
The last portion of the book (Part III) consists of five chapters on advanced topics and a concluding
chapter. These five “topics” chapters will be particularly useful for instructors who are able to move
through Chapters 2 through 4 quickly because their students are well prepared; the “Concluding
Comments” chapter – Chapter 20 – will be useful to all. Chapters 15 and 16 together provide a brief
introduction to the analysis of panel data, and Chapters 17 and 18 together provide a concise
introduction to the broad field of time-series analysis and forecasting. Chapter 19 introduces the
two main alternatives to OLS for estimating parametric regression models: maximum likelihood
estimation (MLE) and the generalized method of moments (GMM). Each of these chapters is described
in a bit more detail below.
A great deal of micro-econometric analysis is nowadays based on panel data sets. Chapters 15 and
16 provide a straightforward, but comprehensive, treatment of panel data methods. The issues, and
requisite panel-specific methods, for the basic situation – with strictly exogenous explanatory variables
– are first carefully explained in Chapter 15, all in the context of an empirical example. This material
7
Most of the usual (and most crucial) issues in using regression models for prediction are, in any case, covered much earlier –
in Section 8.3.
8
See Ashley, R., and R. Verbrugge (2009), “To Difference or Not to Difference: A Monte Carlo Investigation of Inference in
Vector Autoregression Models.” International Journal of Data Analysis Techniques and Strategies1(3): 242–274 (ashley-
mac.econ.vt.edu/working_papers/varsim.pdf) and Toda, H. Y., and T. Yamamoto (1995), “Statistical Inference in Vector
Autoregressions with Possibly Integrated Processes,” J. Econometrics 66, 225–250.
9
The “difference” versus “detrend” issue comes up again in Section 18.1, where it is approached (and resolved) a bit
differently, from a “time-series analysis” rather than a “time-series econometrics” perspective.
FLAST01 11/24/2011 12:52:15 Page 19
concentrates on the Fixed Effects and then on the Random Effects estimators. Then dynamics, in the
form of lagged dependent variables, are added to the model in Chapter 16. (Many readers will be a bit
surprised to find that the Random Effects estimator is still consistent in this context, so long as the
model errors are homoscedastic and any failures in the strict exogeneity assumption are not empirically
consequential.) Finally, the First-Differences model is introduced for dealing with endogeneity (as
well as dynamics) via instrumental variables estimation. This IV treatment leads to an unsatisfactory
2SLS estimator, which motivates a detailed description of how to apply the Arellano-Bond estimator in
working with such models. The description of the Arellano-Bond estimator does not go as deep
(because GMM estimation is not covered until Chapter 19), but sufficient material is provided that the
student can immediately begin working productively with panel data.
The primary focus of much applied economic work is on inferential issues – i.e., on the statistical
significance of the estimated parameter on a particular explanatory variable whose inclusion in the
model is prescribed by theory, or on a 95% confidence interval for a parameter whose value is
policy-relevant. In other applied settings, however, forecasting is paramount. Chapters 17 and 18,
which provide an introduction to the broad field of time-series analysis and forecasting, are
particularly useful in the latter context. Chapter 17 begins with a careful treatment of forecasting
theory, dealing with the fundamental issue of when (and to what extent) it is desirable to forecast
with the conditional mean. The chapter then develops the basic tools – an understanding of the
sample correlogram and the ability to invert a lag structure – needed in order to use Box-Jenkins
(ARMA) methods to identify, estimate, and diagnostically check a univariate linear model for a
time-series and to then obtain useful short-term conditional mean forecasts from it. These ideas and
techniques are then extended – in Chapter 18 – to a variety of extensions of this framework into
multivariate and nonlinear time-series modeling.
Up to this point in the book, regression analysis is basically framed in terms of least-squares
estimation of parameterized models for the conditional mean of the variable whose sample
fluctuations are to be “explained.” As explicitly drawn out for the Bivariate Regression Model in
Chapter 5, this is equivalent to fitting a straight line to a scatter diagram of the sample data.10
Chapter 19 succinctly introduces the two most important parametric alternatives to this “curve-
fitting” approach: maximum likelihood estimation and the generalized method of moments.
In the first part of Chapter 19 the maximum likelihood estimation framework is initially explained –
as was least squares estimation in Part I of the book – in terms of the simple problem of estimating the
mean and variance of a normally distributed variable. The primary advantage of the MLE approach is
its ability to handle latent variable models, so a second application is then given to a very simple
binary-choice regression model. In this way, the first sections of Chapter 19 provide a practical
introduction to the entire field of “limited dependent variables” modeling.
The remainder of Chapter 19 provides an introduction to the Generalized Method of Moments
(GMM) modeling framework. In the GMM approach, parameter identification and estimation are
achieved through matching posited population moment conditions to analogous sample moments,
where these sample moments depend on the coefficient estimates. The GMM framework thus directly
involves neither least-squares curve-fitting nor estimation of the conditional mean. GMM is really the
only graceful approach for estimating a rational expectations model via its implied Euler equation.
Of more frequent relevance, it is currently the state-of-the-art approach for estimating IV regression
models, especially where heteroscedastic model errors are an issue. Chapter 19 introduces GMM via a
detailed description of the simplest non-trivial application to such an IV regression model: the one-
parameter, two-instrument case. The practical application of GMM estimation is then illustrated using a
10
The analogous point, using a horizontal straight line “fit” to a plot of the sample data versus observation number, is made in
Chapter 3. And the (necessarily more abstract) extension to the fitting of a hyperplane to the sample data is described in
Chapter 9. The corresponding relationship between the estimation of a parameterization of the conditional median of the
dependent variable and estimation via least absolute deviations fitting is briefly explained in each of these cases also.
FLAST01 11/24/2011 12:52:16 Page 20
familiar full-scale empirical model, the well-known Angrist-Krueger (1991) model already introduced
in Chapter 12: in this model there are 11 parameters to be estimated, using 40 moment conditions.
Even the simple one-parameter GMM estimation example, however, requires a linear-algebraic
formulation of the estimator. This linear algebra (its only appearance in the book) is relegated to
Appendix 19.1, where it is unpacked for this example. But this exigency marks a natural stopping-
point for the exposition given here. Chapter 20 concludes the book with some sage – if, perhaps,
opinionated – advice.
A great deal of important and useful econometrics was necessarily left out of the present treatment.
Additional topics (such as nonparametric regression, quantile regression, Bayesian methods, and
additional limited dependent variables models) could perhaps be covered in a subsequent edition.
11
And, of course, it is well known that Excel’s implementation of multiple regression is not numerically well-behaved.
FLAST01 11/24/2011 12:52:16 Page 21
In general, however, tables of tail areas and critical points for the normal, x2, Student’s t, and F
distribution are functionally obsolete – as is the skill of reading values off of them. Ninety-nine
times out of a hundred, the econometric software in use computes the necessary p-values for us: the
valuable skill is in understanding the assumptions underlying their calculation and how to
diagnostically check these assumptions. And, in the one-hundredth case, it is a matter of moments
to load up a spreadsheet – e.g., Excel – and calculate the relevant tail area or critical point using a
worksheet function.12
Consequently, this book does not included printed statistical tables.
SUPPLEMENTARY MATERIALS
A number of supplementary materials are posted on the companion Web site for this book,
www.wiley.com/college/ashley. These include:
Active Learning Exercises listed in the Table of Contents, including their accompanying data
sets and any computer programs needed. Answer keys for these Exercises are posted also.
Answer keys for all of the end-of-chapter exercises.
Windows programs which compute tail areas for the normal, x2, t, and F distributions.
PowerPoint slides for each chapter.
Image Gallery – equations, tables, and figures – in JPEG format for each chapter. Sample
presentation files based on these, in Adobe Acrobat PDF format, are also provided for
each chapter.
12
The syntax for the relevant Excel spreadsheet function syntax is quoted in the text where these arise, as is a citation to a
standard work quoting the computing approximations used in these worksheet functions. Stand-alone Windows programs
implementing these approximations are posted at Web site www.wiley.com/college/ashley.
FLAST02 11/21/2011 18:53:11 Page 22
Most chapters of this textbook contain at least one “Active Learning Exercise” or “ALE.” The titles
of these Active Learning Exercises are given in the Table of Contents and listed on the inside covers
of the book. Whereas the purpose of the end-of-chapter exercises is to help the student go deeper into
the chapter material – and worked examples using economic data are integrated into the text – these
Active Learning Exercises are designed to engage the student in structured, active exercises.
A typical Active Learning Exercise involves specific activities in which the student is either
directed to download actual economic data from an academic/government Web site or is provided
with data (real or simulated) from the companion Web site for this book, www.wiley.com/college/
ashley. (This Web site will also provide access to the latest version of each Active Learning
Exercise, as some of these exercises will need to be revised occasionally as Web addresses and
content change.) These exercises will in some cases reproduce and/or expand on empirical results
used as examples in the text; in other cases, the Active Learning Exercise will set the student
working on new data. A number of the Active Learning Exercises involve replication of a portion of
the empirical results of published articles from the economics literature.
The Active Learning Exercises are a more relaxed environment than the text itself, in that one of
these exercises might, for example, involve a student in “doing” multiple regression in an informal
way long before this topic is reached in the course of the careful development provided in the text.
One could think of these exercises as highly structured “mini-projects.” In this context, the Active
Learning Exercises are also a great way to help students initiate their own term projects.
xxii
FLAST03 11/23/2011 15:21:56 Page 23
ACKNOWLEDGMENTS
My thanks to all of my students for their comments on various versions of the manuscript for this
book; in particular, I would like to particularly express my appreciation to Bradley Shapiro and to
James Boohaker for their invaluable help with the end-of-chapter exercises. Thanks are also due to
Alfonso Flores-Lagunes, Chris Parmeter, Aris Spanos, and Byron Tsang for helpful discussions and/
or access to data sets. Andrew Rose was particularly forthcoming in helping me to replicate his very
interesting 2005 paper with Frankel in The Review of Economics and Statistics quantifying the
impact of international trade on environmental air quality variables; this help was crucial to the
construction of Active Learning Exercises 10c and 12b. I have benefited from the comments
and suggestions from the following reviewers: Alfonso Flores-Lagunes, University of Florida,
Gainesville; Scott Gilbert, Southern Illinois University, Carbondale; Denise Hare, Reed College;
Alfred A. Haug, University of Otago, New Zealand; Paul A. Jargowsky, Rutgers-Camden; David
Kimball, University of Missouri, St. Louis; Heather Tierney, College of Charleston; Margie Tieslau,
University of North Texas; and several others who wish to remain anonymous. Thanks are also due
to Lacey Vitteta, Jennifer Manias, Emily McGee, and Yee Lyn Song at Wiley for their editorial
assistance. Finally, I would also like to thank Rosalind Ashley, Elizabeth Paule, Bill Beville, and
George Lobell for their encouragement with regard to this project.
xxiii
FLAST04 11/21/2011 18:2:2 Page 24
NOTATION
Logical and consistent notation is extremely helpful in keeping track of econometric concepts,
particularly the distinction between random variables and realizations of random variables. This
section summarizes the principles underlying the notation used below. This material can be
skimmed on your first pass: this notational material is included here primarily for reference later
on, after the relevant concepts to which the notational conventions apply are explained in the
chapters to come.
Uppercase letters from the usual Latin-based alphabet – X, Y, Z, etc. – are used below to denote
observable data. These will generally be treated as random variables, which will be discussed in
Chapter 2. What is most important here is to note that an uppercase letter will be used to denote such
a random variable; the corresponding lowercase letter will be used to denote a particular (fixed)
realization of it – i.e., the numeric value actually observed. Thus, “X” is a random variable, whereas
“x” is a realization of this random variable. Lowercase letters will not be used below to denote the
deviation of a variable from its sample mean.
The fixed (but unknown) parameters in the econometric models considered below will usually
be denoted by lowercase Greek letters – a, b, g, d, and so forth. As we shall see below, these
parameters will be estimated using functions of the observable data – “estimators” – which are
random variables. Because uppercase Greek letters are easily confused with letters from the Latin-
based alphabet, however, such an estimator of a parameter – a random variable because it depends
on the observable data, which are random variables – will typically be denoted by placing a hat (“^”)
over the corresponding lowercase Greek letter. Sample realizations of these parameter estimators
will then be denoted by appending an asterisk. Thus, a ^ will typically be used to denote an estimator
of the fixed parameter a and a ^ will be used to denote the (fixed) realization of this random variable,
based on the particular values of the observable data which were actually observed. Where a second
estimator of a needs to be considered, it will be denoted by a ~ or the like. The only exceptions to
these notational conventions which you will encounter later are that – so as to be consistent with
the standard nomenclature – the usual convention of using Y and S2 to denote the sample mean and
variance will be used; sample realizations of these estimators will be denoted y and s2, respectively.
The random error terms in the econometric models developed below will be denoted by uppercase
letters from the Latin-based alphabet (typically, U, V, N, etc.) and fixed realizations of these error
terms (which will come up very infrequently because model error terms are not, in practice,
observable) will be denoted by the corresponding lowercase letter, just as with observable data.
xxiv
FLAST04 11/21/2011 18:2:2 Page 25
NOTATION xxv
When an econometric model is fit to sample data, however, one obtains observable “fitting errors.”
These can be usefully thought of as estimators of the model errors. These estimators – which will be
random variables because they depend on the observable (random) observations – will be
distinguished from the model errors themselves via a superscript “fit” on the corresponding letter
for the model error. As with the model errors, the sample realizations of these fitting errors, based on
particular realizations of the observable data, will be denoted by the corresponding lowercase letter.
The following table summarizes these notational rules and gives some examples: