0% found this document useful (0 votes)

348 views14 pages

Econometrics Final Exam Study Guide PDF

This document provides an overview of nonlinear regression functions and assessing studies based on multiple regressors. It discusses several key topics: 1. Nonlinear regression functions including polynomial, logarithmic, and exponential forms as well as interactions between independent variables. 2. Threats to the internal and external validity of multiple regression studies, such as omitted variable bias, functional form misspecification, errors-in-variables bias, missing data bias, and simultaneous causality bias. 3. Using panel data, which contains observations on multiple entities over time, to control for unobserved factors that do not vary over time but could cause omitted variable bias if excluded from the regression.

Uploaded by

Mohamad Bizri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

348 views14 pages

Econometrics Final Exam Study Guide PDF

Uploaded by

Mohamad Bizri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Chapter 8 - Nonlinear Regression Functions

I. Nonlinear Regression Functions - General Comments

A. The general nonlinear population regression function

1. Assumptions

a) E(ui| X1, X2,..., XKi) = 0

(1) For any given value of X the mean is zero

b) (X1,..., Xki, Yi) are i.i.d.

c) Big outliers are rare

d) No perfect multicollinearity

II. Nonlinear functions of one variable

A. Two complementary approaches

1. Polynomials in X; Ex: Yi = B0 + B1Xi + B2*Xi2 + Br + ui

a) Joint hypothesis testing can be used to determine non-linearity

Ex: TestScore = B0 + B1Incomei + B2(Incomei)2 + B3(Incomei)3 + ui

H0 : population coefficients on Income2 and Income3 = 0

H1 : at least one of these coefficients is nonzero

*Run test income2 income3; if p<0.05 then reject null

2. Logarithmic transformations of Y and/or X - permits modeling relations in

percentage terms

a) Three log regression specifications:

(1) Linear-log : Yi = B0 + B1ln(Xi) + ui

(a) A 1% change in X (multiplying X by 1.01) is

associated with a 0.01*B1 unit change in Y

(2) Log-linear : ln(Yi) = B0 + B1Xi + ui

(a) A 1 unit change in X is associated with a 100*B1%

change in Y

(3) Log-log : ln(Yi) = B0 + B1ln(Xi) + ui

(a) A 1% change in X is associated with a B1% change

in Y

B. Negative Exponential Growth Regression

1. Yi = B0 - ae-B1X + ui

III. Nonlinear functions of two variables: interactions between independent variables

A. Interactions between two binary variables: Yi = B0 + B1D1 + B2D2 + B3D1D2 + ui

1. D1 and D2 are binary

2. Including an interaction term D1*D2 allows the effect of changing D1 to

depend on D2

a) To find effect of variable, partially derive:

(1) Yi = B0 + B1D1 + B2D2 + B3D1D2 + ui

(a) dy/dD1 = B1 + B3D2 = effect of change in D1

(b) dy/dD2 = B2 + B3D1 = effect of change in D2

B. Interactions between continuous and binary variables:

1. Yi = B0 + B1Di + B2Xi + B3Di*Xi + ui

a) Di is binary, X is continuous; the interaction term allows the effect

of X to depend on D

b) Two regression lines are formed by D=0 and D=1

(1) Find effect by partially deriving

C. Interactions between two continuous variables:

1. Yi = B0 + B1X1 + B2X2 + B3X1X2 + ui

a) Both X1 and X2 are continuous

b) Interaction term does the same thing as above

c) Perform joint hypothesis testing on individual variables and

interaction

Chapter 9 - Assessing Studies Based on Multiple Regressors

I. Internal and External Validity

A. Internal Validity - The statistical inferences about causal effects are valid for the

population being studied

B. External Validity - The statistical inferences can be generalized from the

population and setting studied to other populations and settings (where the

“setting” refers to the legal, policy, and physical environment and related salient

features)

C. Threats to External Validity of Multiple Regression Studies

1. Assessing threats to external validity requires detailed substantive

knowledge and judgment on a case-by-case basis

II. Threats to Internal Validity of Multiple Regression Analysis

A. Omitted variable bias

1. Arises if an omitted variable is both:

a) A determinant of Y

b) Correlated with at least one included regressor

2. Solutions:

a) If the omitted causal variable can be measured, include it

b) If you have data on one or more controls and they are adequate

(conditional mean independence plausibility holding) then include

the control variables

c) Possibly, use panel data in which each entity (individual) is

observed more than once (thus providing a control)

d) If the omitted variable(s) cannot be measured, use instrumental

variables regression

e) Run randomized control experiment

(1) If X is randomly assigned, then X necessarily will be

distributed independently of ui, thus E(u|X = x) = 0

B. Functional form misspecification

1. Arises if the functional form is incorrect (linear vs. non-linear); ex: an

interaction term is incorrectly omitted

2. Solutions

a) Continuous dependent variable: use the “appropriate” nonlinear

specifications in X (logarithms, interactions, etc…)

b) Discrete (ex: binary) dependent variable: need an extension of

multiple regression methods (“probit” or “logit” analysis for binary

dependent variables)

C. Errors-in-variables bias

1. Arises with data entry errors in administrative data, recollection errors in

surveys, ambiguous questions, intentionally false response problems,

etc…

2. Solutions:

a) Get better data

b) Develop a specific model of the measurement error process (only

possible if a lot is known about the nature of the measurement

error)

c) Instrumental variables regression

D. Missing data and sample selection bias

1. Three cases: Data are missing at random, data are missing based on the

value of one or more X’s, and data are missing based in part on the value

of Y or u

a) Cases 1 and 2 don’t introduce bias; case 3 introduces “sample

selection” bias

2. Sample selection bias arises when a selection process:

a) Influences the availability of data and

b) Is related to the dependent variable

3. Solutions:

a) Collect the sample in a way that avoids sample selection

(1) Don’t collect data on the height of GW’s population by

standing outside the basketball locker-room

b) Randomized control experiment

c) Construct a model of the sample selection problem and estimate

that model (Heckman’s two stage)

E. Simultaneous causality bias

1. What if, not only does X cause Y, but Y causes X as well?

a) Ex: low STR results in better test scores, but suppose districts with

low test scores are given extra resources and as a result of a

political process also have a low STR; STR and u are correlated in

this case

2. Solutions:

a) Run a randomized controlled experiment

b) Develop and estimate a complete model of the two way causal

interaction

c) Use instrumental variables to estimate the causal effect of interest

(effect of X on Y, ignoring effect of Y on X)

*all of these threats imply that the mean for any given X does not equal 0 (conditional mean

independence is invalid), in which case OLS is biased and inconsistent

Chapter 10 - Regression with Panel Data

I. Panel Data: What and Why

A. Contains observations on multiple entities at two or more points in time; also

referred to as longitudinal data

1. Balanced Panel: no missing observations; all variables are observed for all

entities and all time periods

B. With panel data we can control for factors that:

1. Vary across entities but do not vary over time

2. Could cause omitted variable bias if they are omitted

3. Are unobserved or unmeasured and therefore cannot be included in the

regression

II. Panel Data with Two Time Periods: Yit = B0 + B1Xit + B2Zi + uit

A. Z is a factor that does not change over time (at least during the years on which we

have data

1. Suppose Zi is not observed, so its omission could result in omitted variable

bias

a) The effect of Zi can be eliminated by using T=2 years

(1) Any change in Y from Y=1 to Y=2 cannot be caused by Z

because Z does not change

III. Fixed Effects Regression (if T>2) : Yit = B0 + B1Xit + B2Zi + uit, i = 1,...,n, T = 1,...,T

A. We can rewrite this in two useful ways:

1. “n-1 binary regressor” form:

a) Yit = B0 + B1Xit + y2D2i + … + ynDni + uit

(1) Where D2i = 1 for i=2 (entity 2) or 0 for otherwise)

2. “Fixed Effects” form:

a) Yit = B1Xit + ai + uit

(1) ai is called a “state fixed effect” or “state effect” it is the

constant (fixed) effect of being in state i

B. Fixed Effects Regression: Estimation

1. Three estimation methods:

a) “n-1 binary regressors” OLS regression

(1) First create the binary variables D2i,...,Dni

(2) Then estimate Yit = B0 + B1Xit + y2D2i + … + ynDni + uit by

OLS

(3) Inference (hypothesis tests, confidence intervals,etc... ) is as

usual

(4) This is only practical when n is small

b) “Entity-demeaned” OLS regression

c) “Changes” specification, without an intercept (only works for T=2)

IV. Regression with Time Fixed Effects

A. An omitted variable may vary over time but not across states

1. Let St denote the combined effect of variables which changes over time

but not states: Yit = B0 + B1Xit + B2Zi + B3St + uit

B. Two formulations of regression with time fixed effects

1. “T-1” binary regressor formulation:

a) Yit = B0 + B1Xit + δ2B2t + … + δTBTt + uit

b) Where B2t = (1 when t=2 (year 2) and 0 otherwise)

2. “Time effects” formulation: Yit = B1Xit + λt + uit

C. Time fixed effects: estimation methods

1. “T-1 binary regressor” OLS regression:

a) Yit = B0 + B1Xit + δ2B2t + … + δTBTt + uit

b) Create binary variables B2,...BT

c) B2 = 1 if t = year 2, = 0 otherwise

d) Regress using OLS

2. “Year-demeaned” OLS regression

a) Deviate Yit, Xit from year (not entity) averages

b) Estimate by OLS using “year-demeaned” data

V. Standard Errors for Fixed Effects Regression

A. LS Assumptions for Panel Data: Yit = B1Xit + ai + uit, i = 1,...,n, t = 1,...,T

1. E(uit| Xi1,...,XiTai) = 0

a) uit has mean zero, given the entity fixed effect and the entire

history of the X’s for that entity

2. (Xj1,..., XiT, ui1,...,uIT), i = 1,...,n, are i.i.d. draws from their joint

distribution

a) Does not require observations to be i.i.d over time for the same

entity

3. (Xit, uit) have finite fourth movements

4. No perfect multicollinearity

Chapter 11 - Regression with a Binary Dependent Variable

I. The Linear Probability Model

A. When Y is binary, the linear regression model: Yi = B0 + B1Xi + ui

1. Called the linear probability model because:

a) Pr(Y=1|X) = B0 + B1Xi

b) The predicted value is a probability:

(1) E(Y|X=x) = Pr(Y=1|X=x) = prob. That Y=1 given x

(2) Y-hat = the predicted probability that Yi=1 given X

c) B1 = change in probability that Y=1 for a unity change in x:

(1) B1 = [Pr(Y=1|X=x + delta(x)) - Pr(Y=1|X=x)]/delta(x)

B. Advantages

1. Simple to estimate and to interpret

2. Inference is the same as for multiple regression

C. Disadvantages

1. A LPM says that the change in the predicted probability for a given

change in X is the same for all values of X. but that doesn’t make sense;

these predicted probabilities can also be <0 or >1

a) These disadvantages can be solved by the following nonlinear

probability models

II. Probit and Logit Regression

A. We don’t want the probability of Y=1 as being linear, instead we want:

1. Pr(Y=1|X) to be increasing in X for B1>0, and

2. 0<= Pr(Y=1|X) <= 1 for all X

B. Probit regression - models the probability that Y=1 using the cumulative standard

normal distribution function, chi(z), evaluated at z = B0 + B1X

1. Probit Model: Pr(Y=1|X) = chi(B0 + B1X)

a) Where chi is the cumulative normal distribution function and z =

B0 + B1X is the “z-value” of the probit model

C. Logit regression - models the probability of Y=1 given X, as the cumulative

standard logistic distribution function, evaluated at z = B0 + B1X

1. Logit Model: Pr(Y=1|X) = F(B0 + B1X)

a) Where F is the cumulative logistic distribution function:

(1) F(B0 + B1X) = 1/(1+e^-(B0+B1X))

III. Estimation and Inference in Probit and Logit

A. The R2 and R-bar2 don’t make sense here so two other specialized measures are

used:

1. The fraction correctly predicted = fraction of Y’s for which the predicted

probability is >50% when Yi =1, or is <50% when Yi=0

2. The pseudo-R2 measures the improvement in the value of the log

likelihood, relative to having no X’s; simplifies to the R2 in the linear

model with normally distributed errors

Chapter 12 - Instrumental Variables Regression

I. IV Regression: Why and What; Two Stage Least Squares

A. Can address three important threats to internal validity: omitted variable bias,

simultaneous causality bias, and errors-in-variables bias; all three problems can

result in E(u|X) =/= 0 which can be fixed using an instrumental variable, Z

B. IV regression breaks X into two parts: a part that might be correlated with u, and a

part that is not; by isolating the later it is possible to estimate B1

1. This is done using an instrumental variable, Zi, which is correlated with Xi

but not with ui

2. Endogenous variable - one that is correlated with u

3. Exogenous variable - one that is uncorrelated with u

C. Two conditions for a valid instrument:

1. Instrument relevance: corr(Zi,Xi) =/= 0

2. Instrument exogeneity: corr(Zi,ui) = 0

D. IV estimator with one X and one Z

1. Two Stage Least Squares (TSLS)

a) Stage 1 - isolate the part of X that is uncorrelated with u by

regressing X on Z using OLS: Xi = pi0 +pi1Zi + vi

(1) Because Z is uncorrelated with ui, pi0 +pi1Zi is uncorrelated

with ui; we may not know the values but we have estimated

them

(2) Compute the predicted values of Xi, where X-hati = pi-hat0

+pi-hat1Zi, i = 1,...,n

b) Stage 2 - replace Xi with X-hati in the regress and regress Y on it

II. The General IV Regression Model

A. Terminology:

1. Identification - in IV regression, whether the coefficients are identified

depends on the relation between the number of instruments (m) and the

number of endogenous regressors (k)

a) If there are fewer instruments than endogenous regressors, we

can’t estimate B1,...,Bk

b) Coefficients B1,...,Bk are said to be:

(1) Exactly identified if m=k

(2) Overidentified if m>k

(3) Underidentified if m<k

B. Summary of Jargon:

1. Yi = B0 + B1Xi1, + … + BkXik + Bk+1Wi1, + … + Bk+rWir + ui

2. Xi1,...,Xik are the endogenous regressors (potentially correlated with ui

3. Wi1,...,Wir are the included exogenous regressors (uncorrelated with ui) or

control variables (included so that Zi is uncorrelated with ui, once the W’s

are included)

4. B0, B1,...,Bk+r are the unknown regression coefficients

5. Zi,...,Zim are the m instrumental variables (the excluded exogenous

variables

C. IV Regression Assumptions

1. E(ui|Wi1,...,Wir) = 0
a) I.e. the exogenous regressors are exogenous

b) If W’s are used as control variables

(1) E(ui|Wi, Zi) = E(ui|Wi)

2. All variables are i.i.d.

3. The X’s, W’s, Z’s, and Y have nonzero, finite 4th movements

4. The instruments (Zi,...,Zim) are valid

Carton Packaging Knowledge
88% (8)
Carton Packaging Knowledge
93 pages
Computer Application in Aricultural Economics
100% (1)
Computer Application in Aricultural Economics
3 pages
Econometrics Cheat Sheet Stock and Watson
No ratings yet
Econometrics Cheat Sheet Stock and Watson
2 pages
Chapter 4 Econometrics PDF
No ratings yet
Chapter 4 Econometrics PDF
26 pages
Econometrics Cheat Sheet Stock and Watson
100% (5)
Econometrics Cheat Sheet Stock and Watson
2 pages
Econometrics Midterms Test BFT 64th
No ratings yet
Econometrics Midterms Test BFT 64th
5 pages
Lecture 4 SP 2025
No ratings yet
Lecture 4 SP 2025
86 pages
Econometrics Module
No ratings yet
Econometrics Module
148 pages
Chapter Three: Hypothesis Testing
No ratings yet
Chapter Three: Hypothesis Testing
66 pages
CH 5 Time Series
No ratings yet
CH 5 Time Series
46 pages
Portfolio Management in Kotak Securites
0% (1)
Portfolio Management in Kotak Securites
92 pages
Ch3 Acct2105 Up
No ratings yet
Ch3 Acct2105 Up
59 pages
Chapter 6 Student
No ratings yet
Chapter 6 Student
21 pages
Belisa Aliyi - Assignments - For - Econometrics
No ratings yet
Belisa Aliyi - Assignments - For - Econometrics
34 pages
Worksheet - Class Practice
No ratings yet
Worksheet - Class Practice
3 pages
Heteroscedasticity Theory MCQs
No ratings yet
Heteroscedasticity Theory MCQs
3 pages
Qualitative Response Regression Models 1
No ratings yet
Qualitative Response Regression Models 1
29 pages
Worksheet Econometrics I
No ratings yet
Worksheet Econometrics I
7 pages
Multiple Regression
100% (1)
Multiple Regression
58 pages
Chapter 2 - Simple Linear Regression Function
100% (1)
Chapter 2 - Simple Linear Regression Function
49 pages
Econometrics Exam, 11,11,12
100% (1)
Econometrics Exam, 11,11,12
2 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
90 pages
Econometrics With Stata PDF
No ratings yet
Econometrics With Stata PDF
58 pages
Dummy Dependent Variable
100% (1)
Dummy Dependent Variable
58 pages
Econometrics
No ratings yet
Econometrics
320 pages
Econometrics Assignment... 2
No ratings yet
Econometrics Assignment... 2
12 pages
Qualitative Response Regression Questions
No ratings yet
Qualitative Response Regression Questions
10 pages
Econometrics I CH-1
No ratings yet
Econometrics I CH-1
32 pages
Stata Commands PDF
No ratings yet
Stata Commands PDF
5 pages
Chapter 3 Econometrics Practice MC
No ratings yet
Chapter 3 Econometrics Practice MC
35 pages
MPhil Econometrics Question Final Exam 2022
No ratings yet
MPhil Econometrics Question Final Exam 2022
2 pages
Dummy Variable Regression Models
No ratings yet
Dummy Variable Regression Models
48 pages
Etc 2410 Notes
50% (2)
Etc 2410 Notes
133 pages
Chapter 3-Multiple Regression Model
No ratings yet
Chapter 3-Multiple Regression Model
26 pages
Chapter 14: Introduction To Panel Data
No ratings yet
Chapter 14: Introduction To Panel Data
14 pages
ch14 Nonlinear Regression Models
100% (1)
ch14 Nonlinear Regression Models
18 pages
Stationary and Non Stationary
100% (1)
Stationary and Non Stationary
5 pages
Chapter 3
100% (1)
Chapter 3
28 pages
Heteroscedasticity Notes
No ratings yet
Heteroscedasticity Notes
9 pages
3.1 Multiple Choice: Introduction To Econometrics, 3e (Stock) Chapter 3 Review of Statistics
No ratings yet
3.1 Multiple Choice: Introduction To Econometrics, 3e (Stock) Chapter 3 Review of Statistics
32 pages
Lectute 2 - Panel Data Regression
No ratings yet
Lectute 2 - Panel Data Regression
30 pages
Introduction To Stata and Data Management
No ratings yet
Introduction To Stata and Data Management
30 pages
Econometrics I: TA Session 5: Giovanna Ubida
No ratings yet
Econometrics I: TA Session 5: Giovanna Ubida
20 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Econometrics Question M.Phil II 2020
No ratings yet
Econometrics Question M.Phil II 2020
4 pages
Basic Econometrics - Lecture Notes
50% (2)
Basic Econometrics - Lecture Notes
2 pages
Regression - and - Correlation 2 PDF
No ratings yet
Regression - and - Correlation 2 PDF
49 pages
Econometrics Assignemente
No ratings yet
Econometrics Assignemente
2 pages
Review Mid-Term Exam 2
No ratings yet
Review Mid-Term Exam 2
8 pages
Chapter 4 Multiple Regression Model
No ratings yet
Chapter 4 Multiple Regression Model
31 pages
Chapter Three Multiple
No ratings yet
Chapter Three Multiple
15 pages
Chapter 7 PDF
No ratings yet
Chapter 7 PDF
17 pages
Mathematics
100% (1)
Mathematics
4 pages
Simultaneous Equation Models
100% (1)
Simultaneous Equation Models
17 pages
Central Tendency
No ratings yet
Central Tendency
26 pages
Econometrics
No ratings yet
Econometrics
25 pages
II PUC Statistics Mock Paper I
No ratings yet
II PUC Statistics Mock Paper I
4 pages
Basic Econometrics (BA 4th)
No ratings yet
Basic Econometrics (BA 4th)
4 pages
Introductory Econometrics - Exam: 1 Theoretical Questions
No ratings yet
Introductory Econometrics - Exam: 1 Theoretical Questions
5 pages
Explore 5
No ratings yet
Explore 5
233 pages
MCA4020-Model Question Paper
No ratings yet
MCA4020-Model Question Paper
18 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Linear Equation
No ratings yet
Linear Equation
7 pages
Formulation of Objective
No ratings yet
Formulation of Objective
16 pages
Marks Oriented Notes For IGCSE O Level Physics v37
No ratings yet
Marks Oriented Notes For IGCSE O Level Physics v37
76 pages
Teip7419 Mo
No ratings yet
Teip7419 Mo
22 pages
Layers of The Atmosphere Power Point PDF
No ratings yet
Layers of The Atmosphere Power Point PDF
16 pages
IOSA Checklist: ISM Edition 9 - Effective September 1, 2015
No ratings yet
IOSA Checklist: ISM Edition 9 - Effective September 1, 2015
253 pages
Surface Roughness
No ratings yet
Surface Roughness
8 pages
Cloud Seeding
No ratings yet
Cloud Seeding
23 pages
Ray Martinez - Resume 03 11 2023 - Most Recent
No ratings yet
Ray Martinez - Resume 03 11 2023 - Most Recent
3 pages
Science Literacy Strategies
No ratings yet
Science Literacy Strategies
3 pages
Principles of Economics MM MBA 2018
No ratings yet
Principles of Economics MM MBA 2018
60 pages
Chapter 8-Performance Management
No ratings yet
Chapter 8-Performance Management
14 pages
UCUN DINAS I BHS INGGRIS PKT A Dijawab
100% (3)
UCUN DINAS I BHS INGGRIS PKT A Dijawab
12 pages
Preparing OpenStackInstallation Guide
No ratings yet
Preparing OpenStackInstallation Guide
100 pages
Illustrated Parts Catalog Bo105 Ls A-3: Lifting System Assy
No ratings yet
Illustrated Parts Catalog Bo105 Ls A-3: Lifting System Assy
2 pages
Alienation From David-McClellan-The-Thought-of-Karl-Marx
No ratings yet
Alienation From David-McClellan-The-Thought-of-Karl-Marx
17 pages
S5 Math Exercise
No ratings yet
S5 Math Exercise
6 pages
Vet Pharm Therapeutics - 2020 - Broughton Neiswanger - Pharmacometabolomics With A Combination of PLS DA and Random
No ratings yet
Vet Pharm Therapeutics - 2020 - Broughton Neiswanger - Pharmacometabolomics With A Combination of PLS DA and Random
11 pages
LP 4TH Grade 10 Day1
No ratings yet
LP 4TH Grade 10 Day1
3 pages
Chapter 7
No ratings yet
Chapter 7
19 pages
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
No ratings yet
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
79 pages
Fundamentals of Multimedia
No ratings yet
Fundamentals of Multimedia
3 pages
Explosion Proof Pressure Transmitter: Model PT124B-282 Intelligent Type
No ratings yet
Explosion Proof Pressure Transmitter: Model PT124B-282 Intelligent Type
2 pages
Contest1 Tasks
No ratings yet
Contest1 Tasks
9 pages
The Ghosts of Adichanallur - Artefacts That Suggest An Ancient Tamil Civilisation of Great Sophistication - The Hindu
No ratings yet
The Ghosts of Adichanallur - Artefacts That Suggest An Ancient Tamil Civilisation of Great Sophistication - The Hindu
12 pages
2ND Performance Task in Science
No ratings yet
2ND Performance Task in Science
6 pages
Week006-Descriptive Statistics: Laboratory Exercise 002
No ratings yet
Week006-Descriptive Statistics: Laboratory Exercise 002
3 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
2 pages

Econometrics Final Exam Study Guide PDF

Uploaded by

Econometrics Final Exam Study Guide PDF

Uploaded by

Chapter 8 - Nonlinear Regression Functions

I. Nonlinear Regression Functions - General Comments

A. The general nonlinear population regression function

a) E(u​i​| X​1​, X​2​,..., X​Ki​) = 0

(1) For any given value of X the mean is zero

b) (X​1​,..., X​ki​, Y​i​) are i.i.d.

c) Big outliers are rare

II. Nonlinear functions of one variable

A. Two complementary approaches

1. Polynomials in X; Ex: Y​i​ = B​0​ + B​1​X​i​ + B​2​*X​i​2​ + B​r​ + u​i

a) Joint hypothesis testing can be used to determine non-linearity

Ex: TestScore = B​0​ + B​1​Income​i​ + B​2​(Income​i​)​2​ + B​3​(Income​i​)​3​ + u​i

H​0​ : population coefficients on Income​2​ and Income​3​ = 0

H​1​ : at least one of these coefficients is nonzero

*Run test income2 income3; if p<0.05 then reject null

2. Logarithmic transformations of Y and/or X - permits modeling relations in

a) Three log regression specifications:

(a) A 1% change in X (multiplying X by 1.01) is

associated with a 0.01*B​1​ unit change in Y

(2) Log-linear : ln(Y​i​) = B​0​ + B​1​X​i​ + u​i

(a) A 1 unit change in X is associated with a 100*B​1​%

(3) Log-log : ln(Y​i​) = B​0​ + B​1​ln(X​i​) + u​i

(a) A 1% change in X is associated with a B​1​% change

B. Negative Exponential Growth Regression

1. Y​i​ = B​0​ - ae​-B1X​ + u​i

III. Nonlinear functions of two variables: interactions between independent variables

1. D​1​ and D​2​ are binary

2. Including an interaction term D​1​*D​2​ allows the effect of changing D​1​ to

a) To find effect of variable, partially derive:

(1) Y​i​ = B​0​ + B​1​D​1​ + B​2​D​2​ + B​3​D​1​D​2​ + u​i

(a) dy/dD​1​ = B​1​ + B​3​D​2​ = effect of change in D​1

(b) dy/dD​2​ = B​2​ + B​3​D​1​ = effect of change in D​2

B. Interactions between continuous and binary variables:

1. Y​i​ = B​0​ + B​1​D​i​ + B​2​X​i​ + B​3​D​i​*X​i​ + u​i

b) Two regression lines are formed by D=0 and D=1

(1) Find effect by partially deriving

C. Interactions between two continuous variables:

1. Y​i​ = B​0​ + B​1​X​1​ + B​2​X​2​ + B​3​X​1​X​2​ + u​i

a) Both X​1​ and X​2​ are continuous

b) Interaction term does the same thing as above

c) Perform joint hypothesis testing on individual variables and

Chapter 9 - Assessing Studies Based on Multiple Regressors

I. Internal and External Validity

population being studied

B. External Validity - The statistical inferences can be generalized from the

C. Threats to External Validity of Multiple Regression Studies

1. Assessing threats to external validity requires detailed substantive

knowledge and judgment on a case-by-case basis

II. Threats to Internal Validity of Multiple Regression Analysis

1. Arises if an omitted variable is both:

b) Correlated with at least one included regressor

a) If the omitted causal variable can be measured, include it

(conditional mean independence plausibility holding) then include

the control variables

c) Possibly, use panel data in which each entity (individual) is

observed more than once (thus providing a control)

d) If the omitted variable(s) cannot be measured, use instrumental

e) Run randomized control experiment

(1) If X is randomly assigned, then X necessarily will be

distributed independently of u​i​, thus E(u|X = x) = 0

B. Functional form misspecification

1. Arises if the functional form is incorrect (linear vs. non-linear); ex: an

interaction term is incorrectly omitted

a) Continuous dependent variable: use the “appropriate” nonlinear

specifications in X (logarithms, interactions, etc…)

multiple regression methods (“probit” or “logit” analysis for binary

1. Arises with data entry errors in administrative data, recollection errors in

surveys, ambiguous questions, intentionally false response problems,

a) Get better data

b) Develop a specific model of the measurement error process (only

possible if a lot is known about the nature of the measurement

c) Instrumental variables regression

D. Missing data and sample selection bias

a) Cases 1 and 2 don’t introduce bias; case 3 introduces “sample

2. Sample selection bias arises when a selection process:

a) Influences the availability of data and

a) E(ui| X1, X2,..., XKi) = 0

b) (X1,..., Xki, Yi) are i.i.d.

1. Polynomials in X; Ex: Yi = B0 + B1Xi + B2*Xi2 + Br + ui

Ex: TestScore = B0 + B1Incomei + B2(Incomei)2 + B3(Incomei)3 + ui

H0 : population coefficients on Income2 and Income3 = 0

H1 : at least one of these coefficients is nonzero

associated with a 0.01*B1 unit change in Y

(2) Log-linear : ln(Yi) = B0 + B1Xi + ui

(a) A 1 unit change in X is associated with a 100*B1%

(3) Log-log : ln(Yi) = B0 + B1ln(Xi) + ui

(a) A 1% change in X is associated with a B1% change

1. Yi = B0 - ae-B1X + ui

1. D1 and D2 are binary

2. Including an interaction term D1*D2 allows the effect of changing D1 to

(1) Yi = B0 + B1D1 + B2D2 + B3D1D2 + ui

(a) dy/dD1 = B1 + B3D2 = effect of change in D1

(b) dy/dD2 = B2 + B3D1 = effect of change in D2

1. Yi = B0 + B1Di + B2Xi + B3Di*Xi + ui

1. Yi = B0 + B1X1 + B2X2 + B3X1X2 + ui

a) Both X1 and X2 are continuous

distributed independently of ui, thus E(u|X = x) = 0

a) The effect of Zi can be eliminated by using T=2 years

a) Yit = B0 + B1Xit + y2D2i + … + ynDni + uit

a) Yit = B1Xit + ai + uit

(1) ai is called a “state fixed effect” or “state effect” it is the

(1) First create the binary variables D2i,...,Dni

(2) Then estimate Yit = B0 + B1Xit + y2D2i + … + ynDni + uit by

but not states: Yit = B0 + B1Xit + B2Zi + B3St + uit

b) Where B2t = (1 when t=2 (year 2) and 0 otherwise)

2. “Time effects” formulation: Yit = B1Xit + λt + uit

a) Yit = B0 + B1Xit + δ2B2t + … + δTBTt + uit

a) Deviate Yit, Xit from year (not entity) averages

3. (Xit, uit) have finite fourth movements

A. When Y is binary, the linear regression model: Yi = B0 + B1Xi + ui

a) Pr(Y=1|X) = B0 + B1Xi

(2) Y-hat = the predicted probability that Yi=1 given X

c) B1 = change in probability that Y=1 for a unity change in x:

(1) B1 = [Pr(Y=1|X=x + delta(x)) - Pr(Y=1|X=x)]/delta(x)