0% found this document useful (0 votes)
20 views18 pages

AEphd 2023 Week 1 Small

This document provides an overview of an advanced econometrics course for PhD students. The course has two objectives: to provide a theoretical foundation for further econometrics study and to gain practical experience analyzing economic data with econometric methods using Stata software. Prerequisites include matrix algebra, probability and statistics. The course will be graded based on class participation, homework, data exercises, reading reports, and a final exam. Background materials will cover matrix algebra, probability and distribution theory, estimation and inference, and large sample distribution theory.

Uploaded by

Eku Bold
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

AEphd 2023 Week 1 Small

This document provides an overview of an advanced econometrics course for PhD students. The course has two objectives: to provide a theoretical foundation for further econometrics study and to gain practical experience analyzing economic data with econometric methods using Stata software. Prerequisites include matrix algebra, probability and statistics. The course will be graded based on class participation, homework, data exercises, reading reports, and a final exam. Background materials will cover matrix algebra, probability and distribution theory, estimation and inference, and large sample distribution theory.

Uploaded by

Eku Bold
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Advanced Econometrics Overview

 This is an advanced econometrics course for Ph. D


 Instructor: Zhihong Chen ([email protected])
 Office: Keyan Building 725 students.
Office Hour: Wednesday 2:00PM-4:30PM  The course has two objectives.
 Email: [email protected]
----to provide a theoretical foundation useful for
 Course Material: [email protected]/ wechat
further study of econometrics
 Teaching Philosophy:
1. Treat a person as he is, and he will remain as he is. Treat ----to gain practical experience in analyzing
him as he could be, and he will become what he should be. economic data with econometric methods. Using
2. Education is what people do to you, learning is what you popular software Stata to do Econometric analysis
do to yourself. will be taught as well.
 Prerequisite: matrix algebra, probability and
1
statistics 2

Textbook Textbook
 William Greene: Econometric Analysis  Joshua D. Angrist, Jörn-Steffen Pischke:Mostly
Harmless Econometrics: An Empiricist's Companion

3 4
Reference Book
Syllabus--Grading
 Introductory:
1. Peter Kennedy:A Guide to Econometrics (Baby book)
 Class Participation and Homework (15%), Data
2. Stock and Watson:Introduction to Econometrics Visualization Exercise (15%), Two Reading
3. Michael P. Murray: Econometrics: A Modern Introduction Reports (20%), Final Exam (50%).
4. Philip H. Franses: Enjoyable Econometrics
5. Joshua D. Angrist, Jörn-Steffen Pischke:Mastering 'Metrics: The Path
 All assignments and exams should be submitted
from Cause to Effect on time. No points for late work. Of course, if
there is a verifiable medical reason and
 Intermedia:
1. Jeffrey Wooldridge: Introductory Econometrics: A Modern Approach
arrangements are made before the exam,
 Advanced : adjustment might be made. Please take the
1. Jeffrey Wooldridge: Econometric Analysis of Cross Section and Panel academic integrity seriously.
Data
2. Bruce Hansen: Econometrics

5 6

Quantitative Features of Modern Economics


What is Econometrics: Examples
 Features:
1. What is the effect of reducing class size -mathematical modeling for economic theory
on student achievement? -empirical analysis for economic phenomena.
 General methodology of modern economic research:
2. Is there racial discrimination in the market 1. Data collection and summary of empirical stylized facts.
2. Development of economic theories/models
for home loans? 3. Empirical verification of economic models.
3. How much do cigarette taxes reduce 4. Applications:to test economic theory or hypotheses, to
forecast future evolution of the economy, and to make
smoking? policy recommendations.
4. What will the inflation will be next year?

7 8
What is Econometrics: Definition Limitation of Econometrics
Frisch (1933):  Econometrics is the analysis of the "average behavior" of
Econometrics is by no means the same as economic a large number of realizations. However, economic data
statistics. Not is it identical with what we call general are not produced by a large number of repeated random
economic theory, although a considerable portion of this experiments, due to the fact that an economy is not a
theory has a definitely quantitative character. Nor should
econometrics be taken as synonymous with the application controlled experiment:
of mathematics to economics. Experience has shown that
each of these three viewpoints, that of statistics, economic 1. Economic theory or model can only capture the main or
theory, and mathematics, is a necessary, but not by itself a most important factors。
sufficient condition for a real understanding of the 2. An economy is an irreversible or non-repeatable system.
quantitative relations in modern economic life. It is the
unification of all three that is powerful. And it is this 3. Economic relationships are often changing over time for an
unification that constitutes econometrics. economy.
4. Data quality
9 10

Background for Learning this Course Background for Learning this Course
APPENDIX B: Probability and Distribution Theory
 Random Variables
Greene: Appendix A-D  Expectations of a Random Variable
APPENDIX A: Matrix Algebra  Some Specific Probability Distributions
 Algebraic Manipulation of Matrices  The Distribution of a Function of a Random Variable
 Geometry of Matrices  Representations of a Probability Distribution
 Joint Distributions
 Solution of a System of Linear Equations
 Conditioning in a Bivariate Distribution
 Partitioned Matrices  The Bivariate Normal Distribution
 Characteristic Roots and Vectors  Multivariate Distributions
 Quadratic Forms and Definite Matrices  Moments
 Calculus and Matrix Algebra  The Multivariate Normal Distribution

11 12
Background for Learning this Course
Causal Effects and Idealized Experiments
APPENDIX C: Estimation and Inference
 Statistics as Estimators—Sampling Distributions
 Causality means that a specific action leads to a
 Point Estimation of Parameters; Interval Estimation specific, measurable consequence.
 Hypothesis Testing
 Randomized controlled experiment, control group
(receives no treatment), treatment group (receives
APPENDIX D: Large Sample Distribution Theory treatment).
(Introduce)  Causal effect is defined to be the effect on an
outcome of a given action or treatment.
 You don’t need to know a causal relationship to
make a good forecast.

13 14

Correlation or Causation? Correlation or Causation?


(By Vali Chandrasekaran) (By Vali Chandrasekaran)
http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html

15 16
Correlation or Causation?
(By Vali Chandrasekaran)
http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html Extended Reading

It ushers in three big shifts: more, messy


and correlations (the book’s chapters 2, 3
and 4).
Instead of trying to uncover causality, the
reasons behind things, it is often sufficient
to simply uncover practical answers. So if
some combinations of aspirin and orange
juice puts a deadly disease into remission, it
is less important to know what the
biological mechanism is than to just drink
the potion. For many things, with big data it
is faster, cheaper and good enough to learn
“what,” not “why.”

17 18

Extended Reading
Data: Source and Types

 Experimental data come from experiment


designed to evaluate a treatment or policy or to
investigate a causal effect.
 Non-experimental data are obtained by observing
actual behavior outside an experimental setting.
Observational data poses major challenges for
econometrics.

19 20
Data: Source and Types Types of Data – Cross-sectional Data
 There are several different kinds of economic data  Cross-sectional data is a random sample
sets:
 Each observation is a new individual, firm, etc. with
 Cross-sectional data information at a point in time
 Time series data
 If the data is not a random sample, we have a sample-
 Pooled cross sections selection problem
 Panel/Longitudinal data
 The analysis of cross-sectional data is closely aligned with
 Econometric methods depend on the nature of the the applied microeconomics fields, such as labor
data used. Use of inappropriate methods may lead economics, industrial organization, and health economics.
to misleading results.
 The fact that the ordering of the data does not matter for
econometric analysis is a key feature of cross-sectional
21
data sets obtained from random sampling. 22

Types of Data – Time Series


 This includes observations of a variable or several variables
over time.
 Typical applications include applied macroeconomics and
finance. Examples include stock prices, money supply,
consumer price index, gross domestic product, annual
homicide rates, automobile sales, and so on.
 Time series observations are typically serially correlated.
 Ordering of observations conveys important information.
 Data frequency may include daily, weekly, monthly,
quarterly, annually, and so on.
 Typical features of time series include trends and
23 seasonality. 24
Types of Data – Pooled Cross Sections
 Two or more cross sections are combined in one data set.
 Cross sections are drawn independently of each other.
 Pooled cross sections are often used to evaluate policy
changes.
 Example: Evaluating effect of change in property taxes on
house prices:
--Random sample of house prices for the year 1993.
--A new random sample of house prices for the year 1995.
--Compare before/after (1993: before reform, 1995: after
reform).

25 26

This is NOT a panel data set! Types of Data – Panel or Longitudinal Data
 The same cross-sectional units are followed over time.
 Panel data have a cross-sectional and a time series
dimension.
 Panel data can be used to account for time-invariant
unobservables.
 Panel data can be used to model lagged responses.
 Example: City crime statistics; each city is observed in two
years.
--Time-invariant unobserved city characteristics may be modeled.
-- Effect of police on crime rates may exhibit time lag.

27 28
This IS a panel data set!

29 30

Linear Regression Model

Classical Linear Regression Model


y =  0 +  1x +
Ch 2 Assumptions of the classical linear
regression model (CLRM) y = 0 + 1x1 + 2x2+…+ kxk +

31 32
Example: Reed Auto Sales

A Simple Example: Reed Auto Sales Simple Linear Regression

Number of TV Ads Number of Cars Sold


1 14
Reed Auto periodically has a special 3 24
week-long sale. As part of the advertising 2 18
campaign Reed runs one or more television 1 17
3 27
commercials during the weekend preceding
the sale. Data from a sample of 5 previous Sample?
sales are shown on the next slide. Observation?
Dependent Variable?
Independent Variable?
Model?
33 34

Example: Reed Auto Sales


Scatter Diagram
30
25
20
Cars Sold

15
10
5
0
0 1 2 3 4
TV Ads

35 36
Question: Do districts with smaller classes (lower STR) have
higher test scores?
Test score
STR

37 38

Estimation Process
Regression Model Sample Data:
y = 0 + 1x + x y
Regression Equation x 1 y1
E(y) = 0 + 1x . .
Unknown Parameters . .
0, 1 x n yn

Estimated
b0 and b1 Regression Equation
provide estimates of ŷ  b0  b1 x
0 and 1 Sample Statistics
b0, b1

39 40
Matrix Form: Y=X + Assumptions of
the Classical Linear Regression Model
𝒚𝟏 𝜷𝟎 𝜷𝟏 𝒙𝟏𝟏 𝜷𝟐 𝒙𝟏𝟐 ⋯ 𝜷𝒌 𝒙𝟏𝒌 𝜺𝟏  A1. Linearity(in parameters)
𝒚𝟐 𝜷𝟎 𝜷𝟏 𝒙𝟐𝟏 𝜷𝟐 𝒙𝟐𝟐 ⋯ 𝜷𝒌 𝒙𝟐𝒌 𝜺𝟐
 A2. Full rank: There is no exact linear relationship

𝒚𝒊 𝜷𝟎 𝜷𝟏 𝒙𝒊𝟏 𝜷𝟐 𝒙𝒊𝟐 ⋯ 𝜷𝒌 𝒙𝒊𝒌 𝜺𝒊
among any of the independent variables in the model.
𝒚𝒏 𝜷𝟎 𝜷𝟏 𝒙𝒏𝟏 𝜷𝟐 𝒙𝒏𝟐 ⋯ 𝜷𝒌 𝒙𝒏𝒌 𝜺𝒏  A3. Exogeneity of the independent variables
Define:
𝒚𝟏 𝟏 𝒙𝟏𝟏 … 𝒙𝟏𝒌
𝒚𝟐 𝟏 𝒙𝟐𝟏 … 𝒙𝟐𝒌
 A4. Homoscedasticity and nonautocorrelation
𝒀 … X
… … … … =
𝒚𝒏 𝟏 𝒙𝒏𝟏 … 𝒙𝒏𝒌
𝜷𝟎 𝜺𝟏  A5 Data generation
𝜷𝟏 𝜺𝟐
𝜷 𝜺 …
 A6 Normal distribution: The disturbances are

𝜷𝒌 𝜺𝒏 normally distributed
41 42

Assumptions of Assumptions of
the Classical Linear Regression Model the Classical Linear Regression Model
 A1. Linearity(in parameters)  A2. Full rank: There is no exact linear relationship
among any of the independent variables in the model.

43 44
Assumptions of Assumptions of
the Classical Linear Regression Model the Classical Linear Regression Model
 A3. Exogeneity of the independent variables  A4. Homoscedasticity and nonautocorrelation
=

45 46

Assumptions of Assumptions of
the Classical Linear Regression Model the Classical Linear Regression Model
 A5 Data generation  A6 Normal distribution: The disturbances are
normally distributed

47 48
3.2 Least Squares Regression

 Given the intuitive idea of fitting a line, we can


set up a formal minimization problem
Classical Linear Regression Model  That is, we want to choose our parameters such
that we minimize the following:
Ch 3 Estimation and explanation of CLRM

 uˆ     
n n 2
yi  ˆ0  ˆ1 xi
2
i
i 1 i 1

49 50

Deriving OLS Estimates: Least Square


Estimating the Coefficients
n n
 Object: Min  (Y  Yˆ ) 2   (Y  b  b x ) 2
i i i 0 1 i
i 1 i 1

 The OLS estimator minimizes the average squared difference between


the actual values of Yi and the prediction (predicted value) based on the
estimated line.

 
n
n 1  yi  ˆ0  ˆ1 xi  0
 Take derivative: i 1

 
n
n 1  xi yi  ˆ0  ˆ1 xi  0
i 1

 Therefore, ˆ0  y  ˆ1 x

ˆ1  
( xi  x )( yi  y )
51 ( xi  x ) 2 52
Example: Reed Auto Sales
Matrix Form
Regression equation
Def b  arc min(Y  Xb0 )' (Y  Xb0 )
30 b0

25  arc min(Y 'Y  b0 ' X 'Y  Y ' Xb0  b0 ' X ' Xb0 )
b0
20
Cars Sold

^
y = 10 + 5x scalar scalar
15
 arc min(Y Y  2Y Xb0  b0 ' X ' Xb0 )
' '
b0
10
5 Q
FOC  2 X 'Y  2 X ' Xb0  0
0
b0
0 1 2 3 4
TV Ads  X ' Xb0  X 'Y (Normal equation)

    X ' X  X 'Y
^ 1
☆LS estimator

53 54

Projection Projection
^ ^
Y  X  (Fitted value)
  M  I  X  X 'X  X '
1 1
Define P  X X X
'
X'
 X  X ' X  X 'Y
1
(Projection Matrix)  I P (Residual maker)
^ ^ P , M are symmetric, idempotent
e  Y  Y  Y  X  (Residual,Note:   Y  X  is error)
 1 1  1 
1  
 Y  X  X ' X  X 'Y Note:if X  1 1  1
1
M 0  I     
n 
 1 1  1 mn
  I  X  X ' X  X '  Y
1

 

55 56
Projection 3.3 Partitioned Regression and Partial Regression
Suppose that the regression involves two sets of variables X1
and X2 Then y = X+  = X11+ X22+ 
① 𝐏𝐌 𝐌𝐏 𝟎 The normal equations are
② 𝑷𝑿 𝑿 , 𝑴𝑿 𝟎 , 𝑿𝒆 𝑿 ∗ 𝑴𝒀 𝐗𝐗 𝐛 𝐗𝐲
or:
𝟎
𝐗 𝟏 𝐗 𝟏 𝐗 𝟏 𝐗 𝟐 𝐛𝟏 𝐗𝟏𝐲
③𝒀 𝑷𝒀 𝑴𝒀 𝑿𝒃 𝒆 𝐗𝟐𝐗𝟏 𝐗𝟐𝐗𝟐 𝟐 𝐛 𝐗𝟐𝐲
④ 𝐘𝐘 𝐘 𝐏 𝐏𝐘 𝒀 𝑴 𝑴𝒀 𝒀′𝒀 𝒆𝒆 Then:
𝐗 𝟏 𝐗 𝟏 𝐛𝟏 𝐗 𝟏 𝐗 𝟐 𝐛𝟐 𝐗 𝟏 𝐲(1)
𝐗 𝟐 𝐗 𝟏 𝐛𝟏 𝐗 𝟐 𝐗 𝟐 𝐛𝟐 𝐗 𝟐 𝐲(2)

57 58

3.3 Partitioned Regression and Partial Regression 3.3 Partitioned Regression and Partial Regression

From (2):  What if X2X1 0?


submit 𝒃𝟏 = (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝒚 𝑿𝟐 𝒃 𝟐
b2 = (X2X2)-1X2(y - X1b1)
into: 𝑿𝟐 𝑿𝟏 𝒃𝟏 𝑿𝟐 𝑿𝟐 𝒃𝟐 𝑿𝟐 𝒚
Similarly, b1 = (X1X1)-1X1(y – X2b2) (3-18)
then:
𝑿𝟐 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝒚–𝑿𝟐 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝑿𝟐 𝒃𝟐 𝑿𝟐 𝑿𝟐 𝒃𝟐 𝑿𝟐 𝒚
What is this? Regression of (y – X2b2) on X1
If we knew b2, this is the solution for b1. Collect the similar terms:
𝑿𝟐 𝑰 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝑿𝟐 𝒃𝟐 𝑿𝟐 𝑰 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝒚
Finally:
𝟏
𝒃𝟐 𝑿𝟐 𝑰 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝑿𝟐 𝑿𝟐 𝑰 𝑿𝟏 (𝑿𝟏 𝑿𝟏 )−1𝑿𝟏  𝒚
Theorem 3.1: If X2X1 = 0, b1 = (X1X1)-1X1y and 𝟏
= 𝑿𝟐 𝑴𝟏 𝑿𝟐 𝑿𝟐 𝑴𝟏 𝒚
b2 = (X2X2)-1X2y  Application: Corollary 3.2.1, detrend, FE
59 60
Two Theorems 3.4 Partial Regression and Partial Correlation Coefficients

the interpretation of partial regression as “net


of the effect of …”
Partial Correlation Coefficients: Correlation
between sets of residuals.
Partial correlations and coefficients can have
signs and magnitudes that differ greatly from
gross correlations and simple regression
coefficients.

61 62

3.4 Partial Regression and Partial Correlation Coefficients 3.5 Goodness of Fit and the Analysis of Variance

Theorem 3.5: Change in the Sum of Squares When a


Variable Is Added to a Regression
Let u = the residual in the regression of y on [X,z]
e = the residual in the regression of y on X alone,
then uu = ee – c2(z*z*)  ee
where z* = MXz and c is the coefficient on z in the
regression of y on [X,z].

𝐒𝐒𝐓 ∑𝐧𝐢 𝟏 𝐘𝐢 𝐘 𝟐
, SSR ∑𝐧𝐢 𝟏 𝐘𝐢 𝐘 𝟐 , SSE ∑𝐧𝐢 𝟏 𝐘𝐢 𝐘𝐢 𝟐

𝐒𝐒𝐓 = 𝐒𝐒𝐑 + 𝐒𝐒𝐄


63 64
Analysis of Variance Goodness of Fit
𝐧
𝟐
𝐒𝐒𝐓 𝐘𝐢 𝐘 𝐘′𝐌𝟎 𝐘
SSR SSE
𝐢 𝟏
𝐘 𝐌𝟎 𝐗𝐛 𝐘 𝐌𝟎 𝐞
R2   1
SST SST
𝐗𝐛 𝐞 𝐌𝟎 𝐗𝐛 𝐗𝐛 𝐞 𝐌𝟎 𝐞

𝐛 𝐗 𝐌𝟎 𝐗𝐛 𝐞 𝐌𝟎 𝐗𝐛 𝐛 𝐗 𝐌𝟎 𝐞 𝐞 𝐌𝟎 𝐞 R2 is bounded by zero and one only if:


(a) There is a constant term in X and
𝐛 𝐗 𝐌𝟎 𝐗𝐛 𝐞 𝐌𝟎 𝐞
(b) The line is computed by linear least squares.
∑𝐧𝐢 𝟏 𝐘𝐢 𝐘 𝟐 ∑𝐧𝐢 𝟏 𝐘𝐢 𝐘𝐢 𝟐

SSR SSE there is no absolute basis for comparing R2.


65 66

Adding Variables
Goodness of Fit
 R2 never falls when a variable z is added to the
2 (n  1)(1  R 2 )
regression. R  1
nk
 Theorem 3.6: Change in when adding a variable
R 2Xz with both X and variable z equals
Theorem 3.6: Change in when adding a variable
R 2X with only X plus the increase in fit due to z
In a multiple regression, will fall (rise) when
the variable x is deleted from the regression if the
after X is accounted for:
2
square of the t ratio associated with this variable is
R  R 2X  (1  R 2X )ryz|X
*2
Xz
greater (less) than 1.

68
3.6 Linearly Transformed Regression
 Def Z = XP for KK nonsingular P as a linear
transformation, how does transformation affect the results
of least squares?
 Transformation does affect the “estimates
Based on X, b = (XX)-1X’y.
Based on Z, c = (ZZ)-1Z’y = (P’XXP)-1P’X’y
= P-1(X’X)-1P’-1P’X’y = P-1b
 Transformation does not affect the fit of a model to a body of
data
 “Fitted value” is Zc = (XP)(P-1b) = Xb. The same!!
 Residuals from using Z are y - Zc = y - Xb (we just proved this.).
The same!!
 Sum of squared residuals must be identical, as y-Xb = e = y-Zc.
 R2 must also be identical, as R2 = 1 - ee/y’M0y (!!).
69 70

You might also like