Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
Semester 1 - 2009
Department of Economics
The University of the West Indies, Mona
3 Model Estimation 7
3.1 Method 1: Method of Moments . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Method 2: Ordinary Least Squares (OLS) . . . . . . . . . . . . . . . . . . . 12
3.3 Properties of the OLS regression line (SRF) . . . . . . . . . . . . . . . . . . 13
5 R-Squared(R2 ) 22
5.1 Properties of R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Sample Correlation(r ) and R2 . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Estimating the error variance σ 2 . . . . . . . . . . . . . . . . . . . . . . . . . 24
1
6.5 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2
1 Introduction
1.1 Definition of Econometrics
The analysis of economic phenomena by applying Mathematics and Statistical Inference to
economic theory with the ultimate aim of empirically verifying the theory.
7. Forecast or Predict.
8. Use the empirical results of the econometric model for control or policy prescription.
3
Deterministic (Functional) - involves variables that are non-random or non-stochastic. An
example of deterministic relations is Newton’s law of gravity and motion. Deterministic
relations are found in classical Physics.
In this course we abstract from deterministic relations and deal only with statistical
relations.
Note:
• Regression Analysis - the dependent variable is stochastic but the explanatory variable
is fixed or non-stochastic.
Note: Ceteris paribus is crucial to causal analysis because we cannot establish causality
without holding other factors constant. For example:
4
Time Series Data - A collection of observations on the values that a variable takes at
different points in time. Intervals can be daily, monthly, yearly etc.
Pooled Cross Section - Combining sets of cross sectional data to increase sample size.
Example, cross sectional household survey in two different years (two different random sam-
ple).
Panel or Longitudinal Data - A time series data set for each cross-sectional member in
the data set. Example, wage data on a set of individual’s over a 25-year period.
Note: Distinction between the two latter data structures - In panel data, the same cross
sectional units are followed over the given period.
In this course, we restrict our focus to cross-sectional data.
5
2 Simple Regression Analysis
2.1 Some basic Concepts
Recall the aim of regression analysis. Now let Y be the dependent variable, X be the
explanatory variable and (Y , X) be drawn from the same population of interest. We want
a functional form that will allow us to express Y in terms of X. In the context of a Simple
Linear Regression Model, we write
Y = β0 + β1 X + U (2.1)
Equation (2.1) is also called a “two variable linear regression model” or a “bivariate linear
regression model”.
Y Variable X Variable
Dependent Independent
Explained Explanatory
Response Control
Predicted Predictor
Regressand Regressor
Covariate
In equation (2.1), U is known as the error term or disturbance term. That is, U captures
all elements (factors) other than X that affect Y . Note that U is unobserved.
Y = β0 + β1 X + U
⇒ ∆Y = β1 ∆X + ∆U (2.2)
⇒ ∆Y = β1 ∆X if ∆U = 0
6
• β1 is known as the slope parameter (coefficient of X).
Now Assume: (a) E(U ) = 0 and (b) E(U | X) = E(U ). Then (b) implies that (i) X
and U are uncorrelated (ii) X and U are not linearly dependent.
Combining (2.1) and (2.3) we have Y = E(Y | X) + U . Equation (2.3) is known as the
“Population Regression Function” (PRF). Note that β0 and β1 are unknown but fixed
parameters in the PRF.
In regression analysis we seek to estimate the parameters of the PRF.
Note: We will use linear in simple linear regression to mean linear in parameters!!
7
Figure 1: Graph of Fitted values and Residuals
3 Model Estimation
Let us begin with Equation 2.1
Yi = β0 + β1 Xi + Ui , for i = 1, . . . , n
Given the population regression function is not directly observable, we estimate this form:
where:
3. Ûi is the residual, that is the difference between the actual and the estimated values
of Yi (Ûi = Yi − Ŷi ).
8
3.1 Method 1: Method of Moments
This method requires only the two assumptions in section (2.1) that were used to derive the
PRF, namely (a) E(U ) = 0 and E(U |X) = E(U ). Recall that we can combine (a) and (b)
to obtain E(U |X) = 0 which implies that U and X are uncorrelated. That is,
1. E(U ) = 0
2. E(XU ) = 0
9
Using (3.2) we have
n
1X
[Xi (Yi − β̂0 − β̂1 X)] = 0
n i=1
n
1X
(Xi Yi − β̂0 Xi − β̂1 Xi 2 ) = 0
n i=1
n n n
1X 1X 1X
Xi Yi − β̂0 Xi − β̂1 Xi 2 = 0
n i=1 n i=1 n i=1
n n
1X 1X 2
Xi Yi − β̂0 X̄ − β̂1 Xi = 0
n i=1 n i=1
n n
1X 1X 2
Xi Yi − (Ȳ − β̂1 X̄)X̄ − β̂1 Xi = 0
n i=1 n i=1
1
Pn
Xi Yi − Ȳ X̄
⇒ β̂1 = 1 Pi=1
n
n 2 2
n i=1 Xi − X̄
1
P n
n i=1 (Xi − X̄)(Yi − Ȳ )
β1 = 1
Pn
(Xi − X̄)2
n
P i=1
(Xi − X̄)(Yi − Ȳ )
β1 = P
(Xi − X̄)2
Thus, given Y = β0 + β1 X + U the MOM estimators of β0 & β1 , β̂0 & β̂1 , are as follows:
β̂0 = Ȳ − β̂1 X
1
Pn
n i=1 Xi Yi − X̄ Ȳ
β̂1 = 1
P n 2 2
n i=1 Xi − X̄
Example 3.1. Consider the following data for the two variable regression model
Yi = β0 + β1 Xi + Ui , for i = 1, . . . , n,
which satisfies all the standard assumptions of the Classical Linear Regression Model:
X X X X X
n = 10, X = 30, Y = 20, X 2 = 92, Y 2 = 50, XY = 64.
10
Answer:
1
Pn
n i=1 Xi Yi − X̄ Ȳ
β̂1 = 1
P n 2 2
n i=1 Xi − X̄
1
10
64 − (3)(2)
β̂1 = 1 =2
10
92 − ( 30
10
)2
Similarly,
β̂0 = Ȳ − βˆ1 X̄
= 2 − (2)(3) = −4
Formulae:
1. n
X
(Xi − X̄) = 0
i=1
2. n n
X X
(Xi − X̄)2 = (Xi − X̄)Xi
i=1 i=1
3. n n
X X
(Xi − X̄)(Yi − Ȳ ) = (Xi − X̄)Yi
i=1 i=1
4. n n
X X
(Xi − X̄) = 2
Xi 2 − nX̄ 2
i=1 i=1
5. n n
X X
(Xi − X̄)(Yi − Ȳ ) = Xi Yi − nX̄ Ȳ
i=1 i=1
1.
n
X n
X n
X n
X
(Xi − X̄) = Xi − X̄ = Xi − nX̄
i=1 i=1 i=1 i=1
= nX̄ − nX̄ = 0
11
2.
n
X n
X
(Xi − X̄)2 = (Xi − X̄)(Xi − X̄)
i=1 i=1
n h
X i
= (Xi − X̄)Xi + (Xi − X̄)(−X̄)
i=1
Xn n
X
= (Xi − X̄)Xi − (Xi − X̄)X̄
i=1 i=1
Xn n
X
= (Xi − X̄)Xi − X̄ (Xi − X̄)
i=1 i=1
n
X
= (Xi − X̄)Xi − X̄ ∗ 0
i=1
Xn
= (Xi − X̄)Xi
i=1
3. Similar to (2)
4.
n
X n
X
2
(Xi − X̄) = (Xi − 2Xi X̄ + X̄ 2 )
i=1 i=1
X X
= Xi 2 − 2X̄ Xi + nX̄ 2
n
X
= Xi 2 − 2nX̄ 2 + nX̄ 2
i=1
Xn
= Xi 2 − nX̄ 2
i=1
P P
Example 3.2. Suppose Ȳ = 2, X̄ = 3, n =10 ni=1 (Xi − X̄)2 = 2, ni=1 (Xi − X̄)(Yi − Ȳ ) = 4
for the model
Yi = α0 + α1 Xi + Ui , i = 1, . . . , n.
Answer:
Pn
i=1 (Xi − X̄)(Yi − Ȳ ) 4
α̂1 = Pn (deviation form) ⇒ α̂1 = = 2.
i=1 (Xi − X̄)
2 2
12
Also,
α̂0 = Ȳ − α1 X̄
α0 = 2 − 2(3) = −4
where Ûi is the residual and Ŷi is the estimated (conditional mean) values of Yi . That is
Ŷi = β̂0 + β̂1 Xi . Then Ûi = Yi − Ŷi .
The least-squares criterion states that β0 and β1 must be selected so that the sum of
P 2
squares residuals is minimized. That is, Ui is as small as possible. By virtue of the
least-squares criterion we therefore seek β0 and β1 such that
n
X
min Ûi2
β0 ,β1
i=1
n
X
⇒ min (Yi − β̂0 − βˆ1 Xi )2
β0 ,β1
i=1
Alternatively,
X
(Yi − β̂0 − βˆ1 Xi ) = 0 (3.3)
X
(Yi − β̂0 − βˆ1 Xi )Xi = 0 (3.4)
13
Equations (3.3) and (3.4) are known as the normal equations. We use equation (3.3) to solve
for β̂0 :
X X X
=⇒ Yi − β̂0 − βˆ1 Xi = 0
X X
=⇒ Yi − nβ̂0 − βˆ1 Xi = 0
P P
Yi Xi
=⇒ β̂0 = − β̂1
n n
or β̂0 = Ȳ − β̂1 X̄ (3.5)
Notation: In this class we will define Xi ≡ Xi − X̄, that is Xi is the deviation form of Xi
∼ ∼
from its mean value. Then
P
Xi Yi
β̂1 = P ∼ ∼2 (deviationf orm)
Xi
∼
1
P P
Aside: Method of Moments: n
Ûi = 0; n1 Ûi Xi = 0
Remark 3.3. The method of moments condition for the sample are identical to the first
order conditions from the OLS approach. Thus for our classical linear regression models the
estimators from these two model estimation approaches are identical.
2. The mean value of the estimated Yi = Ŷi is equal to the mean of the actual Y. That is
14
¯
Ŷ = Ȳ .
3. The residuals Ûi ’s have mean equal to zero. One implication of this property is that
the SRF can be written as
Yi − Ȳ = β1 (Xi − X̄) + Ui
=⇒ Yi = β1 Xi + Ui (deviationf orm)
∼ ∼
Ŷi = β̂1 Xi ,
∼ ∼
4. There is zero correlation between the residuals Ûi and the fitted values Ŷi .
5. There is zero correlation between the residuals Ûi and the explanatory variable Xi .
Questions:
a Verify all the above properties of the SRF. You can provide a proof for each property.
b Do all properties hold if the simple linear regresssion model is of the form Yi = β1 Xi +
Ui , i = 1, . . . , n?
15