Econometrics Notes Sem II 2020 2021 October 26 2021
Econometrics Notes Sem II 2020 2021 October 26 2021
Notes
Compiled By
February 2019
i
Content
Part I: The Basics of Regression Analysis...............................................................................5
1 Introduction to Econometrics ......................................................................................... 5
1.1 Definition .....................................................................................................................5
1.2 Some History ...............................................................................................................5
1.3 The Econometric approach .......................................................................................... 6
1.4 Types of models...........................................................................................................7
1.5 Structure of economic data .......................................................................................... 7
1.6 Other Important Concepts............................................................................................ 7
1.7 Uses of Econometrics ..................................................................................................8
1.8 Data Generating Mechanisms ...................................................................................... 8
1.9 An example ..................................................................................................................9
1.10Assignment: An Econometric Model Example ......................................................... 10
1.11Key terms ...................................................................................................................10
2 Basic Mathematical Concepts ...................................................................................... 11
2.1 Vectors and Matrices .................................................................................................11
2.2 Matrix Operations ......................................................................................................13
2.3 Matrix Inversion.........................................................................................................14
2.4 Quadratic Forms.........................................................................................................15
2.5 Calculus of Optimization ........................................................................................... 16
3 Elementary Statistics: A Review ..................................................................................17
3.1 Random Variables......................................................................................................17
3.2 Estimation ..................................................................................................................19
3.3 Properties of the Estimators ....................................................................................... 21
3.4 Probability Distributions............................................................................................ 21
3.5 Hypothesis Testing, P-Values and Confidence Intervals...........................................23
3.6 Descriptive Statistics..................................................................................................24
4 Introduction to the Regression Model ..........................................................................25
4.1 Curve Fitting ..............................................................................................................25
4.2 Derivation of Least Squares....................................................................................... 26
5 The Two-Variable Regression Model (Simple Linear Regression) ............................. 28
5.1 The model ..................................................................................................................28
5.2 The Best Linear unbiased Estimation ........................................................................28
5.3 Hypothesis Testing and Confidence Intervals ........................................................... 30
5.4 Analysis of Variance: Goodness of Fit ......................................................................31
5.5 Examples of Commands In Stata ...............................................................................32
6 Multiple Linear Regression .......................................................................................... 33
6.1 The Model..................................................................................................................33
6.2 In Matrix Notation .....................................................................................................33
6.3 Properties of OLS Estimators / Gauss-Markov Theorem ..........................................34
6.4 OLS Estimation in Matrix Notation...........................................................................35
6.5 Unbiased Estimation ..................................................................................................35
6.6 Variance of the Least Squares....................................................................................35
6.7 The Distribution of OLS Estimators / Regression Statistics......................................36
ii
6.8 F-tests, R-Square and Corrected R-Square ................................................................ 37
7 Using the Multiple Regression Model..........................................................................39
7.1 The general linear model ........................................................................................... 39
7.2 Use of Dummy Variables........................................................................................... 40
7.3 Other cases .................................................................................................................41
7.4 Interpretation of some examples of transformations .................................................42
Part 2: Single Equation Regression Models...........................................................................43
8 Multicollinearity ...........................................................................................................43
8.1 Definition: ..................................................................................................................43
8.2 Perfect Collinearity as an Identification Problem ...................................................... 43
8.3 High But Imperfect Collinearity ................................................................................45
8.4 Consequences of Multicollinearity ............................................................................45
8.5 Detection of Multicollinearity....................................................................................45
8.6 Remedial Measures....................................................................................................45
8.7 Concluding Remarks..................................................................................................46
9 Heteroskedasticity ........................................................................................................46
9.1 Definition of Heteroskedasticity ................................................................................46
9.2 Forms of Heteroskedasticity ...................................................................................... 46
9.3 Causes of Heteroskedasticity .....................................................................................48
9.4 Consequences of Heteroskedasticity..........................................................................49
9.5 Detection of Heteroskedasticity.................................................................................50
9.6 Tests for Heteroskedasticity....................................................................................... 51
9.7 What to Do If You Find Evidence of Heteroskedasticity ..........................................53
9.8 Conclusions about Heteroskedasticity .......................................................................56
9.9 Some Stata Commands .............................................................................................. 56
10 Serial Correlation (Autocorrelation).............................................................................57
10.1Definition of Autocorrelation ....................................................................................57
10.2Autocorrelation vs. Heteroskedasticity......................................................................57
10.3Causes of Autocorrelation.......................................................................................... 58
10.4Types of Autocorrelation ........................................................................................... 60
10.5Detection of Autocorrelation .....................................................................................62
10.6Correcting for Autocorrelation ..................................................................................68
10.7Cochrane-Orcutt Iterative Procedure .........................................................................70
10.8Concluding Remarks on Autocorrelation ..................................................................70
11 Forecasting with a Single Equation Regression Model................................................72
11.1Definition ...................................................................................................................72
11.2Point and Interval Forecasts....................................................................................... 72
11.3Ex-post and Ex-ante forecasts....................................................................................72
11.4Conditional vs. Unconditional forecasts ....................................................................73
11.5The forecast error .......................................................................................................73
11.6Forecast Evaluation....................................................................................................77
12 Elements of Model Specifications Gujarati Chapter 13)..............................................79
12.1Basic Facts about Model Specification Analysis....................................................... 79
12.2Types of Specification Errors and Their Consequences ............................................79
iii
12.3Specification Testing .................................................................................................80
12.4Box-Cox Transformation and the Box-Cox Test (graduate) .....................................84
12.5Non-nested hypothesis tests....................................................................................... 85
12.6Alternative Philosophies ............................................................................................ 87
12.7Conclusions................................................................................................................88
12.8Model Specification and Evaluation Analysis(To revisit during TS analysis)..........89
12.9Take Note of Spurious Regression (To revisit during TS analysis--graduate)..........89
13 Maximum Likelihood Estimation Method ...................................................................90
13.1Maximum Likelihood Estimation ..............................................................................91
14 Models of Qualitative Choice....................................................................................... 92
14.1Discrete Dependent Variable Models ........................................................................92
14.2The Basic Model ........................................................................................................92
14.3A Generic Model........................................................................................................93
14.4Linear Probability Model ........................................................................................... 94
14.5Logit Model ...............................................................................................................96
14.6Probit Model ..............................................................................................................97
14.7Estimating the Logit and Probit Models ....................................................................98
14.8Using STATA ............................................................................................................98
14.9Logit Versus Probit ....................................................................................................98
14.10Interpretation of Results .......................................................................................... 99
14.11Tobit Models..........................................................................................................100
iv
Part I: The Basics of Regression Analysis
1 Introduction to Econometrics
1.1 Definition
Econo = Economics
Metrics=Measurement
Econometrics: Quantitative measure of economic concepts and relationships
Examples
Analysing the changes in quantities produced and consumed due to changes in prices and
income (price and income elasticities of demand and supply) for agricultural commodities.
Econometrics has spread in other areas such as social sciences, health, engineering,
psychology, geography, etc.
Early work was based on a “classical” approach which grew out of experimental statistics
5
1.2.1 Highlights of Classical Approach
1. State theory or hypothesis
2. Specify mathematical model of the theory
3. Specify econometric model
4. Obtain data via experiment
5. Estimate parameters
6. Test hypothesis
7. Forecast or predict
8. Draw conclusions (control or policy purposes)
Works well where experimental data can be obtained (e.g., in studying effects of fertilizer
application on maize yield)
Does not work well in some cases (e.g., on effects of changes in money supply on inflation)
6
1.4 Types of models
1. Time-Series Models
2. Single-Equation Models
3. Multi-Equation Models
7
1.7 Uses of Econometrics
1. Structural analysis: What is the price elasticity of export demand for Uganda grains?
2. Forecasting: What will be the value of coffee exports from Uganda in 2020?
3. Policy Analysis (Conditional Forecasting): What will be the value of coffee exports
in 2020 from Uganda following the introduction of a tax on agricultural inputs in
2015/2016 budget?
4. Other examples
Is the generic advertising of mosquito nets effective on usage and control of malaria?
(What is “effective”? How to you test effective?)
1.8 Data Generating Mechanisms
There must be some mechanism that generates the economic data that we observe. We may
not know much about the process, or what makes it work, but there will be a process
When we specify an econometric model, we are hypothesizing what the DGM looks like
(but we will never know for sure)
When we engage in specification testing, we are testing whether our hypothesized DGM fits
the data or not.
Two Phases
1. We begin by assuming we have the correct DGM (model specification) and all we
have to do is to use data to estimate the parameters and test hypothesis
8
1.9 An example
a. Research Question:
To what extent does maize price, sex of farmer, land under maize production, and transport
costs influence maize sales in Uganda?
b. Model specification:
ln Q ln P S ln L T i
i 0 1 i 2 i 3 i 4 t
Where
ln =natural logarithm.
Q =quantity of maize sold in kilograms by the i th household.
i
P =price of maize in shillings per kilogram for the i th household.
i
S =sex of the household head of the i th household (Female=1, Male=0).
i
L =land under maize production by i th household.
i
T =transport cost per 100 kilogram bag of maize from home to the market of the
i
i th household.
c. Data:
d. Estimation Results
Standard errors are reported under the coefficients estimates. The regression F-statistic, the
model R 2 , the error sums of squares (ESS) and the sample size are also reported.
e. Model Evaluation
“Goodness of fit”
Others specification tests to determine “best” model if we had more than one model
9
f. Interpretation of Results and answer the research question
a) Interpret the intercept and the slope parameter estimate on the sex variable.
b) Interpret the slope parameters on price, land and transport, and state whether you
think the estimated signs for the parameters estimates for price, land, and transport variables
conform to what we would expect from the economics underlying the model.
g. Inference
c) Calculate and interpret a 95% confidence interval on the slope parameter associated
with price variable.
d) Using a 1% significance level, formally test the null hypothesis that the slope
parameter on the transport is 0.
e) Predict the quantity of maize in kilograms of a female farmer would sell if she plants
2 acres of maize, sells at 600 shillings per kilogramme, and incurs a transport cost of 10,000
shillings per 100 kg bag.
1 2 3
4 6
If A 4 5 6 then Submatrix A(1, 2) of A is
7 9
,
7 8 9
which was obtained by deleting the first row and the second column.
11
Types of Matrices
Triangular Matrix is a square matrix that has all zeros either above or below the main non-
zero diagonal.
1 4 6 1 0 0
Upper triangular = 0 2 8 or Lower triangular = 4 2 0 .
0 0 3 6 8 3
Null Matrix is one with all elements as zero and is denoted by O .
Equal Matrices must be of the same order and corresponding elements must be equal. We
write A = B.
12
2.2 Matrix Operations
2.2.1 Addition and Subtraction:
C A B [a b ]
ik ik
A B [a b ]
ik ik
ui2 , a scalar.
i
c. Column vector of order n by a row vector of order n will yield a n n order matrix.
u1
u
2
uu u1 u 2 u n 1 u n
u n 1
u
n
u12 u1u 2 u1u 3 u1u n
u 2 u1 u 22 u 2u3 u 2u n
= is symmetric.
2
u n u1 u n u 2 u n u 3 u n
13
2.3 Matrix Inversion
An inverse of a square matrix A, denoted by A 1 , if it exists, is a unique square matrix such
that
AA 1 A 1 A I.
Properties
a. AB B 1A 1 .
1
b. A A
1 1
.
Determinants
To every square matrix A, there corresponds a scalar known as a determinant denoted by
det(A) or A .
a a
For a A 22 11 12 the determinant is
a21 a22
a11 a12
A a11a22 a12 a21.
a21 a22
Properties of Determinants
a. Singular matrix is one whose determinant is zero. Nonsingular matrix is one whose
determinant is nonzero. So the inverse of a singular matrix is indeterminate.
b. If all elements of any row of a matrix are zeros, its determinant is zero.
c. A A .
d. If any row (or column) is a multiple (linear combination) of other rows (or columns),
the determinant of the matrix is zero. Helpful in demonstration of multicollinearity of
variables.
e. AB A B .
Rank of a Matrix is the order of the largest square submatrix whose determinant is not zero.
As noted earlier the inverse of a singular matrix does not exist. Therefore, for a n n matrix
A, its rank must be n for its inverse to exist; if it is less than n, A is singular. This is an
important condition for valid estimates.
Minor
If the ith row and jth column of an n n matrix A are deleted, the determinant of the resultant
submatrix is called the minor of the element aij and is denoted by M ij .
14
Cofactor
The cofactor of element aij of an n n matrix A, denoted by cij is defined as
cij ( 1) i j M ij .
Cofactor Matrix
Replacing elements of aij of a matrix A by their cofactors cij we obtain the cofactor matrix,
cofA .
a11 a1n x1
Q (x ) x1 x 2 xn
an1 ann xn
aij x i x j
i j
15
2.5 Calculus of Optimization
16
3 Elementary Statistics: A Review
Definition: A probability density function maps every possible outcome from a continuous
RV into a value that defines the probability of any interval as the area under the PDF.
17
Expectation is a Linear Operator
Result 1: ( + )= + ( )
Result 2: ( + + )= + ( )+ ( )
(1⁄ ) ≠ 1/ ( )
E(lnY) ≠lnE(Y)
Variance
The variance of a RV, Y, is a measure of “dispersion” around the mean and is defined as:
( )= = [ − ]
Two RVs Y1 and Y2, are said to have a joint distribution if knowledge of the outcome from
one of them influences the probabilities that the other one will take the particular values.
Example: Maize and Beans yields in Busoga next season are related. If you know what the
maize yield will be, that helps you predict the bean yield. Hence, these RVs are jointly
distributed.
Independence
If knowing one RV value does not provide ANY information for predicting the other:
( | = ) = ( ) for all y.
18
Covariance
Covariance is a measure of the tendency for two jointly distributed RVs to “move together”
( , )= = [( − )( − )] = ( )−
Note: Independence implies covariance is zero but zero covariance does not imply
independence.
( + )= ( )
Assignment:
Var (Y1+Y2)
Var ( Y Y )
1 2
Correlation
Correlation is related to covariance and both are measures of the tendency of two RVs to
move together.
But: Covariance depends on the units of measurement of the RVs while correlation is
independent of the units of measurement.
3.2 Estimation
Suppose we make one draw from Y (e.g., We measure the yield for one maize plot in one
season). This is one data point on Y.
19
Now run the same experiment N times providing a sample of N data points (e.g., We
measure yield from 12 different plots but all of the same plot type).
We want to do statistical inference (learn about the properties of the underlying random
variable Y from observing the data sample {y1, y2,….. yN }.
1 N 2
S2 ( y Y ) because E ( s ) S
2 2
N 1i 1 i
The estimate of the variance is:
1 n 2
s 2 v( y ) ( yi y )
n 1i 1
The estimate of the covariance of y1 and y 2 is the expectation of the product of y1 and y 2
when both are measured as deviations from their means.
1 N
Cov ( y , y ) E[( y Y )( y Y )] ( y Y )( y Y )
1 2 1i 1 2i 2 N i 1 1i 1 2i 2
The estimate of the covariance is
^ 1 n
Cov ( y , y ) ( y y )( y y )
1 2 n 1 i 1 1i 1 2i 2
Correlation coefficient between variables y1 and y 2 is given as
20
Cov( y1 , y 2 )
( y1 , y 2 )
V ( y1 ) V ( y 2 )
The sample correlation coefficient is:
n
^ ( y1i y1 )( y 2i y 2 )
Cov ( y , y )
r( y , y ) 1 2 i 1
1 2 v( y ) v( y ) n
1 2 2 n 2
( y1i y1 ) ( y 2i y 2 )
i 1 i 1
0 indicates no relationship (zero covariance)
1 indicates a complete negative relationship
1 indicates a complete positive relationship
3.3.2 Efficiency
( )< ( ∗) ∗
for some other estimate
3.3.3 Consistency
( )=
lim prob (| Y | ) 1
Y
N
21
1 1
f ( z) exp( z 2 )
2 2
Y
It is denoted Z ~ N (0,1) Z
Y
A general normal RV, Y is continuous over ( , ) and can be written as:
Y Y Z
Where E (Y ) , 2 var(Y ) , and Z ~ N (0,1)
Y
It is denoted Y ~ N ( , 2 )
Y
22
3.5 Hypothesis Testing, P-Values and Confidence Intervals
3.5.2 P-Values
A p-value is the minimum significance level at which a null hypothesis can be rejected.
P-values are sometimes called the exact level of significance of a test statistic.
E.g., If a p-value=0.04 this means the null can be rejected at the 5% level but not the 1%
level.
~ ( − 1)
√
Therefore, for 20 df:
23
3.5.4 The power of a test
Power is the probability of rejecting the null hypothesis when it is in fact false.
power P (reject H | H is true)
0 1
It is 1 minus the probability there will be a Type II error. I.e., it is 1 minus the probability one
will accept the Null Hypothesis as true when it is in fact false.
24
4 Introduction to the Regression Model
Pooled data: Combines time series and cross-section data: Study of firms or households
over time.
25
Negative deviations below the line
Positive deviations above the line
Fitted Values
Least-square line
yˆ ˆ ˆx
i i
Deviation: ˆ y yˆ
i i i
26
The ordinary least squares (OLS) criterion for picking estimators in the simple linear
regression model is to choose the , that minimise the sums of the squared residuals:
min ̂ = ( − − )
First order conditions for solving these quadratic programing problems are:
−2 − − =0
−2 − − =0
Normal equations
27
5 The Two-Variable Regression Model (Simple Linear Regression)
Then the OLS estimators , are the Best Linear Unbiased Estimators (BLUE).
1. “Linear” estimator means that OLS estimators are a linear functions of the
observations on the dependent variable
2. “Unbiased” Means that ( ) = and E ( ˆ )
3. “Best means that of all the estimators that are linear and unbiased, the OLS estimators
have the smallest variance (are most efficient).
28
5.2.3 Conditional Mean Interpretation of Regression
Notice that if we assume E ( ) =0 for all i (a very unrestrictive assumption-why?) then:
i
( | )= + is the conditional mean of at i (conditional on observing xi).
Definition: An estimator , of ( , ) is any pair of values that depend on the data and
satisfy: = + + ̂ for some set of residuals ̂
Note that is a population quantity that you don’t see (disturbance) while ˆ is a sample
i i
quantity that is observable.
Note: Clearly there are many, many such potential estimators , . How do we make sure
we are picking “good” ones?
Y X
i i i
Where ~ . . . ( , )
29
2 X 2 (check)
~ N ,
ˆ
~ N , N
ˆ
N 2 2
(X X ) (X X )
i 1 i i 1 i
X 2
ˆ ~ N ,
N 2
( X X )
i 1 i
X 2
Cov (ˆ , ˆ ) cross check
N 2
(Xi X )
i 1
2
2 2 ˆi
s ˆ
N 2
s ̂ =standard error of the regression
2
ˆi =residual sums of squares.
Proof:
(X X ) x
Let k i i
2 N 2
i N
(Xi X )
i
x
i 1 i 1
30
= ~ (0,1) and ̂ = ~ ( − 2)
∑ ( ̅) ∑ ( ̅)
6) Reject if the p-value is less than the chosen significance level, otherwise fail to reject
We know that:
̂= ~ ( − 2) Where = ∑ ( − ̅)
2. Decompose the total sum of squares (TSS) in y into two components: the “Regression
sum of squares” (RSS) explained by the regression line; and the “Error sum of Squares”
(ESS) that is not explained by the regression line:
31
N 2 N 2 N 2
( yˆi y ) ( yi yˆi ) ˆi
i 1 1 i 1 i 1
RSS
R2 1
TSS N 2 N 2 N 2
( yi y ) ( yi y ) ( yi y )
i 1 i 1 i 1
Note:
lies between 0 and 1
Values “close to” 1 indicate a good fit, but:
Low (high) does not necessarily indicate the model is “bad” (“good”). Why not?
32