0% found this document useful (0 votes)

24 views32 pages

Econometrics Notes Sem II 2020 2021 October 26 2021

Uploaded by

franciscozziwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views32 pages

Econometrics Notes Sem II 2020 2021 October 26 2021

Uploaded by

franciscozziwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Econometric Methods

Notes

Compiled By

Andrew Muganga Kizito

February 2019

i
Content
Part I: The Basics of Regression Analysis...............................................................................5
1 Introduction to Econometrics ......................................................................................... 5
1.1 Definition .....................................................................................................................5
1.2 Some History ...............................................................................................................5
1.3 The Econometric approach .......................................................................................... 6
1.4 Types of models...........................................................................................................7
1.5 Structure of economic data .......................................................................................... 7
1.6 Other Important Concepts............................................................................................ 7
1.7 Uses of Econometrics ..................................................................................................8
1.8 Data Generating Mechanisms ...................................................................................... 8
1.9 An example ..................................................................................................................9
1.10Assignment: An Econometric Model Example ......................................................... 10
1.11Key terms ...................................................................................................................10
2 Basic Mathematical Concepts ...................................................................................... 11
2.1 Vectors and Matrices .................................................................................................11
2.2 Matrix Operations ......................................................................................................13
2.3 Matrix Inversion.........................................................................................................14
2.4 Quadratic Forms.........................................................................................................15
2.5 Calculus of Optimization ........................................................................................... 16
3 Elementary Statistics: A Review ..................................................................................17
3.1 Random Variables......................................................................................................17
3.2 Estimation ..................................................................................................................19
3.3 Properties of the Estimators ....................................................................................... 21
3.4 Probability Distributions............................................................................................ 21
3.5 Hypothesis Testing, P-Values and Confidence Intervals...........................................23
3.6 Descriptive Statistics..................................................................................................24
4 Introduction to the Regression Model ..........................................................................25
4.1 Curve Fitting ..............................................................................................................25
4.2 Derivation of Least Squares....................................................................................... 26
5 The Two-Variable Regression Model (Simple Linear Regression) ............................. 28
5.1 The model ..................................................................................................................28
5.2 The Best Linear unbiased Estimation ........................................................................28
5.3 Hypothesis Testing and Confidence Intervals ........................................................... 30
5.4 Analysis of Variance: Goodness of Fit ......................................................................31
5.5 Examples of Commands In Stata ...............................................................................32
6 Multiple Linear Regression .......................................................................................... 33
6.1 The Model..................................................................................................................33
6.2 In Matrix Notation .....................................................................................................33
6.3 Properties of OLS Estimators / Gauss-Markov Theorem ..........................................34
6.4 OLS Estimation in Matrix Notation...........................................................................35
6.5 Unbiased Estimation ..................................................................................................35
6.6 Variance of the Least Squares....................................................................................35
6.7 The Distribution of OLS Estimators / Regression Statistics......................................36

ii
6.8 F-tests, R-Square and Corrected R-Square ................................................................ 37
7 Using the Multiple Regression Model..........................................................................39
7.1 The general linear model ........................................................................................... 39
7.2 Use of Dummy Variables........................................................................................... 40
7.3 Other cases .................................................................................................................41
7.4 Interpretation of some examples of transformations .................................................42
Part 2: Single Equation Regression Models...........................................................................43
8 Multicollinearity ...........................................................................................................43
8.1 Definition: ..................................................................................................................43
8.2 Perfect Collinearity as an Identification Problem ...................................................... 43
8.3 High But Imperfect Collinearity ................................................................................45
8.4 Consequences of Multicollinearity ............................................................................45
8.5 Detection of Multicollinearity....................................................................................45
8.6 Remedial Measures....................................................................................................45
8.7 Concluding Remarks..................................................................................................46
9 Heteroskedasticity ........................................................................................................46
9.1 Definition of Heteroskedasticity ................................................................................46
9.2 Forms of Heteroskedasticity ...................................................................................... 46
9.3 Causes of Heteroskedasticity .....................................................................................48
9.4 Consequences of Heteroskedasticity..........................................................................49
9.5 Detection of Heteroskedasticity.................................................................................50
9.6 Tests for Heteroskedasticity....................................................................................... 51
9.7 What to Do If You Find Evidence of Heteroskedasticity ..........................................53
9.8 Conclusions about Heteroskedasticity .......................................................................56
9.9 Some Stata Commands .............................................................................................. 56
10 Serial Correlation (Autocorrelation).............................................................................57
10.1Definition of Autocorrelation ....................................................................................57
10.2Autocorrelation vs. Heteroskedasticity......................................................................57
10.3Causes of Autocorrelation.......................................................................................... 58
10.4Types of Autocorrelation ........................................................................................... 60
10.5Detection of Autocorrelation .....................................................................................62
10.6Correcting for Autocorrelation ..................................................................................68
10.7Cochrane-Orcutt Iterative Procedure .........................................................................70
10.8Concluding Remarks on Autocorrelation ..................................................................70
11 Forecasting with a Single Equation Regression Model................................................72
11.1Definition ...................................................................................................................72
11.2Point and Interval Forecasts....................................................................................... 72
11.3Ex-post and Ex-ante forecasts....................................................................................72
11.4Conditional vs. Unconditional forecasts ....................................................................73
11.5The forecast error .......................................................................................................73
11.6Forecast Evaluation....................................................................................................77
12 Elements of Model Specifications Gujarati Chapter 13)..............................................79
12.1Basic Facts about Model Specification Analysis....................................................... 79
12.2Types of Specification Errors and Their Consequences ............................................79

iii
12.3Specification Testing .................................................................................................80
12.4Box-Cox Transformation and the Box-Cox Test (graduate) .....................................84
12.5Non-nested hypothesis tests....................................................................................... 85
12.6Alternative Philosophies ............................................................................................ 87
12.7Conclusions................................................................................................................88
12.8Model Specification and Evaluation Analysis(To revisit during TS analysis)..........89
12.9Take Note of Spurious Regression (To revisit during TS analysis--graduate)..........89
13 Maximum Likelihood Estimation Method ...................................................................90
13.1Maximum Likelihood Estimation ..............................................................................91
14 Models of Qualitative Choice....................................................................................... 92
14.1Discrete Dependent Variable Models ........................................................................92
14.2The Basic Model ........................................................................................................92
14.3A Generic Model........................................................................................................93
14.4Linear Probability Model ........................................................................................... 94
14.5Logit Model ...............................................................................................................96
14.6Probit Model ..............................................................................................................97
14.7Estimating the Logit and Probit Models ....................................................................98
14.8Using STATA ............................................................................................................98
14.9Logit Versus Probit ....................................................................................................98
14.10Interpretation of Results .......................................................................................... 99
14.11Tobit Models..........................................................................................................100

iv
Part I: The Basics of Regression Analysis

1 Introduction to Econometrics

1.1 Definition

Econo = Economics
Metrics=Measurement
Econometrics: Quantitative measure of economic concepts and relationships

Application of statistical and mathematical methods to analyse economic data

Examples

Analysing the changes in quantities produced and consumed due to changes in prices and
income (price and income elasticities of demand and supply) for agricultural commodities.

Impact of policy interventions

 Health: Distribution of mosquito nets on malaria incidence
 Education: Training teachers on student completion rates or grades
 Infrastructure: Opening or tarmacking new roads on agricultural production
 Monetary and Fiscal Policy: Changes in interest rates on inflation

Econometrics has spread in other areas such as social sciences, health, engineering,
psychology, geography, etc.

1.2 Some History

 Developed as a result of application of Statistics, mathematics, economics to economic

data that was becoming readily available in the early 1900s

 Early work was based on a “classical” approach which grew out of experimental statistics

5
1.2.1 Highlights of Classical Approach
1. State theory or hypothesis
2. Specify mathematical model of the theory
3. Specify econometric model
4. Obtain data via experiment
5. Estimate parameters
6. Test hypothesis
7. Forecast or predict
8. Draw conclusions (control or policy purposes)

Works well where experimental data can be obtained (e.g., in studying effects of fertilizer
application on maize yield)

Does not work well in some cases (e.g., on effects of changes in money supply on inflation)

1.2.2 Highlights of the modern Approach

1. Recognize that economics is not an experimental science
2. Instead of controlling for background factors, include them in the model and measure
them
3. Treat all variables as random
4. Models are often incomplete and data is “noisy” and “messy”
5. Draw conclusions or make predictions based on incomplete and evolving theories

1.3 The Econometric approach

1. Identify the question to be answered
2. Specify a tentative mathematical / statistical model (use economic theory or any other
available information)
3. Collect data
4. Choose an appropriate estimator and estimate the parameters of the model
5. Evaluate the estimated model and return to the specification stage if necessary
6. Evaluate the final model specification (hypothesis and specification tests)
7. Answer the questions you started out with. This may involve a forecast, a hypothesis
test, or an economic interpretation of the results)

6
1.4 Types of models
1. Time-Series Models
2. Single-Equation Models
3. Multi-Equation Models

1.5 Structure of economic data

1. Time series data
a. Observe variable(s) over a long period
2. Cross-sectional data
a. A sample of units (individuals, households, holdings) at a given point in time
3. Pooled Cross sections
a. Both cross-sectional and time series future
b. Units may be different
4. Panel or longitudinal data
a. Time series for each cross sectional member

1.6 Other Important Concepts

1. Causality
a. Statistical relations do not establish causal connection.
b. Causation comes from some theory
c. One variable has causal effect on another (rain and yield)
d. Association does not necessarily imply causality
2. The notion of Ceteris Paribus
a. Other relevant factors remaining constant
b. E.g., change in prices on quantity demanded holding other factors constant
3. If other factors are not kept constant, causal effects cannot be established

7
1.7 Uses of Econometrics
1. Structural analysis: What is the price elasticity of export demand for Uganda grains?
2. Forecasting: What will be the value of coffee exports from Uganda in 2020?
3. Policy Analysis (Conditional Forecasting): What will be the value of coffee exports
in 2020 from Uganda following the introduction of a tax on agricultural inputs in
2015/2016 budget?
4. Other examples
Is the generic advertising of mosquito nets effective on usage and control of malaria?
(What is “effective”? How to you test effective?)
1.8 Data Generating Mechanisms

DGM is an important concept in econometric model building and inference

DGM is an abstract concept not a tangible, known process

There must be some mechanism that generates the economic data that we observe. We may
not know much about the process, or what makes it work, but there will be a process

When we specify an econometric model, we are hypothesizing what the DGM looks like
(but we will never know for sure)

When we engage in specification testing, we are testing whether our hypothesized DGM fits
the data or not.

Two Phases

We approach this course in two phases

1. We begin by assuming we have the correct DGM (model specification) and all we
have to do is to use data to estimate the parameters and test hypothesis

2. Then we move on to a more difficult question of how we determine whether our

model specification is consistent with the true DGM, and what effect it might have on
our estimation and hypothesis testing if our model specification is incorrect.

8
1.9 An example
a. Research Question:

To what extent does maize price, sex of farmer, land under maize production, and transport
costs influence maize sales in Uganda?

b. Model specification:
ln Q     ln P   S   ln L   T   i
i 0 1 i 2 i 3 i 4 t

Where
ln =natural logarithm.
Q =quantity of maize sold in kilograms by the i  th household.
i
P =price of maize in shillings per kilogram for the i  th household.
i
S =sex of the household head of the i  th household (Female=1, Male=0).
i
L =land under maize production by i  th household.
i
T =transport cost per 100 kilogram bag of maize from home to the market of the
i
i  th household.

c. Data:

Collected from a random sample of households in Uganda in 2013

d. Estimation Results

ln Q  6.965  0.407 ln P  0.609 S  0.201 ln L  0.430 ln T

i i i i i
(1.387) (0.1876) (0.128) (0.0698) (0.0813)
F  16.21 R 2  0.1618 ESS  456.62 T  341

Standard errors are reported under the coefficients estimates. The regression F-statistic, the
model R 2 , the error sums of squares (ESS) and the sample size are also reported.

e. Model Evaluation

“Goodness of fit”

Others specification tests to determine “best” model if we had more than one model

9
f. Interpretation of Results and answer the research question

a) Interpret the intercept and the slope parameter estimate on the sex variable.

b) Interpret the slope parameters on price, land and transport, and state whether you
think the estimated signs for the parameters estimates for price, land, and transport variables
conform to what we would expect from the economics underlying the model.

g. Inference

c) Calculate and interpret a 95% confidence interval on the slope parameter associated
with price variable.

d) Using a 1% significance level, formally test the null hypothesis that the slope
parameter on the transport is 0.

e) Predict the quantity of maize in kilograms of a female farmer would sell if she plants
2 acres of maize, sells at 600 shillings per kilogramme, and incurs a transport cost of 10,000
shillings per 100 kg bag.

1.10 Assignment: An Econometric Model Example

Use font 12, Times New Romans, margins 1 inch top, bottom, left, right.
On two pages, give an example of the following.
a. Research question
b. Specify the model
c. State the data you will use
d. What results will you estimate?
e. How will you evaluate the model?
f. How will you interpret the results and answer your research question in 1?
g. What inference will you make

Hand in on February 25, 2019.

1.11 Key terms

Causal Effect
Ceteris Paribus
Cross-Sectional Data Set
Econometric Model
Economic Model
Empirical Analysis
Experimental Data
Non-experimental Data
Observational Data
Panel Data
Pooled Cross Section
Time Series Data
10
2 Basic Mathematical Concepts

2.1 Vectors and Matrices

Matrix A of order or dimension M  N (M row and N columns) is given by
 a11 a12  a1N 
 
A   ai j       
a a 
 M 1 M 2  aMN 
Scalar is a single (real) number or matrix of order 1  1.
Column Vector: Matrix of M rows and 1 column.
3
1 
 
x 51   5 
 
0
 2 

Row Vector: Matrix of 1 row and N columns.

x 14  1 1 4 9.
Transposition: A transpose of matrix A denoted by A is the obtained by interchanging the
rows and the columns. For instance:
1 2 3 1 4 7 

A  4 5 6  A   2 5 8 
7 8 9   3 6 9 
Similarly,
1 
x   2  and x  1 2 5 .
 5 
Commonly the convention is to indicate row vectors by primes as x  1 2 5 .
Sub-matrix
Is a matrix resulting from the deletion of the ith row and jth column

1 2 3
4 6
If A   4 5 6  then Submatrix A(1, 2) of A is  
7 9 
,
7 8 9  
which was obtained by deleting the first row and the second column.

11
Types of Matrices

Square Matrix has the same number of rows as columns.

1 2 3
 1 3 
A   4 5 6 or B  
0 11
.
7 8 9  
Diagonal Matrix is a square matrix with at least one nonzero element on the main diagonal
and zeros elsewhere.
1 0 0 
A  0 5 0 
0 0 3
Scalar Matrix is a diagonal matrix whose diagonal elements are all equal (e.g., variance-
covariance)
 2 0 0 0 0
 
0 
2
0 0 0
var-cov(u), Σ   0 0 2 0 0
 
0 0 0 2 0 
0 0  2 
 0 0
Identity or Unit Matrix a diagonal matrix with elements of the leading diagonal being 1 and
the rest zeros.
1 0 0 0 0 
0 1 0 0 0
 
I 5  0 0 1 0 0
 
0 0 0 1 0
 0 0 0 0 1 
Symmetric Matrix is a square matrix whose elements above the main diagonal are mirror
images of the elements below the main diagonal.

Triangular Matrix is a square matrix that has all zeros either above or below the main non-
zero diagonal.
1 4 6  1 0 0
Upper triangular =  0 2 8  or Lower triangular =  4 2 0  .
 
 0 0 3   6 8 3
Null Matrix is one with all elements as zero and is denoted by O .
Equal Matrices must be of the same order and corresponding elements must be equal. We
write A = B.

12
2.2 Matrix Operations
2.2.1 Addition and Subtraction:
C  A  B  [a  b ]
ik ik
A  B  [a  b ]
ik ik

2.2.2 Scalar Multiplication

Multiply each element by the scalar: i.e.,  A    ai j  .
1 1  2 1 2  1  2 2 
A  , then 2 A    
0 2  2  0 2  2  0 4 
2.2.3 Matrix Multiplication
To multiply matrix A with B as AB then matrices A and B must be multiplicatively
conformable, that is, the number of columns of A must be equal to the number of rows in
matrix B. For instance,
1 2  1 0 2   (11)  (2  3) (1 0)  (2  1) (1 2)  (2  0) 
A 22  B 23     
3 4  3 1 0  (3 1)  (4  3) (3  0)  (4 1) (3  2)  (4  0) 
Properties of Matrix Multiplication
a. Matrix multiplication is not always commutative, i.e., AB  BA .
b. Row vector by column vector is a scalar. Consider the ordinary least squares residuals
 u1 
u 
 2 
uu  u1 u2  un 1 un    
 
un 1 
u 
 n 
=u1 + u2    un 1  un2
2 2 2

  ui2 , a scalar.
i

c. Column vector of order n by a row vector of order n will yield a n  n order matrix.

 u1 
u 
 2 
uu       u1 u 2  u n 1 u n 
 
 u n 1 
u 
 n 
 u12 u1u 2 u1u 3  u1u n 
 
 u 2 u1 u 22 u 2u3  u 2u n 
= is symmetric.
     
 2 
 u n u1 u n u 2 u n u 3  u n 

13
2.3 Matrix Inversion
An inverse of a square matrix A, denoted by A 1 , if it exists, is a unique square matrix such
that

AA 1  A 1 A  I.
Properties
a.  AB   B 1A 1 .
1

b.  A    A
1 1
.

Determinants
To every square matrix A, there corresponds a scalar known as a determinant denoted by
det(A) or A .
a a 
For a A 22   11 12  the determinant is
 a21 a22 
a11 a12
A     a11a22  a12 a21.
a21 a22

Properties of Determinants
a. Singular matrix is one whose determinant is zero. Nonsingular matrix is one whose
determinant is nonzero. So the inverse of a singular matrix is indeterminate.
b. If all elements of any row of a matrix are zeros, its determinant is zero.
c. A  A .
d. If any row (or column) is a multiple (linear combination) of other rows (or columns),
the determinant of the matrix is zero. Helpful in demonstration of multicollinearity of
variables.
e. AB  A B .

Rank of a Matrix is the order of the largest square submatrix whose determinant is not zero.

As noted earlier the inverse of a singular matrix does not exist. Therefore, for a n  n matrix
A, its rank must be n for its inverse to exist; if it is less than n, A is singular. This is an
important condition for valid estimates.

Minor
If the ith row and jth column of an n  n matrix A are deleted, the determinant of the resultant
submatrix is called the minor of the element aij and is denoted by M ij .

14
Cofactor
The cofactor of element aij of an n  n matrix A, denoted by cij is defined as
cij  ( 1) i  j M ij .

Cofactor Matrix
Replacing elements of aij of a matrix A by their cofactors cij we obtain the cofactor matrix,
cofA .

Adjoint Matrix is the transpose of the cofactor matrix, denoted by adj A.

Finding the Inverse of a Square Matrix

If A is a nonsingular square matrix (i.e., A  0 ), its inverse A 1 is given by
1
A 1   adjA  .
A

2.4 Quadratic Forms

Let A denote an n  n symmetric matrix with real entries and let x denote an n  1 column
vector. Then

Q ( x )  x Ax is said to be a quadratic form.

a11  a1n   x1 
Q (x )   x1 x 2  xn         
an1  ann   xn 
  aij x i x j
i j

Note: Weighted sum of squares and cross products of x .

Example: One variable quadratic y  ax 2 .

If a  0 , then ax 2 is always positive and zero when x = 0; therefore, ax 2 is positive
definite and x = 0 is a global minimizer (optimum).
If a  0 , then ax 2 is always negative and zero when x = 0; therefore, ax 2 is always
negative definite and x = 0 is a global maximizer (optimum).

Classification of Quadratic forms

a. Positive definite if Q ( x )  x Ax  0 for all x  0 in R n ;
b. Positive semidefinite if Q ( x )  x Ax  0 for all x  0 in R n ;
c. Negative definite if Q ( x )  x Ax  0 for all x  0 in R n ;
d. Negative semidefinite if Q ( x )  x Ax  0 for all x  0 in R n ;
e. Indefinite if Q ( x )  x Ax  0 for some x in R n and Q ( x )  x Ax  0 for
some other x in R n .

15
2.5 Calculus of Optimization

The first order or necessary condition for an optimum (maximum or minimum)

dy
0
dx
A sufficient condition for an optimum is
d2y
For a maximum, 0
dx 2
d2y
For a minimum, 0
dx 2

16
3 Elementary Statistics: A Review

3.1 Random Variables

 A random variable is a variable that can take on different numerical values, each with
some probability lying in [0, 1].

 A Random Variable (RV) might be discrete or continuous.

 A random variable Y is said to be discrete if it can only assume a finite or countably

infinite number of distinct values.

 A random variable that takes on any value in an interval is called continuous.

3.1.1 Probability Functions and Densities

Definition: A probability function maps every possible outcome from a discrete RV into a
probability value in [0, 1].

These probability values add to one.

These are point probabilities of random variables.

Definition: A probability density function maps every possible outcome from a continuous
RV into a value that defines the probability of any interval as the area under the PDF.

PDFs integrate to one.

3.1.2 Expected Values

Every random variable can be written as: Y=µ+ɛ

The expected value (or mean) µ is calculated as:

μ= ( ) =∑ For discrete RVs

( )=∫ ( ) For continuous RVs

ɛ is another RV with E(ɛ)=0

17
Expectation is a Linear Operator

Result 1: ( + )= + ( )

Result 2: ( + + )= + ( )+ ( )

But: This only occurs if the relationship is linear

Let’s look at some examples where this does NOT work

( ) ≠ ( ) ( ) Unless they are independent.

(1⁄ ) ≠ 1/ ( )
E(lnY) ≠lnE(Y)

Variance

The variance of a RV, Y, is a measure of “dispersion” around the mean and is defined as:
( )= = [ − ]

The standard deviation of Y is then the square root of the variance:

= ( )
And is a measure of the “typical” distance between a Y outcome and its mean μ.

3.1.3 Joint Distribution, Independence, Covariance, and Correlation,

Joint Distribution

Two RVs Y1 and Y2, are said to have a joint distribution if knowledge of the outcome from
one of them influences the probabilities that the other one will take the particular values.

Example: Maize and Beans yields in Busoga next season are related. If you know what the
maize yield will be, that helps you predict the bean yield. Hence, these RVs are jointly
distributed.

Independence
If knowing one RV value does not provide ANY information for predicting the other:
( | = ) = ( ) for all y.

Then the RVs are said to be independent.

18
Covariance
Covariance is a measure of the tendency for two jointly distributed RVs to “move together”

( , )= = [( − )( − )] = ( )−

Note: Independence implies covariance is zero but zero covariance does not imply
independence.

Variance Is Not a Linear Operator

( + )= ( )

Assignment:

Var (Y1+Y2)

Var ( Y   Y ) 
1 2

Correlation
Correlation is related to covariance and both are measures of the tendency of two RVs to
move together.

But: Covariance depends on the units of measurement of the RVs while correlation is
independent of the units of measurement.

This characteristic makes correlation more useful as a measure of co-movement.

The correlation coefficient of the two RVs Y1 and Y2 is defined as:

( , )= =
Correlation coefficients always satisfy −1 ≤ ≤ 1 and:
= 0 indicates no relationship (zero covariance)
= −1 indicates a complete negative relationship.

= 1 indicates a complete positive relationship

3.2 Estimation

3.2.1 Data Samples and Statistical Inference

Consider the random variable, Y (e.g., the yield on a particular kind of maize plot in
Kapchorwa).

Suppose we make one draw from Y (e.g., We measure the yield for one maize plot in one
season). This is one data point on Y.

19
Now run the same experiment N times providing a sample of N data points (e.g., We
measure yield from 12 different plots but all of the same plot type).

We want to do statistical inference (learn about the properties of the underlying random
variable Y from observing the data sample {y1, y2,….. yN }.

3.2.2 Statistics and Estimation

Definition: A statistic is any function of the data sample.
Definition: An estimator is any statistic that provides information about the underlying
population probability distribution.
Note: The data are random draws from the underlying probability distribution. Therefore,
statistics are themselves RVs with their own probability distributions.

3.2.3 Estimators of the Mean, Variance and Co-Variance

The population mean is given as:

1 N
Y   y
N i 1 i
The estimate of the mean is
1 n
y  y
ni 1 i
The population variance is given as:
1 N 2
 Y2  E[(Y  Y ) 2 ]   ( y  Y ) ; This estimator is biased
N i 1 i
We normally use

1 N 2
S2   ( y  Y ) because E ( s )  S
2 2

N 1i  1 i
The estimate of the variance is:
1 n 2
s 2  v( y )   ( yi  y )
n 1i  1
The estimate of the covariance of y1 and y 2 is the expectation of the product of y1 and y 2
when both are measured as deviations from their means.
1 N
Cov ( y , y )  E[( y  Y )( y  Y )]   ( y  Y )( y  Y )
1 2 1i 1 2i 2 N i  1 1i 1 2i 2
The estimate of the covariance is
^ 1 n
Cov ( y , y )   ( y  y )( y  y )
1 2 n  1 i  1 1i 1 2i 2
Correlation coefficient between variables y1 and y 2 is given as

20
Cov( y1 , y 2 )
 ( y1 , y 2 ) 
V ( y1 ) V ( y 2 )
The sample correlation coefficient is:
n
^  ( y1i  y1 )( y 2i  y 2 )
Cov ( y , y )
r( y , y )  1 2  i 1
1 2 v( y ) v( y ) n
1 2 2 n 2
 ( y1i  y1 )  ( y 2i  y 2 )
i 1 i 1
  0 indicates no relationship (zero covariance)
  1 indicates a complete negative relationship
  1 indicates a complete positive relationship

3.2.4 The Central Limit Theorem

If the random variable Y has mean  and variance  2 , then the sampling distribution of Y
becomes approximately normal with mean  and variance  2 / N as N increases.

3.3 Properties of the Estimators

3.3.1 Unbiasedness
( )=

3.3.2 Efficiency
( )< ( ∗) ∗
for some other estimate

3.3.3 Consistency
( )=

lim prob (| Y   |  )  1
Y
N 

3.4 Probability Distributions

You need to be familiar with the following four probability distributions
1. Normal
2. Chi-Square
3. t distribution
4. F distribution

3.4.1 The Standard Normal (Z-distribution)

A standard normal RV, Z, is continuous over (   ,   ), has zero mean, unit variance
and a pdf of the form:

21
1 1
f ( z)  exp( z 2 )
2 2
Y 
It is denoted Z ~ N (0,1) Z 
Y
A general normal RV, Y is continuous over (   ,   ) and can be written as:
Y    Y Z
Where   E (Y ) ,  2  var(Y ) , and Z ~ N (0,1)
Y
It is denoted Y ~ N (  ,  2 )
Y

3.4.2 Chi-Square distribution

A RV defined as the sum of squares of N independent standard normal RV is distributed as
Chi-square with N degrees of freedom.
N
W   Z 2 ~  2 (N ) .
i
i 1
Where Z ’s are independent and distributed Z ~ N (0,1) for all i.
i i
Note: W is defined over [0, +∞).

3.4.3 The t-distribution

A RV defined as the ratio of a standard normal RV to the square root of a standardised
independent Chi-square RV with N degrees of freedom is distributed as a t with N degrees of
freedom.
Z
V ~ t(N )
W /N
Where Z ~ N (0,1) and W ~  2 ( N )
Note: V is defined over (-∞, +∞), and looks a lot like a normal, but with “fatter tails.”

3.4.4 The F-distribution

If W and W are two independent chi-square RVs with degrees of freedom N and N ,
1 2 1 2
(W / N )
then a RV defined by the ratio: U  1 1 ~ F ( N , N ) is said to follow an F-
(W / N ) 1 2
2 2
distribution.

Note: U is defined over [0,   ).

22
3.5 Hypothesis Testing, P-Values and Confidence Intervals

3.5.1 Hypothesis Testing

Hypothesis testing involves the following steps:
1) State the null and alternative hypothesis
Eg : = 10, : ≠ 10
2) Define an appropriate test statistic (e.g., the sample mean, sample variance or
estimated coefficients from a regression model)
3) Compute the distribution of the statistic under the null
~ (10, )
4) Standardize the statistic so as to use published tables
= ~ (0,1) and ̂ = ~ ( − 1)
√ √
5) Compute the p-values of the computed test statistic
6) Reject if the p-value is less than the chosen significance level, otherwise fail to reject

3.5.2 P-Values
A p-value is the minimum significance level at which a null hypothesis can be rejected.

P-values are sometimes called the exact level of significance of a test statistic.

E.g., If a p-value=0.04 this means the null can be rejected at the 5% level but not the 1%
level.

3.5.3 Confidence Intervals

If you can do hypothesis tests you can do confidence intervals. We know that:

~ ( − 1)
√
Therefore, for 20 df:

−1.725 ≤ ≤ 1.725 = 0.9

√

: + 1.725 ≥ ≥ − 1.725 = 0.9

√ √

23
3.5.4 The power of a test
Power is the probability of rejecting the null hypothesis when it is in fact false.
power  P (reject H | H is true)
0 1
It is 1 minus the probability there will be a Type II error. I.e., it is 1 minus the probability one
will accept the Null Hypothesis as true when it is in fact false.

Table on Power and Type I and Type II errors

Decision Ho True Ho False
Fail to reject H Correct Decision Type II Error (1- Power)
0
Reject H Type I error (p-value) Correct Decision
0

3.6 Descriptive Statistics

 Histogram: Tabulates frequency distribution of the data
 The Median:
o For odd number of observations, the middle observation when the data are
ranked from low to high or (high to low)
o For even number of observations, the average of the two middle observations
 Skewness: provides the symmetry of the probability distribution
N
S  (1 / N )  x 3 / s 3
i
i 1
o s the standard deviation.
o S =0 or all symmetric distributions
o S >0 upper tail is thicker
o S <0 when negative tail is thicker

Negatively Skewed Distribution Positively Skewed Distribution

 Kurtosis: Provides measure of “thickness” of the tails of the distribution

N
K  (1 / N )  x 4 / s 4
i
i 1

o K =3 for a normal distribution

o K >3 for tails thicker than the normal
o K <3 for tails thinner than the normal

24
4 Introduction to the Regression Model

4.1 Curve Fitting

Time series data: Describes movement over time (e.g., Daily, weekly, monthly, quarterly,
annual)

Cross-Section data: Describes activities of individuals, firms, or households at a given point

in time

Pooled data: Combines time series and cross-section data: Study of firms or households
over time.

Sample: a set of observations

Grade Point Average and Family Income

Grade Point Income of parents in
Average $1,000
4 21
3 15
3.5 15
2 9
3 12
3.5 18
2.5 6
2.5 12
Source: Pindyck and Rubinfeld

Assume a linear relation between X and Y

Objective: Obtain “Best” straight line relating X and Y
Method 1: Connect a line between lowest and Highest Point
Method 2: Draw a line which seems to pass through most points
Method 3: Draw a line so that sum of vertical distances from line (deviations) is Zero.
Method 4: Draw a line that minimizes the absolute deviations from the fitted line
Method 5: Least squares: The “Line of best fit” minimises the sum of squared deviations of
the points of the graph from the point of the straight line.

25
Negative deviations below the line
Positive deviations above the line

4.2 Derivation of Least Squares

Fitted Values

Least-square line
yˆ  ˆ  ˆx
i i
Deviation: ˆ  y  yˆ
i i i

26
The ordinary least squares (OLS) criterion for picking estimators in the simple linear
regression model is to choose the , that minimise the sums of the squared residuals:

min ̂ = ( − − )

First order conditions for solving these quadratic programing problems are:

−2 − − =0

Normal equations

Solving for , gives the OLS formulas:

∑ ( − ̅ )( − )
= and = − ̅
∑ ( − ̅)

27
5 The Two-Variable Regression Model (Simple Linear Regression)

5.1 The model

The model is
y    x  
i i i
y = the ith observation on the dependent variable, y
i
x =the ith observation on the explanatory variable, x
i
 =the ith error term
i
α and β are the parameters (intercept and slope) to be estimated.

5.2 The Best Linear unbiased Estimation

5.2.1 Properties of OLS estimators / The Gauss-Markov Theorem

(1) If the DGM is truly linear of the form y     x  
i i i
(2) The X s are not Random Variables (controlled, as in experiments). They are fixed.
i
(3) E ( )  0 for all i (Error term has zero expected value)
i
(4) ( )= ( )= (Error term has constant variance for all
overvations-Homoscedastic)
(5) , = =0 ≠ (error terms are statistically independent-no
autocorrelation)
(6) The error term is normally distributed

Then the OLS estimators , are the Best Linear Unbiased Estimators (BLUE).
1. “Linear” estimator means that OLS estimators are a linear functions of the
observations on the dependent variable
2. “Unbiased” Means that ( ) = and E ( ˆ )  
3. “Best means that of all the estimators that are linear and unbiased, the OLS estimators
have the smallest variance (are most efficient).

5.2.2 Sources of Errors

1. Errors in optimization (behavioral errors)
2. Omitted variables (no data, immeasurable variable, incomplete theory).
3. Errors of measurement (poor proxy variables)
4. Intrinsic randomness in human behavior
5. Principal of parsimony (use as less variables as is necessary)
6. Wrong functional form. Functional form is only an approximation to the true
relationship between the variables (linear approximation)

28
5.2.3 Conditional Mean Interpretation of Regression
Notice that if we assume E (  ) =0 for all i (a very unrestrictive assumption-why?) then:
i
( | )= + is the conditional mean of at i (conditional on observing xi).

That implies that

1. The intercept, ,is the expected value of given = 0:
= ( | = 0)

2. The slope , is the expected change in given a marginal unit change in xi

E ( yi / xi )

xi
5.2.4 Estimation
We don’t know α and β but want to estimate them from the observed data on ( , ) for
i=1,2,3,……,N.

Definition: An estimator , of ( , ) is any pair of values that depend on the data and
satisfy: = + + ̂ for some set of residuals ̂

Note that  is a population quantity that you don’t see (disturbance) while ˆ is a sample
i i
quantity that is observable.

Note: Clearly there are many, many such potential estimators , . How do we make sure
we are picking “good” ones?

Some Potential Estimators

1. Graph the data and draw in a “line of best fit.”
2. Set = ∑ ⁄ and = 0
3. Set = 0 and = 1
4. OLS
Which of these are “good” and which are “bad?”

5.2.5 The Distribution of OLS Estimators

To test hypotheses about and we need to know their probability distributions.
Suppose that DGM:

Y    X  
i i i
Where ~ . . . ( , )

It can be shown that:

29
   
   
 2   X 2  (check)
 ~ N  ,
ˆ
  ~ N  , N
ˆ

N 2 2
  (X  X )    (X  X ) 
 i  1 i   i  1 i 

 
 
 X 2 
ˆ ~ N  , 
N 2
  ( X  X )
 i  1 i 

X 2
Cov (ˆ , ˆ )   cross check
N 2
 (Xi  X )
i 1

2
2 2  ˆi
s  ˆ 
N 2
s  ̂ =standard error of the regression
2
 ˆi =residual sums of squares.
Proof:
(X  X ) x
Let k  i  i
2  N 2 
i N
 (Xi  X )
  i 
x
i 1 i  1 

5.3 Hypothesis Testing and Confidence Intervals

5.3.1 Hypothesis Testing

Hypothesis testing involves the following steps:

1) State the null and alternative hypothesis
Eg : = 0, : ≠0

2) Define an appropriate test statistic:

3) Compute the distribution of the statistic under the null

~ 0,
∑ ( ̅)
4) Standardize the statistic so as to use published tables

30
= ~ (0,1) and ̂ = ~ ( − 2)
∑ ( ̅) ∑ ( ̅)

5) Compute the p-values of the computed test statistic ̂

6) Reject if the p-value is less than the chosen significance level, otherwise fail to reject

5.3.2 Confidence Intervals

If you can do hypothesis tests you can do confidence intervals.
Pr ob( ˆ  t s    ˆ  t s ) 1
( / 2, n  k ) ˆ ( / 2, n  k ) ˆ
1   =desired level of confidence, n  k =degrees of freedom.

We know that:
̂= ~ ( − 2) Where = ∑ ( − ̅)

Therefore for 20 degrees of freedom:

−1.725 ≤ ≤ 1.725 = 0.9

: + 1.725 ≥ ≥ − 1.725 = 0.9

5.4 Analysis of Variance: Goodness of Fit

How do we determine that the model is a “good” fit for the data?
1. Determine the y values predicted by the model as:
y    x  
i i i

2. Decompose the total sum of squares (TSS) in y into two components: the “Regression
sum of squares” (RSS) explained by the regression line; and the “Error sum of Squares”
(ESS) that is not explained by the regression line:

TSS= RSS + ESS

N 2 N 2 N 2
 ( yi  y )   ( yˆi  y )   ( yi  yˆi ) 
i 1 i 1 i 1
3. Define the “coefficient of determination:”
RSS ESS
R2   1
TSS TSS

31
N 2 N 2 N 2
 ( yî  y )  ( yi  yî )  î
 i 1 1 i 1 i 1
RSS
R2  1
TSS N 2 N 2 N 2
 ( yi  y )  ( yi  y )  ( yi  y )
i 1 i 1 i 1

Note:
lies between 0 and 1
Values “close to” 1 indicate a good fit, but:
Low (high) does not necessarily indicate the model is “bad” (“good”). Why not?

Note: Be careful about the use of:

Error Sums of Squares (ESS) = RSS (Residual Sums of Squares)
Regression Sums of Squares (RSS) =ESS (Explained Sums of Squares) in some
books.

5.5 Examples of Commands In Stata

Sum gpa income

Corr gpa income
Reg gpa income
Test income
ttest gpa == 2

A Koutsoyiannis - Theory of Econometrics - An Introductory Exposition of Econometric Methods (1977, Macmillan) - Libgen - Li
88% (8)
A Koutsoyiannis - Theory of Econometrics - An Introductory Exposition of Econometric Methods (1977, Macmillan) - Libgen - Li
701 pages
Applied Statistics With Python
100% (1)
Applied Statistics With Python
320 pages
Fe 455
No ratings yet
Fe 455
340 pages
Econometrics Summary
No ratings yet
Econometrics Summary
5 pages
ECONOMETRICS An Introduction PDF
No ratings yet
ECONOMETRICS An Introduction PDF
53 pages
Lecture Notes Week One
No ratings yet
Lecture Notes Week One
16 pages
Day 2 Introduction To Econometrics
No ratings yet
Day 2 Introduction To Econometrics
11 pages
CH-1 - Introduction-Updated
No ratings yet
CH-1 - Introduction-Updated
55 pages
Introduction To Econometrics ECO 356 Course Guide and Course Material
No ratings yet
Introduction To Econometrics ECO 356 Course Guide and Course Material
139 pages
Advanced Estimation Manual
100% (1)
Advanced Estimation Manual
57 pages
Introduction To Econometrics For Finance
100% (1)
Introduction To Econometrics For Finance
93 pages
ECO713 Applied Econometrics
No ratings yet
ECO713 Applied Econometrics
240 pages
Econometric Methods
No ratings yet
Econometric Methods
6 pages
WEEK1 Intro & Review
No ratings yet
WEEK1 Intro & Review
266 pages
CH 1b Introduction
No ratings yet
CH 1b Introduction
35 pages
Econometrics Final 11111
No ratings yet
Econometrics Final 11111
178 pages
Econometrics 1 Handbook 1
No ratings yet
Econometrics 1 Handbook 1
201 pages
Econometrics II
No ratings yet
Econometrics II
15 pages
Eco Assignment N
No ratings yet
Eco Assignment N
8 pages
Econemtrics PPT
No ratings yet
Econemtrics PPT
230 pages
Econometrics: Submitted To Submitted by
No ratings yet
Econometrics: Submitted To Submitted by
6 pages
Lecture 1 - Econometrics 1 - Nature of Econometrics
No ratings yet
Lecture 1 - Econometrics 1 - Nature of Econometrics
23 pages
Linear Models and Econometrics Chapter 1-9-2
No ratings yet
Linear Models and Econometrics Chapter 1-9-2
206 pages
Econometrics and Development
No ratings yet
Econometrics and Development
55 pages
Notes 113
No ratings yet
Notes 113
48 pages
Handout For Econometrics
No ratings yet
Handout For Econometrics
27 pages
Chapter One and Two
No ratings yet
Chapter One and Two
73 pages
Econometrics For Accounting Students
No ratings yet
Econometrics For Accounting Students
132 pages
Basic Econometrics MBA 2024
No ratings yet
Basic Econometrics MBA 2024
104 pages
CH-1 - Introduction
No ratings yet
CH-1 - Introduction
55 pages
Short Note For Theme 2
No ratings yet
Short Note For Theme 2
63 pages
CH-1 - Introduction (Compatibility Mode)
No ratings yet
CH-1 - Introduction (Compatibility Mode)
53 pages
Lecture 1 On Econometrics
No ratings yet
Lecture 1 On Econometrics
23 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Econometrics PART ONE
No ratings yet
Econometrics PART ONE
33 pages
Chapter 01 - EM
No ratings yet
Chapter 01 - EM
44 pages
Econometrics Notes 2024
100% (1)
Econometrics Notes 2024
46 pages
Chapter 1 - Nature of Applied Econometrics and Economic Data
No ratings yet
Chapter 1 - Nature of Applied Econometrics and Economic Data
38 pages
Course Outline E300 2024 25
No ratings yet
Course Outline E300 2024 25
2 pages
Answers Econometrics
No ratings yet
Answers Econometrics
3 pages
Econometrics Course Outline UCU 2023
No ratings yet
Econometrics Course Outline UCU 2023
8 pages
Econometric S
No ratings yet
Econometric S
59 pages
Level III ICCE Curriculum Guide
No ratings yet
Level III ICCE Curriculum Guide
58 pages
Chapter1
No ratings yet
Chapter1
55 pages
Understanding Econometrics Basics
No ratings yet
Understanding Econometrics Basics
10 pages
Econometrics For MGT Chapter1
No ratings yet
Econometrics For MGT Chapter1
24 pages
المستند
No ratings yet
المستند
7 pages
Econometrics, Course Outlines-1
No ratings yet
Econometrics, Course Outlines-1
3 pages
Course Outline Eco 422 2020 2023
No ratings yet
Course Outline Eco 422 2020 2023
4 pages
Course Outline
No ratings yet
Course Outline
3 pages
Econ 325
No ratings yet
Econ 325
2 pages
Course Outline Econometrics
No ratings yet
Course Outline Econometrics
2 pages
Introduction To Econometrics and Operations Research
No ratings yet
Introduction To Econometrics and Operations Research
28 pages
Econ f342 Applied Economics1
No ratings yet
Econ f342 Applied Economics1
3 pages
Econ F342 1450 1
No ratings yet
Econ F342 1450 1
4 pages
Lecture #1
No ratings yet
Lecture #1
22 pages
What Is Econometrics?: (Ref: Wooldridge, Chapter 1)
No ratings yet
What Is Econometrics?: (Ref: Wooldridge, Chapter 1)
11 pages
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
No ratings yet
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
16 pages
Research Paper Chapter 1 3 Bardelosa Et Al.
No ratings yet
Research Paper Chapter 1 3 Bardelosa Et Al.
29 pages
1st Puc Economics Annual Examination Question Paper Eng Version 2018-Mandya
No ratings yet
1st Puc Economics Annual Examination Question Paper Eng Version 2018-Mandya
2 pages
Research Title
No ratings yet
Research Title
18 pages
BQE3203 Fullnotes
No ratings yet
BQE3203 Fullnotes
59 pages
STAT-2104 Probability and Statistics
No ratings yet
STAT-2104 Probability and Statistics
4 pages
+++++++ Communication and Team Performance
No ratings yet
+++++++ Communication and Team Performance
8 pages
Econometrics Notes Sem I 2019 2020 Nov 8 2021
No ratings yet
Econometrics Notes Sem I 2019 2020 Nov 8 2021
77 pages
1 s2.0 S2352012424012979 Main
No ratings yet
1 s2.0 S2352012424012979 Main
18 pages
Adding Missing-Data-Relevant Variables To FIML-Based Structural Equation Models
No ratings yet
Adding Missing-Data-Relevant Variables To FIML-Based Structural Equation Models
22 pages
Group A Industrial Statistics
No ratings yet
Group A Industrial Statistics
12 pages
Elements of Development Planning 22-Feb-2023 12-44-22
No ratings yet
Elements of Development Planning 22-Feb-2023 12-44-22
14 pages
PR2 Q2 Module 5
No ratings yet
PR2 Q2 Module 5
9 pages
Corporate Sustainability: First Evidence On Materiality
No ratings yet
Corporate Sustainability: First Evidence On Materiality
55 pages
Chapter 7 An Introduction To Portfolio Management - Update
No ratings yet
Chapter 7 An Introduction To Portfolio Management - Update
41 pages
Statistcs For Management 2 Marks
No ratings yet
Statistcs For Management 2 Marks
24 pages
The Impactof Strategic Cost Management Techniqueson Reducing Costsand Achieving Competitive Advantage
No ratings yet
The Impactof Strategic Cost Management Techniqueson Reducing Costsand Achieving Competitive Advantage
16 pages
Data Analysis CORRELATION
No ratings yet
Data Analysis CORRELATION
4 pages
One Versus Two Anterior Miniscrews For Correcting
No ratings yet
One Versus Two Anterior Miniscrews For Correcting
10 pages
ABM 401 Lesson 12
No ratings yet
ABM 401 Lesson 12
14 pages
Maths (Ans)
No ratings yet
Maths (Ans)
7 pages
Course Outline Business Stats II
No ratings yet
Course Outline Business Stats II
5 pages
The Relationship Between The Usage of Xero Accounting Software and The Intellectual Capital of The Selected Companies in Pampanga
No ratings yet
The Relationship Between The Usage of Xero Accounting Software and The Intellectual Capital of The Selected Companies in Pampanga
56 pages
Assumptions in Linear Regression
No ratings yet
Assumptions in Linear Regression
3 pages
Interpretation
No ratings yet
Interpretation
8 pages
Groundwater Vulnerability To Selenium in Semi-Arid Environments Amman Zarqa Basin, Jordan
No ratings yet
Groundwater Vulnerability To Selenium in Semi-Arid Environments Amman Zarqa Basin, Jordan
23 pages
Crim 7
No ratings yet
Crim 7
16 pages
Paper 5 Impact of Social Media Influencers On Purchasing Intention Towards Pet Products. A Quantitative Study Among Females in Malaysia
No ratings yet
Paper 5 Impact of Social Media Influencers On Purchasing Intention Towards Pet Products. A Quantitative Study Among Females in Malaysia
17 pages
Financial Reporting Quality As A Determinant of The Relationship Between Financial Constraints and Investment Efficiency Dr. Maha Ramadan
No ratings yet
Financial Reporting Quality As A Determinant of The Relationship Between Financial Constraints and Investment Efficiency Dr. Maha Ramadan
30 pages
Untitled Document Payall Nayak
No ratings yet
Untitled Document Payall Nayak
25 pages
Analysis Employee Turnover Mapua
No ratings yet
Analysis Employee Turnover Mapua
13 pages
The Percentage Oil Content, P, and The Weight, W Milligrams, of Each of 10 Randomly
No ratings yet
The Percentage Oil Content, P, and The Weight, W Milligrams, of Each of 10 Randomly
17 pages
Lab 1
No ratings yet
Lab 1
2 pages
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Options Trading for Income: Learn the strategies and techniques for maximizing returns and minimizing risk in the options market (2023 Guide for Beginners)
From Everand
Options Trading for Income: Learn the strategies and techniques for maximizing returns and minimizing risk in the options market (2023 Guide for Beginners)
Lane Conner
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet