(2021) EC6041 Lecture 2 CLRM
(2021) EC6041 Lecture 2 CLRM
Ratjomose P. Machema
[email protected]
Department of Economics
National University of Lesotho (NUL)
EC6041:
Econometric Theory and Applications
1 Introduction
4 Matrix Representation
5 CLRM Assumptions
Recall that in a simple regression the dependent variable y is related to only one
explanatory variable x.
Therefore we would like to “study how y varies with changes in x.”
For examples:
x is amount of fertilizer, and y is soybean yield; or
x is years of schooling, y is hourly wage.
Simple linear regression model is a specification of the process that we
believe describes the relationship between two variables
For each level of xi , we assume that yi is generated by the following
simple linear regression model (or two-variable regression model)
yi = β1 + β2 xi + ui (1)
The variable u, called the error term or disturbance, represents factors other
than x that affect y .
A SRM treats all factors affecting y other than x as being unobserved.
We call β0 the intercept parameter and β1 the slope parameter.
These describe a population, and our ultimate goal is to estimate them
y is assumed to be linearly related to x.
If the other factors in u are fixed, meaning ∆u = 0 then
∆y = β1 ∆x (2)
where the x variables are assumed to be independent of the error terms, the βs
are fixed and each εi is distributed independently and identically with a mean of
0 and variance σ 2 (i.e. there is no heteroscedasticity and no autocorrelation).
The observed value of yi is the sum of two parts, the regression function
and the disturbance, εi .
Our objective is to estimate the unknown parameters of the model,
use the data to study the validity of the theoretical propositions, and
perhaps use the model to predict the variable y .
If in addition we make the assumption that the errors are normally distributed,
then the model is known as the classical normal linear regression model.
y = x1 β1 + x2 β2 + · · · + xk βk + ε
or in short
y = Xβ + ε
and
Cov [εi , εi |X] = 0 for alli 6= j
If the error term is heteroskedastic, the dispersion changes over the range
of observations. Heteroskedasticity occurs when the variance of the error
term changes in response to a change in the value(s) of the independent
variable(s):
Var [εi |X] = σi2 for alli = 1, . . . , n
Figure: Homoskedasticity
X is fixed (non-stochastic); or
X is independent of ε
ε|X ∼ N 0, σ 2
y = Xβ + ε
ŷ1 = Xβ̂ 1
e 1 = y − ŷ 1
−2X 0 y + 2X 0 X β̂ = 0
X 0 X β̂ = X 0 y (12)
These are the normal equations. Provided that X 0 X has rank k (which it will
do, by the identification condition A2 that we imposed), we can solve out for β̂
β̂ = (X T X )−1 X T y
β̂ = (X T X )−1 X T y
to define the unique minimum (and not some other stationary point),
the Hessian given by
∂ 2 RSS
= 2X 0 X
∂ β̂∂ β̂ 0
must be a positive definite matrix, which it is because X has full rank.
Therefore, the least squares solution is unique and minimizes the sum
of squared residuals.