3 Lecture03
3 Lecture03
Obid Khakimov
PRF
Yi
Ŷi
E(Y |Xi )
Weekly income, $
Features:
attractive statistical properties
simple
intuitive
the SRF.
To see this, let û1 , û2 , û3 , and û4 take the values of +10, -2, +2,
and -10, respectively. The algebraic sum of these residuals is zero
although û1 and û4 are scattered more widely around the SRF than
û2 and û3 .
We can avoid this problem if we adopt the least-squares criterion,
which states that the SRF can be fixed in such a way that
P 2 P 2 P 2
ûi = Yi − Ŷi = Yi − β̂1 − β̂2 Xi
where û2i are the squared residuals.
Now which sets of β̂ values should we choose? Obviously the β̂’s of the first
experiment are the “best” values. But we can make endless experiments and
P 2
then choosing that set of β̂ values that gives us the least possible value of ûi .
Fortunately, the method of least squares provides us with unique estimates of β1
and β2 without trial-and-error process.
Ȳ
where
yi = Yi − Ȳi
xi = Xi − X̄i
ŷi = β̂2 xi
Econometrics, 5ECON012C WIUT 16 / 1
Properties of Regression line
P
5. The residuals ûi are uncorrelated with Xi : ûi Xi = 0
X X
ŷi ûi = β̂2 xi ûi
X
= β̂2 xi (yi − β̂2 xi )
X X
= β̂2 xi yi − β̂i2 x2i
X X
= β̂i2 x2i − β̂i2 x2i
=0
In regression analysis our objective is not only to obtain β̂1 and β̂2
but also to draw inferences about the true β1 and β2 .
Look at the PRF (Yi = β1 + β2 Xi + ui ). It shows that Yi depends on
both Xi and ui . The assumptions made about the Xi variable(s) and
the error term are extremely critical to the valid interpretation of the
regression estimates.
The Classical Linear Regression Model has 10 assumptions.
Yi = β1 + β2 Xi + ui
disturbance term ui is
+ui
zero. Technically, the
conditional mean value −ui
of ui is zero.
E(ui |Xi ) = 0
X
X1 X2 X3
x2
= E u2i |Xi
= σ2
Heteroscedasticity f (u)
y
u2i |Xi 2
var(ui |Xi ) = E ̸= σ
β1 + β2 Xi
x1
or x2
xi
E u2i |Xi = σi2
x
The likelihood is that the Y observations coming from the population with
X = X1 would be closer to the PRF than those coming from populations
corresponding to X = X2 , X = X3 , and so on. In short, not all Y values
corresponding to the various X’s will be equally reliable, reliability being
judged by how closely or distantly the Y values are distributed around
their means.
is zero. ui ui
The PRF assumes that X and u (which may represent the influence of all
the omitted variables) have separate (and additive) influence on Y . But if
X and u are correlated, it is not possible to assess their individual effects
on Y . In other words, it is difficult to isolate the influence of X and u on
Y.