0% found this document useful (0 votes)
8 views48 pages

Chapter 3 Econometrics Edited

Chapter 3 discusses the Ordinary Least Squares (OLS) method for estimating linear regression models, outlining its assumptions and the least squares criterion. Key assumptions include the independence of error terms, homoscedasticity, and the normal distribution of errors. The chapter also covers the derivation of estimators, interpretation of results, and the precision of estimates through standard errors.

Uploaded by

alexjacky718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views48 pages

Chapter 3 Econometrics Edited

Chapter 3 discusses the Ordinary Least Squares (OLS) method for estimating linear regression models, outlining its assumptions and the least squares criterion. Key assumptions include the independence of error terms, homoscedasticity, and the normal distribution of errors. The chapter also covers the derivation of estimators, interpretation of results, and the precision of estimates through standard errors.

Uploaded by

alexjacky718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

CHAPTER 3

The Ordinary Least Squares Methods


(OLS) of Estimating LRM
3.1. Outline
 Assumptions of the OLS method
 The Least Sqaure Criterion...
 Interpretation of estimates
 Properties of estimators obtained by OLS.
 Precision or standard errors of least-squares estimates
 Properties of least-squares estimators: the GAUSS–
MARKOV THEOREM
3.2. OLS METHOD
To estimate the coefficients and we need observations on X, Y and u.
 yet u is never observed like the other explanatory variables,

 To estimate the function Yi = β1 + β2Xi + ui

• We should guess the values of u (reasonable assumptions) about

 the shape of the distribution of each ui

 its means, variance and covariance with other u’s


 These assumptions are guesses about the true, but unobservable,

value of ui
3.3. The Assumptions of The OLS Method

 The linear regression model is based on certain assumptions:

1) the distribution of the random variable ui


2) the relationship between ui and the explanatory
variables (xi)
3) the relationship between the explanatory variables
themselves.
THE ASSUMPTIONS OF THE OLS METHOD…….

1. ui is a random real variable and has zero mean value:

E(ui)=0 or E(uiXi)=0

 This implies that for each value of X, u may assume various


values,
 some positive and some negative
 but on average zero
THE ASSUMPTIONS OF THE OLS METHOD…….

• Further E(Yi) = β1 + β2Xi gives


 the relationship between X and Y on the average, i.e.

 when X takes on value Xi , then

 Y will on the average take on E(Yi) or E(YiXi)


THE ASSUMPTIONS OF THE OLS METHOD…….

2. The variance of ui is constant for all i

)=E=
 called the assumptions of common variance or
homoscedasticity
 The implication is that for all values of X, the values of u
show the same dispersion around their mean
THE ASSUMPTIONS OF OLS METHOD…….

 The consequence of faller this assumption is that

var(yiXi) =
 the variance of Y population varies as X changes
 a situation of non-constancy of the variance of Y causes called
heteroscedasticity
THE ASSUMPTIONS OF OLS METHOD…….

ui ∼ N(0, )
3. ui has a normal distribution i.e.,

Yi ∼ N(β1 + β2Xi, )
 Which also implies:

• called the assumptions of normality


THE ASSUMPTIONS OF OLS METHOD…….

4. The random terms of different observations are


independent:

 cov(ui, uj)=E(ui, uj)=0 for i ≠ j where i and j run from 1 to n.

 This is called the assumption of no-autocorrelation or serial


among the error terms

• The consequence of this assumption is that cov(Yi, Yj)= 0,


for i ≠ j is no autocorrelation among the Y’s
THE ASSUMPTIONS OF OLS METHOD…

5. Xi’s are a set of fixed values in the process of repeated


sampling which underlies the linear regression model, i.e.
 they are non-stochastic

6. ui is independent of the explanatory variables, i.e.,

 cov(ui,Xi)=E(uiXi)= 0
THE ASSUMPTIONS OF OLS METHOD…….

7. Variability in X values
 the X values in a given sample must not all be the same
 var(X) must be a finite positive number.

8. The regression model is correctly specified


3.4. THE LEAST SQAURE CRITERION

• So far we have completed the work involved in the first stage


of any econometric application,
 namely we have specified the model and stated explicitly its
assumptions.
 The next step is the estimation of the model, that is,
 the computation of the numerical values of its parameters
3.4. THE LEAST SQAURE CRITERION...

• The linear relationship Yi = β1 + β2Xi + ui holds for the

population of the values of X and Y

 we could obtain the numerical values of β1 and β2 only if

 we could have all the possible values of X, Y and u which form

the population of these variables.


THE LEAST SQAURE CRITERION…

 Since this is impossible in practice from population, we get a

sample of observed values of Y and X

Eg. X= Family disposable income in Birr and Y= Family food


expenditure in Birr
 Then, specify the distribution of the u’s and try to get satisfactory

estimates of the true parameters of the relationship


 This is done by fitting a regression line through the observations

of the sample
 which we consider as an approximation to the true line
THE LEAST SQAURE CRITERION…

• The method of ordinary least squares is


 one of the econometric methods which enable us to find the
estimate of the true parameter
 is attributed to Carl Friedrich Gauss,
 Is German mathematician
THE LEAST SQAURE CRITERION…
• To understand OLS method, we first explain the least squares
principle
 Recall the two-variable PRF: X= disposable income level and Y=
Household food expenditure
 Yi = β1 + β2Xi + ui -------------------- (2.2.4)

• The PRF is not directly observable and so we estimate it from


the SRF:
• =+ +

• =+

• Where: = the estimated (conditional mean) value of Yi


How is the SRF itself determined

• But how is the SRF itself determined? To see this, let us


proceed as follows. First, express as

i = Yi − i
= Yi − − 2Xi ---------------------------------------------------------------- (3.2.1)
• is the residuals
 = the differences between the actual and estimated Y
values
Derivation of estimators using OLS method

 To this end, we adopt the least-squares criterion, which states


that the SRF can be fixed in such a way that

• = 2

• =2 -------------------- (3.2.2)

• is as small as possible, where i2 are the squared residuals

• = f (, 2) that is,
 the sum of the squared residuals is some function of the

estimators 1 and 2.
• The principle or the method of least squares chooses 1 and

2 in such a manner that, for a given sample or set of data, is as


small as possible.
• In other words, for a given sample, the method of least squares
provides us with

 unique estimates of β1 and β2 that give the smallest possible

value of .
• The process of differentiation yields the following equations
for estimating β1 and β2.
• Differentiating Eq. (3.2.2) partially with respect to 1 and 2, we
obtain
• = -2= 0
• = -2 = 0
• Setting these equations to zero gives, the normal equations

σ 𝒀𝒊 = n 𝜷෡1 + 𝜷෡2σ 𝑿𝒊 ----------------------------------------------------- (3.2.3)


below

σ 𝑿𝒊 𝒀𝒊 = 𝜷෡𝟏 σ 𝑿𝒊 + 𝜷෡2σ 𝑿𝟐𝒊 --------------------------------------------- (3.2.4)

• where n is the sample size


• These simultaneous equations are known as the normal
equations
• Solving the normal equations simultaneously, we obtain

• =

•1 = - 2 ---------------- (3.2.6)
• where and are the sample means of X and Y and
• where we define xi = (Xi − ) and yi = (Yi − ).
• The above lowercase letters in the formula denote deviations
from mean values.
• Estimating β2 can be alternatively expressed as
• 2 = = = --------------- (3.2.7)
• The estimators obtained previously are known as the least-
squares estimators.

• The regression line equation as i = + 2Xi


Interpretation of estimates

1. Estimated intercept, 1:

 The estimated average value of the dependent variable when


the independent variable takes on the value zero

2. Estimated slope, 2:

 The estimated change in the average value of the dependent


variable when the independent variable increases by one unit.
3. gives average relationship between Y and X. i.e.

 is average value of Y given Xi.


Example 1

• A random sample of ten families had the following income


and food expenditure (in $ per week)

Families A B C D E F G H I J

Family
income (X)
20 30 33 40 15 13 26 38 35 43
Family
expenditure
7 9 8 11 5 4 8 10 9 10
(Y)

• Estimate the regression line of food expenditure on income


and interpret your results.
Numerical properties of estimators obtained by the method of
OLS.

1. The OLS estimators are expressed solely in terms of the


observable (i.e., sample) quantities (i.e., X and Y). Therefore, they
can be easily computed.
2. They are point estimators.
3. Once the OLS estimates are obtained from the sample data, the
sample regression line can be easily obtained.
Regression line properties

The regression line thus obtained has the following properties:


1. It passes through the sample means of Y and X.
2. The mean value of estimated Y = is equal to the actual
value of Y. i.e. =
3. The mean value of the residuals i is zero.

4. The residuals i are uncorrelated with the predicted Yi.

6. The residuals i are uncorrelated with Xi.


4. The sample regression Yi = 1 + 2Xi + i can be expressed in an
alternative form where both Y and X are expressed as
deviations from their mean values.
i.e. yi = + i .
The SRF can also be written as = , whereas in the original

units of measurement it was i = 1 + 2Xi.


The above equations are called the deviation form
3.3 PRECISION OR STANDARD ERRORS OF LEAST-
SQUARES ESTIMATES

• It is evident that least-squares estimates are a function of the


sample data.
• But since the data are likely to change from sample to sample,
the estimates will change.
• Therefore, what is needed is some measure of “reliability”
or precision of the estimators 1 and 2
• In statistics the precision of an estimate is measured by its
standard error (SE). SE of the OLS estimates can be

var(𝜷෡2) = σ 𝟐 ------------------------------------------------------------------------------- (3.3.1)


obtained as follows:
𝝈𝟐
𝒙𝒊

SE(𝜷෡2) = 𝛔൫𝛃෡𝟐𝐢൯ =
𝝈
--------------------------------------------------------------------- (3.3.2)
ට σ 𝒙𝟐𝒊

var(𝜷෡1) =
σ 𝑿𝟐𝒊 𝝈𝟐
𝒏 σ 𝒙𝟐𝒊
---------------------------------------------------------------------------- (3.3.3)

SE (𝜷෡1) = 𝛔൫𝛃෡𝟏𝐢൯ = ට 𝝈----------------------------------------------------------------- (3.3.4)


σ 𝑿𝟐𝒊
𝒏 σ 𝒙𝟐𝒊
• Where:
 var = variance and
 SE = standard error and where
 σ2 is the constant or homoscedastic variance of ui of
• All the quantities entering into the preceding equations except
σ2 can be estimated from the data.
• σ2 itself is estimated by the following formula:

• Where is the OLS estimator of the true but unknown σ2 and


• the expression n − 2 is known as the number of degrees of
freedom (df),
• being the sum of the residuals squared or the residual sum of
squares (RSS).1
• Once is known, can be easily computed.
• itself can be computed either from (3.2.2) or

• = 2

• =2
• or from the following expression.
• --------------------------- (3.3.5)
• Tis is easy to use,
 for it does not require computing for each observation.

• Since , is
• ------------------ (3.3.6)
• Note that the positive square root of
• --------------3.3.7) is known as the standard error of estimate
or the standard error of the regression (SE).
• It is simply the standard deviation of the Y values about the
estimated regression line and is
 often used as a summary measure of the “goodness of fit” of
the estimated regression line
Features of the variances (standard errors) of and

1. The variance of is
 directly proportional to
 but inversely proportional to .

2. The variance of is
 directly proportional to and.
 but inversely proportional to and the sample size n.
NUMBER OF DEGREES OF FREEDOM

• Number of degrees of freedom means


 the total number of observations in the sample (= n) less the
number of independent (linear) constraints or restrictions put
on them.
 In other words, it is the number of independent observations
out of a total of n observations.
 The general rule is this: df = (n − number of parameters
estimated).
A NUMERICAL EXAMPLE on

• We illustrate the econometric theory developed so far by


considering the Keynesian consumption function discussed in
the Introduction.
• As a test of the Keynesian consumption function, we use the
sample data of Table 2.2a, which for convenience is
reproduced as Table 3.2.
Table 3.2: hypothetical data on weekly family consumption
expenditure Y and weekly family income X

Y($) X($)
70 80
65 100
90 120
95 140
110 160
115 180
120 200
140 220
155 240
150 260
Table 3.3 raw data based on table 3.2

70 80 5600 6400 -90 -41 8100 3690 65.1818 4.8181

65 100 6500 10000 -70 -46 4900 3220 75.3636 -10.3636

90 120 10800 14400 -50 -21 2500 1050 85.5454 4.4545

95 140 13300 19600 -30 -16 900 480 95.7272 -0.7272

110 160 17600 25600 -10 -1 100 10 105.9090 4.0909

115 180 20700 32400 10 4 100 40 116.0909 -1.0909

120 200 24000 40000 30 9 900 270 125.2727 -6.2727

140 220 30800 48400 50 29 2500 1450 136.4545 3.5454

155 240 37200 57600 70 44 4900 3080 145.6363 8.3636

150 260 39000 67600 90 39 8100 3510 156.8181 -6.8181

≈ 1110.0
1109.9995
Sum 1110 1700 205500 322000 0 0 33000 16800 0

• 1= 24.4545 var (1) = 41.1370 and se
(1) = 6.4138
• 2= 0.5091 var (2) = 0.0013 and se
(2) = 0.0357
• cov (1,2) = −0.2172 = 42.1591
• r2 = 0.9621 r = 0.9809 df =
8

• The estimated regression line therefore is
• = 24.4545 + 0.5091Xi
• The associated regression line are interpreted as follows:
• Each point on the regression line gives an estimate of the expected
or mean value of Y corresponding to the chosen X value; that is,
is an estimate of E(Y | Xi).
• The value of 2 = 0.5091, which measures the slope of the line,
shows that, within the sample range of X between $80 and $260
per week, as X increases, say, by $1, the estimated increase in the
mean or average weekly consumption expenditure amounts to
about 51 cents.
• The value of 1 = 24.4545, which is the intercept of the line,
indicates the average level of weekly consumption expenditure
when weekly income is zero.
PROPERTIES OF LEAST-SQUARES ESTIMATORS:
THE GAUSS–MARKOV THEOREM

• . An estimator, say the OLS estimators2, is said to be a best


linear unbiased estimator (BLUE) of β2 if the following hold:
• It is linear, that is, a linear function of a random variable, such
as the dependent variable Y in the regression model.
• It is unbiased, that is, its average or expected value, E(2), is
equal to the true value, β2.
• It has minimum variance in the class of all such linear unbiased
estimators; an unbiased estimator with the least variance is
known as an efficient estimator.
• In the regression context it can be proved that the OLS
estimators (1, 2) are BLUE.
3.5 REGRESSION THROUGH THE ORIGIN

• There are occasions when the two-variable PRF assumes the


following form:
• Yi = β2Xi + ui -------------------------------------------------- (3.5.1)

• In this model the intercept term is absent or zero, hence the
name regression through the origin. How do we estimate
models like (3.5.1)? To answer these questions, let us first
write the SRF of (3.5.1), namely, Yi = 2Xi + i ---------- (3.5.2)

• var(2) =

• Now applying the OLS method to (3.5.2), we obtain the


following formulas for 2 and its variance 2 = and
where is estimated by
• The differences between the two sets of formulas should be
obvious: the df for computing is (n − 1) in the model without
intercept and (n − 2) in the model with intercept.

You might also like