Chapter 2
Chapter 2
1
Introduction
• Regression analysis concerned with describing and evaluating the
r/ship b/n a given variable (often called the dependent variable) and
one or more variables which are assumed to influence the given
variable ( often called independent or explanatory variables).
• The simplest economic r/ship is presented through a two –variable
model (also called the simple linear regression model) which is given
by:
Y= a bX
Where a and b are unknown parameters (also called regression coefficients)
that we estimate using sample data. Here Y is the dependent variable and X is
the independent variable.
2
Cont….
3
Cont….
4
Why do we need to include the stochastic (random) component, for example in the consumption function?
often is, incomplete. We might know for certain that weekly income X influences
other variables affecting Y. Therefore, ui may be used as a substitute for all the
in introducing all the relevant variables into the model, there is bound to be
matter how hard we try. The disturbances, the u’s, may very well reflect this
intrinsic randomness.
6. Error of aggregation:- the sum of the parts may be different from the
whole.
6
Cont…
(X) of HHs. The sample we randomly choose to examine the R/ship may
and β from this sample may not be as good as that from a balanced sample
group.
variables. 7
Cont….
• This Assumption states that the r/ship b/n Yi and Xi is linear, and that
the deterministic component( X i ) and the stochastic component
( i ) are additive.
9
Cont…
Assumption 2: The mean of u in any particular period is Zero.
This means that for each value of X, ε may assume various values,
some greater than zero and some smaller than zero, but if we consider
all the possible values of ε, for any given value of X, they would have an
average value equal to zero.
Assumption 3: The variance of i is constant in each period
(Homoscedasticity)
The variance of i about its mean is constant at all values of X. in
other words for all values of X, the ε ’s will show the same dispersion
around their mean.
Assumption 4: The variables i has a normal distribution with mean
zero and variance 2 for all i (often written as:
10
Cont…
• Assumption 5: the random term of different observations ( i , are
j)
independent ( No error autocorrelation)
Cov( i , j ) E ( i j ) 0 for i j.
• This means that all the covariance of any i with any other j are
equal to zero. The value which the random term assumed in one
period does not depend on the value which it assumed in any other
period.
Assumption 6: U is independent of explanatory variable
• The disturbance term is not correlated with the explanatory
variable(s). The u’s and the X’s do not tend to vary together; their
covariance is zero.
cov( X ) E X i E ( X i ) ui E ( i ) 0
11
Cont…
12
Cont…
• Other assumptions
Assumption 8: the explanatory variables are not perfectly linearly correlated.
• If there is more than one explanatory variable in the r/ship it is assumed
that they are not perfectly correlated with each other. Indeed the
repressors should not even be strongly correlated, they should not be
highly multicollinear.
Assumption 9: the macro variables should be correctly aggregated.
• Usually the variables X and Y are aggregative variables, representing the
sum of individual items. For example, in consumption function
C=bo+biY+u, C is the sum of the expenditures of all consumers and Y is the
sum of all individual incomes.it is assumed that the appropriate
aggregation procedure has been adopted in compiling the aggregate
variables.
13
cont
• Assumption 10: the r/ship being estimated is identified.
• It is assumed that the r/ship whose coefficients we want to estimate
has a unique mathematical form, that is it does not contain the same
variables as any other equation related to the one being investigated.
Assumption 11: the r/ship is correctly specified( no specification bias or
error).
Assumptions 12: The number of observations n must be greater than
the number of parameters to be estimated.
14
Methods of estimation
• Specifying the model and stating its underlying assumptions are the
first stage of any econometric application. The next step is the
estimation of the numerical values of the parameters of economic
relationships. The parameters of the simple linear regression model
can be estimated by various methods. Three of the most commonly
used methods are:
• Ordinary least square method (OLS)
• Maximum likelihood method (MLM)
• Method of moments (MM)
• But, here we will deal with the OLS and the MLM methods of
estimation.
15
The Ordinary Least square (OLS) method of Estimation
• In the regression model, Yi X i i , the values of the
parameters and are not known. When they are estimated from a
sample of size n, we obtain the sample regression line given by:
Y Xi i 1, 2,..., n
i
estimated value of Y.
• The dominating and powerful estimation method of the parameters
( or regression coefficients) and is the method of least squares.
• The deviation between
the observed and estimated values of Y are
called the residuals i , that is
Y Y , i 1, 2,..., n
i i i
16
Cont….
• The magnitude of the residuals is the vertical distance between the
actual observed points and the estimating line (see the figure below)
• The estimating line will have a ‘good fit’ if: it minimizes the error between the
estimated points on the line and the actual observed points that were used to draw it.
i (Y Y )
SSE i i
2
(Yi X i ) 2
2 (Yi X i ) 0
SSE
2 X i (Yi X i ) 0
Re-arranging the two equations, we get the so called normal equations
18
Cont…
Y n X
i i
X Y X X
i i i i
2
Thus, we have two equations with two unknowns
and . Solving for
and we get:
n X iYi ( X i Yi X Y nXY
i i
n X i 2 ( X i ) 2 X nX2 2
i
Y X
Where X and Y arethe mean value of theindependent and dependent var iables, respectively ,
1 1
that is X
n
X i and Y
n
Yi
19
Cont…
are said to be the ordinary least-square (OLS) estimators of α
and
and β, respectively. The line Y X is called the least square line or
i
Y X u ............(2)
Subtracting equation (2) from (1) we get
Yi Y ( X i X ) (ui u ).........(3)
Letting xi X i X , yi Yi Y and i (ui u ), we can write equation (3) as
yi xi i .........(4) 20
Cont…
• Equation (4) is the simple regression (or two-variable) model in
deviations form.
• The OLS estimator of β from equation (4) is given by:
xy i i
x i
2
21
Cont.
• In other words the Total Sum of Square (TSS) is decomposed in to
Regression (explained) Sum of Square (RSS) and Error (residual or
unexplained) Sum of Square (ESS).
TSS= RSS+ESS
Computation formulas
• The TSS is a measure of dispersion of the observed value of Y about
their mean. That
n
is computednas:
TSS (Yi Y ) 2 yi 2
i 1 i 1
• The regression (explained) sum of square (RSS) measures the amount
of the total variability in the observed values of Y that is accounted for
by the linear r/ship b/n the observed values of X and Y. this is
computed as:
n
n
2
n
RSS (Yi Y ) ( X i X ) xi 2
2 2 2
i 1 i 1 i 1 22
Cont…
• The error (residual or unexplained) sum of square (ESS) is measures
the amount of the total variability in the observed values of Y about
the regression line. This is computed as:
ESS (Yi Yi ) 2 TSS RSS
• If a regression equation does a good job of describing the R/ship
between two variables, the explained sum of squares should
constitute a large portion of the total sum of squares.
• Thus, it would be of interest to determine the magnitude of this
proportion by computing the ratio of the explained sum of squares
to the total sum of squares. This proportion is called the sample
Coefficient of determination, R 2.That is:
RSS ESS
Coefficient of determination R 2 1
TSS TSS
xi yi
R
2
where xi X i X and yi Yi Y .
yi
2 23
Cont…
• Note
1) The proportion of total variation in the dependent variable (Y)
that is explained by changes in the independent variable (X) or by
the regression line is equal to: R 2 x100%
2) The proportion of total variation in the dependent variable (Y)
that is due to factors other than X (e.g., due to excluded variables,
chance, etc.) is equal to: (1 R 2 ) x100%
To test for the significant of R 2 , we compare the variance ratio with the critical value from the F distribution with 1
and (n-2) degree of freedom in the numerator and denominator, respectively, for a given significance level α.
Decision: if the calculated variance ratio exceeds the tabulated value, that is, if
Fcal F (1, n 2), we then conclude that R 2is significant (or that the linear regression
mod el is adquate.
26
Cont…
Note: the F test is designed to test the significance of all variables or a set of variables in
a regression model. In the two variable model, however, it used to test the explanatory
power of a single variable (X). and at the same time, is equivalent to the test of
significance of R 2.
Illustrative Example
Consider the following data on the percentage rate of change in electricity consumption
(millions KWH) (Y) and the rate of change in the price of electricity (Birr/KWH) (X) for year
1979-1994.
Summary statistics: note here that:
xi X i X and yi Yi Y
n 16 , X 1.280625, Y 23.42688, i
x 2
92.20109, i 13228.7,
y 2
x y i i 779.235
27
Cont…
• Estimation of regression coefficients
Y X 23.42688 ( 8.45147)(1.280625) 34.25004
Therefore, the estimated regression equation is :
Y X Y 34.25004 8.45147 X
n
n
n
RSS (Yi Y ) 2 2 ( X i X ) 2 2 xi 2 (8.45147) 2 (92.20109) 6585.679
i 1 i 1 i 1
28
Cont…
ESS=TSS-RSS=13228.7-6585.679=6643.016
RSS 6585.679
R2 0.4978
TSS 13228.7
30
31
Cont…
32
Estimation of the standard error of β and test of its significance
2
• An unbiased estimator of the error variance is given by:
n
1 ESS 6643.016
2
n2
i 1
i
2
n2
16 2
474.5011
Thus, an unbiased estimator of Var ( ) is given by :
2
474.5011
V ( ) 5.146372
x i
2
92.20109
The s tan dared error of is :
s.e( ) V ( ) 5.146372 2.268562
The hypothesis of interest is :
H0 : 0
H1 : 0
We calculated the test statistic :
8.45147
t
3.72548
s.e( ) 2.268562
33
Cont…
• For α=0.05, the critical value from the student’s t distribution with (n-2)
degree of freedom is:
t t / 2 (n 2),
• Decision: Since we reject the null hypothesis, and conclude that β
is significantly different from zero. In other words, the price of electricity
significantly and negatively affects electricity consumption.
8.45147
• The interpretation of the estimated regression coefficient is that for a
one percent drop (increase) in the growth rate of price of electricity, there is
an 8.45 percent increase (decrease) in the growth rate of electricity
consumption.
34
Properties Of OLS Estimators
• The ideal or optimum properties that the OLS estimates possess may be
summarized by well known theorem known as the Gauss-Markov Theorem.
• According to this theorem, under the basic assumptions of the classical linear
regression model, the least squares estimators are linear, unbiased and have
minimum variance (i.e. are best of all linear unbiased estimators).
• Some times the theorem referred as the BLUE theorem i.e. Best, Linear,
Unbiased Estimator. An estimator is called BLUE if:
• Linear: a linear function of the a random variable, such as, the dependent
variable Y.
• Unbiased: its average or expected value is equal to the true population
parameter.
• Minimum variance: It has a minimum variance in the class of linear and
unbiased estimators. An unbiased estimator with the least variance is known as
an efficient estimator.
35
Cont….
xi
Where ai , x X i X , and yi Yi Y .Thus,we can see that is linear
xi 2
36
Cont….
b)To showthat is an unbiased estimator of
Note : An estimator of is said to be unbiased if : E ( )
Consider the model in deviations form : yi xi i
xy i i
x ( x
i i i )
xi 2 xi i
x i i
(*)
x i
2
x i
2
x i
2
x i
2
Now we have
E ( ) (since is constant)
E ( xi i ) xi E ( i ) xi (0) 0 (sin ce, xi is non stochastic and E ( i ) 0)
Thus,
E ( ) E ( ) E (
x i i
)
x E (i i )
0
x i
2
x i
2
is an unbiased estimator of
37
Cont….
c) To showthat has the smallest variance out of all linear unbiased estimator of
Note :
1. The OLS estimators and are calculated from a specific sample of observations of
the dependent and independent var iables. If we consider a diffirent sample of observations for
and may var y from one sample to another , and hence, are random var ibles.
2.The var iance an estimator (a random var ible) and is given by :
Var ( ) E ( ) 2
2
n
3.The expression xi can be written in expanded form as :
i 1
2
n
n n
xi xi xi x j
2
i 1 i 1 i j
This is simply the sum of square ( xi 2 ) plus the sum of cross product terms ( xi x j for i j )
38
39
• Note that (**) follows from assumption (error term have constant
variance and no error autocorrelation), that is, for all i and for all
• We have seen above (in proof (a)) that OLS estimator of can be
expressed as:
Where
• Now let be another linear unbiased estimator of given by:
40
Cont….
xi
* ci yi d i ( xi i ) (since yi xi i )
xi
2
xi 2
d i xi
xi i
di i
xi 2
xi 2
41
The variance of is given by
42
Cont’d
43
The Confidence Interval Approach
to Hypothesis Testing
• A range of values of a sample statistic that is likely (at a given level of probability, i.e.
• The interval that will include that population parameter a certain percentage (= confidence
• An example of its usage: We estimate a parameter, say to be 0.93, and a “95% confidence
interval” to be (0.77,1.09). This means that we are 95% confident that the interval
containing the true (but unknown) value of .
2. Choose a significance level, , (again the convention is 5%). This is equivalent to choosing
a (1-)100% confidence interval, i.e. 5% significance level = 95% confidence interval
3. Use the t-tables to find the appropriate critical value, which will again have T-2 degrees of
freedom.
( ˆ t crit SE ( ˆ ), ˆ t crit SE ( ˆ ))
4. The confidence interval is given by
5. Perform the test: If the hypothesised value of (*) lies outside the confidence interval,
then reject the null hypothesis that = *, otherwise do not reject the null.
Confidence Intervals Versus Tests of Significance
• Note that the Test of Significance and Confidence Interval approaches always give
the same answer.
• Under the test of significance approach, we would not reject H 0 that = * if the test
statistic lies within the non-rejection region, i.e. if
*
t crit t crit
SE ( )
• Rearranging, we would not reject if
t crit SE ( ˆ ) ˆ * t crit SE ( ˆ )
ˆ t crit SE ( ˆ ) * ˆ t crit SE ( ˆ )
• But this is just the rule under the confidence interval approach.
Constructing Tests of Significance and
Confidence Intervals: An Example
• Using both the test of significance (standard error test) and confidence interval
approaches, test the hypothesis that =1 against a two-sided alternative.
• The first step is to obtain the critical value. We want tcrit = t20;5%
Determining the Rejection Region
f(x)
-2.086 +2.086
49
Performing the Test
ˆ t crit SE ( ˆ )
0.5091 2.086 0.2561
( 0.0251,1.0433)
Changing the Size of the Test
•
Changing the Size of the Test:
The New Rejection Regions
f(x)
-1.725 +1.725
Changing the Size of the Test:
The Conclusion
• t20;10% = 1.725. So now, as the test statistic lies in the rejection region, we would
reject H0.
• If we reject the null hypothesis at the level, we say that the result of the test is
statistically significant.
The Size of the Hypothesis Test and the
Type I and Type II Errors
• While using sample statistics to draw conclusions about the parameters
of the population as a whole, there is always the possibility that the
sample collected does not accurately represent the population.
• Consequently, statistical tests carried out using such sample data may
yield incorrect results that may lead to erroneous rejection of the null
hypothesis. We have two types of errors:
54
Cont’d
• Type I Error
• Type I error occurs when we reject a true null hypothesis. For example, a type I
error would manifest in the form of rejecting H0 = 0 when it is actually zero.
• Type II Error
• Type II error occurs when we fail to reject a false null hypothesis. In such a
scenario, the test provides insufficient evidence to reject the null hypothesis
when it’s false.
55
Cont’d
56
Cont’d
• The level of significance denoted by α represents the probability of making a type I
error, i.e., rejecting the null hypothesis when, in fact, it’s true. α is the direct opposite
of β, which is taken to be the probability of making a type II error within the bounds
of statistical testing.
• We use α to determine critical values that subdivide the distribution into the rejection
and the non-rejection regions.
57
End of chapter 2.
Thank you!
58