0% found this document useful (0 votes)
31 views83 pages

CHAPTER 2 AcFn

econometrics lecture note

Uploaded by

naol ejata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views83 pages

CHAPTER 2 AcFn

econometrics lecture note

Uploaded by

naol ejata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

CHAPTER TWO

SIMPLE LINEAR REGRESSION


2.1 The Concept of Regression Analysis
2.2 The Simple Linear Regression Model
2.3 The Method of Least Squares
2.4 Properties of Least-Squares Estimators and the
Gauss-Markov Theorem
2.5 Residuals and Goodness of Fit
2.6 Confidence Intervals and Hypothesis Testing in
Regression Analysis
2.7 Prediction with the Simple Linear Regression
1
2.0. The Concept of Regression Analysis
 Regression analysis is one of the most commonly
used tools in econometric work.
 Definition: Regression analysis is concerned with
describing and evaluating the relationship between a
given variable (often called the dependent variable)
and one or more variables which are assumed to
influence the given variable (often called independent
or explanatory variables).

2
If Y=f(X), terminology and notation of Y and X
are as follows:
Y X
Explained variable Explanatory variable

Predicted variable Predictor

Regressand Regressor
Response Stimulus variables

endogenous Exogenous variable

Dependent variable Independent variable

Target variable Control variables

3
2.1. The Concept of Regression Analysis under simple
linear regression model
 The simplest economic relationship is represented
through a two-variable model (also called the simple
linear regression model) which is given by:
Y = a + bX
where a and b are unknown parameters (also called
regression coefficients) that we estimate using sample
data.
Here Y is the dependent variable and X is the
independent variable.

4
Example: Suppose the relationship between
expenditure (Y) and income (X) of households
is expressed as:
Y = 0.6X + 120
Here, on the basis of income, we can predict
expenditure. For instance, if the income of a
certain household is 1500 Birr, then the
estimated expenditure will be:
Expenditure = 0.6(1500) + 120 = 1020 Birr
Note that since expenditure is estimated on the
basis of income, expenditure is the dependent
variable and income is the independent
variable. 5
 The purely mathematical model stated above has a
limited interest to the econometrician as it assumes
that there is an exact or deterministic relationship
between Y and the explanatory variable X. But
relationships between economic variables are
generally inexact (stochastic).
 Such a set of behavioral equations derived from an
economic model is simply represented by
econometric model.

6
A simple linear regression model, i.e. a relationship
between two variables related in a linear form.
 A relationship b/n variables has two important
forms: stochastic and non-stochastic, among which
we shall be using the former in econometric analysis.
A relationship between X and Y, characterized as Y =
f(X) is said to be deterministic or non-stochastic if for
each value of the independent variable (X) there is
one and only one corresponding value of dependent
variable (Y).

7
 On the other hand, a relationship between X and
Y is said to be stochastic if for a particular
value of X there is a whole probabilistic
distribution of values of Y. In such a case, for
any given value of X, the dependent variable Y
assumes some specific value only with some
probability.
Let’s illustrate the distinction between stochastic
and non stochastic relationships with the help of a
supply function.

8
Assuming that the supply for a certain commodity
depends on its price (other determinants taken to be
constant) and the function being linear, the
relationship can be put as:
Q  f ( P)     P                            (2.1)
The above relationship between P and Q is such that for a
particular value of P, there is only one corresponding
value of Q. This is, therefore, a deterministic (non-
stochastic) relationship.
This implies that all the variation in Q is due solely to
changes in P, and that there are no other factors affecting
the dependent variable.
9
If this were true all the points of price-quantity
pairs, if plotted on a two-dimensional plane, would
fall on a straight line.
 However, if we gather observations on the quantity
actually supplied in the market at various prices and
we plot them on a diagram we see that they do not
fall on a straight line.

10
 The deviation of the observation from the line
may be attributed to several factors.
1. Omission of variables from the function
In economic reality each variable is influenced by
very large number of factors and each variable may
not be included in the function because of
a) Some of the factors may not be known.
b) Even if we know them the factors may not be
measured statistically example psychological
factors (test, preferences, expectations etc) are not
measurable

11
c) Some factors are random appearing in an
unpredictable way & time. Example epidemic, earth
quacks, etc..
d) Some factors may be omitted due to their small
influence on the dependent variables.
e) Even if all factors are known, the available data
may not be adequate for the measure of all factors
influencing a relationship.
2. Random behavior of human beings: The erratic
nature of human beings behavior. The human
behavior may deviate from the normal situation to a
certain extent in unpredictable way.
12
3. Imperfect specification of the mathematical form
of the model
We may wrongly specified the relationship between
variables. We may form linear function to non-
linearly related relationships and vice versa.
4. Error of aggregation
The data of many Economic variables are available in
aggregate form ex. Consumption, income etc is found
in aggregate form which we are added magnitudes
referring to individuals where behavior is dissimilar.

13
5. Error of measurement
When we are collecting data we may commit errors
of measurement.
6. Sampling error:
Consider a model relating consumption (Y) with
income (X) of households. The sample we randomly
choose to examine the relationship may turn out to be
predominantly poor households. In such cases, our
estimation of ∂ and ß from this sample may not be
as good as that from a balanced sample group.

14
 In order to take into account the above sources of
errors we introduce in econometric functions a
random variable which is usually denoted by the
letter ‘u’ or ‘ ’ and is called error term or random
disturbance or stochastic term of the function
because u is supposed to ‘disturb’ the exact linear
relationship which is assumed to exist between X
and Y.
By introducing this random variable in the function
the model is rendered stochastic of the form:
Yi     X i  ui                            (2.2)

15
 This stochastic model is a model in which the
dependent variable is not only determined by the
explanatory variable(s) included in the model but
also by others which are not included in the model.
 The true relationship which connects the variables
involved is split into two parts:
 a part represented by a line &
 a part represented by the random term ‘u’.

16
 The scatter of observations represents the true
relationship between Y and X.
 The line represents the exact part of the
relationship, and
 the deviation of the observation from the line
represents the random component of the
relationship. 17
If there were no errors in the model, we would
observe all the points on the line Y , Y ,......, Y 1
'
2
'
n
'

corresponding to X 1 , X 2 ,...., X n

However because of the random disturbance, we


observe Y1 , Y2 ,......, Ycorresponding
n to X , X ,....,. X . These 1 2 n

points diverge from the regression line by u1, u2 ,....,. un


Yi     xi  ui
thedependent var iable theregression line random var iable

The first component in the bracket is the part of Y


explained by the changes in X and the second is the part
of Y not explained by X, that is to say the change in Y
is due to the random influence of ui.
18
2.1. Simple Linear Stochastic Regression Model
 The simplest economic relationship is represented through a two-variable model
(also called the simple linear regression model) which is given by:
𝑌𝑖 = 𝑎 + 𝑏𝑋𝑖 + 𝑈𝑖
where a and b are unknown parameters (also called regression coefficients) that we
estimate using sample data. Here Y is the dependent variable, X is the independent
variable and U is random disturbance term variable.
2.1.1.Assumptions of the Classical Linear Stochastic Regression Model
 The classical made important assumption in their analysis of regression. The
most important of these assumptions are:
1. The model is linear in parameters
 The classical assumed that the model should be linear in the parameters
regardless of whether the explanatory and the dependent variables are linear
or not. This is because if the parameters are non-linear it is difficult to estimate
them given with the data of the dependent and independent variable since their
value is not known.

19
 Example 1. Y     x  u is linear in both parameters and
i

the variables, so it satisfies the assumption


 Example 2. ln Y     ln x  u is linear only in the parameters.
Since the classicalist worry on the parameters, the
model satisfies the assumption.
Exercise: Check yourself whether the following
models satisfy the above assumption:
A. ln Y 2
    ln X 2
U
B. Y     X  U
i i i

20
2. Ui is a random real variable
This means that the value which ‘ui’ may assume in
any one period depends on chance; it may be
positive, negative or zero.
Every value has a certain probability of being
assumed by ‘ui’ in any particular instance.

21
3. The mean value of the random variable(U) in
any particular period is zero
 The mean value of the random variable(U) in any
particular period is zero.
This means that for each value of x, the random
variable(u) may assume various values, some
greater than zero and some smaller than zero, but
if we considered all the positive and negative
values of u, for any given value of X, they would
have on average value equal to zero. In other
words the positive and negative values of u cancel
each other.
Mathematically, E(U )  0….……..….(2.3)
i
22
4. The variance of the random variable(U) is
constant in each period (The assumption of
homoscedasticity)
 For all values of X, the ui’s will show the same
dispersion around their mean. In Fig.2.c this
assumption is denoted by the fact that the values
that ui can assume lie with in the same limits,
irrespective of the value of X. For x1 , u can
assume any value with in the range AB; for x2 , ui
can assume any value with in the range CD which
is equal to AB and so on.

23
Graphically;

Mathematically;Var(Ui )  E[Ui  E(Ui )]2  E(Ui )2   2 since E (U )  0


i

This constant variance is called homoscedasticity


assumption and the constant variance itself is called
homoscedastic variance.
24
5. The random variable (U) has a normal
distribution
This means the values of u (for each x) have a bell
shaped symmetrical distribution about their zero
mean and constant variance , i.e.
2

Ui N (0, )
2

25
6. The assumption of no autocorrelation
 The random terms of different observations(Ui ,Uj )are
independent.
This means the value which the random term assumed
in one period does not depend on the value which it
assumed in any other period.
Algebraically,
Cov(uiu j )   [(ui  (ui )][u j  (u j )]
 E (u i u j )  0

26
7. There are a set of fixed values in the hypothetical
process of repeated sampling which underlies the
linear regression model.
 This means that, in taking large number of samples
on Y and X, the ‘x’ values are the same in all
samples, but the values of ‘ui’ do differ from sample
to sample, and so of course do the values of ‘yi’ .

27
8. The random variable (U) is independent of the
explanatory variables.
 This means there is no correlation between the random
variable and the explanatory variable. If two variables
are unrelated their covariance is zero. i.e.
Cov( X i ,U i )  0

Cov( X i , U i )   [( X i  ( X i )][U i  (U i )]


 [( X i  ( X i )(U i )] given E (U i )  0
 ( X iU i )  ( X i )(U i )
 ( X iU i )
 X i (U i )
0

28
9. The explanatory variables are measured without
error
Ui absorbs the influence of omitted variables and
possibly errors of measurement in the yi’s. i.e., we
will assume that the regressors are error free,
while ‘y’ values may or may not include errors of
measurement.

29
 We can now use the above assumptions to derive
the following basic concepts.
A. The dependent variable is normally distributed.
Proof: i.eY ~N (   x ),   2
i i
Mean=
Variance: (Y )      x  u 
i i

    X i ;since (ui )  0
Var (Yi )   Yi  (Yi ) 
2

     X i  ui  (   X i ) 
2

  (u i ) 2

  2 ;since (ui )2  0
 var(Yi )   2

30
The shape of the distribution of is determined by
the shape of the distribution of ui which is normal
by assumption 4. Since ∂ and ß, being
constant, they don’t affect the distribution of .
Furthermore, the values of the explanatory
variable, , are a set of fixed values by assumption
5 and therefore don’t affect the shape of the
distribution of yi.
Yi ~N (   x i ),  2 

31
 E (U iU j )  0

B. The successive values of the dependent


variable are independent, i.e Cov(Yi , Yj )  0

Proof:
Cov(Yi , Yj )  E{[Yi  E(Yi )][Yj  E(Yj )]}
 E{[   X i  Ui  E(   X i  Ui )][   X j  U j  E(   X j  U j )}

Since;Yi     X i  Ui and ;Yj     X j  U j


E[(   X i  Ui     X i )(   X j  U j     X j )],sin ce(ui )  0
 E (UiU j )  0

Therefore, Cov(Yi ,Y j )  0

32
2.2.Metods of Estimation for Simple Linear
Regression Model
Objective of regression analysis:
how the average value of the dependent variable
(regressand) varies with values of explanatory
variables (regressors).
E[Y | X ]  f ( X )
This function is called conditional expectation function
(CEF), or population regression function (PRF)

33
 The Stochastic PRF written in econometric model
is used for empirical purposes(analysis).
Yi  E[Y | X i ]   i
 The stochastic disturbance term ɛi plays a critical role in
estimating the PRF.
The PRF is an idealized concept. since in practice one rarely
has access to the entire population of interest.Hence, we use
the stochastic sample regression function (SSRF)
to estimate the PRF, i.e.,
we use Y  Yˆ  e to estimate; Y  E[Y | X ]  
i i i i i i
Yˆ  f(X )
i i
34
where:
 Using theoretical r/p b/n X & Y, Yi is decomposed into
a non-stochastic/systematic component α+βXi and a
random component ui.
Yi     xi  ui .
the dependent var iable the regression line random var iable

 This is a theoretical decomposition because we do not


know the values of α and β, or the values of ɛ.
 Operational decomposition of Yi is with reference to the
fitted line(from sample observations for Y & X): actual
value Yi is equal to the fitted value plus the residual ei. i.e.
Yˆi  ˆ  ˆ X i  ei
 The residuals ei serve a similar purpose as the
stochastic term ɛi, but the two are not identical. 35
 From the PRF:
Yi  E[Y | X i ]   i  i  Yi  E[Y | X i ]
but , E[Y | X i ]    X i ε i  Yi  α  βX i
 From the SSRF:
Y  Yˆ  e ei  Yi  Yˆi
i i i

but Yˆi  ˆ  ˆX e i  Yi  αˆ  βˆ X


i i
36
E[Y|X2] = α + βX2
O4
Y ɛ4
E[Y|Xi] = α + βXi
O1 P3
P4
ɛ1 P2 ɛ3
ɛ2 O3
α P1
O2

X
X1 X2 X3 X4
37
O4
e4 SRF : Yˆ  ˆ  ˆX
Y ɛ4
R4 E[Y|Xi] = α + βXi
O1 R3
P4
P3 Ɛi & ei are not
ɛ1 RP
22 e3 identical
e1 ɛ3
e ɛ2
P1 2
O3  Ɛ1 < e1
α O2
R1  Ɛ2 = e2
̂
 Ɛ3 < e3
X1 X2 X3 X4 X  Ɛ4 > e4
38
 Our sample is only one of the large number of
possibilities.
 Implication: the SRF line is just one of the possible
SRFs. Each SRF line has unique ˆ and ˆ values.
 Then, which of these lines should we choose?
 Generally we will look for the SRF which is very close
to the (unknown) PRF.
 We need a rule that makes the SRF as close as
possible to the observed data points.
 But, how can we devise such a rule? Equivalently,
how can we choose the best technique to estimate
the parameters of interest (α and β)?
39
Generally, there are 3 methods of estimation:
 method of least squares,
 method of moments, and
 maximum likelihood estimation.
2.3 The Method of Least Squares(OLS method)
The most common method for fitting a regression line is
the method of least-squares specifically, Ordinary Least
Squares (OLS).
The reasons to use OLS
i. The computational procedure of OLS is fairly simple as
compared to other econometric methods.

40
ii. The parameters obtained by this methods
have some optimal properties i.e. BLUE
(Best,Linear, Unbiased Estimators).
What does OLS do?
A line fits a dataset well if observations are
close to it, i.e., if predicted values obtained
using the line are close to the values actually
observed.
41
2.3 The Method of Least Squares
Meaning, the residuals should be small.
Therefore, when assessing the fit of a line, the
vertical distances of the points from the line are the
only distances that matter.
The OLS method calculates the best-fitting line for
a dataset by minimizing the sum of the squares of
the vertical deviations from each data point to the
line (the Residual Sum
n
of Squares, RSS).
Minimize RSS =  i
2
e
i 1
We could think of minimizing RSS by successively
choosing pairs of values for ˆ and ˆ until RSS is
made as small as possible. 42
2.3 The Method of Least Squares
To find the values of ∂ & ß that minimize this
sum, we have to differentiate with respect to ∂ˆ &߈
& set the partial derivatives equal to zero
 Why the sum of the squared residuals? Why not just
minimize the sum of the residuals?
 To prevent negative residuals from cancelling
positive ones.
 If we use  ei , all the error terms ei would receive
equal importance no matter how closely/widely
scattered the individual observations are from SRF.
 If so, the algebraic sum of ei’s may be small (even
zero) though the eis are widely scattered about SRF.
43
2.3 The Method of Least Squares
n

n
 ˆ 
n
 ˆ  ˆ
OLS: minimizei 1 i 
  i α β
2 2 2
e (Yi Y i ) (Y X i )
i 1 i 1
αˆ , βˆ
n n

F.O.C.: (1)  ( e ) 2
i [ (Yi  ˆ  ˆX i ) 2 ]
i 1
0 i 1
0
ˆ ˆ
n n
 2.[ (Yi  ˆ  ˆX i )][1]  0   (Yi  ˆ  ˆX i )  0
i 1 i 1

n n n n n
  Yi   ˆ   ˆX i  0   Yi  nˆ  ˆ  X i  0.
i 1 i 1 i 1 i 1 i 1

 Y  ˆ  ˆX  0 αˆ Y  βˆX 44
2.3 The Method of Least Squares
n n
F.O.C.: (2)  ( e ) 2
i [ (Yi  ˆ  ˆX i ) 2 ]
i 1
0 i 1
0
ˆ ˆ
n
 2.[ (Yi  ˆ  ˆX i )][ X i ]  0
i 1
n
  [(Yi  ˆ  ˆX i )( X i )]  0
i 1
n n n
  Yi X i   ˆX i   ˆX i2  0
i 1 i 1 i 1
n n n
  Yi X i  ˆ  X i  ˆ  X i2
i 1 i 1 i 1 45
2.3 The Method of Least Squares
Solve αˆ Y  βˆX and Y i X i  αˆ  X i  βˆ  X i2
(called normal equations) simultaneously!
i i  i  i 
Y X  ˆ X  ˆ X 2  Y X  (Y  βˆX)( X )  βˆ X 2
 i i i  i

ˆ ˆ
  Yi X i  Y  X i  βX  X i  β  X i2

  Yi X i  Y  X i  βˆ  X i2  βˆX  X i
ˆ
  Yi X i  Y  X i  β(  X i  X  X i )
2

ˆ
  Yi X i  nXY  β(  X i  nX )
2 2

 Xi
b/c X    X i  nX. 46
n
2.3 The Method of Least Squares
 Y X  nXY
Thus, 1. βˆ  i i
2 2
 X  nX
i
Alternative expressions forβˆ :

 ( X  X )(Y  Y ) ˆβ   xy
ˆ
2. β  i i
 x 2
2
 ( Xi  X ) where : x  X  X & y  Y  Y .
i i

Cov ( X , Y ) n Y X  ( X )( Y )
ˆ
3. β  4. βˆ  i i i i
Var ( X ) 2 2
n X  ( X ) 47
i i
2.3 The Method of Least Squares

For αˆ just use: αˆ Y  βˆX


 Y X  nXY
Or, αˆ  Y  {X.[ i i ]}
2  nX 2
 X
i
2
( Y )( X )  ( X )( Y X )
 αˆ  i i i i i
n( X 2  nX 2 )
i

48
2.3 The Method of Least Squares
Previously, we came across two normal equations:
1.  (Yi  ˆ  ˆX i )  0 this is equivalent to:  ei  0
n

i 1
n
2.  i
[(Y  
ˆ  ˆX )( X )]  0 equivalently,
 i i e Xi i 0
i 1

Note also the following property: Y  Yˆ


Y  Yˆ  e   Y   Yˆ   e
i i i i i i
 Yi  Yˆi  ei  Y  Yˆ as  e  0  e  0.
   i
n n n 49
2.3 The Method of Least Squares

Y  Yˆ
Y  ˆ  ˆX
These facts imply that the sample regression line
passes through the sample mean values of X and Y.

Yˆ  ˆ  ˆX
Y

X 50
X
Example
1. Assume the following hypothetical weekly
data on Y (Demand for a normal good) and X
(its price) are obtained from a certain market.
𝑌𝑖 1 3 5 6 5 6 4 7 8 7
𝑋𝑖 7 6 5 4 4 4 4 3 3 4

a. Assuming a relationship 𝑌𝑖 = 𝜕 + 𝛽𝑋𝑖 + 𝑈𝑖 ,


obtain the OLS estimators of 𝜕 and 𝛽 .
b. Estimate the conditional mean of good
demanded when its price is 10. 51
2.3 The Method of Least Squares
Firm Sales Advertising
2. Explaining sales (i) (Yi) Expense (Xi)
= f(advertising) 1 11 10
Sales are in thousands 2 10 7
of Birr & advertising 3 12 10
expenses are in 4 6 5
hundreds of Birr. 5 10 8
6 7 8
7 9 6
8 10 7
9 11 9
52
10 10 10
2.3 The Method of Least Squares
i Yi Xi y i  Yi  Y xi  X i  X xi y i 10
1 11 10 1.4 2 2.8 Y i
96
2 10 7 0.4 -1 -0.4 Y  i 1

. n 10
3 12 10 2.4 2
4.8  9.6
4 6 5 -3.6 -3 10.8 10
5
6
10
7
8
8
0.4
-2.6
0 0 X i
80
0 0 X  i 1

7 9 6 -0.6 -2 1.2 n 10
8 10 7 -0.4 -1 -0.4 8
9 11 9 -1.4 1
10 10 10 0.4 1.4
2
0.8
Ʃ 96 80 0 0 53
21
2.3 The Method of Least Squares
i yi xi y i2 x i2


1 1.4 2 1.96 4 x y 21
ˆ    0.75
i i
2 0.4 -1
x
0.16 1 2
3 2.4 2 i 28
5.76 4
4 -3.6 -3
12.96 9
5 0.4 0
6 -2.6 0
1.96 0
ˆ  Y  ˆX
6.76 0
7 -0.6 -2 0.36 4
 9.6  0.75(8)  3.6
8 -0.4 -1 0.16 1
1.96 1
9 -1.4 1
0.16 4
10 0.4 2
30.4 28 54
Ʃ 0 0
2.3 The Method of Least Squares
ˆY  3.6  0.75 X ei  Yi  Yˆi
  14.65
i 2
e 2
i i
0.01
i
e i
1 11.1 -0.10

 yˆ  15.75
1.3225
2 8.85 1.15 2
3 11.10 0.90 0.81 i
4 7.35 -1.35 1.8225
5
6
9.60
9.60
0.40
-2.60
0.16
6.76
 y  30.4 2
i

7 8.10 0.90 0.81  y   x   yˆ


i i i

 e  0
8 8.85 1.15 1.3225
i
9 10.35 0.65 0.4225
10 11.10 -1.10
1.21
Ʃ 96 0 14.65
55
2.3 The Method of Least Squares
Assumptions Underlying the Method of Least Squares
 To obtain ˆ & ˆ in the model Yi  ˆ  ˆX i  ei , the only
assumption we need is that:
 X must take at least 2 distinct values (number of
observations  number of parameters).
 But the objective in regression analysis is not only
to obtain ˆ and ˆ but also to draw inferences about
the true parameters  and  .
 For example, we’d like to know how close ˆ and ˆ
are to  and  or how close Yˆi is to E[Y | X i ] .
 To that end, we must also make certain assumptions
about the manner in which Yi ’s are generated. 56
2.4 Properties of OLS Estimators and the Gauss-Markov Theorem
☞Given the assumptions of the classical linear
regression model, the least-squares estimators
possess some ideal or optimum properties.
These statistical properties are extremely important
because they provide criteria for choosing among
alternative estimators.
These properties are contained in the well-known
Gauss–Markov Theorem.

57
2.4 Properties of OLS Estimators and the Gauss-Markov Theorem
Gauss-Markov Theorem:
“Given the assumptions of the classical linear
regression model, the OLS estimators ˆ and ˆ , in
the class of linear and unbiased estimators, have
the minimum variance, i.e. the OLS estimators are
BLUE.
An estimator is called BLUE if:
Linear: a linear function of the random variable Y

58
2.4 Properties of OLS Estimators and the Gauss-Markov Theorem

Unbiased: its average or expected value is equal


to the true population parameter.
E ( ˆ )  
Minimum variance: It has a minimum variance
in the class of linear and unbiased estimators. An
unbiased estimator with the least variance is
known as an efficient estimator.

59
Variances of OLS estimates:
ˆ
or , var( ) 
 2
or, var (αˆ)  σ 2
 i
X 2

x 2
i
n xi2

60
2.5 Residuals and Goodness of Fit
Decomposing the variation in Y:

61
2.5 Residuals and Goodness of Fit
One measure of the variation in Y is the sum of its
squared deviations around its sample mean, often
described as the Total Sum of Squares, TSS.
TSS, the total sum of squares of Y can be
decomposed into two:
ESS, the ‘explained’ sum of squares, and
RSS, residual (‘unexplained’) sum of squares.

TSS = ESS + RSS

 i
(Y  Y ) 2
  i
(Yˆ  Y ) 2
 i
e 2
62
2.5 Residuals and Goodness of Fit
Yi  Yˆi  ei  Yi  Y  Yˆi  Y  ei
(Yi  Y ) 2  (Yˆi  Y  ei ) 2
 i
2
 ˆ
(Y  Y )  (Y  Y  e ) i
2
i

y 2
i   ( yˆ i  ei ) 2

 i  i  i  2 yˆ i ei
y 2
 ˆ
y 2
 e 2

The last term equals zero:


 ii  i
yˆ e  (Yˆ  Y )e i   i ei   Y ei

  yˆ i ei   (ˆ  ˆX i )ei  Y  ei


63
2.5 Residuals and Goodness of Fit
.   yˆ i ei  ˆ  ei  ˆ  X i ei
  yˆ i ei  0
Hence:   y i   yˆ i2   ei2
2
TSS  ESS  RSS
Coefficient of Determination (R2):
 It is also another measure of variation in the
dependent variable.
 It is the proportion of the variation in the
dependent variable that is explained by the model.
64
2.5 Residuals and Goodness of Fit
1. R 
2 ESS

 ˆ
y 2

TSS  y 2
ESS x 2

2. R 
2
 ˆ 2
R 
2 ESS

 ( ˆx) 2
 TSS y 2

TSS  y 2

 The OLS regression coefficients are chosen in such


a way as to minimize the sum of squared residuals.
It automatically follows that they maximize R2.
TSS  ESS  RSS ESS RSS
TSS ESS RSS
1
TSS

TSS
 3. R 2
1
 ei2
  
TSS TSS TSS 
ESS
TSS
 1
RSS
TSS
y 2

65
2.5 Residuals and Goodness of Fit
Coefficient of Determination (R2):
R 
2 ESS
 ˆ (
 )(  )
xy x 2
ESS  i
ˆ
y 2

R 
2

TSS  x 2
 y 2
TSS y 2

4. R 
ESS
2
 ˆ
 xy 
15.75
 0.5181
TSS y 2 30.4

R 2  xy  xy
x y 2 2

2
( xy ) 2
 6. R 2 [cov( X , Y )]
 5. R  2
var( X )  var(Y )
x y 2 2
66
2.5 Residuals and Goodness of Fit
 A natural criterion of goodness of fit is the
correlation between the actual and fitted values of
Y. The least squares principle also maximizes this.
 Note,  R  (ryˆ , y )  (rx , y )
2 2 2

where ryˆ , y and rx,y are the coefficients of correlation


between Yˆ & Y, and X & Y, defined as:
cov(Yˆ , Y ) cov( X , Y ) , respectively.
ryˆ , y  & rx , y 
 Yˆ  Y  XY
Note:
RSS  (1  R ) y 2 2
67
To sum up:
Use Yˆi  ˆ  ˆX i to estimate E[Y | X i ]    X i .

OLS: min i1 e 


n
(Y 
 i i ˆ
Y 2
)2

n
n
 i
(Y  ˆ
α  ˆ
β X )2
i i
i 1 i 1
α̂, β̂
 xy
ˆ
β αˆ  Y  βˆX

2
x
Given the assumptions of the linear regression
model, the estimators ˆ and ˆ have the smallest
variance of all linear and unbiased estimators of
 and 
ˆ
var( ) 
 2
1
var(ˆ )   ( 
2 X 2
)  2  X i
2

 i
x 2
n  xi
2
n x 68 i
2
To sum up …
2 2
 i  i i
2
y  ˆ
y 2
 e 2
var(ˆ )  
x 2
i 28
TSS  ESS  RSS  0.0357 2

R 
2 ESS

y
ˆ 2
1 X 2

y 2 var(ˆ )   ( 
2
)
n  xi
TSS 2

 yˆ 2
 ̂  xy
var(ˆ )   (
2 1

64
)
10 28
 yˆ 2 ˆ
 2
x 2
 2.3857 2

RSS  (1  R ) y 2 2
But ,   ? 2
69
An unbiased estimator for σ2

E ( RSS )  E ( e )  (n  2)
2
i
2

Thus, if we define ˆ 2

 e2
i
, then :
n2
1 1
E ( )  (
ˆ
2
) E ( ei )  (
2
)(n  2)  
2 2

n2 n2

 ˆ 2 
 i
e 2

is an unbiased estimator of  2 .
n2
70
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis
Why is the Error Normality Assumption Important?
The normality assumption permits us to derive the
functional form of the sampling distributions of
ˆ , ˆ & ˆ 2 .
Knowing the form of the sampling distributions
enables us to derive feasible test statistics for the
OLS coefficient estimators.
These feasible test statistics enable us to conduct
statistical inference, i.e.,
1)to construct confidence intervals for  ,  &  . 2

2)to test hypothesis about the values of  ,  &  2


. 71
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis


. i ~ N (0, )
2
  Y ~ N (  X , )
i i
2

 2
ˆ ~ N (  , ) ˆ ~ N ( , 
X 2

 xi
2 i
2 )
x 2
i

ˆ  
( )  xi ~ N (0,1)
2

 ˆ  
~ t n2

2
βˆ  β seˆ(ˆ ) σˆ 
e
i
~t n  2 n2
seˆ(βˆ)
X
2
ˆ seˆ(αˆ )  σˆ. i
seˆ( ˆ )  n x 2
 i
2 i
x 72
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

Confidence Interval for α and β :


αˆ  α n 2
P{t n2
  t / 2 }  1  
seˆ(αˆ )
 /2

100( 1  α)% Two-Sided


CI for α : αˆ  (t n2
α/ 2 )seˆ(αˆ)
Similarly,
100( 1  α)% Two-Sided
CI for β:
  (t / 2 ) seˆ( ˆ )
ˆ n 2

73
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis
Let us continue with our earlier example.
We have: n  10, ˆ  3 .6, ˆ
  0.75, R  0.5181,
2

var(ˆ )  2.3857 2 , var(ˆ )  0.0357 2 ,&  ei2  14.65

 e
2
2 14.65
is estimated by: σˆ 
2
 i
 1.83125
n2 8
 σˆ  1.83125  1.3532
Thus, vâr(ˆ )  2.3857(1.83125)  4.3688
 seˆ(ˆ )  4.3688  2.09
vâr( ˆ )  0.0357(1.83125)  0.0654
 seˆ( ˆ )  0.0654  0.256 74
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

 95% CI for α and β :


95% CI for α : 1    0.95    0.05   / 2  0.025
 95% CI for α :

 3.6  4.8195 [1.2195, 8.4195]


95% CI for  :
 95% CI for  :
[0.1597, 1.3403]
 0.75  0.5903 75
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

The confidence intervals we have constructed for


 &  are two-sided intervals.
Sometimes we want either the upper or lower limit
only, in which case we construct one-sided intervals.
For instance, let us construct a one-sided (upper
limit) 95% confidence interval for  .
From the t-table, t08.05  1.86 .
Hence, ˆ  t 08.05 .seˆ( ˆ )  0.75  1.86(0.256)
 0.75  0.48  1.23
The confidence interval is (- ∞, 1.23]. 76
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

 Similarly, the 95% lower limit is:


  t 0.05 seˆ( ˆ )  0.75  1.86(0.256)
ˆ 8

 0.75  0.48  0.27


 Hence, the 95% CI is: [0.27, ∞).
Hypothesis Testing:
 Use our example to test the following hypotheses.
 Result: Yˆi  3.6  0.75 X i
(2.09) (0.256)
1. Test the claim that sales doesn’t depend on
advertising expense (at 5% level of significance).
77
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

Hypothesis Testing
Suppose that from a sample of 10 observations, we estimate
the ff sales function:
 Result: Yˆi  3.6  0.75 X i , where xi is adv. cost.
(2.09) (0.256)
1. Test the claim that sales doesn’t depend on advertising
expense (at 5% level of significance) & construct CI check by
both methods!
2. Test whether the intercept is greater than 3.5.
3. Can you reject the claim that a unit increase in advertising
expense raises sales by one unit? If so, at what level of
significance? Solution 78
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

1. H0:   0 against Ha:   0 .


 Test statistic: ˆ   0.75  0
tc   tc   2.93
seˆ( ˆ ) 0.256
 Critical value: (tt = t-tabulated)
n 2
  0.05   / 2  0.025 tt  t / 2  t 0.025  2.306
8

 Since t c  t t , we reject the null (the alternative is


supported). That is, the slope coefficient is
statistically significantly different from zero:
advertising has a significant influence on sales.
79
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

2. H0:   3.5 against Ha:   3.5 .


 Test statistic: t  ˆ    t  3.6  3.5  0.1  0.05
seˆ( )
ˆ
c c
2.09 2.09

 Critical value: (tt = t-tabulated)


At 5% level of significance (  0.05),
n2
t t  t t 8
0.05  1.86
 Since t c  t t , we do not reject the null (the null is
supported). That is, the intercept (coefficient) is not
statistically significantly greater than 3.5.
80
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis
3. H0:   1 against Ha:   1 .
 Test statistic: ˆ   0.75  1  0.25
tc   tc    0.98
seˆ( ˆ ) 0.256 0.256
 At   0.05, 0.025  2.306 and thus H0 can’t be rejected.
t 8

 Similarly, at   0.10, 0.05  1.86 H can’t be rejected.


8
t
0
 At   0.20, t 0.10  1.397 and thus H0 can’t be rejected.
8

  0.05  0.706 and thus H0 is rejected.


8
 At 0.50, t

81
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

Exercise
The following intermediate results are obtained from the data
collected on the quantity supplied (Y) for commodity A and
its price (X) for 10 years.
Y = 60 X2 = 304 xy = 53 y2 = 53
X = 60 Y2 = 413 x2 = 54
a. Assuming a relationship 𝑌𝑖 = 𝜕 + 𝛽𝑋𝑖 + 𝑈𝑖 , obtain the
OLS estimators of 𝜕 & 𝛽.
b. Compute the value of R2 (coefficient of determination)
and interpret the result.

82
2.6 Confidence Intervals and Hypothesis Testing in Regression Analysis

c. Test the hypothesis that price influences demand


using t-test at 5% level of confidence and interpret
it.
d. Estimate the conditional mean of Y
corresponding to a value of X fixed at X = 10.

83

You might also like