100% found this document useful (1 vote)
438 views

Chapter 2 - Simple Linear Regression Function

The document summarizes key concepts in simple linear regression models. It defines regression analysis and the objective of estimating relationships between dependent and independent variables. It describes the ordinary least squares estimation method and assumptions of the classical linear regression model, including that error terms are random, uncorrelated and have constant variance. It also outlines estimation of simple linear regression functions and sources of variation in the dependent variable.

Uploaded by

Ermias Atalay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
438 views

Chapter 2 - Simple Linear Regression Function

The document summarizes key concepts in simple linear regression models. It defines regression analysis and the objective of estimating relationships between dependent and independent variables. It describes the ordinary least squares estimation method and assumptions of the classical linear regression model, including that error terms are random, uncorrelated and have constant variance. It also outlines estimation of simple linear regression functions and sources of variation in the dependent variable.

Uploaded by

Ermias Atalay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

SIMPLE LINEAR REGRESSION MODELS

DEBRE BERHAN UNIVERSITY


COLLEGE OF BUSINESS AND ECONOMICS
DEPARTMENT OF ECONOMICS

Solomon Estifanos

MARCH, 2022
DEBRE BERHAN, ETHIOPIA
Outline

Simple Linear Regression Models


 Concept of Regression Function
 Ordinary Least Squares (OLS) estimation Method
 Residuals and Goodness-of-fit
 Properties of OLS Estimates and Gauss-Markov Theorem
 Confidence Intervals and Hypothesis Testing
 Predictions using Simple Linear Regression Model

2
Simple Liner Regression Models
 Regression analysis refers to estimating functions showing the relationship between two or more variables
and corresponding tests.

Introduction to Simple Linear Regression


 Correlation analysis is concerned for expressing and measuring the closeness of variables. It does
not identify the cause and effect relationship between variables.
 Regression is suitable in identifying the cause and effect relationships of variables.

 Regression analysis is concerned with the study of relationship between one variable (known as
dependent variable) and one or more other variables (known as independent variable(s)).
 A central point in regression analysis is estimating regression functions accompanied by some
preceding and some succeeding steps in the econometrics methodology.
3
Simple Linear . . . Con’t . . .
𝒀𝒊 = 𝒇 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 , . . .
Dependent Variable Independent Variable
Explained Variable Explanatory Variable
Endogenous Variable Exogeneous Variable
Regressand Regressors
• In addition to showing the causal relationship among the variables, regression
analysis has following objectives and uses.
1) To estimate mean or average value of the dependent variable, given the value of independent
variable(s)
2) To test hypothesis about sign and magnitude of relationship between the dependent variable
and one or more independent variable(s) and
3) To predict or forecast future value(s) of the dependent variable which is in turn used in policy
formulation
4
Simple Linear . . . Con’t . . .
✓ Having this much about regression function in general, now let’s see some more points

about our current topic ‘simple linear regression analyses.

✓ Let’s see the meaning of the words ‘simple’ and ‘linear’ in the topic. In simple linear

regression function or analysis, the term ‘simple’ refers to the fact that we use only two
variables (one dependent and another independent variable).

✓ If the number of independent or explanatory variables is greater than one, we don’t use the

term ‘simple’. Instead we use the term ‘multiple’

5
Simple Linear . . . Con’t . . .

Example: Theory of Demand


Economic theory provides the following information with regard to demand function.

a) The dependent variable is quantity demanded and the independent variable is price.

𝑌𝑖 = 𝑓 𝑋𝑖 Where 𝑌𝑖 = quantity demanded and 𝑋𝑖 = 𝑝𝑟𝑖𝑐𝑒

b) Economic theory does not specify whether demand is studied using one equation or more
elaborated form of simultaneous equations. Let us we use one equation system.

c) Economic theory is not clear about the mathematical form (linear or non-linear) of the demand
function. Let we choose a linear function. Therefore, the demand function is given as

𝒀𝒊 = 𝜶 + 𝜷𝑿𝒊
6
Simple Linear . . . Con’t . . .
d) The above form of demand function implies that the relation between Y and X is exact. That means the

whole variation in Y is due to the changes in X only and there are no other factors or variables affecting the
dependent variable except X. If this were true, all quantity-price pairs would fall on the straight line if we
plot it in the X-Y plane. However, if we gather information from the market and plot it in the X-Y plane, all
the price-quantity pairs do not fall on the straight line. Most of the pairs would lie on the straight line, some
of the points would lie above the straight line and some of the points would also lie below the straight line.

*
* * *
* * *
* *
7
Simple Linear . . . Con’t . . .
The deviation of points from the straight line may be attributed to several reasons. These are:
1) Omitted Variables from the Function

a) Some of the factors may not even known by a person

b) Some of the factors may not be statistically measured

c) Some of the factors are random appearing in unpredictable way and time

d) Some of the factors have very small influence on the dependent variable

e) Some of the factors may not be having reliable statistical data.

2) Misspecification of the Mathematical Form of the Model: linear vs. non-linear

3) Errors of Aggregation: We often use aggregate data in which we add magnitudes referring to individual behavior
which are dissimilar. In this case, those variables expressing individual peculiarities are missing.

4) Errors of Measurement: errors of measurements of variables which are inevitable due to the methods of
8 collecting and processing statistical information.
Outline
 In order to take into account, the above sources of errors, we include in econometric functions a random
variable which is usually denoted by letter ‘U’ and is called the error term or the disturbance term or the
random term.

𝒀𝒊 = 𝜶 + 𝜷𝑿𝒊 + 𝑼𝒊

 The true relationship which connects the variables involved is split in two parts:

 A part represented by the line and

 A part represented by the random term

 Such type of models is called stochastic or probabilistic models and is familiar in econometrics. The above
model shows that the relationship between the two variables is inexact and the total variation of the
dependent variable is split into two additive components: Explained variation and residual variation which
can be shown as;

9 Total Sum of Square (TSS) = Explained Sum of Square (ESS) + Residual Sum of Square (RSS)
Outline
 The variation in the dependent variable is not hundred percent explained by the variation in the
explanatory variable. Thus, the variation in the dependent variable expressed as the sum of
explained variation and random variation given as follows.

(Variation in Yi) = (Systematic variation) + (Random variation)


(Variation in Yi) = (Explained variation) + (Unexplained variation)
 The term ‘linear’ in simple linear regression model refers;
 Mathematically a function is said to be linear in variables if the conditional expectation of the
dependent variable given the values of the independent variable is a linear function of the variable.
 As opposed to mathematical functions, in econometrics linearity doesn’t necessarily mean linearity in
variables. It basically means linearity in parameters.
10
Estimation of Simple Linear Regression Function
• There are various methods of estimating regression functions such as
 Method of Moment

 Method of Least Square or Ordinary Least Square (OLS) method and

 Maximum Likelihood (MLE) method

11
Ordinary Least Square Method (OLS) and Classical Linear Regression
Model (CLRM) Assumptions

CLRM Assumptions
Assumption 1
The error terms ‘Ui’ are randomly distributed. Ui is a random real variable. The value which
Ui may assume in any period depends on chance: some may be positive, some may be
negative or some may be zero.

Assumption 2
• The disturbance terms ‘Ui’ have zero mean. The sum of the disturbance terms is zero. The
deviation or the values of some of the disturbance terms are negative; some are zero and
some are positive and the sum or the average is zero. That is, E(Ui) =ΣUi. Multiplying
both sides by (sample size ‘n’) we obtain the following. E(Ui) = ΣUi=0.
12
CLRM Assumptions Con’t . . .
Assumption 3
 Independence of 𝑼′𝒊 s the disturbance terms are not correlated. This means that there is no
systematic variation or relation among the value of the error terms (𝑼𝒊 𝑎𝑛𝑑 𝑈𝑗 ) where i =
1, 2, 3, …….., j = 1, 2, 3, ……. and 𝑖 ≠ 𝑗 .
 This is represented by zero covariance among the error terms: Cov(𝑼𝒊 , 𝑼𝒋 ) = 0 for i≠j.
Otherwise, it causes an autocorrelation problem.

Assumption 4
 The disturbance terms have constant variance in each period. This is given as follows:

 This assumption is known as the assumption of homoscedasticity. If this condition is not


fulfilled or if the variance of the error terms varies as sample size changes or as the value
13
of explanatory variables changes, then this leads to Heteroscedasticity problem.
CLRM Assumptions Con’t . . .
Assumption 5
• Explanatory variables 𝑿𝒊 and disturbance terms 𝑼𝒊 are uncorrelated or independent.
All the co-variances of the successive values of the error term are equal to zero. This
condition is given by the following identity.
Cov(𝑿𝒊 , 𝑼𝒊 ) = E(𝑿𝒊 𝑼𝒊 ) = 0 ====➔ Σ𝑿𝒊 𝑼𝒊 = 0
• If this condition is not met by our data or variables, our regression function and
conclusions to be drawn from it will be invalid.

Assumption 6
• Linearity of the model in parameters. The simple linear regression requires linearity in
parameters; but not necessarily linearity in variables. The same technique can be applied
to estimate regression functions of the following forms:

14
CLRM Assumptions Con’t . . .
Assumption 7

 The explanatory variable Xi is fixed in repeated samples. Each value of Xi does not vary

for instance owing to change in sample size. This means the explanatory variables are non-
random and hence distributional free variable.

Assumption 8
• The disturbance term Ui is assumed to have a normal distribution with zero mean and a
constant variance.

• This assumption is a combination of zero mean of error term assumption and


homoscedasticity assumption.
15
CLRM Assumptions Con’t . . .
Assumption 9
 Explanatory variables should not be perfectly, linearly and/or highly correlated. Using
explanatory variables which are highly or perfectly correlated in a regression function causes a
biased function or model. It also results in multicollearity problem
Assumption 10
 The variables are measured without error (the data are error free). Since wrong data leads to
wrong conclusion, it is important to make sure that our data is free from any type of error.
Assumption 11
 The relationship between variables (or the model) is correctly specified. For instance, all the
necessary variables are included in the model. The variables are in the form that best describes the
functional relationship. For instance, “Y = f (X 2)” may better reflects the relationship between Y
16 and X than “Y = f (X )”.
Estimation Using OLS Method
 Estimating a linear regression function using the Ordinary Least Square (OLS) method is
simply about calculating the parameters of the regression function for which the sum of square
of the error terms is minimized. Suppose we want to estimate the following equation;
𝒀𝒊 = 𝜶 + 𝜷𝑿𝒊 + 𝑼𝒊
 Since most of the time we use sample (or it is difficult to get population data) the corresponding
sample regression function is given as follows.
෢𝒊 = 𝜶
𝒀 ෡ 𝒊
ෝ + 𝜷𝑿
 From this identity, we solve for the residual term ‘𝒆𝒊 ’, square both sides and then take sum of both
sides we get;
𝒆𝒊 = 𝒀𝒊 - ෢
𝒀𝒊 = ෢ ෡ 𝒊
ෝ -𝜷𝑿
𝒀𝒊 -𝜶
σ 𝒆𝟐𝒊 = σ(𝒀𝒊 − 𝜶 ෡ 𝒊 )2
ෝ − 𝜷𝑿

17
Where, RSS= Residual Sum of Squares.
Estimation Using OLS Method Con’t . . .
 The method of OLS involves finding the estimates of the intercept and the slope for which the
RSS is minimized. To minimize the RSS we take the first order partial derivatives and equate
them to zero.
ෝ:
Partial derivative with respect to 𝜶
𝝏 σ 𝒆𝟐𝒊 ෡ 𝒊 (−𝟏) = 0

= 2σ 𝒀𝒊 − 𝜶
ෝ – 𝜷𝑿
𝝏𝜶
σ 𝒀𝒊 − 𝜶 ෡ 𝒊 =0
ෝ – 𝜷𝑿
σ 𝒀𝒊 − 𝒏𝜶 ෡ 𝒊=0
ෝ − 𝜷σ𝑿
σ 𝒀𝒊 = 𝒏𝜶 ෡ 𝒊 ------------------------------------------------------- (1)
ෝ + 𝜷σ𝑿

Where n is the sample size.



Partial derivative With respect to 𝜷
𝝏 σ 𝒆𝟐𝒊 ෡ 𝒊 (−𝑿𝒊 ) = 0

𝝏𝜷
= 2σ 𝒀𝒊 − 𝜶
ෝ – 𝜷𝑿

σ 𝒀𝒊 𝑿𝒊 − 𝜶 ෡ 𝟐𝒊 = 0
ෝ 𝑿𝒊 – 𝜷𝑿
σ 𝒀𝒊 𝑿𝒊 = 𝜶 ෡ 𝟐𝒊 ----------------------------------------------- (2)
ෝ σ𝑿𝒊 + 𝜷σ𝑿
18
Estimation Using OLS Method Con’t . . .
 Lets represent equation (1) and (2) in matrix form as;
𝒏 σ𝑿𝒊 ෝ
𝜶 σ𝒀𝒊
෡ =
σ𝑿𝒊 σ 𝑿𝟐𝒊 𝜷 σ𝒀𝒊 𝑿𝒊
 Applying Cramer’s rule to solve for 𝛽መ we get;

𝒏 σ 𝒀𝒊 𝑿 𝒊 − σ 𝒀𝒊 σ 𝑿 𝒊 σ 𝒀𝒊 𝑿𝒊 −𝒏𝒀 𝑿 ഥ ഥ

𝜷= σ 𝟐 σ 𝟐 or ෡
𝜷 = σ 𝟐 ഥ𝟐 or
𝒏 𝑿𝒊 −( 𝑿𝒊 ) 𝑿𝒊 −𝒏𝑿

 In deviation form 𝛽መ can be computed as;


σ 𝒚𝒊 𝒙𝒊

𝜷= σ 𝟐

where, 𝒚𝒊 = 𝒀𝒊 − 𝒀 and ഥ
𝒙𝒊 = 𝑿𝒊 − 𝑿
𝑿𝒊

 And by dividing equation (1) we solve for 𝜶


ෝ as;
19 ෝ =𝒀
𝜶 ෡𝑿
ഥ− 𝜷 ഥ
Examples
Example 2.1: The following table gives the quantity supplied (in tons) and its price (pound
per ton) for a commodity over a period of twelve years.

Yi 69 76 52 56 57 77 58 55 67 53 72 64

Xi 9 12 6 10 9 10 7 8 12 6 11 8

Fit the simple linear regression equation; Y = f(X) and Interpret your result.

20
Solution
σ 𝒀𝒊 = 𝟕𝟓𝟔 σ 𝒚𝟐𝒊 = 𝟖𝟗𝟒 σ 𝒚𝒊 𝒙𝒊 = 𝟏𝟓𝟔
σ 𝑿𝒊 = 𝟏𝟎𝟖 σ 𝒙𝟐𝒊 = 𝟒𝟖 ഥ = 𝟔𝟑 𝑿
𝒀 ഥ=𝟗

Interpretation: The value of the intercept term, 33.75,


implies that the value of the dependent variable ‘Y’ is
33.75 when the value of the explanatory variable is zero.
෡ is a measure of the
The value of the slope coefficient 𝜷
marginal change in the dependent variable ‘Y’ when the
value of the explanatory variable increases by one. For
instance, in this model, the value of ‘Y’ increases on
average by 3.25 units when ‘X’ increases by one.
The Coefficient of Determination (𝑹𝟐 ) – Measure of Goodness of Fit

 The coefficient of determination is the measure of the amount or proportion of the total
variation of the dependent variable that is determined or explained by the model or the
presence of the explanatory variable in the model.
 The total variation of the dependent variable is split in two additive components: a part
explained by the model and a part represented by the random term.
 Total Variation in 𝒀𝒊 : σ(𝒀𝒊 − 𝒀 ഥ )𝟐
 Total Explained Variation: σ(𝒀 ഥ
෡ 𝒊 −𝒀
෡ )𝟐
 Total Unexplained Variation: σ 𝒆𝟐𝒊
 The total variation of the dependent variable is: TSS = ESS + RSS, which means total sum

of square of the dependent variable is split into explained sum of square and residual sum
of square.

22
Coefficient of Determination (𝑹𝟐 ) Con’t . . .
 The coefficient of determination is given by the formula;

or

or

 The higher the coefficient of determination is the better the fit. Conversely, the smaller the
coefficient of determination is the poorer the fit. That is why the coefficient of determination is
used to compare two or more models. One minus the coefficient of determination is called the
coefficient of non-determination, and it gives the proportion of the variation in the dependent
23
variable that remained undetermined or unexplained by the model.
Outline
Example 2.2: Refer to Example 2.1. Determine how much percent of the variations in the quantity
supplied is explained by the price of the commodity and what percent remained unexplained.

σ ො
𝐲𝐢
𝟐 ෡ σ 𝐲𝐢 𝐱 𝐢 𝟑. 𝟐𝟓 ∗ 𝟏𝟓𝟔
𝛃
𝐑𝟐 = 𝟐
= 𝟐
= = 𝟎. 𝟓𝟔𝟕𝟏 = 𝟓𝟔. 𝟕𝟏%
σ 𝐲𝐢 σ 𝐲𝐢 𝟖𝟗𝟒

𝟏 − 𝑹𝟐 = 𝟏 − 𝟎. 𝟓𝟔𝟕𝟏 = 𝟎. 𝟒𝟑𝟐𝟗 = 𝟒𝟑. 𝟐𝟗%

 This result shows that 56.71% of the variation in the quantity supplied of the commodity under
consideration is explained by the variation in the price of the commodity; and the rest 43.29%
remained unexplained by the price of the commodity. In other word, there may be other important
explanatory variables left out that could contribute to the variation in the quantity supplied of the

24
commodity, under consideration.
Properties of OLS Estimates and Gauss-Markov Theorem
 We would like to an estimate to be as close as the value of the true population parameters i.e. to vary within
only a small range around the true parameter. How are we to choose among the different econometric
methods, the one that gives ‘good’ estimates? We need some criteria for judging the ‘goodness’ of an
estimate.

 ‘Closeness’ of the estimate to the population parameter is measured by the mean and variance or standard
deviation of the sampling distribution of the estimates of the different econometric methods.

 The ideal or optimum properties that the OLS estimates possess may be summarized by well-known theorem
known as the Gauss-Markov Theorem.

 Statement of the theorem: “Given the assumptions of the classical linear regression model, the OLS
estimators, in the class of linear and unbiased estimators, have the minimum variance, i.e. the OLS
estimators are BLUE. Sometimes the theorem referred as the BLUE theorem i.e. Best, Linear, and
25 Unbiased Estimator.
Properties of OLS Estimates . . . Con’t . . .

a) Linear: a linear function of the random variable, such as, the dependent variable Y.

b) Unbiased: its average or expected value is equal to the true population parameter.

c) Minimum Variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.

26
Properties of OLS Estimates . . . Con’t . . .

෡ ෡
Unbiasedness: (for 𝜷)
Linearity: (for 𝜷)
Proposition: are the unbiased estimators of the true parameters
Proposition: 𝜷෡ are linear in Y. ෡ is an estimator of 𝜽 then E(𝜽)
If 𝜽 ෡ -𝜽 = amount of bias and if 𝜽෡ is
෡ =𝜽
the unbiased estimator of 𝜽 then bias = 0 i.e. E(𝜽)
Sx y Sx (Y - Y ) Sx Y - YSx We know that bˆ = S kY i = S k i (a + b X i + U i )
bˆ = i i
= i
= i i
,
Sx 2
Sx 2
Sx 2
= aSk i + b Sk i X i + Sk iu i ,
i i i

but Ski = 0 and Sk i X i = 1

(but Sxi = ( X - X ) =  X - nX = nX - nX = 0 ) Sxi S(X - X ) SX - nX -


Sk = = = = n X 2n X = 0
i
Sxi2
Sxi2
S x i2 Sxi

Sx Y
 bˆ = i
; Now, let
x
i
=K (i = 1,2,.....n)
 k i =0
Sx 2 Sx 2
i
Sxi X i S - ) Xi S X 2 - X S X SX 2
- nX 2
i i Sk i X i = = (X X = = =1
Sxi2
Sxi2
SX - nX
2 2
SX 2
- nX 2

\ b̂ = SK Y  kX i i =1
i
bˆ = b + S k i u i  bˆ - b = S k i u i

 b̂ = K Y + K Y + K Y + - - - - +K Y E ( bˆ ) = E ( b ) + S k i E ( u i ), Since k i are fixed


1 1 2 2 3 3 n n
E ( bˆ ) = b , since E (u i ) = 0

\ b̂ is linear in Y
Therefore, bˆ is unbiased estimator of b .
Properties of OLS Estimates . . . Con’t . . .
Minimum Variance

ො 𝑎𝑛𝑑 𝛽መ possess the


 Now, we have to establish that out of the class of linear and unbiased estimators of 𝛼 𝑎𝑛𝑑 𝛽, 𝛼

smallest sampling variances. For this, we shall first obtain variance of𝛼ො 𝑎𝑛𝑑 𝛽መ and then establish that each has the
minimum variance in comparison of the variances of other linear and unbiased estimators obtained by any other
econometric methods than OLS.

var( bˆ ) = E ( k i u i ) 2

= E[k12 u12 + k 22 u 22 + .......... .. + k n2 u n2 + 2k1k 2 u1u 2 + ....... + 2k n-1k n u n-1u n ]

= E[k12 u12 + k 22 u 22 + .......... .. + k n2 u n2 ] + E[2k1k 2 u1u 2 + ....... + 2k n-1k n u n-1u n ]

= E( k i2 u i2 ) + E(Sk i k j ui u j ) i j

= Sk i2 E(ui2 ) + 2Sk i k j E(ui u j ) =  2 Sk i2 (Since E(ui u j ) =0)


28
Properties of OLS Estimates . . . Con’t . . .
Since

Therefore,

෡ = 𝜹𝟐𝒖 𝜹𝟐𝒖 σ 𝑿𝟐𝒊


𝑽𝒂𝒓 𝜷 σ 𝒙𝟐𝒊
, ෝ =
𝑽𝒂𝒓 𝜶
𝒏 σ 𝒙𝟐𝒊

 To check whether ෡
𝜷 possess minimum variance property, we compare its variances with
that of the variances of some other alternative linear and unbiased estimator of β, say 𝛽∗ .
Now, we want to prove that any other linear and unbiased estimator of the true population
parameter obtained from any other econometric method has larger variance that that OLS
estimators.

29
Properties of OLS Estimates . . . Con’t . . .
Suppose: b * an alternative linear and unbiased estimator of b and; To prove whether bˆ has minimum variance or not let’s compute var( b *) to compare with var( bˆ )

𝑥
Let b * = SwiYi where, wi  k i ; but: wi = k i + ci where, 𝑘𝑖 = σ 𝑥𝑖 2 var( b *) = var( Swi Yi ) = Swi var(Yi )
2
𝑖

b * = Swi (a + bX i + u i ) Since Yi = a + bX i + U i Therefore, 𝑣𝑎𝑟(𝛽 ∗ ) = 𝛿 2 σ 𝑤𝑖2 , since Var(Yi ) =  2

= aSwi + bSwi X i + Swi u i But, Swi = S(k i + ci ) 2 = Sk i2 + 2Sk i ci + Sci2


2

Therefore, E(𝛽 ∗ ) = 𝛼 σ 𝑤𝑖 + 𝛽 σ 𝑤𝑖 𝑋𝑖 , since E(u i ) = 0 , σ 𝑤𝑖 = 0 𝑎𝑛𝑑 σ 𝑤𝑖 𝑋𝑖 = 1 Sci xi


 Swi2 = Ski2 + Sci2 Since Sk i ci = =0
Sxi2
Since b * is assumed to be an unbiased estimator, then for b * is to be an unbiased estimator of b Therefore,
there must be true that Swi = 0 and Swi X = 1 in the above equation.
𝑥 σ 𝑥2 1
var( b *) =  2 (Ski2 + Sci2 )   2 Ski2 +  2 Sci2 , since 𝑘𝑖 = σ 𝑥𝑖 2 𝑎𝑛𝑑 σ 𝑘𝑖2 = (σ 𝑥 2𝑖 )2 = σ 𝑥 2
𝑖 𝑖 𝑖
But, wi = k i + ci

Swi = S(k i + ci ) = Sk i + Sci 𝛿2


𝑣𝑎𝑟 (𝛽 ∗ ) = 𝛿 2 𝑘𝑖2 + 𝛿 2 𝑐𝑖2 ====> 𝑣𝑎𝑟 (𝛽 ∗ ) = 2 + 𝛿2 𝑐𝑖2
σ 𝑥𝑖
Therefore, Sci = 0 since Sk i = Swi = 0

Again Swi X i = S(k i + ci ) X i = Sk i X i + Sci X i ෡ ) + 𝜹𝟐 σ 𝒄𝟐𝒊


Then, 𝒗𝒂𝒓 (𝜷∗ ) = 𝒗𝒂𝒓(𝜷
Since, Swi X i = 1 and Sk i X i = 1  Sci X i = 0 .

From these values we can drive Sci xi = 0, where xi = X i - X Given that ci is an arbitrary constant,  2 Sci2 is a positive i.e. it is greater than zero. Thus

Sci xi =  ci ( X i - X ) =Sci X i + XSci , Since Sci xi = 1 Sci = 0  Sci xi = 0 var( b *)  var( bˆ ) . This proves that bˆ possesses minimum variance property. In the similar way
Thus, from the above calculations we can summarize the following results. we can prove that the least square estimate of the constant intercept ( â ) possesses minimum
variance.
Swi = 0, Swi xi = 1, Sci = 0, Sci X i = 0
Variance of the Error/Random Term (𝑼𝒊 )
✓ You may observe that the variances of the OLS estimates involve 𝛿𝑢2 , which is the
population variance of the random disturbance term. But it is difficult to obtain the
population data of the disturbance term because of technical and economic reasons. Hence
it is difficult to compute 𝛿𝑢2 ; this implies that variances of OLS estimates are also difficult
to compute. But we can compute these variances if we take the unbiased estimate of 𝛿𝑢2
which is 𝛿መ𝑢2 computed from the sample value of the disturbance term 𝑒𝑖 from the
expression:

To use ˆ 2 in the expressions for the variances of aˆ and bˆ , we have to prove whether ˆ 2

 ei
2

is the unbiased estimator of  2 , i.e., E (ˆ 2 ) = E ( )=2


n-2
31
Confidence Interval and Hypothesis Testing

Testing the significance of a given regression coefficient


 Since the sample values of the intercept and the coefficient are estimates of the true population
parameters, we have to test them for their statistical reliability.
 The significance of a model can be seen in terms of

 The amount of variation in the dependent variable that it explains and

 The significance of the regression coefficients.

 There are different tests that are available to test the statistical reliability of the parameter
estimates. The following are the common ones;
A)The standard error test
B)The students t-test
32
Confidence Interval and … Con’t . . .
A) The Standard Error Test

 This test first establishes the two hypotheses that are going to be tested which are
commonly known as the null and alternative hypotheses. The null hypothesis
addresses that the sample is coming from the population whose parameter is not
significantly different from zero while the alternative hypothesis addresses that the
sample is coming from the population whose parameter is significantly different from
zero. The two hypotheses are given as follows:
H0: βi=0
H1: βi≠0
 The standard error test is outlined as follows:
33
Confidence Interval and … Con’t . . .
1. Compute the standard deviations of the parameter estimates using the above formula
for variances of parameter estimates. This is because standard deviation is the
positive square root of the variance.
  U2
se( b 1 ) =
 xi2
  U2  X i2
se( b 0 ) =
n  xi2
2. Compare the standard errors of the estimates with the numerical values of the estimates and
make decision.
 1 
a) If se( b i )  ( b i ) , reject the null hypothesis and we can conclude that the
2
estimate is statistically significant.
 1 
b) If se( b i )  ( b i ) , do not reject the null hypothesis and we can conclude
2
that the estimate is not statistically significant.
34
Confidence Interval and … Con’t . . .
 Example 2.4: The regression shows the estimated regression of supply on price

 Yi = 33.75 + 3.25 X i where the numbers in parenthesis are standard errors. Test the statistical
(8.3) (0.9)
significance of the estimates using standard error test.

Solution: The following information is given for decision.


∧ ∧
𝛽0 = 33.75 ✓Testing for 𝜷𝟎 : we have to reject the null hypothesis and conclude that the
∧ ∧
𝑠𝑒(𝛽0 ) = 8.3 parameter estimate 𝜷𝟎 is statistically significant.

𝛽1 = 3.25 ∧
∧ ✓Testing for 𝜷𝟏 : we have to reject the null hypothesis and conclude that the
𝑠𝑒(𝛽1 ) = 0.9 ∧
parameter estimate 𝜷𝟏 is statistically significant.
35
Confidence Interval and … Con’t . . .

A) The Student T-Test


 In conditions where Z-test is not applied (in small samples), t-test can be used to test the
statistical reliability of the parameter estimates. The test depends on the degrees of
freedom that the sample has. The test procedures of t-test are similar with that of the z-test.
The procedures are outlined as follows;
1. Set up the hypothesis. The hypotheses for testing a given regression coefficient is given
by:
H 0 : bi = 0
H1 : b i  0
2. Determine the level of significance for carrying out the test. We usually use either 1% or 5% level

significance in applied econometric research.


36
Confidence Interval and … Con’t . . .
3. Determine the tabulated value of t from the table with n-k degrees of freedom, where k is
the number of parameters estimated.

4. Determine the calculated value of t. The test statistic (using the t- test) is given by:

bˆi
tcal =
se( bˆi )
 The test rule or decision is given as follows:

o Reject H0 if | t cal | ta / 2,n-k

o Do not reject H0 if |𝑡𝑐𝑎𝑙 | ≤ 𝑡𝛼/2,𝑛−𝑘

37
Confidence Interval and … Con’t . . .
 Example 2.5: Refer to Example 2.1. Is price of the commodity significant in determining the quantity
supplied of the commodity under consideration? Use a=0.05.

 The hypothesis to be tested is:

H 0 : b1 = 0
H 1 : b1  0
 As we found in Example 2.1,
bˆ1 = 3.25, se(bˆ1 ) = 0.8979
bˆi 3.25
t cal = = = 3.62
ˆ
se( b i ) 0.8979
 The tabulated value as given in Example 2.1 is 2.228. since calculated t is greater than the
tabulated value, we reject the null hypothesis and conclude that the price of the commodity
is statistically significant in determining quantity supply of the commodity at 5% level of
significance.
38
Confidence Interval and … Con’t . . .
Confidence Interval Estimation of the regression Coefficients

✓ In the above section, we have seen how to test the reliability of parameter estimates. But
one thing that must be clear is that rejecting the null hypothesis does not mean that the
parameter estimates are correct estimates of the true population parameters.
✓ It means that the estimate comes from the sample drawn from the population whose
population parameter is significantly different from zero. In order to define how close to
the estimate the true parameter lies, we must construct a confidence interval for the
parameter.
✓ we can construct 100(1- a) % confidence intervals for the sample regression coefficients.
To do so we need to have the standard errors of the sample regression coefficients. The
standard error of a given coefficient is the positive square root of the variance of the
coefficient.
39
Confidence Interval and … Con’t . . .
✓ Variance of the intercept (𝛽መ0 )

X
2
i

var( bˆ0 ) =  u
2

n xi
2

✓ Variance of the slope (𝛽መ1 )

1  ei
2
var( bˆ1 ) =  u
2
where, u2 =
x
2
i n-k

 The standard errors are the positive square root of the variances, as repeatedly defined
above and 100(1- a) % confidence interval for the slope is:
40
Confidence Interval and … Con’t . . .
   
b1 - ta ( n - k )(se( b1 ))  b1  b1 + ta ( n - k )(se( b1 ))
2 2

 Example 2.6: From example 2.1 above, determine the 95% confidence interval for the
slope.  xy
bˆ = =
156
= 3.25 bˆ0 = Y - bˆ1 X = 63 - (3.25)(9) = 33.75
48
x 2

e var( bˆ1 ) =  u
1 1
2
= 38.7( ) = 0.80625
2
i
387 387
u = = = = 38.7 48
x
2
2
n-k 12 - 2 10

 The standard error of the slope is:

se( bˆ1 ) = var( bˆ1 ) = 0.80625 = 0.8979

 The tabulated value of t for (n-k) degrees of freedom is12-2=10 and a/2=0.025 is 2.228.
Hence, the 95% confidence interval for the slope is given by:
41 bˆ1 = 3.25  (2.228)(0.8979) = 3.25  2 = (3.25 - 2, 3.25 + 2) = (1.25, 5.25)
Test of Model Adequacy (Overall Significance Test)
 Is the estimated equation a useful one? To answer this, an objective measure of some sort
is desirable.

 The total variation in the dependent variable Y can partition into two: one that accounts for
variation due to the regression equation (explained portion) and another that is associated
with the unexplained portion of the model.

ഥ) 𝟐 =
(𝒀 − 𝒀 ෡− 𝒀
(𝒀 ഥ) 𝟐 + ෡) 𝟐
(𝒀 − 𝒀

TSS = ESS + RSS


42
Test of Model Adequacy Con’t . . .
 In other words, the total sum of squares (TSS) is decomposed into regression (explained)
sum of squares (ESS) and error (residual or unexplained) sum of squares (RSS).
 Thus, a small value of R2 casts doubt about the usefulness of the regression equation. We
do not, however, pass final judgment on the equation until it has been subjected to an
objective statistical test. Such a test is accomplished by means of analysis of variance
(ANOVA) which enables us to test the significance of R2 (i.e., the adequacy of the linear
regression model).
 In simple linear regression model if the slope coefficient is statistically significant
definitely the regression model become adequate. The ANOVA table for simple linear
regression is given below:
Source of Variation Sum of Squares Degree of Freedom Mean Square Variance Ratio
Regression ESS k-1 𝑬𝑺𝑺ൗ 𝑬𝑺𝑺ൗ
(𝒌 − 𝟏) (𝒌 − 𝟏)
𝑹𝑺𝑺ൗ 𝑭𝒄𝒂𝒍 =
Residual RSS n-k 𝑹𝑺𝑺ൗ
(𝒏 − 𝒌) (𝒏 − 𝒌)
43 Total TSS n-1 TSS/(n-1)
Test of Model Adequacy
 To test for the significance of R2, we compare the variance ratio with the critical value
from the F distribution with (k-1) and (n-k) degrees of freedom in the numerator and
denominator, respectively, for a given significance level α.

 Decision: If the calculated variance ratio exceeds tabulated value, that is, if
Fcal > Fα (k-1, n-2), we then conclude that R2 is significant (or that the linear
regression model is adequate or statistically significant).

 Note that, the F test is designed to test the significance of all variables or a set of
variables in a regression model. In the two-variable model, however, it is used to test
the explanatory power of a single variable (X), and at the same time, is equivalent to
the test of significance of R2.
44
Prediction Using Simple Linear Regression Model
Predicting the future values of the dependent variable is one of the key tasks in econometric analysis. The estimated
መ 𝑖 is used for predicting the values of Y for a given values of X. To proceed with, let X0 be
regression equation 𝑌෡𝑖 = 𝛼ෝ + 𝛽𝑋
the given value of X. Then we predict the corresponding value of Y0 of Y by

෢𝟎 = 𝜶
𝒀 ෡ 𝟎
ෝ + 𝜷𝑿

The true value of Y is given by; 𝑌0 = 𝛼 + 𝛽𝑋0 + 𝑈0 where 𝑈0 𝑖𝑠 𝑒𝑟𝑟𝑜𝑟 𝑡𝑒𝑟𝑚

Hence, the prediction error is:

𝑌෡0 − 𝑌0 = 𝛼ෝ – 𝛼 + (𝛽෠ −𝛽)𝑋0 −𝑈0 , Since 𝐸 𝛼ෝ – 𝛼 = 0, 𝐸 𝛽෠ − 𝛽 = 0 𝑎𝑛𝑑 𝐸(𝑈0 ) = 0, we have 𝐸 𝑌෡0 − 𝑌0 = 0

This equation shows that the predictant is unbiased. Note that the predictant is unbiased in the sense that 𝐸 𝑌෡0 ) = 𝐸(𝑌0 since both
𝑌෡0 𝑎𝑛𝑑 𝑌0 are random variables.

45
Example
Using the following information about money demand (M) measured in billion USD and
interest rate (R) measured in percentage for eight different economy samples;

Money Demand (M) 56 50 46 30 20 35 37 61


Interest Rate (R) 6.3 4.6 5.1 7.3 8.9 5.3 6.7 3.5

A. Assuming the relationship between;


𝑴 𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅 = 𝜷𝟎 + 𝜷𝟏 𝑰𝒏𝒕𝒆𝒓𝒆𝒔𝒕 𝑹𝒂𝒕𝒆 (𝑹) + 𝐞

obtain the OLS estimators of β0 and β1

46
Example Con’t . . .
𝑴 (𝒀) 𝑹 (𝑿) 𝒚 𝒙 𝒚𝟐 𝒙𝟐 𝒚𝒙
56 6.3 14.125 0.337 199.516 0.114 4.767
50 4.6 8.125 -1.363 66.016 1.856 -11.070
46 5.1 4.125 -0.863 17.016 0.744 -3.558
30 7.3 -11.875 1.338 141.016 1.789 -15.883
20 8.9 -21.875 2.938 478.516 8.629 -64.258
35 5.3 -6.875 -0.663 47.266 0.439 4.555
37 6.7 -4.875 0.738 23.766 0.544 -3.595
61 3.5 19.125 -2.463 365.766 6.064 -47.095

47
Example Con’t . . .
σ 𝒚𝒙 −𝟏𝟑𝟔. 𝟏𝟑𝟖
෡𝟏 =
𝜷 = = −𝟎. 𝟏𝟎𝟐
σ 𝒚𝟐 𝟏𝟑𝟑𝟖. 𝟖𝟕𝟓

෡𝟎 = 𝒀
𝜷 ෡ 𝟏𝑿
ഥ−𝜷 ഥ = 𝟒𝟏. 𝟖𝟕 − −𝟎. 𝟏𝟎𝟐 ∗ 𝟓. 𝟗𝟔 = 𝟒𝟐. 𝟒𝟖

Then, the estimated Money Demand function becomes;


𝑴(𝑴𝒐𝒏𝒆𝒚 ෡𝟎 + 𝛃
𝑫𝒆𝒎𝒂𝒏𝒅) = 𝛃 ෡𝟏 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)


𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)
B. If in a 9th economy the rate of interest rate is R = 8.1, predict the Demand for Money
(M) in this economy

𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)


𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝟖. 𝟏 = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟖𝟐𝟔 = 𝟒𝟏. 𝟔𝟓𝟒
48
Example using STATA Output
Outcome Predictor
variable (Y) variable (X)
This is the p-value of the model. It tests whether R2is
different from 0. Usually we need a p-value lower than
. reg qtybeer pricebeer 0.05 to show a statistically significant relationship
between X and Y.
Source SS df MS Number of obs = 30
F( 1, 28) = 51.14
R-square shows the amount of variance of Y
Model 1156.92449 1 1156.92449 Prob > F = 0.0000 explained by X. In this case price of beer
Residual 633.490177 28 22.6246492 R-squared = 0.6462 explains 64.62% of the variance in quantity
Adj R-squared = 0.6335 demand of beer.
Total 1790.41467 29 61.7384368 Root MSE = 4.7565
Adj R-square shows the same as R-square but adjusted
by the no. of cases and no. of variables. When the no. of
qtybeer Coef. Std. Err. t P>|t| [95% Conf. Interval] variables is small and the no. of cases is very large then
Adj R-square is closer to R-square. This provides a
more honest association between X and Y.
pricebeer -9.835284 1.375388 -7.15 0.000 -12.65264 -7.017929
_cons 86.40601 4.324293 19.98 0.000 77.5481 95.26392
The confidence interval shows the lower and upper limit
where to find the true population parameter.
qtybeer = 86.406 - 9.835*pricebeer
For each one-point increase in price of Two-tail p-values test the hypothesis that each coefficient is different from 0. To reject this, the
beer, quantity demand of beer p-value has to be lower than 0.05 (you could choose also an alpha of 0.01). In this case,
decreases by 9.835 million liters. price of beer is statistically significant in explaining quantity demand of beer.

The t-values test the hypothesis that the coefficient is different from 0. To reject this, you need a t-value greater
than 1.96 (for 95% confidence). You can get the t-values by dividing the coefficient by its standard error. The t-
49
values also show the importance of a variable in the model.

You might also like