0% found this document useful (0 votes)
266 views35 pages

Econometrics Chapter 1& 2

This document provides an overview of econometrics by: 1) Defining econometrics as the intersection of economics, mathematics, and statistics for empirical analysis of economic relationships. 2) Distinguishing between economic models, which are theoretical representations, and econometric models, which incorporate empirical data and statistical analysis. 3) Outlining the aims and methodology of econometrics, which include formulating testable models, estimating models with data, testing hypotheses, and using models to make predictions. The classical methodology proceeds from economic theory to mathematical then econometric specification, data collection, parameter estimation, and hypothesis testing.

Uploaded by

Dagi Adanew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
266 views35 pages

Econometrics Chapter 1& 2

This document provides an overview of econometrics by: 1) Defining econometrics as the intersection of economics, mathematics, and statistics for empirical analysis of economic relationships. 2) Distinguishing between economic models, which are theoretical representations, and econometric models, which incorporate empirical data and statistical analysis. 3) Outlining the aims and methodology of econometrics, which include formulating testable models, estimating models with data, testing hypotheses, and using models to make predictions. The classical methodology proceeds from economic theory to mathematical then econometric specification, data collection, parameter estimation, and hypothesis testing.

Uploaded by

Dagi Adanew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Chapter 1

Basic Concepts of Econometrics


Chapter objectives: At the end of this chapter students will be able to:
 Understand what econometrics is all about.
 Distinguish among a model, an economic model, and an econometric model.
 Understand the aims and methodology of econometrics.
1.1 Introduction
Definition - Econometrics is the field of economics that concerns with the application of
mathematics, statistics, economic theory and computer science (statistical inference) to
empirical analysis of relationships between variables. To put differently, econometrics is
basically concerned with the empirical estimation (analysis) of economic phenomena
using the tools of economic theory, mathematics and statistical inference. Hence,
Econometrics is the intersection (blend) of mathematics, statistics and economics.
Econometrics is by no means the same as economic statistics. Nor is it identical with
what we call general economic theory, although a considerable portion of this theory has
a definitely quantitative character. Nor should econometrics be taken as synonymous with
the application of mathematics to economics. Experience has shown that each of these
three viewpoints, that of statistics, economic theory`, and mathematics, is a necessary, but
not by itself a sufficient, condition for a real understanding of the quantitative relations in
modern economic life. It is the unification of all three that is powerful. And it is this
unification that constitutes Econometrics.
Different economists give different definition for Econometrics, but all of them are
arriving at the same conclusion and we can boil down the whole definition in to the
following:

1.2 Economic model and Econometric model

The first task an econometrician faces is that of formulating an econometric model. What
is a model?
A model is a simplified representation of actual phenomena. For instance, saying that ‘the
quantity demanded of oranges depends on the price of oranges’ is a simplified

1
representation because there are a host of other variables that one can think of that
determine the demand for oranges. For example, the price of related goods such as apples
and income of the consumer, taste and preference of the consumer, etc are determinants
of the demand for orange. However, there is no end to this stream of other variables. In a
remote sense even the price of gasoline can affect the demand for oranges.

An economic model is a set of assumptions that approximately describes the behavior of


an economy (or a sector of an economy).

An econometric model consists of the following:


 A set of behavioral equations derived from the economic theory/ model. These
equations involve some observed variables and some ‘disturbances’ (which are a
catchall for all the variables considered as irrelevant for the purpose of this model
as well as all unforeseen events).
 A statement of whether there are errors of observation in the observed variables.
 A specification of the probability distribution of the ‘disturbances’ (and errors of
measurement)
With these specifications we can proceed to test the empirical validity of the economic
model and use it to make forecasts or use it in policy analysis.

Here we can distinguish between two types of algebraic model.


1. Deterministic (mathematical or exact) model - where there is unique value of
the dependent variable for a given value of the independent variable. Example:
Y =β 0 + β1 X , where β 0 ∧β 1 are constan ts , Y= Selling price, X = Advertising
expenditure

2. Econometric (Stochastic) model


The element of randomness in human behavior necessitates statistical modeling
introducing probabilistic discourse to the analysis of economic variables. Hence the
relation between Y and X is no longer exact; it is likely to assume one of different values.
The stochastic relationship could be expressed as:
Y =β 0 + β1 X +u , Where ‘u|’ is the random or error term.

2
In fact, u is the source of inexactness in this model. Assume u is a random variable taking
the value of +100 or –100 with uniform probability of (0.5, 0.5). Let also
β 0 =50 and β 1 =1 . for a given value of X (say X ), Y is likely to take one of two
0

possible values. That is: -


Y =50+ X 0 +100 with probablity of 0 .5
or Y =50+ X 0 −100 with probablity of 0 . 5
E ( Y / X =X 0 )=0. 5 [ 50+ X 0 +100 ] +0 . 5 [ 50+ X 0 −100 ]=50+ X 0
Then, =the average
value of Y
Note: - The value of Y is not certain now, it is rather determined probabilistically

1.3 The Aims and Methodology of Econometrics

i) Aims of Econometrics Analysis are: -


1. Formulation of econometric models in an empirically testable form. Usually, there are
several ways of formulating the econometric model from an economic model, because we
have to choose the functional form, the specification of the stochastic structure of the
variables, and so on. This part constitutes the specification aspect of the econometric
work.
2. Estimation and testing of models with observed data; this part constitutes the
hypothesis testing and making inference.
3. Use of models for prediction.

ii) Basic Steps in Econometrics


How do econometricians proceed in their analysis of an economic problem? Although
there are several schools of thought on econometric methodology, the traditional or
classical methodology, which still dominates empirical research in economics and other
social sciences, proceeds along the following lines.
1. Economic Theory or model

This is Statement of a theoretical proposition (or hypothesis) about relationship between


economic variables.

3
For Example, in economics there is a theory that states a positive relationship between
household consumption expenditure and the level of income, given other things being
constant. As income increases consumption expenditure will also increase.
Other theories include; negative relation between investment and interest rate, positive
relation between demand for a s normal good and level of income, negative relation
between quantity demanded and price of a product, etc, are an accepted theories in
economics.

2. Specification of the mathematical model

The relationship between variables proposed by economic theories will be presented


mathematically. The relationship between consumption and income could be
mathematically as follows:
Y = β0 + β 1 X Where Y = consumption expenditure, dependent variable
X = level of income (independent) and 0 & 1, known as the parameters of the model,
are respectively, the intercept and slope coefficients, respectively.
The mathematical model assumes an exact relationship between Y and X.; hence it is
also referred as Deterministic Model.

3. Specification the Econometric model


Since relationships between economic variables are generally inexact, the econometrician
would modify the mathematical (deterministic) consumption function as follows:
Y = β0 + β 1 X +u; Where u known as the disturbance or error term. It is a random
(stochastic) variable that has well defined probabilistic properties. The disturbance term u
may well represent all those factors that affect consumption but are not taken in to
account explicitly.

The significance of the stochastic disturbance term

As noted above, the disturbance term  is a surrogate for all those variables that are
omitted from the model but that collectively affect the independent variable Y. But the
question is that; why not introduce all those variables into the model explicitly? Some of
the reasons are:

4
i) effects of variables omitted
ii) statistical errors
iii) Vagueness of theory
iv) unavailability of data
v) Intrinsic randomness in human behavior
vi) Wrong functional form, etc.

4. Obtaining the data - To estimate the econometric model specified above, that is, to
obtain the numerical values of 1 and 2, we need data. Thus, the econometrician or a
researcher can collect either a primary data or secondary data depending on his interest or
justification. Of course, we will have more to say about the nature and sources of data for
econometric analysis in the next chapter.

5. Estimation of the parameters of the econometric model - After obtaining the data,
the next task is to estimate the parameters of the consumption function. The numerical
estimates of the parameters give empirical content to the consumption function. The
actual mechanics of estimating the parameters will be discussed in the next chapter.
6. Hypothesis Testing - Assuming that the fitted model is a reasonably good
approximation of reality, we have to develop suitable criteria to find out whether the
estimates (for the values of β 0 and β 1) obtained from the econometric model are in
accordance with the expectations of the theory that is being tested.
Given the model: Y = β0 + β 1 X +u , Y – consumption and X - income
According the accepted economic theory there is positive relationship between
consumption expenditure and income. The Marginal propensity to Consume (MPC
measured by β 1 ) has to be positive: 0 < β 1< 1
Thus, the sign of β 1 is expected to be between zero and 1; if our result of estimation
shows a negative sign (if β 1< 0), contradicts the theory.

Before we accept the finding, we must enquire whether this estimate is not a chance but it
is a valid result. We test whether the estimated coefficients are statistically significant or
not, whether the estimated model is a close approximation the actual phenomenon, etc.
the confirmation or refutation of economic theories on the basis of sample evidence is

5
based on a branch of statistical theory known as statistical inference (hypothesis
testing).
Hypothesis testing enables the researcher whether to use the model for prediction or to
use it for further analysis or to accept the interpretation the result obtained from
estimation is valid or not.

7. Forecasting or prediction - If the chosen model does not refute the hypothesis or
theory under consideration, we may use it to predict the future value(s) of the dependent
variable Y on the basis of the known or expected future value(s) of the independent
variable X. For example, from our simple investment model, one can predict that for a
one-unit fall (rise) in real interest rate, gross investment will increase (decrease) by the
amount approximately equal to 1.

8. Using the model for policy purposes

If the regression result seems reasonable, any policy maker or the government might use
appropriate fiscal and monetary policy mix to manipulate the control variable X (say
interest rate) to produce the desired level of the target variable Y (say investment).
1.3 The Types of Data for Econometric analysis

The source of data ÷

1 Primary data source


2 Secondary data source

Data could be primary or secondary data. Primary data are data collected by the
researcher for his/her study from a selected sample using different instruments: such as
interview, questionnaire, focus group discussion, experiments, etc.
Secondary Data are data that are available from documents, reports or different
organizations. They are data that are not originally collected by the researcher but
someone else. For example, in Ethiopia, secondary data on macroeconomic variables can
be obtained from: CSA (Central statistics Authority), NBE (National bank of Ethiopia),
MOFD (Ministry of Finance and Dev’t) EEA (Ethiopian Economic Association), etc.

6
International organizations such as World Bank (WB data base), IMF, etc are also major
sources of macroeconomic data on almost every country.
In generally, there are three types of data which are used in empirical analysis: cross-
section data, time series data and pooled data (including longitudinal, panel data)

1) Cross – section data: - are data on one or more variables collected at the same point in
time. The most important example of cross-sectional data is population Census which
provides data on demographic and socio- economic characteristics of households at given
point in time. In Ethiopia, Population Census is taken once in 10 years; the most is taken
in 2007.
Example 1 – monthly income of selected households in 2005
Subject Income in birr Sex Schooling Work experience
in years
1 600 M Secondary 4
2 1200 M Secondary 2
3 900 F Tertiary 3

⋮ ⋮ ⋮ ⋮ ⋮
100 5000 M Tertiary 12

We have 100 samples households or subjects and 4 cross-sectional observations for each
subject. Thus, we have a total of 400 cross-sectional data (4×100).
Example 2 – prices for sampled fruits in 2005
Items Output in tons Price per kg (in birr)
Banana 200 8
Orange 60 12 We two cross sectional
Avocado 30 10 observations for 5 sampled
Apple 5 60 fruits. Hence, we have
Mango 75 8 2 ×5=10 cross sectional
data.
Cross - sectional data suffer from the problem of heterogeneity.

7
2) Time series data: - A time series is a set of observations on the values that a variable
takes at different times. Such data may be collected at regular time intervals, such as
daily, weekly, monthly, quarterly, annually, etc. The most important time series data are
Macroeconomic data such as: GDP, GNP, PCI (per capita income) etc, are reported
annually. Example - Ethiopian Per capita income at constant US dollar 2000 - 2012 (World
bank Data base 2014)
Year PCI
In time series data non-stationary problem is
2000 135
encountered, where mean and variance of the data vary
2001 142
2002 140 over time. A time series is said stationary if it’s mean
2003 133 and variance don’t vary over time (if mean and variance
2004 147
are constant).
2005 160
2006 172
2007 187
2008 202
2009 214
2010 234
2011 253
2012 269
(Ethiopian PCI at constant 2005 US dollar)
3) Pooled data: - In this case data are elements of both time series and cross –section.
For example, the following are macroeconomic data for Ethiopian economy 2000- 2012
Year PCI g I S
2000 135 3.0 23.11 11.0
PCI - per capita at constant 2005 US$,
2001 142 5.22 24.53 12.68
2002 140 1.38 27.31 13.16 g - GDP per capita income growth rate in %
2003 133 -5.0 25.0 10.74 Source: World Bank 2014
2004 147 10.4 29.76 12.86 I –Gross Domestic investment as %GDP
2005 160 8.73 26.53 5.92 Data base
2006 172 7.83 27.94 4.98 S – Gross Domestic saving as %GDP
2007 187 8.48 24.46 4.92 We have 4 cross sectional variables
2008 202 7.86 24.7 4.88 (per capita income, growth rate,
2009 214 6.0 25.59 7.0
2010 234 9.63 27.43 7.55 saving and investment) and for each
12.73 variable we have 13 time series
2011 254 8.32 27.87
2012 269 6.0 33.08 15.0 observation. We have a total of 4 by
13(52) observations.
8
4) Panel or Longitudinal data: - This is a special type of pooled date in which the same
cross- sectional unit is surveyed over time. This very closely related with pooled data.
e.g., Ethiopian rural households survey Data since 1989

Chapter Two –
Classical Linear Regression Model (CLRM)
2.1. The concept of Regression

9
Definition: Regression analysis is the study of the relationship between two or more
variables. There are dependent variables and explanatory (independent) variables; hence,
regression used to study the dependence of one variable (the dependent variable) over
one or more of explanatory variables.
We regress the dependent variable over the explanatory variables (Regress Y on X); and
we estimate or predict the mean value of the dependent variable in terms of the known
(fixed) values of the independent variables.

Regression analysis concerned with the statistical dependence, not functional or


deterministic relationship, among variables. The dependent variable is assumed random
or stochastic with certain pattern of probability distributions. The mean or expected value
of the dependent variable for a given fixed value of the independent variable is a function
of the independent variable. If the econometric model is given as:

Y i=β 0 + β 1 X +ui

The Population Regression Function (PRF) which shows the expected value of the
dependent variable (conditional upon X, the independent variable) is given as:

E ( Y / X ) =β 0+ β1 X

Y – The dependent and X – independent variables. E ( Y / X ) is conditional mean of Y.


β 0−¿ β 1 are parameters which are called the Regression Coefficients. β 0- is the Intercept
Coefficient and β 1- is the Slope Coefficients.

Suppose the relation between age and height of children; where height of children is the
dependent (Y) variable while Age of children (X) is the explanatory variable. That is,
Height increases as Age of children increases.

: Y i=β 0 + β 1 X +ui

Where, Y the actual height of a child measured in Inches, X is age of a child in years and
ui denotes the error term. However; the expected height of children for any given age is
given as the conditional mean of Y at X:

E ( Y / X ) =β 0+ β1 X - PRF that shows the conditional mean of Y

Using the population regression line, the PRF is graphically shown as follows

10
Height(Y) in inches

70 D PRF Line: E ( Y / X ) =β 0+ β1 X

60 C
B
50 A

40
10 12 14 16 Age in years (X)
The population regression line contains the average distributions of heights (Y) at a given
Age (X) of children. Height of children increases as age increases, on average.
Note that at any given Age (for any fixed value of the explanatory variable, X), children’s
Height is random; it could be above or below the regression line. That is, Y is randomly
distributed while X is statistical fixed. However, the conditional mean of height ( E ( Y / X ) )
of children in each age group is predictable and it is denoted by points on the regression
line such as at points: A, B, C and D. E ( Y / X ) =β 0+ β1 X
 For example, at age 10, let assume the expected height of children is 48 inches (at
point A on PRF line). But, the actual height of a child can be anywhere above or
below point A (it could be 50, 53, 58, or 38, 40, 45, etc inches.
Since we don’t have the entire population data, we should rely on sample data to estimate
population mean value. Thus, the sample counter part of the population regression
function is referred as Sample Regression Function (SRF) which given as:
Y^ i = ^β 0 + ^β 1 X i
Y^ i - Estimator of the population conditional mean E ( Y / X ) ; simply estimator of Y i
^β - Estimator of β 0 and
0

^β 1 - estimator of β 1

If we draw the line for the estimated mean values (SRF) it will not necessarily overlap
with the line of PRF, it only approximately closes to PRF line (below the broken line
shows the estimated SRF).

11
Height (Y) SRF PRF
70 D

60 C - - - - - SRF
B _______ PRF
50 A

40
10 12 14 16 Age (X)

Note: the dependent variable also called the explained, regressand, predictand,
endogenous, outcome, etc.
The ‘Explanatory variable’ also called independent, predictor, Regressor Exogenous,
Stimulus variable, etc.

Regression analysis deals with the dependence of one variable over other variables, it
does not necessarily imply causation (in a sense that one variable is a cause for another).
It only shows statistical relationship between dependent and explanatory variables.

There is a fundamental difference between Regression Analysis & Correlation Analysis.


In Correlation Analysis the primary objective is to measure the strength or degree of
linear association between two variables. The Correlation Coefficients measures this
strength of (linear) association.

In Regression Analysis the primary objective is to estimate or predict the average value
of one variable on the basis of the fixed values of other variables (explanatory).

In regression analysis there is an asymmetry in the way the dependent and explanatory
variables are treated. The dependent variable is assumed to be statistical, random, or
stochastic, that is, to have a probability distribution. The explanatory variables, on the
other hand, are assumed to have fixed values (in repeated sampling).

2.2. Simple Linear Regression model (Two – variable case)

Regression analysis could be simple or multiple regression models. The Multiple


Regression Model is a model that contains three or more variables:
12
Y = β0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 +U (5 variable model)

The simple linear regression model is a regression model that contains only two
variables: one dependent and one explanatory variable.
Y = β0 + β 1 X +U
Our discussion in this chapter is focused on two variables model (Simple Regression
Model)

2.2.1 Assumptions of Classical Simple Regression model (CLRM)


The Classical Linear Regression Model (CLRM) is based on certain key assumptions.
The CLRM assumptions are discussed below;
Assumption 1): - The model is linear in parameters. That is, the expected value of the
E (Y / X i )
dependent variable Y or is a linear function of the parameters, the β' s ;
however, it may or may not be linear with respect to the explanatory, X.
Example

Y 1=β0+β1 X+u1 ¿}¿¿¿}¿ ¿are all linear in parameters regressionfunctions¿

But:
2
}
α2
}
Y 1=β0 +β1 X+u1 ¿ Y 2=α 0+ X+u2 ¿ ¿ ¿are non−linear in parameter regressioin functions.¿
α1
Assumption 2): X values are fixed in repeated sampling. Values taken by the regressor X
are considered fixed in repeated samples. More technically, X is assumed to be non-
stochastic or non- random.

Assumption 3): - Zero mean value of disturbance term; E ( ui )= 0.

13
This assumption states that the factors not explicitly included in the model (which are

denoted by
ui, ) do not systematically affect the mean value of Y i.e., their effect will
i

cancel out on average.

 ui
+ui

−u

Assumption – 4: - Homoscedasticity or equal variance of


ui .Given the value of X, the

variance of
ui the same for all observations. Symbolically:

Var ( ui / x i ) = E [ ui − E ( ui / x i ) ]
2

= E [ u2i ] =δ 2 i=1⋯n

var ( u i )=σ 2
The opposite of homoscedasticity is heteroscedasticity which means the variance of the
error term is not constant.

Assumption 5: No autocorrelation between the disturbance or error terms. Given any two

X values,
X i ∧X j ( i≠ j ) , the correlation between any two error terms; ui ∧u j ( i≠ j ) is
zero. i.e.

[
cov ( ui , u j ) =E [ ui −E ( ui ) ][ u j−E ( u j ) ] ]
cov ( ui , u j ) =E [ [ ui−0 ][ u j−0 ] ]

cov ( ui , u j ) = E ( ui ) E ( u j ) = 0

Assumption 6: Zero covariance between the error terms ui and the explanatory
Variables, Xi. i.e., ( i i )
E u x =0
Cov ( u i , x i )=E [ ui −E ( ui ) ][ X i−E ( X i ) ]

¿ E ( ui X i ) −E ( X i ) E ( ui ) . since E ( X i ) is non−random¿ E ( ui X i ) =0 , by assumption

14
If the error terms are correlated with the explanatory variables, it is difficult to isolate
the influence of X and u on the dependent variable, Y. Assumption 6 is automatically
fulfilled if X variable is nonrandom or non - stochastic and Assumption 3 hold.

Assumption 7: - the number of observations ‘n’ must be greater than the number of
Explanatory variables. In other words, the number of observations ‘n’ must be greater
than the number of parameters to be estimated.

Assumption 8: Variability of the explanatory variables; the values of X have to vary.


According to this assumption, the X values in a given sample must not all be the same.
Technically, var ( X ) has to be positive; or, the squared summation of the deviation of X
values from the mean ( X i−X ) has to be positive. ∑ x 2i > 0. Where x 2i = ( X i−X )
2

Assumption 9: The regression model has to be specified correctly. Alternatively, there is


no specification bias or error in the model used in econometric analysis. An econometric
investigation has to begin with the specification of the econometric model underlying the
phenomenon.

Model specification has to include the following points:


 What variables should be included in the model?
 What is the functional form of the model? Is it linear in the parameters, the
variables, or both?
 What are the probabilistic assumptions made about the Y i , the X i and the ui
entering the model?

Omission of important variables from the model, or by choosing the wrong functional
form, or by making wrong stochastic assumptions about the variables of the model, the
validity of interpreting the estimated regression will be highly questionable.

Assumption 10: - no perfect multicollinearity among regresses or the explanatory


variable. This assumption is relevant in Multiple Regression Models discussed in the next
chapter.

15
Normality Assumption: - The disturbance or error terms are normally distributed for all i.

i.e. ui ∼ N ( 0 , δ )
2

The disturbance term (ui ) is normally distributed with zero mean and variance σ 2
The normality assumption is the outcome of the Central Limit Theorem (CLT) of
statistics which implies as the sample size increases the distribution of the variable
approximately normally distributed. With the inclusion of the normality assumption
about the probability distribution of the disturbance term, the model could be called
Classical Normal linear Regression Model (CNLRM).

Assumption about the nature of probability distribution is required for hypothesis


testing and to infer about the true population parameters based the sample estimate.

The probability distribution of the OLS (Ordinary Least Square) estimator depend the
distribution of the error. Since knowledge of the probability distributions of OLS
estimators is necessary to draw inferences about their population values, the nature of the
probability distribution of ui assumes an extremely important role in hypothesis testing.

2.2.3 Method of Ordinary least square (OLS)


Ordinary Least Square (OLS) method is one of the most commonly used methods of
estimation. The OLS method has some very attractive statistical properties that have
made it one of the most powerful and popular methods of regression analysis which will
be discussed later.
Given a two – variable PRF (population Regression function):
Y =β 0 + β1 X +u ,
Since the PRF is not observable, we estimate it from the SRF (Sample Regression
Function): Y^ i = ^β 0 + ^β 1 X i Y
Y i=Y^ i +u i Y1 Y3 SRF
¿
Y i= β^ 0 + β^ 1 X i +ui , i=1⋯⋯n u1
Y^ i= β^ 0 +β 1 X
⇒ui =Y i −( β^ 0 + β^ 1 X i )
or u =Y −Y^
i i i

16
Now, given data or observations on Y& X, we would like to determine the SRF in such a
manner that the estimated value (Y^ i ) is as close as possible to the actual Y. To this end,
^ ^
the OLS method determines or estimates β 0 and β 1 in such a way that it minimizes the
error or, it minimizes the Squared Sum of the Residuals (RSS) as possible as it could be.
The OLS method of estimation is shown below:

The objective function is minimize ∑ ui


2
with respect to
β^ 0 and β^ 1

ui=Y i− ^β 0− ^β 1 X i , now take the squared sum of both sides:


2
∑ ui 2=∑ ( Y i − β^ 0− β^ 1 X i )
Min : ∑ u i =∑ ( Y i− β^ 0 − β^ 1 X i )
2 2

β^ 0 . β^ 1

^ ^
Then take the partial derivatives with respect to β 0 and β 1 as follows
∂ (∑ u ) =−2i2

∂ β^ 0
∑ ( Y i − β^ 0− β^ 1 X i )=−2 ∑ ui = 0⋯⋯ ( 1 )
F.O.C:-
∂ (∑ u ) =2 i2

∂ β^ 1
∑ ( Y i − β^ 0 − β^ 1 X i ) (− X i ) =−2 ∑ xi ui =0 ⋯⋯ ( 2 )
From equation (1), we have the following summations:

∑ Y i −n β^ 0− β^ 1 ∑ X i =0 , Dividing both sides by ‘n’ sample size, we have:


∑ Y i − β^ ^ ∑ X i =0
0 − β1
n n ; thus,

β^ 0 =Ȳ − β^ 1 x̄ - ^β
0 is the least square point estimator for β 0

Where: Y ∧¿ X are average values (mean) of Y and X respectively. This is the last
square estimator for 0 the intercept term.
β
From equation (2), we have:

−2 [∑ ( Y X − ^β i i 0 ]
X i− ^β 1 X i2 ) =0

∑ XiY i −¿ ^β 0 ∑ X i −¿ ^β 1 ∑ X i2 ¿ 0

Note that; ^β 0=Y − ^β 1 X and ∑ X i is written as n X


∑ XiY i −¿ ( Y − ^β 1 X ) ∑ X i −¿ ^β 1 ∑ X i2 ¿ 0

17
∑ XiY i −¿ ( Y − ^β 1 X ) n X −¿ ^β 1 ∑ X i2 ¿ 0

∑ XiY i −¿ n Y X + ^β 1 n X 2−¿ ^β 1 ∑ X i2 ¿ 0

∑ XiY i −¿ n Y X = ^β 1 ∑ X i - ^β 1 n X and ∑ XiY i −¿ n Y X = ^β 1 [ ∑ X i −n X ] ;


2 2 2 2

Hence

^β 1= ∑
X i Y i−n X Y
- ^β is the least square point estimator for β 1
∑ X i2 −n X 2 1

In deviation form: Given that; ∑ XiY i −¿ n Y X = ∑ [ ( Y i−Y ) ( ( X i− X ) ) ] = ∑ xi yi


and ∑ X i2−n X 2 = ∑ x i2, then formula for ^β 1could be written in deviation form as
follows:

^β = ∑ xi y i
1
∑ xi2
The lower letters y i and x i denotes the values of the mean deviation: the deviation of
individual observation from its mean or average values.
y i = Y i−Y - the deviation of Y values from the mean value
x i = X i −X - the deviation of X i values from the mean value
[Note that in most of our subsequent discussion we write variables in deviation form].
The estimators as derived above are called point estimators; that is, given the sample,
each estimator provides only a single (point) value of the relevant population parameter.

Numerical Example 1: - consider a hypothetical date on output (Y) produced and labor
input ( X i ) used for a firm are given as follows:
Y 11 10 12 6 10 7 9 10 11 10
X 10 7 10 5 8 8 6 7 9 10

Then we have: two variables Y (the dependent variable) and X the explanatory variable,
sample size: n=10,
∑ Y i=96, ∑ X i=80, ∑ Y i2 =952, ∑ X i2=668 , ∑ Y i X i =789, Y =9.6 , X=8
The model is specified as: Y = β0 + β 1 X +U

18
Then estimate the values of the regression coefficients based on the data

β^ 1=
∑ Y i X i −n ȳ x̄ 789−( 10 ) ( 8 ) ( 9. 6 ) 21
= = =0 .75
∑ X 2−n x̄ 2 668−( 10 ) ( 8 )2
i
28

β^ 0 =Ȳ − β^ 1 X̄ =9 . 6−0. 75 ( 8 )=3. 6


Thus, we have: Y i=3.6+ 0.75 X i +ui and Y^ i=3.6+ 0.75 X i
Where ^β 0=3.6 and ^β 1=0.75 are point estimates of the true parameters. The value for
^β (¿ 0.75) interpreted as the marginal product of labor; for a one unit increase in labor
1

employment, total output will increase by 0.75 unit. Note also the following summations
in deviation form:
∑ yi x i=21 ∑ x i2=28 , ∑ yi2=30.4, ∑ ^yi2 = ^β 2 ∑ x 2=0.752 ×28=14.063
1 i

Note: ^β 1=
∑ xi y i = 21 =0.75
∑ xi2 28
2.2.4 Important properties of OLS estimators: -
1. It passes through the sample means of Y and X, ( Y and X respectively); in other
words, the SRF contains both mean values.
Y ^ = ^β 0 + ^β 1 X i
SRF Y
β^ 0 =Ȳ − β^ 1 X̄ i

⇒ Ȳ = β^ + β^ X̄
0 1

Ȳ A

β^ 0
X X
In the above example we have X =8, then we can determine the mean of Y as follows:
Y^ i = ^β + ^β X = 3.6 + 0.75(8) = 9.6. Hence, Y = 9.6 give X is 8. As far as the X is
0 1 i

known, the dependent variable can be determined using the SRF.

2) The mean value of the estimated ( Y^ ) is equal to the mean value of the actual Y.

Y i=Y^ i +ui , Summing both sides & dividing by ‘n’, we have.

∑ Y i =∑ (Y^ i +ui )

19
∑ Y i = ∑ Y^ i + ∑ ui ∑ ui
n n n , where n = 0 by assumption
Y = Y^ + 0; Hence, Y = Y^

Y = Y^ = ^β 0 + ^β 1 X
3. The summation the disturbance term is zero – could be proved as follows:
ui=Y i− ^β 0− ^β 1 X 1

∑ u i = ∑ ( Y i− β^ 0− β^ 1 X 1 )
∑ u i = ∑ ( Y i )−∑ ( ^β 0+ β^ 1 X 1 )
∑ u i = ∑ ( Y i )−∑ Y^ I
∑ u i ¿ n Y −n Y^ = 0 (sinceY = Y^ ). Hence: ∑ ui ¿ 0
4. The OLS model can be written in deviation from as follows
Y i=Y^ i + ui= ^β0 − ^β 1 X 1 +ui
Y i=Y^ i + ui , subtract Y from both sides: Y i−Y =Y^ i−Y + ui
2a) y i= ^y i + ui
Again, observe that ^y i = Y^ i−Y = ^β 0 + ^β 1 X i −( ^β 0 + ^β 1 X )

^y i = ^β 0 + ^β 1 X i −( ^β 0 + ^β 1 X )

^y i = ^β 0 + ^β 1 X i − ^β 0− ^β 1 X

^y i = ^β 1 X i− ^β 1 X = ^β 1 ( X i−X )

2b) ^y i = ^β 1 x i - the predicted value written in deviation form.


y i= ^y i + ui = ^β 1 x i + ui

2c) y i = ^β 1 x i + ui - the deviation form of the actual value


5. The residuals and the predicted value are uncorrelated, i.e. ¿ ) = 0
The estimated value, Y^ i , depends only on the explanatory variable. It doesn’t affected by
the error term, u^ i . cov ¿ ) = 0

∑ Y^ i ui ∑ ui ( β^ 0 + β^ 1 X i )
Cov ( Y
^ ,u ) =
i i =
n n
= β 0 ∑ ui + β 1 ∑ ui X i
^ ^

20
cov ¿ ) = 0 + 0
cov ¿ ) = 0; Hence, no correlation between estimated values of the dependent variable &
the error terms.

6. The residuals term is not correlated with the explanatory variable, ie, cov ¿ ) = 0
This imply that the error term has no systematic relation or correlation with the
explanatory variable

2.2.5. Measure of Goodness of Fit (Coefficient of Determination,r 2)

Having estimated a particular linear model, a natural question that comes up is:
 How well does the estimated regression line fit the observations?
 The coefficient of determination r 2 is a summary measure that tells how well the
sample regression line fits the data.

Since the population regression function (PRF) is estimated using sample, there is always
some deviation from actual value. Using sample observation, we produce the SRF
(Sample Regression Function). The measure of the ‘goodness of fit’, which denoted by r 2
in simple two variables regression, helps us to see how close is the estimated sample
regression line to the population regression line. r 2 is used to measure the proportion of
change in the dependent variable that can be accounted to the explanatory variable. r 2 is
computed from sample information and the amount error made. Recall that:

Y i=Y^ i +u i ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ( a )Y = Y^ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ( b )

- Deducting equation (b) from equation (a), we have:

( Y i− Ȳ )=( Y^i−Y^ ) +ui;


Written In deviation form; where y=Y i−Y ∧^y =Y^i−Y^
y i= ^y i +ui

Squaring & sum both sides we have:


∑ yi2=∑ ( ^y i +ui )2
∑ yi2=∑ ( ^y i2 +2 ^yi ui +ui2 )

21
∑ yi2=∑ ( ^y 2i ) + 2∑ ^yi ui + ∑ u i2
∑ u i ^y i=0, thus

⏟ y i 2=∑
⏟ ^y i2 + ∑ u 2 ⋯⋯⋯⋯⋯⋯ ( c )
⏟i
TSS ESS
RSS

∑ ^yi2 ¿ ^β i ∑ x i =
∑ yi2−¿ ∑ u i2
2 2
ESS =
RSS = ∑ u i2 = ∑ yi2−∑ ^y i2 = ∑ yi2− ^βi2 ∑ x 2i
2 RSS
r =1− Where:
TSS
TSS = total sum of square. (Total variation of the dependent variable),
ESS = Explained sum of square, explained variation or the variation of the dependent
variable that accounted for the explanatory variable.
RSS = Residual sum of square, (Unexplained variation) the variation in the dependent
variable that is not explained by explanatory variables in the model.

TSS = ESS + RSS, then dividing both sides of equation (c) by TSS, we have
TSS ESS RSS
= +
TSS TSS TSS
2 RSS
r =1− - Another way of computing the goodness of fit
TSS

Note: - r 2 measures the proportion or percentage of total change in the dependent variable
(Y) explained by the regression model (Regressor). r 2 is taken as a measure of “goodness
–of fit” of the model.

The Coefficient of Correlation( r ) is measure of the degree of association or correlation


between two variables. Sample correlation coefficient of two variables; X and Y can be
cov ( yx ) ∑ yi xi
computed in two ways as follows; r =√ r 2 or r= =
σ y σx √ y i2 √ x i 2
“r ” has a different interpretation from r 2 . “r ” is a measure of the degree of association
between two variables, it is not measure of the ‘goodness of fitness’ of a model. “r" lies
between negative one and positive 1; -1< r <1.

22
When r = -1, there is perfect negative correlation between X and y. When r = 1, there
perfect positive correlation between X and Y.

Note the following points about r 2:-


1. r 2 is non – negative, it can’t be negative.
2
2. it always lies between zero and one: 0<r <1 ,
3. If r 2 = 0; the model doesn’t explain anything, the explanatory variable doesn’t explain
the changes on the dependent variable.
4. When r 2 is close to 1, it implies the explanatory variable explain much changes
observed in the dependent variable.
5. If r 2 = 1 means perfect fit: Y i=Y^ i
6. As the number of explanatory variables increases, the value of r 2 also increases.
Given observations on Y & X, r 2 can also be calculated using the formula shown above.
Example: -For the numerical example given above (on output and labor employment of
labor) compute TSS, ESS, RSS and r 2

Solution - we have the following summations in deviation form: ∑ yi x i=21 ,

∑ x i2=28, ∑ yi2=30.4, ∑ ^yi2 = ^β 12 ∑ x i2=0.752 ×28=14.063, hence,


TSS = ∑ yi2=30.4, ESS = ^β 1 ∑ yi x i = 0.75 ×21 = 15.75, RSS = 30.4 – 15.75 = 14.65
^β 2 ∑ x 2 0.75 2 ×18 0.5625× 28
2 1 i
r= = = = 0.52 or
∑ yi 2
30.4 30.4

β^ ∑ y i x i 21
r 2= 1 =( 0 . 75 ) . =0 . 52
∑ y i2 30 . 4

∑ yi xi 21 21
r =√ 0.52 = 0.721 or r =¿ = = = 0.72
√ y i2 √ x i2 √30.4 × 28 29.175

Interpretation: r 2 = 0.52 means that about 52% of the variation in output is explained by
the variation in labor hour.

23
2

r 2
=β 1
∑ xi yi = ( ∑ xi yi)
Students Activity: - show that:
∑ y i 2 ∑ x i2 ∑ y i2
2.3. The BLUE property of OLS Estimators: Gauss - Markov Theorem
The Gauss - Markov Theorem state that given the assumptions of the classical linear
regression model, the OLS estimators satisfy Best Linear Unbiased Estimators (BLUE)
property.
^
An estimator, say the OLS estimator β 1 , is said to be a Best Linear Unbiased Estimator

(BLUE) of β 1 if the following are satisfied:


1) It is linear, that is, a linear function of a random variable, such as the dependent
variable Y in the regression model.
2) It is unbiased, that is, the expected value of the estimator is equal to its true value.
E ( β^ 1 )
= β 1 and E( β^ 0 ) =β 0. The expected values of OLS estimators are the same as
their respective true parameters.
3) It has minimum variance (efficient estimator) in the class of all such linear
unbiased estimators.

^ ^
2.4. Variance & covariance of OLS estimators β 0 ∧ β1

^ ^
Estimates β 0 ∧ β1 differ from sample to sample. Since we have only one sample at a
time, we rely on the precision of these estimates in representing the true parameters (
β 0∧β 1 ¿ . The measure of such precision is the standard error. It can be shown that the
variances of the estimators are:

ar ( β^ 0 )=δ 2 1 +
x̄ 2
[ ] δ 2 ∑ X i2 √∑ X 2
σ
n ∑x var ( ^β 0 ) se ( β^ 0 ) = i
2 n ∑ x i2 √ n∑ x 2
a) v i ; or = and i

δ
2
^ σ
b) var ( ^β 1) = and se ( β 1 )=
∑ xi 2
√∑ x i
2

cov ( β^ 0 , β^ 1 ) = − x̄ var ( β^ 1 )
c)

24
d) The population variance[ δ ] = var ( Y ) =
2 ∑ y2 = δ 2
n

e) If the sample variance[ δ 2 ] is not known, it can be estimated as follows:

= ∑
RSS u2
δ^ =
2
n−2 n−2
Where RSS = Residual Sum of Squares, and n−2 is the degree of freedom. More
variation in X (the explanatory variable) and increase in sample size (n) increases
^ ^
precision of the estimators ( β 0 ∧ β1 ), this so because it will reduce the variance of
estimators (remember assumption 8 which state var(X) has to be positive).

Example: - From the previous example of the firm:


Y i=3. 6+ 0. 75 xi + ui , r 2 =0 .52
RSS=∑ u 2 =∑ y 2 − β^ 2 ∑ y i xi =30 . 4−( 0 .75 ) ( 21 )=14 . 65
i i

14 . 65
^ 2= ∑ u
2
=1 . 83
Therefore, δ 8
n−2 = - estimate of the population variance

2.5. Implications of the Normality Assumption about the Error Term

The classical Normal Linear Regression Model (CNLRM) assumes that:


ui ∼ N ( 0 , δ 2 ) : The error term is normally distributed with mean zero and variance δ 2

this would imply that


Y i ∼ N ( β 0 + β 1 x , δ 2 ) , because Y i is a linear fucntion of ui
 , hence Y is also
normally distributed
^ ^
 Similarly, β 0 ∧ β1 are Normally distributed with the following mean and
variances

( ) [ ( )]
2 2
^ ≈N β , δ
β ^ ∼ N β , δ 2 1 + x̄
and β
1 1
∑ x i2 0 0
n ∑x2
i

25
With the assumption that ui follows normal distribution, the OLS estimators have the following
properties:

 They are unbiased.


 They have minimum variance. They are minimum-variance - unbiased estimators.
 They have consistency; that is, as the sample size increases indefinitely, the
estimators converge to their true population values
 The coefficients being linear function of the error term - are normally distributed with
constant mean and variance.

Then by the properties of the normal distribution using the standard normal distributions
(Z and t - distributions) the standardized values of the coefficients will be computed as
follows:
¿
β 1 −β 1 ( β^ 1−β 1) ( β^ 0 −β 0 )
Z= ¿ ∼ t n−2 Similarly, ∼ t n−2
se β 1 Se ( β^ 1 ) Se ( β^ 0 )

Later we are going to use these t- distributions for hypothesis testing


2.6. Interval Estimation for regression coefficients:
β 0 ∧β1
In reliability of a point estimator is measured by its standard, error. Therefore, instead of
relying on the point estimate alone, we may construct an interval around the point

estimator (
β^ 0 ∧ β^ 1 ) ,
say within two or three standard errors on either side of the point
estimator, such that this interval has, say 95% probability of including the true parameter

value (say,
β 0 , β1 )
^ ^
Symbolically: ( β 1 −δ ≤β 1 ≤ β1 +δ )=1−α , where ( 0<α <1 )
Where α is the level of significance (probability of committing a type – I error)
Note: - Type – I error  rejecting a true hypothesis:
Type –II error  Accepting a false hypothesis:

i) Confidence interval for the regression coefficients:


β 0 and β 1
The interval estimation gives the estimate of coefficient in the form a range/interval in
between two values with some level of confidence.

26
Suppose we have estimated: ^y i = ^β 0 + ^β 1 x i

β^ 0 −β 1
se ( β^ )
0
∼ t n−2
,
se ( β^ 0 ) =
√(δ^ 2
1
n
+
x̄ 2
∑x2 i
)
Thus, we have: Pr ¿ ¿ ¿
¿
n−2 Denote the degree of freedom, since we have two variables (in the case of two
variables model), hence, df =n−2 and if α = 5%, then a 95% confidence interval for β 0 is
given as follows:
^
[ ^ ^ ^
p β 0−t ( α ,n−2) se ( β 0 ) ≤ β 0 ≤ β 0+ t ( α , n−2) se ( β 0 ) = 95%
2 2 ]
Using the previous firm example; we have estimated the model as:
y i=3.6+ 0.75 x i +ui ;

Sample size n=10 & df =n−2=10−2=8 , se ( β^ 0 ) = 2.09

 Construct a 95% confidence interval for β 0


α
Since we are intended to construct a 95% confidence interval, α =¿ 0.5 and = 0.025
2
t 0.025 ,8 = 2.306 (from the t - table). Then confidence interval for β 0 will be:

[ ]
p β^ 0 −t ( 0.025, 8) se ( ^β 0 ) ≤ β0 ≤ ^β 0+t (0.025 , 8) se ( ^β 0 ) = 95%

⇒ Pr [ ( 3 . 6 )− (2 . 306 ) ( 2. 09 )≤β 0 ≤( 3 . 6 ) + ( 2 .306 )( 2 .09 ) ]=95 %


⇒ Pr [−1 .22≤β 0 ≤8 . 42 ]

Therefore, the 95% confidence interval for


β0
is [−1.22, 8.42 ]
Interpretation: - Given the confidence coefficient of 95%, in the long run, in 95 out of

100 cases intervals like (-1.22, 8.42) will contain the true
β0 .

The confidence interval for β 1

[^ ^ ^ ^ ^
p β1 −t ( α ,n−2) se ( β 1 ) ≤ β 1 ≤ β 1 + β 1+t (α , n−2) se ( β1 ) = 95%
2 2 ]
[ ]
p β^ 1−t (0.025 , 8) se ( ^β 1) ≤ β 1 ≤ ^β 1+ t ( 0.025, 8) se ( ^β 1 ) = 95%

27
y i=3.6+ 0.75 x i +ui ; n=10 , df =n−2=10−2=8
α
95% confidence interval for β 1; where: α =¿ 0.5 and = 0.025, t 0.025 ,8 = 2.306
2

∑ yi =28 , RSS = 14.65 and se ( β^ 1 ) = var ( β^ 1 ) =


2

Then 95% confidence interval for β 1 will be:


√ √ δ2
∑ xi 2
=

1.83
28
= 0.256

[ ]
p β^ 1−t (0.025 , 8) se ( ^β 1) ≤ β 1 ≤ ^β 1+ t ( 0.025, 8) se ( ^β 1 ) = 95%

Pr [ ( 0 .75 )−( 2. 306 ) ( 0 . 256 )≤β 1 ≤( 0 .75 )+ ( 2 . 306 ) ( 0. 256 ) ]=95 %


Pr [−0 .16≤β 1 ≤1. 34 ]

[ −0 .16 , 1 .34 ] - The interval estimates of the true value of β 1 at 95% confidence

ii) Confidence Interval for the true variance σ 2

2 ( n−1 ) δ^ 2
χ cal = 2
- chi - square distribution with n – 2 df .
σ
δ^ - Variance estimated from the sample data and σ 2 - a hypothesized value of the true p
2

variance. Rearranging the above statement, we write the confidence interval for the
variance at 1 −α level of confidence:

[ ]
( n−2 ) δ^ 2 2 ( n−2 ) δ^ 2
P 2 ≤σ ≤ 2
χ α χ α = 1 −α
( 2 , n−2) ( 1− 2 ,n−2)
2 2
χ and χ 2
( α
2
, n−2 ) (1− α2 , n−2 ) obtained from the χ - table

Example – returning back to the above numerical illustration, construct a 95% confidence
interval for the true variance.
We have estimated sample variance as δ^ 2 = 1.83, n = 10, df = n - 2 = 8, ¿ 0.05 ,
2
From the table we have: χ (0.025 , 8) = 17.5346 and χ 2(0.975 , 8) = 2.1797

28
P [ ( 8 ) 1.83
17.5346
≤ σ2 ≤
( 8 ) 1.83
2.1797 ]
P [ 0.835 ≤σ ≤ 6.72 ]
2
The true variance lies between 0.835 and 6.72.

2.7) Hypothesis Testing

Hypothesis testing is the process of determining whether or not a given hypothesized


statement is true. There are a number hypothesis tests in econometric analysis: test about
individual parameters, tests of basic assumptions of the CLRM, tests of normality,
correlation tests, tests of overall significance of the model, etc. The goal of hypothesis
tests is to ascertain whether the statistics from sampled data are reliable to make inference
about population values.

Hypothesis testing could be a two- or one-tail test. Whether one uses a two- or one-tail
test of significance will depend upon how the alternative hypothesis is formulated, which,
in turn, may depend upon some a priori considerations or prior empirical experience.

i)Testing the significance of regression coefficients: The t – test

The t- test used to test a hypothesis about the statistical significance of the estimated
value of individual coefficient. In the language of significance tests, a statistic (a
particular value of estimated coefficient) is said to be statistically significant if the value
of the test statistic lies in the critical region. In this case the null hypothesis is rejected.
Similarly, a test is said to be statistically insignificant if the value of the test statistic lies
in the acceptance region. In this case, the null hypothesis can’t be rejected.

f ( x)

29
−t α / 2 ,df t α / 2 ,df
−t α/ 2 ,df and t α / 2 ,df are the critical t – values that can be obtained from the t- table.
From the sample information the t- value of each estimated coefficient is computed as
follows:
β^ 1 −β 1
t=
se ( β^ )
1 - follows the t - distribution with n−2 df , where:
β^ 1 −β 1
t c =¿
se ( β^ )1 = - is called test statistic computed from data (t – calculated)
β^ 0 −β 0
t c =¿
se ( β^ ) 0

Two – sided or Two – Tail Test


We use two – tail test when we do not have a strong a priori or theoretical expectation
about the direction in which the alternative hypothesis should move relative to the null
hypothesis.
Decision rule: - Reject H0 if the computed t- value lies outside the interval, but we accept
the null hypothesis if the computed value lies within the interval.
Example: From our previous example of the firm, we have assumed a linear production

Function as:
Y =β 0 + β1 X i +ui , here MP L=β 1 and the estimated result:
Y i=3.6+ 0.75 X i +ui
If we want to test the following at α =5 % Accept Null Reject Null
H 0 : β 1 =0 ∧ H 1 : β 1≠0
β^ 1−β 1 0 .75−0 -2.306 2.306
= =2 . 929=t c
SE ( β^ 1 ) 0 . 256
α
t – tabulated - the critical Value from the table at = 0.025 and df =8 is equal to ±2.306;
2
[−2.306 , 2.306]. But, the t- value computed is 2. 929.which is outside the interval, in
other words, the estimated value lies in the critical (rejection) region.
Conclusion: Since t c outside the interval (acceptance region), hence, we reject H0 - for the
marginal product of labor MPL or β 1 , is significantly different from zero.

30
Note: a larger |t| value computed will always evidence against the null hypothesis.

Note: If on the basis of a test of significance, we decided to ‘accept” the null –


hypothesis, all we are saying is that on the basis of the sample evidence we have no
reason to reject it. In” accepting” or rejecting” a null hypothesis, we should always be
aware that another null hypothesis may be equally compatible with the data.
The “2-t” rule of Thumb: - If the number of df is 20 or more, and if α , the level of

significance, is set at 0.05, then the null hypothesis β 1=0 , can be rejected if

β^ 1
|t c|=| |>2 .
Se ( β^ )
1

The exact level of significance (P– value): The P- value is defined as the lowest
significance level at which the null – hypothesis is rejected. From our illustrative example

of the firms, given the null hypothesis that β 1=0 , we have obtained or computed a t-
value of 2.929.
What is the P – value of obtaining a t – value of as much as or greater than 2.929?
By using a computer, it can be shown that the probability of obtaining a t- value of 2.929

or greater (for 8 DF) is about 0.019, i.e., Pr (|t|> 2.929 )≈0.02. since this is low
probability, we reject the null hypothesis.

ii) Testing the overall significance of the model: (The F – test)


Given a general regression function of the form:
Y i= β0 + β 1 X 1i + β 2 X 2i +⋯+ β n X ni +u i
If we formulate the null and alternative hypothesis as:
H 0 : β 1 =β2 =⋯=β n=0 , and H 1 : β 1=β 2 =⋯=β n ≠ 0
A test of such a hypothesis is called a test of the overall significance of the estimated
regression line. The overall significance of a regression model is tested using the F – test.
The F –test, F value is computed the data as follows
2
TSS = ESS + RSS, dividing both sides by δ ;

31
TSS ESS RSS ∑ y
i
2 ∑ ^y i + ∑ ui 2
= + → =
δ2 δ2 δ2 δ2 δ2 δ2 ¿
2
χ n−1 = χ 2k −1 +¿ χ 2n−k

ESS / df ESS/ k −1
=
Thus, RSS / df RSS / n−k
∼F k −1, n−k

Decision Rule: is based on the comparison of the F – value computed based the sample
estimation using the above formula (can be denoted by F c ) and the F- value from the
table (denoted simply by F) at the given level of significance ( α ¿ and degree of freedoms
(K-1 and n-k). Here, k denotes the number of variables in the model and n – sample size.
The F –test has two degrees of freedom;
 one for the numerator – for the ESS which is k −1 (in some statistical tables
denoted by v1 )
 another for the denominator (RSS) which is n−k (in statistical tables denoted by
v 2)

Hence; we will Reject H0 if F c > F and conclude that the regression coefficients are
statistically significantly different from zero.

If F c < F we Accept the Null Hypothesis and conclude that the regression coefficients are
not statistically significantly different from zero.
Example: - Given the previous estimated model for output and labor; test the overall
significance of the model at 0.05 level of significance,
Y i=3.6+ 0.75 X i +ui
We have: ESS=15.75, and k −1=1 and RSS=14.65, and n−k =8
H 0 : β 0 =β 1=0 , and H 1 : β 0 =β1 ≠ 0
The null hypothesis asserted that the coefficients are not significant; while the alternative
hypothesis states the coefficients of the model are significantly different from zero.
The significance of the model is tested through F – test. The first compute the F- value
from the sample data;

32
ESS
ESS/k −1
1 15.75 8
F c ( 1,8 ) = RSS = = × =1.075 ×8=8.6
RSS 14.65 1
n−k
8
Next read the F value from F table at df =(1,8) and at α =0.05 level of significant
Hence F (1, 8) at 5%, level is 5.32;
Note: F (1, 8) at 5%, level is 5.32 which is lower than F computed; hence we reject H 0;
Conclude that the coefficients significant taken simultaneously.

Reporting the Results of Regression analysis


¿
Y i= 3 . 6 + 0 . 75 x i
Se= ( 2. 09 ) ( 0 . 256 ) r 2 =0 .52
t=( 1. 72 ) ( 2. 93 ) df =8 , F1,8 =8 . 6

Note: t – statistic for β^ 0 ∧ β^ 1 is calculated based on the null hypothesis:


β 0 =0 and β 1 =0
Example 2 - Data on weekly households’ consumption expenditure (Y ¿ and income (X¿.
Y 70 65 90 95 110 115 120 140 155 150
X 80 100 120 140 160 180 200 220 240 260
We have: two variables Y (the dependent variable) and X the explanatory variable and
sample size: n=10.
∑ X i2=322 , 000 ,∑ Y i2 =132,100 , ∑ Y i X i =2,055,000, Y =111, X =170
In deviation form: ∑ yi x i=16,800 ∑ x i2=33,000
The model is specified as: Y = β0 + β 0 X +u
a) estimate the two regression coefficients: β 0∧β 1

^β = ∑ y i x i = 16800
= 0.5091 - It is the estimated value of the slope coefficients,
1
∑ xi2 33000

which in this case it considered as the MPC (marginal propensity to consume of


households). As it tells us that whenever household income increased by birr 1,
consumption expenditure will increase by 0.5091 cents.

^β =Y − ^β X = 111 −0.5091 ×170=111−86.55=24.45


0 1

33
b) Compute TSS, ESS and RSS

ESS= ^β12 ∑ x i2 =0.50912 × 33,000=8,553

TSS = ∑ yi2=∑ Y i2−n ×Y 2 =132,100−10 ×1112 = 132,100 - 123,210 = 8,890


RSS = ∑ u i2 = TSS – ESS = 8890 – 8553 = 337

C) Compute the r 2 - the goodness of fit and r


8553
r2 = = 0.962 and r =√ r 2 = √ 0.962=0.981
8890
d) Compute the estimated variance or the sample variance, δ^ 2, and the standard error
RSS 337
δ^ ¿
2
= =42.125 and σ =√ δ^ 2 = √ 42.125=6.49
n−2 8
e) Compute standard errors of the coefficients

√ δ^
√ δ^
2 2
42.125

se( β^ 1 ) = var ( β^ 1 ) =
∑ i x
2
=
33,000
= 0.0357; var( β^ 1 ) =
∑ xi 2
= 0.00128


δ^ ∑ X i

2
42.125× 322,000
^

se( β 0 ) = var ( β 0 ) =
^
n ∑ xi
2
=
10× 33,000

se( β^ 0 ) =
√ 13,564,250
330,000
= √ 41.104=6.411 ; var( β^ 0 ) =
13,564,250
330,000
= 41.104

f) Compute t – values for the coefficients


^β 24.45 ^β 0.5091
0 1
t c= = = 3.812 and t c= = = 14.261
se ( β )
^
0
6.4138 se ( β ) 0.0357
1
^

The estimated SRF is reported as follows:


Y^ i=24.46 +0.5091 X i
Se (6.411) (0.0357)
tc (3.812) (14.261)
r = 0.962, r =0.981, ESS = 8553, RSS = 337, n = 10, α = 0.05, F 1, 8= 203.039
2

If we use STATA 12 (statistical software) results displayed as follows


Source SS df MS Number of obs = 10
Model 8552.73 1 8552.73 F( 1, 8) = 202.87
2 2
Residual 337.273 8 42.1591 R = 0.9621 , R - adjusted = 0.9573
Total 8890 9 987.78 Root MSE = 6.493

34
Y Coef. Std. Err. t P>|t| [95% Conf. Interval]
X 0.5090909 0.0357428 14.24 0.000 0.4266678 0.591514
cons 24.45455 6.413817 3.81 0.005 9.664256 39.24483

35

You might also like