Econometrics Chapter 1-3
Econometrics Chapter 1-3
1
affect the relationship and make it stochastic. Furthermore, they do not provide numerical values
for the coefficients of economic relationships.
Econometrics differs from mathematical economics in that, although econometrics presupposes,
the economic relationships to be expressed in mathematical forms, it does not assume exact or
deterministic relationship. Econometrics assumes random relationships among economic
variables. Furthermore, econometric methods provide numerical values of the coefficients of
economic relationships.
Econometrics vs. Statistics
Econometrics differs from both mathematical statistics and economic statistics. Economic
statistics is concerned with collecting, processing and presenting economic data (descriptive
statistics). An economic statistician gathers empirical data, records them, tabulates them or charts
them, and attempts to describe the pattern in their development over time and perhaps detect
some relationship between various economic magnitudes. Economic statistics is mainly a
descriptive aspect of economics. It does not provide explanations of the development of the
various variables and it does not provide measurements to the coefficients of economic
relationships.
Mathematical (or inferential) statistics deals with the method of measurement which is developed
on the basis of controlled experiments. But statistical methods of measurement are not
appropriate for a number of economic relationships because for most economic relationships
controlled or carefully planned experiments cannot be designed due to the fact that the nature of
relationships among economic variables are stochastic or random.
Economic Models vs. Econometric Models
i) Economic Models
Any economic theory is an observation from the real world. For one reason, the immense
complexity of the real world economy makes it impossible for us to understand all
interrelationships at once. Another reason is that all the interrelationships are not equally
important as such for the understanding of the economic phenomenon under study. The sensible
procedure is therefore, to pick up the important factors and relationships relevant to our problem
and to focus our attention on these alone. Such a deliberately simplified analytical framework is
called an economic model. It is an organized set of relationships that describes the functioning of
2
an economic entity under a set of simplifying assumptions. All economic reasoning is ultimately
based on models. Economic models consist of the following three basic structural elements.
1. A set of variables
2. A list of fundamental relationships and
3. A number of strategic coefficients
ii) Econometric Models
The most important characteristic of econometric relationships is that they contain a random
element which is ignored by mathematical economic models which postulate exact relationships
between economic variables.
Example: Economic theory postulates that the demand for a commodity depends on its price, on
the prices of other related commodities, on consumers‟ income and on tastes. This is an exact
relationship which can be written mathematically as:
Ԛ= 0 + 1P + 2P0 + 3Y + 4t
The above demand equation is exact. However, many more factors may affect demand. In
econometrics the influence of these other factors are taken into account by the introduction into
the economic relationships of random variable. In our example, the demand function studied
with the tools of econometrics would be of the stochastic form:
Ԛ= 0 + 1P + 2P0 + 3Y + 4t +𝗎
Where u stands for the random factors which affect the quantity demanded.
1.2. Goals of Econometrics
Basically there are three main goals of Econometrics. They are:
i) Analysis i.e. testing economic theory
ii) Policy making i.e. obtaining numerical estimates of the coefficients of economic
relationships for policy simulations.
iii) Forecasting i.e. using the numerical estimates of the coefficients in order to forecast the
future values of economic magnitudes.
1.3. Methodology of Econometrics
Econometric research is concerned with the measurement of the parameters of economic
relationships and with the predication of the values of economic variables.
Econometric analyses usually follow the following steps:
3
1. Statement of theory or hypothesis
What does economic theory tell you about the relationship between two or more variables? For
example, Keynes stated: “Consumption increases as income increases, but not by as much as the
increase in income”. It means that “The marginal propensity to consume (MPC) for a unit change
in income is greater than zero but less than a unit”.
2. Specification of the mathematical model of the theory
What functional relationship exists between the two variables? Is it linear or non-linear?
According to Keynes, consumption expenditure and income are linearly related. That is,
Y 0 1 X ; 0 1 1 where, Y = consumption expenditure, X = income and β0 and β1 are
4
Are the estimates accords with the expectations of the theory that is being tested? Is MPC < 1
statistically? If so, it may support Keynes‟ theory. Confirmation or refutation of economic
theories based on sample evidence is object of statistical analysis.
7. Forecasting or prediction
If we have time series data, with given future value(s) of X, we can estimate the future value(s)
of Y. For example, if the above data is the yearly consumption expenditure of an individual
from1982 to 1984. If income = Birr 7000 in 1985 what is the forecast consumption expenditure?
5
It has the dimensions of both time series and cross-sections. An example is a household survey
(census) conducted every 10 years in Ethiopia.
2. Specification
This is about the specification of the econometric model that we think generated the sample data.
3. Estimation
This consists of using the assembled sample data on the observable variables in the model to
compute estimates of the numerical values of all the unknown parameters in the model.
4. Inference
This consists of using the parameter estimates computed from sample data to test hypotheses
about the numerical values of the unknown population parameters that describe the behavior of
the population from which the sample was selected.
Measurement Scales of Variables
The variables that we will generally encounter fall into four broad categories.
1. Ratio scale
For a variable X, taking two values, X1 and X2, the ratio X1/X2 and the distance (X2 − X1) are
meaningful quantities. Also, there is a natural ordering (ascending or descending) of the values
along the scale. Therefore, comparisons such as X2 ≤ X1 or X2 ≥ X1 are meaningful.
2. Interval scale
An interval scale is a scale on which equal intervals between objects, represent equal differences.
The interval differences are meaningful. But, we can‟t defend ratio relationships. The distance
between two time periods, say (2000–1995) is meaningful, but not the ratio of two time periods
(2000/1995).
3. Ordinal scale
A variable belongs to this category only if it satisfies the third property of the ratio scale (i.e.,
natural ordering). Examples are grading systems (A, B, C grades) or income class (upper,
middle, lower). For these variables the ordering exists but the distances between the categories
cannot be quantified.
4. Nominal scale
Variables in this category have none of the features of the ratio scale variables. Variables such as
gender (male, female) and Marital status (married, unmarried, divorced, separated).
6
Chapter Two: Correlation Theory
2.1. Basic Concepts of Correlation
Economic variables have a great tendency of moving together and very often data are given
in pairs of observations in which there is a possibility that the change in one variable is on
average accompanied by the change of the other variable. This situation is known as
correlation.
Correlation is a statistical measure that indicates the extent to which two or more variables
fluctuate together.
Correlation may be defined as the degree of relationship existing between two or more
variables. The degree of relationship between the variables under consideration is measure
through the correlation analysis. Correlation analysis deals with the association between two
or more variables. The measure of correlation called the correlation coefficient. The degree
of relationship is expressed by coefficient which range from correlation (
) The direction of change is indicated by a sign.
2.2. Types of Correlation
1. Type I
Positive Correlation: The correlation is said to be positive correlation if the values of two
variables changing with same direction. For example, the correlation between price of a
commodity and its quantity supplied is positive since as price rises, quantity supplied will be
increased and vice versa.
Negative Correlation: The correlation is said to be negative correlation when the values of
variables change with opposite direction. For example, the correlation between price of a
commodity and its quantity demanded is negative since as price rises, quantity demanded
will be decreased and vice versa.
Zero Correlation: Means no relationship between the two variables X and Y; i.e. the
change in one variable (X) is not associated with the change in the other variable (Y). For
example, body weight and intelligence, shoe size and monthly salary; etc.
2. Type II
Simple correlation: Under simple correlation there are only two variables are studied.
Multiple Correlations: Under Multiple Correlations three or more than three variables are
studied.
7
Partial correlation: analysis recognizes more than two variables but considers only two
variables keeping the other constant.
3. Type III
Linear Correlation: Correlation is said to be linear when the amount of change in one
variable tends to bear a constant ratio to the amount of change in the other. The graph of the
variables having a linear relationship will form a straight line.
Non Linear Correlation: The correlation would be non-linear if the amount of change in
one variable does not bear a constant ratio to the amount of change in the other variable.
8
Figure 1: Perfect linear correlations
A perfect positive correlation is given the value of 1. A perfect negative correlation is
given the value of -1. If there is absolutely no correlation present the value given is 0. The
closer the number is to 1 or -1, the stronger the correlation, or the stronger the relationship
between the variables. The closer the number is to 0, the weaker the correlation.
Two variables may have a positive correlation, negative correlation, or they may be
uncorrelated. This holds true both for linear and nonlinear correlation. Two variables are said
to be positively correlated if they tend to change together in the same direction, that is, if they
tend to increase or decrease together. Such positive correlation is postulated by economic
theory for the quantity of a commodity supplied and its price. When the price increases the
quantity supplied increases. Conversely, when price falls the quantity supplied decreases.
Two variables are said to be negatively correlated if they tend to change in the opposite
direction: when X increases Y decreases, and vice versa. Such negative correlation is
postulated by economic theory for the quantity demand for a commodity and its price. When
price increases, demand for the commodity decreases and when price falls demand increases.
2. Simple Linear Correlation coefficient
For a precise quantitative measurement of the degree of correlation between Y and X we use a
parameter which is called the correlation coefficient and is usually designated by the Greek
letter . Having as subscripts the variables whose correlation it measures, refers to the
correlation of all the values of the population of X and Y. Its estimate from any particular
9
sample (the sample statistic for correlation) is denoted by r with the relevant subscripts. For
example if we measure the correlation between X and Y the population correlation coefficient
is represented by xy and its sample estimate by rxy. The simple correlation coefficient is used
to measure relationships which are simple and linear only. It cannot help us in measuring non-
linear as well as multiple correlations. Sample correlation coefficient is defined by the
formula:
n X i Yi X i Y i
x y i i
r Or rxy
n X i ( X i ) 2 n Yi ( Yi ) 2 x y
2 2
2 2
i i
Where, xi X i X and y i Yi - Y
We will use a simple example from the theory of supply. Economic theory suggests that the
quantity of a commodity supplied in the market depends on its price, ceteris paribus. When
price increases the quantity supplied increases, and vice versa. When the market price falls
producers offer smaller quantities of their commodity for sale. In other words, economic
theory postulates that price (X) and quantity supplied (Y) are positively correlated.
Example 2.1: The following table shows the quantity supplied for a commodity with the
corresponding price values. Determine the type of correlation that exists between these two
variables.
10
Mekdela Amba University
Table 1: Data for computation of correlation coefficient
Time period(in days) Quantity supplied Yi (in tons) Price Xi (in shillings)
1 10 2
2 20 4
3 50 6
4 40 8
5 50 10
6 60 12
7 80 14
8 90 16
9 90 18
10 120 20
n XY X Y
10(8520) (110)(610)
r 0.975
10(1540) (110)(110) 10(47700) (610)(610)
n X ( X )
2 2
n Y ( Y )
2 2
Or using the deviation form, the correlation coefficient can be computed as:
1810
r 0.975
330 10490
11
This result shows that there is a strong positive correlation between the quantity supplied and
the price of the commodity under consideration.
Properties of Simple Correlation Coefficient
The simple correlation coefficient has the following important properties:
1. The value of correlation coefficient always ranges between -1 and +1.
2. The correlation coefficient is symmetric. That means rxy ryx , where, rxy is the correlation
3. The correlation coefficient is independent of change of origin and change of scale. By change
of origin we mean subtracting or adding a constant from or to every values of a variable and
change of scale we mean multiplying or dividing every value of a variable by a constant.
4. If X and Y variables are independent, the correlation coefficient is zero. But the converse is
not true.
5. The correlation coefficient has the same sign with that of regression coefficients.
6. The correlation coefficient is the geometric mean of two regression coefficients.
r b yx * bxy
Though, correlation coefficient is most popular in applied statistics and econometrics, it has its
own limitations. The major limitations of the method are:
1. The coefficient requires the quantitative measurement of both variables. If one of the two variables
is not quantitatively measured, the coefficient cannot be computed.
2. Great care must be exercised in interpreting the value of this coefficient as very often the
coefficient is misinterpreted. For example, high correlation between lung cancer and smoking
does not show us smoking causes lung cancer.
3. The value of the coefficient is unduly affected by the extreme values.
6 D 2
r' 1 2.3
n(n 2 1)
Where,
D = difference between ranks of corresponding pairs of X and Y
n = number of observations.
The values that r may assume range from + 1 to – 1.
Two points are of interest when applying the rank correlation coefficient. Firstly, it does not
matter whether we rank the observations in ascending or descending order. However, we must
use the same rule of ranking for both variables. Second if two (or more) observations have the
same value we assign to them the mean rank/common ranks to the repeated items. When there is
( )
a repetition of ranks, a correction factor is added to ∑ in the Spearman‟s rank
correlation coefficient formula, where m is the number of times a rank is repeated. It is very
important to know that this correction factor is added for every repetition of rank in both characters.
Thus, in case of tied or repeated rank Spearman‟s rank correlation coefficient formula is
Example 2.2: A market researcher asks experts to express their preference for twelve different
brands of soap. Their replies are shown in the following table.
13
Person II 7 8 3 1 10 12 2 6 5 4 11 9
The figures in this table are ranks but not quantities. We have to use the rank correlation
coefficient to determine the type of association between the preferences of the two persons. This
can be done as follows.
6 D 2
6(50)
r' 1 1 0.827
n(n 2 1) 12(12 2 1)
This figure, 0.827, shows a marked similarity of preferences of the two persons for the
various brands of soap.
Example 2.3: (Table 5) Calculate rank correlation coefficient from the following data:
14
There is a negative association between expenditure on advertisement and profit.
4. Coefficient of concurrent deviation (Kendall’s Tau correlation)
Coefficient of concurrent deviation or Kendall‟s tau is a correlation coefficient that
measures the relationship between two variables. In contrast to Pearson‟s correlation, it is
non-parametric test procedure. Thus, for Kendall‟s tau the data need not normally distributed
and the two variables need only have ordinal scale levels. Kendall‟s tau is very similar to
Spearman‟s rank correlation coefficient but Kendall‟s tau should be preferred to Spearman‟s
correlation for very few data with many rank ties are available. Kendall‟s tau correlation
coefficient given by the formula
( )
Doctor A Doctor B
1 3
15
2 1 -
3 4 + +
4 2 - + -
5 6 + + + +
6 5 + + + + -
16
determined in terms of the simple correlation coefficients among the various variables involved
in a multiple relationship. In our example there are three simple correlation coefficients.
r12 = correlation coefficient between X1 and X2
r13 = correlation coefficient between X1 and X3
r23 = correlation coefficient between X2 and X3
The partial correlation coefficient between X1 and X2, keeping the effect of X3 constant is given
by:
r12 r13 r23
r12..3
(1 r13 )(1 r23 )
2 2
Similarly, the partial correlation between X1 and X3, keeping the effect of X2 constant is given
by:
Example 2.4: The following table gives data on the yield of corn per acre(Y), the amount of
fertilizer used(X1) and the amount of insecticide used (X2). Compute the partial correlation
coefficient between the yield of corn and the fertilizer used keeping the effect of insecticide
constant.
Table 6: Data on yield of corn, fertilizer and insecticides used
Year 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980
Y 40 44 46 48 52 58 60 68 74 80
X1 6 10 12 14 16 18 22 24 26 32
X2 4 4 5 7 9 12 14 20 21 24
The computations are done as follows:
Table 7: Computation for partial correlation coefficients
Year Y X1 X2 Y x1 x2 x1y x2y x1x2 x12 x22 y2
1971 40 6 4 -17 -12 -8 204 136 96 144 64 289
1972 44 10 4 -13 -8 -8 104 104 64 64 64 169
1973 46 12 5 -11 -6 -7 66 77 42 36 49 121
1974 48 14 7 -9 -4 -5 36 45 20 16 25 81
1975 52 16 9 -5 -2 -3 10 15 6 4 9 25
1976 58 18 12 1 0 0 0 0 0 0 0 1
1977 60 22 14 3 4 2 12 6 8 16 4 9
1978 68 24 20 11 6 8 66 88 48 36 64 121
17
1979 74 26 21 17 8 9 136 153 72 64 81 289
1980 80 32 24 23 14 12 322 276 168 196 144 529
Sum 570 180 120 0 0 0 956 900 524 576 504 1634
Mean 57 18 12
ryx1=0.9854,
ryx2=0.9917
rx1x2=0.9725
Then,
X
Y 80 100 120 140 160 180 200 220 240 260
19
Conditional expected values, ( ) which is read as the expected value of Y given the value
of X.
Unconditional expected value of weekly consumption expenditure, ( )
“What is the expected value of weekly consumption expenditure of a family?” The answer:
$121.20 (the unconditional mean).
“What is the expected value of weekly consumption expenditure of a family whose monthly
income is $160?” The answer: $113.00 (the conditional mean). Each conditional mean E(Y | Xi )
is a function of Xi , where Xi is a given value of X. Symbolically, ( ) ( ) where f
(X is some function of Xi ). This is called conditional expectation function (CEF) or population
regression function (PRF) ( ) The Concept of Population Regression
Function states merely that the expected value of the distribution of Y given Xi is functionally
related to Xi.
In the simple regression model, the
population regression model or,
simply, the population model is the
following:
Deterministic Stochastic
20
On the right hand of we can distinguish two parts: the systematic component and the random
disturbance .
The dependent and independent variables can be indicated in different terminologies shown in
the following Table.
Table 3.2
Dependent variable Independent variable
Explained variable Explanatory variable
Predictand Predictor
Regressand Regressor
Response Stimulus
Endogenous Exogenous
Outcome Covariate
Controlled variable Control variable
21
2. The mean value of error term ‘Ui’ is zero. This implies the sum of the individual
disturbance terms is zero. The deviations of the values of some of the disturbance terms are
negative; some are zero and some are positive and the sum or the average is zero.
Mathematically, E ( ) .
3. The random terms of different observations ( ) are independent. (The assumption
of no autocorrelation). This means that there is no systematic variation or relation among the
value of the error terms ( ) Algebraically, ( ) ( ( )][
( )]
4. The variance of the random variable (Ui) is constant.
( ) ( )] ( )
This constant variance is called homoscedasticity assumption and the constant variance itself is
called homoscedastic variance.
5. The random variable (U) is independent of the explanatory variables. This means there is
no correlation between the random variable and the explanatory variables. If two variables are
unrelated their covariance is zero.
Cov ( ) – E ( )] [ ( )]
=E[ ( – E ( ))] given E ( ) = 0
= E( ) – E( ) E( )
=E( ) or ( ), since E( ) = 0
= 0, by assumption 2
6. Normality assumption-The disturbance term Ui is assumed to have a normal distribution with
zero mean and a constant variance. This assumption is given as follows:
Ui ~
N 0, u
2
.
This assumption is a combination of zero mean of error term assumption and homoscedasticity
assumption. This assumption or combination of assumptions is used in testing hypotheses about
significance of parameters.
7. The number of observations n must be greater than the number of parameters to be
estimated. Alternatively, the number of observations n must be greater than the number of
explanatory variables.
22
Method of least squares,
Method of moments, and
Maximum likelihood estimation.
The most common method for fitting a regression line is the method of least-squares.
Estimating a linear regression function using the Ordinary Least Square (OLS) criteria is simply
about calculating the parameters of the regression function for which the sum of square of the
error terms is minimized using a mathematical function or formula. A square is determined by
squaring the distance between a data point and the regression line or mean value of the data set.
Linear regression calculates a line that best fits the observations. In the following image, the line
that best fits the regression is clearly the blue line or line A.
The model chooses the estimated or fitted parameters (β0 and β1) that minimize the sum of the
squared vertical differences between the observations and the regression line (Sum of squared
errors (SSE). The procedure is given as follows. Suppose we want to estimate the following
equation
̂ ̂ ̂
Q. What is the difference between the error term ( ) and the residual( ̂ )?
An error term is generally unobservable and a residual is observable and calculable, making it
much easier to quantify and visualize. In effect, while an error term represents the way observed
data differs from the actual population, a residual represents the way observed data differs from
sample population data.
23
Since most of the time we predict the sample (or it is difficult to get population data) the
corresponding sample regression prediction is given as follows.
From this identity, we solve for the residual term ' ei ' , square both sides and then take sum of
both sides. These three steps are given (respectively as follows).
ei Yi ˆ0 ˆ1 X i
2
2
To minimize the residual sum of squares we take the first order partial derivatives of and equate
them to zero.
ei
2
2 Yi ˆ0 ˆX i (1) 0
ˆ 0
Y
i
ˆ0 ˆ1 X i 0
Yi nˆ0 ˆ1 X i =0
24
Y i nˆ0 ˆ1 X i
But, ∑ ̅ and ∑ = ̅
Solving for, ̂
̂ =̅- ̂ ̅
we have ∑ ( - ̂ -̂ )=0
̂ =̅- ̂ ̅
Substitute ̅ - ̂ ̅ into ̂
∑ ( -̅ ̂ ̅- ̂ )=0
∑ - ̅∑ + ̂ ̅∑ -̂ ∑ =0
∑ - n̅ ̅ + ̂ n ̅ - ̂ ∑ =0
Solving for ̂ ,
̂ (n ̅ - ∑ ) = n̅ ̅ - ∑
So overall we have
̅̅ ∑ ∑ ̅̅
̂ =
̅ ∑ ∑ ̅
This method or criteria of finding the optimum is known as ordinary least squares (OLS).
Now we have the formula to estimate the simple linear regression function. Let us illustrate with
example.
Example 2.1: Given the following sample data of three pairs of „Y‟ (dependent variable) and „X‟
(independent variable), find a simple linear regression function; Y = f(X).
25
Table 3.4
Yi Xi
10 30
20 50
30 60
a) find a simple linear regression function; Y = f(X)
b) Interpret your result.
c) Predict the value of Y when X is 45.
Solution: a. To fit the regression equation, we do the following computations.
Yi Xi Yi Xi Xi2
10 30 300 900
20 50 1000 2500
30 60 1800 3600
Sum 60 140 3100 7000
Mean Y = 20 140
X =
3
n XY ( X )( Y )
3(3100) (140)(60)
ˆ1 0.64
3(7000) (140) 2
n X 2 ( X ) 2
b) Interpretation, the value of the intercept term,-10, implies that the value of the dependent
variable „Y‟ is – 10 when the value of the explanatory variable is zero. The value of the
slope coefficient ( ˆ 0.64 ) is a measure of the marginal change in the dependent
variable „Y‟ when the value of the explanatory variable increases by one. For instance, in
this model, the value of „Y‟ increases on average by 0.64 units when „X‟ increases by
one.
c) Yˆi 10 0.64 X i ( )( )
That means when X assumes a value of 45, the value of Y on average is expected to be 18.8.
The regression coefficients can also be obtained by simple formulae by taking the deviations
between the original values and their means. Now, if
26
xi X i X , and yi Yi Y
x y
i i
Then, the coefficients can be represented by alternative formula given below ˆ1 ,
x i
2
Yi Xi y X xy x2 y2
10 30 -10 -16.67 166.67 277.78 100
20 50 0 3.33 0.00 11.11 0
30 60 10 13.33 133.33 177.78 100
Sum 60 140 0 0 300.00 466.67 200
Mean 20 46.66667
Then
x i yi
300 , and ˆ0 Y ˆ1 X =20-(0.64) (46.67) = -10 with results
ˆ1 0.64
466.67
x i
2
27
Mean and Variance of Parameter Estimates
Formula for mean and variance of the respective parameter estimates and the error term are
given below.
1. The mean of 1 E (1 ) 1
U2
2. The variance of 1 Var ( 1 ) E ((1 E ( )) 2
xi2
3. The mean of 0 E ( 0 ) 0
U2 X i2
4. The variance of 0 E (( 0 E ( 0 )) 2
n xi2
ei2
5. The estimated value of the variance of the error term U
2
n2
3.3. Normal Equations of OLS
In linear regression analysis, the normal equations are a system of equations whose solution is
the Ordinary Least Squares (OLS) estimator of the regression coefficients. The normal equations
are derived from the first-order condition of the Least Squares minimization problem.
Let us start from the simple linear regression model specified as
However, as we noted in 3.1 and 3.2, the PRF is not directly observable. We estimate it from the
SRF given as ̂ ̂ ̂ where, ̂ ̂ ̂ ̂
Minimize RSS = ∑ ̂
28
If we use ∑ ̂ , all the error terms ̂ would receive equal importance no matter how
closely/widely scattered the individual observations are from SRF.
If so, the algebraic sum of ̂ ‟s may be small (even zero) though the ̂ s are widely scattered
about SRF.
But ̂ ̂ ̂
∑ ̂ ∑( ̂ ̂ )
OLS: minimize ̂ ̂ ∑ ̂ ∑( ̂ ̂ )
FOC 1: ∑ ̂ ∑ ̂ *∑( ̂ ̂ ) +
̂ ̂
*∑( ̂ ̂ ) +( )
∑( ̂ ̂ )
∑ ∑ ̂ ̂ ∑
∑ ̂ ̂ ∑
∑ ̂ ∑ ̂
̂ ̅ ̂ ̅ ( )
∑ ̂ ∑ ̂ *∑( ̂ ̂ ) +
̂ ̂
*∑( ̂ ̂ ) +( )
∑( ̂ ̂ ) ( )
∑ ̂ ∑ ̂ ∑
∑ ̅ ̂ ̅(∑ ) ̂ ∑
∑ ̅∑ ̂ ̅(∑ ) ̂ ∑
29
∑ ̅∑ ̂ ∑ ̂ ̅(∑ )
∑ ̅∑ ̂ (∑ ̅(∑ ))
∑ ̅̅ ̂ (∑ ̅)
∑ ̅̅
̂
∑ ̅
∑( ̅ )( ̅)
̂ or
∑( ̅)
∑
̂ ̅ ̅
∑
( )
̂ or
( )
∑ ∑ ∑
̂
∑ (∑ )
Thus, in the case of a simple linear regression, the normal equations are a system of two
equations ( ) and ( ) in two unknowns ( ̂ and ̂ ).If the system has a unique
solutions, then the two values of ( ̂ and ̂ ) that solve the system are the OLS estimators of
the intercept ( ̂ and the slope ̂ ) respectively.
3.4. Coefficient of Correlation and Determination
The coefficient of determination r 2 (two-variable case) or R2 (multiple regression) is a summary
measure that tells how well the sample regression line fits the data. Before we show how r 2 is
computed, let us consider an explanation of r2 in terms of a graphical device, known as the Venn
diagram.
30
In this figure the circle Y represents variation in the dependent variable Y and the circle X
represents variation in the explanatory variable X. The overlap of the two circles (the shaded
area) indicates the extent to which the variation in Y is explained by the variation in X (say, via
an OLS regression). The greater the extent of the overlap, the greater the variation in Y is
explained by X. The is simply a numerical measure of this overlap. In the figure, as we move
from left to right, the area of the overlap increases, that is, successively a greater proportion of
the variation in Y is explained by X. In short, increases. When there is no overlap, is
obviously zero, but when the overlap is complete, is 1, since 100 percent of the variation in Y
is explained by X. As we shall show shortly lies between 0 and 1.
∑ ∑̂ ̂ ∑ ̂ ̂ Since ∑ ̂ ̂ .∑ ∑̂ ̂ . Therefore,
that means
31
Mathematically; the explained variation as a percentage of the total variation is explained
∑̂
as:
∑
∑( ̂ ̅ ) ∑̂
∑( ̅) ∑( ̅)
∑( ̂ ̅ )
We can define r2 as ∑( ̅)
∑̂
Alternatively, ∑( ̅)
(∑ )
Or ,
∑ ∑
Give the definition of r2, we can express ESS and RSS as follows.
∑ ∑ ( )
( ) ∑ ∑ ∑ ( )
A quantity closely related to but conceptually very much different from is the coefficient of
correlation, which, as noted in Chapter 2, is a measure of the degree of association between two
2. Its limits are 0 ≤ ≤ 1. An of 1 means a perfect fit. On the other hand, an of zero means
that there is no relationship between the regressand and the regressor.
The higher the coefficient of determination is the better the fit. Conversely, the smaller the
coefficient of determination is the poorer the fit. That is why the coefficient of determination is
used to compare two or more models. One minus the coefficient of determination is called the
coefficient of non-determination, and it gives the proportion of the variation in the dependent
variable that remained undetermined or unexplained by the model.
32
Interpretation of
Suppose , this means that the regression line gives a good fit to the observed data since
this line explains 90% of the total variation of the Y value around their mean. The remaining
10% of the total variation in Y is unaccounted for by regression line and is attributed to the
factors included in the disturbance variable .
33
1. Compute the standard deviations of the parameter estimates using the above formula for
variances of parameter estimates.
This is because standard deviation is the positive square root of the variance.
U2
se( 1 )
xi2
U2 X i2
se( 0 )
n xi2
2. Compare the standard errors of the estimates with the numerical values of the estimates and
make decision.
A. If the standard error of the estimate is less than half of the numerical value of the estimate, we
1
can conclude that the estimate is statistically significant. That is, if se( i ) ( i ) , reject the
2
null hypothesis.
B. If the standard error of the estimate is greater than half of the numerical value of the estimate,
we can conclude that the parameter estimate is not statistically significant. That is, if
1
se( i ) ( i ) , conclude to accept the null hypothesis.
2
Example: Suppose that from a sample of size , we estimate the following supply
function.
( ) ( )
Test the significance of the slope parameter at 5% level of significance using the standard error
test.
(̂ )
̂
̂
significance.
34
2. Student’s t-test
t-test can be used to test the statistical reliability of the parameter estimates.
The test depends on the degrees of freedom that the sample has. The test procedures of t-test are
similar with that of the z-test.
Since we have two parameters in simple linear regression, our degree of freedom is . Like
the standard error test we formally test the hypothesis: against the alternative:
for the slope parameter; and: against the alternative: for
the intercept.
To undertake the above test, we follow the following steps.
Step 1: Compute t*, which is called the computed value of t, by taking the value of
in the null hypothesis. In our case , then t* becomes:
̂ ̂ ̂
(̂ ) (̂ ) (̂ )
Step 2: Choose level of significance. Level of significance is the probability of making „wrong‟
decision, i.e. the probability of rejecting the hypothesis when it is actually true or the probability
of committing a type I error. It is customary in econometric research to choose the 5% or the 1%
level of significance. This means that in making our decision we allow (tolerate) five times out
of a hundred to be „wrong‟ i.e. reject the hypothesis when it is actually true.
Step 3: Check whether there is one tail test or two tail test. If the inequality sign in the
alternative hypothesis is , then it implies a two tail test and divide the chosen level of
significance by two; decide the critical region or critical value of t called . But if the inequality
sign is either or then it indicates one tail test and there is no need to divide the chosen level
of significance by two to obtain the critical value of from the t-table.
Example:
If we have:
against:
Then this is a two tail test. If the level of significance is 5%, divide it by two to obtain critical
value of from the t-table.
Step 4: Obtain critical value of , called at and degree of freedom for two tail test.
Step 5: Compare t* (the computed value of t) and (critical value of ).
35
If t* , reject and accept . The conclusion is ̂ or ̂ is statistically
significant.
If t* , accept and reject H1. The conclusion is ̂ or ̂ is statistically
insignificant.
Numerical Example:
Suppose that from a sample size we estimate the following consumption function:
( ) ( )
The values in the brackets are standard errors. We want to test the null hypothesis:
against the alternative: using the t-test at 5% level of significance.
a. The t-value for the test statistic is:
̂ ̂ ̂
(̂ ) (̂ ) (̂ )
b. Since the alternative hypothesis (H1) is stated by inequality sign ( ), it is a two tail test, hence
freedom (df) i.e. ( ). From the t-table „ ‟ at 0.025 level of significance and 18 df
is 2.10.
c. Since t* and .1, t* . It implies that ̂ is statistically significant.
The test rule or decision is given as follows:
Reject H0 if n-k
3. Confidence interval
In order to define how close the estimate to the true parameter; we must construct the confidence
interval for the parameter. In other words we must establish limiting values around the estimate
within which the true parameter is expected to lie within a certain degree of confidence. In this
respect we say that with a given probability the population parameter will be within the defined
confidence interval (confidence limits).
We choose probability in advance and refer to it as confidence level (interval coefficient). It is
customarily in econometrics to choose 95% confidence level. This means that in repeated
sampling, the confidence limits, computed from the sample, would include the true population
36
parameter in 95% of the cases. In the other 5% of the cases the population parameter will fall
outside the confidence interval.
In a two-tail test level of significance, the probability of obtaining the specific t-value either or
is ⁄ at n-2 degree of freedom. The probability of obtaining any value of t which is equal to
̂
at n-2 degree of freedom is 1 – ( ⁄ + ⁄ ) i.e. 1 -
( ̂)
The limit within which the true lies at (1 – ) degree of confidence is:
Decision rule: If the hypothesized value of in the null hypothesis is within the confidence
interval, accept and reject . The implication is that ̂ is statistically insignificant; while if
the hypothesized value of in the null hypothesis is outside the limit, reject and accept .
This indicates ̂ is statistically significant.
37
Numerical Example:
Suppose we have estimated the following regression line from a sample of 20 observations.
(38.2) (0.85)
The values in the bracket are standard errors.
a. Construct 95% confidence interval for the slope of parameter
b. Test the significance of the slope parameter using constructed confidence interval.
Solution:
a. The limit within which the true lies at 95% confidence interval is:
̂ (̂ )
̂
(̂ )
at 0.025 level of significance and 18 degree of freedom is 2.10.
̂ (̂ ) ( )
Example: The following table gives the quantity supplied (Y in tons) and its price (X pound per
ton) for a commodity over a period of twelve years.
Table 3.6: Data on supply and price for given commodity
Y 69 76 52 56 57 77 58 55 67 53 72 64
X 9 12 6 10 9 10 7 8 12 6 11 8
38
6 77 10 770 100 5929 1 14 14 1 196 66.25 10.75 115.56
7 58 7 406 49 3364 -2 -5 10 4 25 56.50 1.50 2.25
8 55 8 440 64 3025 -1 -8 8 1 64 59.75 -4.75 22.56
9 67 12 804 144 4489 3 4 12 9 16 72.75 -5.75 33.06
10 53 6 318 36 2809 -3 -10 30 9 100 53.25 -0.25 0.06
11 72 11 792 121 5184 2 9 18 4 81 69.50 2.50 6.25
12 64 8 512 64 4096 -1 1 -1 1 1 59.75 4.25 18.06
Sum 756 108 6960 1020 48522 0 0 156 48 894 756.00 0.00 387.00
Use Tables (table 2.6 and table 2.7) to answer the following questions
1. Estimate the Coefficient of determination (r2)
2. Run significance test of regression coefficients using the following test methods
A) The standard error test
B) The students t-test
3. Fit the linear regression equation and determine the 95% confidence interval for the slope.
Solution
1. Estimate the Coefficient of determination (r2)
Use data in table 2.7 to estimate r2 using the formula given below.
e
2
i
387
R2 1 1 1 0.43 0.57
894
y i
2
This result shows that 57% of the variation in the quantity supplied of the commodity under
consideration is explained by the variation in the price of the commodity; and the rest 43%
remain unexplained by the price of the commodity. In other word, there may be other important
explanatory variables left out that could contribute to the variation in the quantity supplied of the
commodity, under consideration.
2. Run significance test of regression coefficients using the following test methods
Fitted regression line for the data given is:
Yi 33.75 3.25 X i , where the numbers in parenthesis are standard errors of the respective
(8.3) (0.9)
coefficients.
A. Standard Error test
In testing the statistical significance of the estimates using standard error test, the following
information needed for decision.
Since there are two parameter estimates in the model, we have to test them separately.
39
Testing for 1
We have the following information about 1 i.e 1 =3.25 and se( 1 ) 0.9
The following are the null and alternative hypotheses to be tested.
H 0 : 1 0
H 1 : 1 0
Since the standard error of 1 is less than half of the value of 1 , we have to reject the null
hypothesis and conclude that the parameter estimate 1 is statistically significant.
Testing for 0
Again we have the following information about 0
0 33.75 and se( 0 ) 8.3
The hypotheses to be tested are given as follows;
H0 : 0 0
H1 : 0 0
Since the standard error of 0 is less than half of the numerical value of 0 , we have to reject the
null hypothesis and conclude that 0 is statistically significant.
B. The students t-test
In the illustrative example, we can apply t-test to see whether price of the commodity is
significant in determining the quantity supplied of the commodity under consideration? Use
=0.05.
The hypothesis to be tested is:
H 0 : 1 0
H 1 : 1 0
The parameters are known.
ˆ 3.25, se(ˆ ) 0.8979
1 1
40
Further tabulated value for t is 2.228. When we compare these two values, the calculated t is
greater than the tabulated value. Hence, we reject the null hypothesis. Rejecting the null
hypothesis means, concluding that the price of the commodity is significant in determining the
quantity supplied for the commodity.
3. Fit the linear regression equation and determine the 95% confidence interval for the slope.
Fitted regression model is indicated Yi 33.75 3.25 X i , where the numbers in parenthesis are
(8.3) (0.9)
standard errors of the respective coefficients. To estimate confidence interval, we need standard
error which is determined as follows
e
2
i
387 387
u2 38.7
nk 12 2 10
1 1
var( ˆ1 ) u 38.7( ) 0.80625
2
48
x 2
The standard error of the slope is se( ˆ1 ) var( ˆ1 ) 0.80625 0.8979
The tabulated value of t for degrees of freedom 12-2=10 and /2=0.025 is 2.228.
Hence the 95% confidence interval for the slope is given by:
ˆ1 3.25 (2.228)(0.8979) 3.25 2 3.25 2, 3.25 2 1.25, 5.25 . The value of 1 in the
null hypothesis is zero which implies it is outside the confidence interval. Hence ˆ is 1
statistically significant.