0% found this document useful (0 votes)
14 views16 pages

Paper No and Title 8, Fundamentals of Econometrics Module No and Title 2, Estimation of Regression Analysis Module Tag BSE - P8 - M2

Uploaded by

Maadhav Sehgal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views16 pages

Paper No and Title 8, Fundamentals of Econometrics Module No and Title 2, Estimation of Regression Analysis Module Tag BSE - P8 - M2

Uploaded by

Maadhav Sehgal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

____________________________________________________________________________________________________

Subject Business Economics

Paper No and Title 8, Fundamentals of Econometrics

Module No and Title 2, Estimation of regression analysis

Module Tag BSE_P8_M2

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
1
____________________________________________________________________________________________________

TABLE OF CONTENTS
1. POPULATIONS AND SAMPLE REGRESSION FUNCTION
2. METHODS FOR ESTIMATING REGRESSION MODEL
3. THE LEAST SQAURE METHODS
3.1. NECESSARY REQUIREMENT FOR OLS ESTIMATES
3.2. INTERPRETATION OF THE COEFFICIENTS
4. VARIANCE AND STANDARD ERROR OF THE OLS ESTIMATES
4.1. THE VARIANCE AND STANDARD ERROR OF THE OLS ESTIMATES
4.2. VARIANCE ESTIMATES OF DISTURBANCE TERM
5. NUMERICAL PROPERTIES OF OLS
6. SUMMARY

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
2
____________________________________________________________________________________________________

1. POPULATION AND SAMPLE REGRESSION FUNCTION


The Population Regression Function (PRF) can be defined as follow:
= � + � +
Where is the dependent variable; is the explanatory variable; is the disturbance
term; � � are fixed but unknown parameters.

The right hand side of the Population Regression Function can be divided into two parts:
(1) The systematic components: � + � = ��
(2) The disturbance term:
We need to estimate the population parameters � � from a given sample since it is
not possible to observe the whole population. The sample counterpart of the PRF is
known as the Sample Regression Function (SRF) which can be expressed as follows:

� = + � + � ℎ � = , ,………,
= ̂� + �

Where ̂� = + � is known as the fitted value of � . ̂� is also known as the


estimated conditional mean of � . The error term � measures the deviation of the sample
value � from the estimated conditional mean ̂� .

The sample counterparts of the PRF of various parameters and terms can be seen as
follows:
Population Sample


�� ̂�

The sample estimates of can be calculated for a given sample but these
estimates will change from sample to sample. However the population parameters
� � are fixed but remain unknown. The relationship between sample and
population regression lines can be seen from figure 1 below:

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
3
____________________________________________________________________________________________________

2. METHODS FOR ESTIMATING RE LGRESSION MODE

We want the deviation of � from ̂� to be as minimum as possible. In other words we


want the least � possible. This can be done in three different ways.

(1) Minimize the sum of the deviations


(2) Minimize the sum of the absolute deviations
(3) Minimize the sum of the squared deviations
Criterion 1: Minimize the sum of the deviations

According to this criterion the values of would be chosen in such away that
the sum of all the errors are (near) zero. This can be achieved by minimizing the
following function
min |∑ �|
�=
Although this criterion is intuitively appealing, it has serious problem. The residuals of
positive signs can be compensated by the residuals of negative signs. So there could be
infinite number of lines which have the same sum of residuals ∑�= � equal to zero, no
matter what its slope or intercept are.

Criterion 2: Minimize the sum of the absolute deviations

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
4
____________________________________________________________________________________________________

According to this criterion the values of would


be chosen in such away that it minimizes the sum of the absolute deviations. This can be
achieved by minimizing the following function:

min ∑| � |
�=
This approach is also known as the ‘minimum absolute distance’ (MAD) estimator
because it minimizes the distance between � ̂� . This approach avoided the
possibility of positive errors being compensated by negative errors. In this approach all
the deviations are given equal weight and it is more resistant to influence by outliers.
However it is not very popular because their calculations are complicated and involves
linear programming or iterative calculations.

Criterion 3: Minimize the sum of the squared deviations

According to this criterion, the values of would be chosen in such away that it
minimizes the sum of the squared deviations. This can be achieved by minimizing the
following function:

min ∑ �
�=

This approach is known as the least square estimation method. This criterion avoided the
problem of compensation of residuals as we square the residuals. This approach puts
more weights on observations with large deviations and less weights on observation with
small deviations. It is also easy to calculate and obtain the least square estimates. The
least square estimates has some very useful properties under some relatively general
conditions.

3. THE LEAST SQAURE METHODS


Recall from the Sample Regression Function
� = ̂� + �
� = � − ̂�
� = � − − �

We want to minimize the sum of squared errors (ESS) or the residual sum of squares
(RSS)
i.e.min ∑�= � . This is shown graphically in figure 2 below:

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
5
____________________________________________________________________________________________________

�= ∑ �
�=

= ∑ � − ̂�
�=

= ∑ � − − �
�=

=∑ � − � − − � � + + � + �
�=
The estimates of and are obtained by partially estimating the above equation with
respect to and

= ∑ − � + + �
�=

= − ∑ � − − �
�=

= − ∑ �
�=

= ∑ − � � + � + �
�=

= − ∑ � − − � �
�=

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
6
____________________________________________________________________________________________________

= − ∑ � �
�=
We equate the above two equations to 0

= − ∑ �− − � =
�=

= − ∑ � − − � � =
�=
We then obtain

∑ � = + ∑ �
�= �=

∑ � � = ∑ � + ∑ �
�= �= �=

These two equations are known as the Ordinary Least Squares Normal Equations. They
represent two equations and two unknowns. Solving them gives and

∑�= � − ̅ � −̅
=
∑�= � − ̅
,
=
= ̅− ̅

3.1. Necessary Requirement for OLS estimate

We can always compute the OLS estimates for a particular sample as long as∑�= � −
̅ > . In other words all the � should not be equal and there should be some variation
in � .

3.2. Interpretation of the Coefficients

: Geometrically represents the intercept and it denotes the point where the
regression line cuts the y-axis. Econometrically it represents the average value of when
= . It may or may not have substantial meaning depending on the problem.

: It captures the change in when changes by 1 unit. It can also be interpreted in


terms of derivatives and marginal effects.

Consider a simple regression model


� = + �+ �
BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS
ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
7
____________________________________________________________________________________________________

The marginal effect of on is calculated as



=

Thus measures the marginal effect of on
When we have additive model like above, the marginal effect of on is the same as the
effect of a one-unit increase in on . However when we have multiplicative or non-
linear model, they are not the same.

4. VARIANCE AND STANDARD ERROR OF THE OLS


ESTIMATES

We know that the variance of a random variable measures the dispersion of that variable
around its mean. If the variance is small, then the individual variable is closer to their
mean. A random variable with smaller variance will also have narrow confidence interval
for the parameter. Therefore the precision of an estimator is captured by the variance of
an estimator. Hence it is worth computing the variances of ordinary least square estimates
respectively.

4.1 The variance and standard error of the OLS estimates:

It should be noted that the OLS estimates , depends on the dependent variable
� . The � in turns depend on the disturbance terms , , … … . , � . Therefore the OLS
estimates are random variables with associated distributions.
The variance and standard error of the ordinary least square estimates are
given as below:
∑� �
� = �
∑� � − ̅

∑� �
� = √ �
∑� �− ̅

Where � is the homoscedastic variance of the disturbance term term



� =
∑� � − ̅

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
8
____________________________________________________________________________________________________


� =
√∑� � − ̅
The covariance between the OLS estimates and are
− ̅
�� , = � =− ̅
∑ �− ̅

From the above formulae, it is clear that when there is large variation in � and sample
size is large, the variance and corresponding standard error of the estimates are smaller.
And smaller variances actually improve the precision of the ordinary least squares
estimates.

The problem with the above expression is that the population variances are unknown
precisely because � is unknown.
4.2 Variance estimates of disturbance term

The population variance � can be estimated from the sample. Consider the following
equation:
̂� = + �
The above equation is a straight line. So the estimate of � can be obtained as follows:
� = �− − � where � is the estimate residual. So an estimator of � is found by
estimating the variance of error term and correcting it for the loss of degrees of freedom
for calculating . So unbiased estimator of � can be obtained as follows:

∑� � �
= �̂ = =
− −

Where is the OLS estimator of the true but unknown � and − are the degrees of
freedom.The standard error of the error term is found by taking the square root of the � .
It is also known as root mean square error or standard error of the disturbance term.

∑� �
=√

Since is an estimate of the variance of the error term � , it is also an estimate of the
variance of � conditioned on �

Numerical Example 1

We illustrate an economic theory by considering a Keynesian Consumption function. The


fundamental Psychological Law states that on average consumption rises as income
increases but the increase in income is less than the increase in consumption. In other
words the marginal propensity to consume is greater than zero but is less than one. The
BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS
ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
9
____________________________________________________________________________________________________

exact functional relationship between income and


consumption is not specified by Keynes. However for the sake of simplicity, we assume
that the relationship between consumption and income is linear. The raw data for weekly
consumption expenditure and weekly income of a family are given below:

Weekly 800 1200 1450 1500 1850 2250


Consumption
expenditure
(In Rupees)
Income 1000 1500 2000 2500 3000 3500
(In Rupees)

From the above given data


(1) Find the Ordinary least square estimate of intercept and slope coefficient.
(2) Find the estimated regression equation for weekly consumption expenditure and
weekly income
(3) Interpret the economic meaning of intercept and slope coefficient.
(4) Find the predicted value of weekly consumption expenditure when the weekly
income is Rs 3300
(5) Calculate the variance and standard deviation of the disturbance term. Interpret its
meaning.

Solution:
Here weekly consumption expenditure is the dependent variable and income is the
explanatory variable. Let represent the dependent variable and represent the
explanatory variable. In simple linear regression model the relationship between Weekly
Consumption expenditure and Weekly income can be written as follows:
� � = + ∗� +

The calculation for slope and intercept coefficient is given in the table below:

−̅ − ̅ − ̅ − ̅ −̅
800 1000 -708.33 -1250 1562500 885412.5
1200 1500 -308.33 -750 562500 231247.5
1450 2000 -58.33 -250 62500 14582.5
1500 2500 -8.33 250 62500 -2082.5
1850 3000 341.67 750 562500 256252.5
2250 3500 741.67 1250 1562500 927087.5
∑ − ̅ −̅ =
∑ =9050 ∑ =13500 ∑ − ̅ = 2312500
4375000

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
10
____________________________________________________________________________________________________

The mean value for the weekly consumption expenditure can be calculated as given
below:

̅= = = .

Thus the average weekly consumption expenditure is Rs. 1508.33

Again the mean value for weekly income can be calculated as given below:

̅= = =
Thus the average weekly income of a family is Rs 2250.

Now, we calculate the slope and intercept coefficient as given below:

∑ − ̅ −̅
= = = .
∑ − ̅
= ̅− ̅= .

(a) The slope coefficient is 0.52857 and the intercept coefficient is 319.0476

(b) The estimated regression equation for weekly consumption expenditure and
weekly income can be written as follow:
�̂ = . + . ∗�

(c) The value of = . measures the slope of the regression line. It shows
that when the value of weekly income for a family lies between Rs 1000 and Rs
3500 and as income increases by Re 1, the average increase in estimated
weekly consumption expenditure is Re 0.52857. So if the weekly income
increases by Rs 100, then on average the weekly consumption expenditure will
rise by Rs 529 approximately.

The value of = . measures the intercept of the regression line. It


indicates that the average level of weekly consumption expenditure of family is
equal to Rs 319 when the weekly income is equal to zero. So if a family does not
have any weekly income, it will have to finance some basic level of consumption
either by dissaving or borrowing. This makes lots of sense. However such
mechanical interpretation of intercept term may not always be meaningful in other
regression models.
BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS
ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
11
____________________________________________________________________________________________________

Therefore it is best to interpret the intercept term as the average or mean effect of
the dependent variable when all the explanatory variables are omitted from the
regression model.

(d) The predicted value of weekly consumption when weekly income is Rs 3300 can
be obtained as follows:

�̂ = . + . ∗ = .

Therefore when weekly income is Rs 3300 the expected weekly consumption


expenditure is Rs 2063. This is the usefulness of regression analysis as they can
be use in predicting the value of one variable based on the given value of another
variable.

(e) The variance and standard variance of the disturbance term can be calculated as
follows:


�̂ = =

ℎ �= ∑ � − ∑ � − ∑ � �

Here ∑ � = ; ∑ � = ∑ � � =
So �= − . ∗ − . ∗ = .
44794.47
Therefore, �̂ = = 4
= .
And � = √ . = 105.8235

The variance and standard variance of the disturbance term are 11198.617 and 105.8235
respectively.

The interpretation for the standard deviation is as follows: The standard deviation of
105.8235 is the magnitude of typical deviation from the estimated regression line. So
some points are closer to the regression line and other points are farther away from it.

Numerical Example 2

It is widely believed in labour economics that wage earning depends on the level of
education. so the relationship between monthly earning and the number of years of
education are as follows:

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
12
____________________________________________________________________________________________________

̂� = . + . �
= .
= .
= .
= .
�= .

(a) Interpret the slope and intercept coefficient of the above wage-education
regression model
(b) What does the standard deviation of disturbance term indicate?
Answer:
(a) There is positive relationship between the level of education and the monthly
wage earning. For every increase in additional year of schooling raises the
monthly wage earning by 42% approximately. The intercept term is positive
however there is no economic meaning attached to it.

(b) The standard deviation of the disturbance term is small indicating that the
individual values of sample data do not divert away from the regression line.

5. SUPURIOUS REGRESSION MODEL

Consider the following model:


=� + � +

In the above specification of the model we assumed implicitly that causes . We


generally use as a measure of goodness of fit. However it cannot be used to identify
the direction of causality. So even if and are highly correlated it does not provide a
clue on whether the changes in causes or the changes in causes . For instance the
correlation coefficient between elephant population and human population may be quite
high in Assam. Does this mean that the change in elephant population causes the change
in human population or vice-versa? This is clearly not the case and we have a situation of
spurious correlation. In such cases, if we regress one variable against the other, we will
have a spurious regression.

Consider a second more realistic example. Suppose we run a regression model in which
the number of crime in a city is consider as a dependent variable and the number of
policemen is taken as an explanatory variable. Let us assume that we obtain positive
slope coefficient. In such a situation can we say that more number of policemen in a city
BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS
ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
13
____________________________________________________________________________________________________

increases the number of crime in that city? Common sense


will tell us that the answer is definitely no. Instead it is possible that due to increase in
crime the city employ more policemen. A more plausible regression model will be to
consider the number of crime as independent variable and the number of policemen as
dependent variable. It is also possible that are other factors that needs to be incorporated
in the regression model. Therefore due care needs to be taken based on economic theory
and other information before formulating a regression model.

6. NUMERICAL PROPERTIES OF OLS


The OLS estimates obtained from the sample data always satisfy the least square criteria.
They can be used to draw the sample regression line. The sample regression line obtained
from the OLS estimates has the following numerical properties.

1. The sample regression line always passes through the sample means of and
̅= + ̅
This property hold when the sample regression model has an intercept term .
To derive the above equation recall that = ̅− ̅ which can be re-written as
̅= + ̅ . So the predicted value of the dependent variable is ̅ when the
explanatory variable is ̅

2. The sum and the average value of the residuals � are zero.
∑ � = and ̅ =

To prove the above property recall the derivation of the least square estimates

= − ∑ �− − � =
�=
Or − ∑�= � =
So ∑�= � = and hence ̅ =

3. The mean value of the predicted (say ̂ ) is equal to the mean value of the actual

̂� = + � = ̅− ̅+ � = ̅+ � − ̅

Summing both the sides over the sample values and dividing through the sample size n
we get

∑�= ̂� ∑�= ̅ ∑�= � − ̅


= +
̅
̂� = ̅
BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS
ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
14
____________________________________________________________________________________________________

4. The regressors and the errors are uncorrelated. i.e the covariance between
regressors and residuals are zero.
� �, � =
Or ∑�= � � =

To prove the above recall the equation of the OLS estimates



= − ∑ �− − � � =
�=

∑ � � =
�=
5. The predicted value of � and the errors are uncorrelated. i.e, the covariance
between predicted value of � and the errors are zero.

� ̂� , � =

∑ ̂� � =
�=

Proof:

∑ ̂� � = ∑ + � �
�= �=

= ∑ � + ∑ � � =
�= �=

because of the fact that


∑�= � � = and∑�= � =

7. SUMMARY
1. The sample regression function is used to estimate the population regression function
because it is not possible to observe the population parameters.

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
15
____________________________________________________________________________________________________

2. There are basically three methods that can be used to


estimate the sample population regression function. They are: (a) Minimizing the sum of
the deviations (b) Minimizing the sum of the absolute deviations and (c) Minimizing the
sum of the squared deviations

3. The least square method is chosen as the best way to estimate the sample regression
function. The least square estimates possessed interesting statistical properties.

4. The intercept coefficient of the ordinary least square represents the average value of
the dependent value when the explanatory variable is zero. The slope coefficient captures
the change in dependent variable when the explanatory variable changes by one unit.

5. The standard errors measure the precision of the ordinary least squares estimates. If the
standard errors are small then the estimates are precisely estimated.

6. In order to avoid running spurious regression model, one must be careful to choose the
dependent and independent variable using economic theory and other prior information.

7. The ordinary least squares estimates had many interesting numerical properties. They
are (a) the sample regression line always passes through the sample means of and (b)
The sum and the average value of the residuals � are zero. (c) The mean value of the
predicted (say ̂ ) is equal to the mean value of the actual (d) The regressors and the
errors are uncorrelated (e) The predicted value of � and the errors are uncorrelated.

BUSINESS PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS


ECONOMICS MODULE No. : 2, ESTIMATION OF REGRESSION
ANALYSIS
16

You might also like