Introduction To Econometrics For Finance
Introduction To Econometrics For Finance
Credit hour: 3
Course outline
Semester: 1st
Econometrics is concerned with summarizing relevant data & information by means of a model.
Such econometric models help us to understand the relation between economic and business variables
economic models.
By: Amare Mitiku
Course Objective
Chapters Chapters
4.1. Heteroscedasticity
3.1. The three variable model notation
and assumptions 4.2. Autocorrelation
3.2. Interpretation of multiple regressions 4.3. Multicollinerity
4.4. Distributed lag models and Expectations
References
Jefferey M. Wooldridge (2002), Introductory Econometrics, A modern Approach, by 2 nd
edition.
Christiaan, BoerPhilip, HFranses, Kloek, and van Dijk, (2007). Econometric Methods with
Applications in Business and Economics, Oxford University press, 1 st edition.
Dominick Salvatore, & Derrck Reagle (2002), Statistics and Econometrics, 2nd edition, Mc
Graw Hill.
Greene, W., (2003), Econometric Analysis, Prentice Hall, 3 rd edition.
Gujarati, D.N., (2009), Basic Econometrics, Mc Graw Hill, New York, 54 th edition.
define econometrics.
Econometrics is the amalgamation of statistics and
mathematical Economics together with economic theories.
In other words,
By: Amare Mitiku
Cont...
14
Mathematical economics states economic theory in terms of
mathematical symbols.
There is no essential difference between mathematical economics
and economic theory.
Both state the same relationships, but while economic theory uses
verbal exposition(words and sentences), mathematics uses
symbols and equations.
Both express economic relationships in an exact or deterministic
form.
Neither mathematical economics nor economic theory allows for
random elements which might affect the relationship and make it
stochastic.
15
economic variables.
Econometric methods are designed to take into account random
economic statistics.
An economic statistician gathers empirical data, records them,
relationships.
By: Amare Mitiku
Cont...
17
Mathematical (or inferential) statistics deals with the method
Any economic theory is an observation from the real world. For one
21
The most important characteristics of economic relationships is
as: Q=b0+b1P+b2P0+b3Y+b4t
By: Amare Mitiku
Cont...
22
The above demand equation is exact. However, many more factors may
affect demand.
In econometrics the influence of these ‘other’ factors is taken into
Q=b0+b1P+b2P0+b3Y+b4t+u
where u stands for the random factors which affect the quantity
demanded.
By: Amare Mitiku
1.5 Methodology of econometrics
23
Econometric research is concerned with the measurement of the parameters of
economic relationships and with the predication of the values of economic variables.
The relationships of economic theory which can be measured with econometric
techniques are relationships in which some variables are postulated as causes of the
variation of other variables.
Starting with the postulated theoretical relationships among economic variables,
In this step the econometrician has to express the relationships between
economic variables in mathematical form.
This step involves the determination of three important tasks:
i) the dependent and independent (explanatory) variables which will be included
in the model.
ii) the a priori theoretical expectations about the size and sign of the parameters of
the function.
iii) the mathematical form of the model (number of equations, specific form of the
equations, etc.)
Note: The specification of the econometric model will be based on economic
theory and on any available information related to the phenomena under
investigation.
Thus, specification of the econometric model presupposes knowledge of
economic theory and familiarity with the particular phenomenon being studied.
By: Amare Mitiku
Cont...
25
Specification of the model is the most important and the most difficult stage of any
econometric research.
It is often the weakest point of most econometric applications.
In this stage there exists enormous degree of likelihood of committing errors or
incorrectly specifying the model.
Some of the common reasons for incorrect specification of the econometric models
are:
1. the imperfections, looseness of statements in economic theories.
2. the limitation of our knowledge of the factors which are operative in any particular
case.
3. the formidable obstacles presented by data requirements in the estimation of large
models.
The most common errors of specification are:
a. Omissions of some important variables from the function.
b. The omissions of some equations (for example, in simultaneous equations model).
c. The mistaken mathematical form of the functions.
By: Amare Mitiku
Cont...
2. Estimation26of the model
econometric methods, their assumptions and the economic implications for the
multicollinearity).
28
This stage consists of deciding whether the estimates of
the parameters are theoretically meaningful and
statistically satisfactory.
This stage enables the econometrician to evaluate the
results of calculations and determine the reliability of the
results.
For this purpose we use various criteria which may be
classified into three groups:
i. Economic a priori criteria:
These criteria are determined by economic theory and
refer to the size and sign of the parameters of economic
relationships.
By: Amare Mitiku
Cont...
29
magnitudes.
A time series data set consists of observations on a variable or several variables over
time.
Examples of time series data include stock prices, money supply, consumer price
index, gross domestic product, annual homicide rates, and automobile sales figures.
Because past events can influence future events and lags in behaviour are prevalent in
the social sciences, time is an important dimension in a time series data set.
Unlike the arrangement of cross-sectional data, the chronological ordering of
observations in a time series conveys potentially important information.
A key feature of time series data that makes them more difficult to analyze than cross-
sectional data is that economic observations can rarely, if ever, be assumed to be
independent across time.
Most economic and other time series are related, often strongly related, to their recent
histories.
For example, knowing something about the gross domestic product from last quarter
tells us quite a bit about the likely range of the GDP during this quarter, because GDP
tends to remain fairly stable from one quarter to the next.
By: Amare Mitiku
Cont....
40
A panel data (or longitudinal data) set consists of a time series for
each cross-sectional member in the data set.
As an example, suppose we have wage, education, and
employment history for a set of individuals followed over a ten-
year period.
Or we might collect information, such as investment and
financial data, about the same set of firms over a five-year time
period.
Panel data can also be collected on geographical units.
For example, we can collect data for the same set of regions in the
Ethiopia on immigration flows, tax rates, wage rates, government
expenditures, and so on, for the years 1990, 2000, and 2008.
some specific value only with some probability. Let’s illustrate the
distinction between stochastic and non stochastic relationships with the help
of a supply function.
several factors.
a. Omission of variables from the function
b. Random behaviour of human beings
c. Imperfect specification of the mathematical form of the model
d. Error of aggregation
e. Error of measurement
and X.
The line represents the exact part of the relationship and the deviation
Yi =α+βXi+Ui
them since their value is not known but you are given with the data of the
dependent and independent variables.
By: Amare Mitiku
Cont...
56
This means that the value which u may assume in any one Period depends on
instance.
2.The mean value of the random variable(U) in any particular period is zero.
This means that for each value of x, the random variable(u) may assume various
values, some greater than zero and some smaller than zero, but if we considered
all the positive and negative values of u, for any given value of X, they would
have an average value equal to zero. In other words, the positive and negative
By: Amare Mitiku
values of u cancel each other.
Cont...
58
Mathematically, E(Ui)=0
mean.
In the ff. Fig. this assumption is denoted by the fact that the values that
u can assume lie with in the same limits, irrespective of the values of X.
For X1 , u can assume any value with in the range AB; for X 2 , u can
assume any value with in the range CD which is equal to AB and so on.
By: Amare Mitiku
Cont...
59
Graphically;
Y
E(Y)=α+Βx
HomoscedasticVariance X
Mathematically;
Var (Ui) = E[Ui- E(Ui)]2 = E (Ui)2 = σ2, (Since E(Ui) = 0).
This constant variance is called homoscedasticity assumption and the constant variance itself
This means that the values of the random variable u (for each x) have a bell
Ui∼N(0,σ2) ………………………………………..……2.4
values are the same in all samples, but the u i values differ from sample
This means that there is no correlation between the random variable and the explanatory variables.
Cov(XiUi) = E(XiUi)-E(Xi)E(Ui)
= E(XiUi)
=0
U absorbs the influence of omitted variables and possible errors of measurement in the Y’s. i.e., we
will assume that the regressors are error free, while Y values may or may not include errors of
measurement.
By: Amare Mitiku
Cont...
63
We can now use the above assumptions to derive the following basic concepts.
A. The dependent variable Yi is normally distributed.
i.e Yi~N[(α+βXi,σ2)]………………………………(2.7)
Proof
Mean: Ε (Yi)=E(α+βXi +ui)
• = E(α+βXi) since, E(Ui)=0
• = α+βXi
Variance
• Var(Yi)=E(Yi-E(Yi))2
• =E(α+βXi +ui- E(α+βXi))2
• =E(Ui)2
• Var(Yi)=σ2. …………………………….........…….(2.8)
The shape of the distribution of Yi is determined by the shape of the distribution of
Ui which is normal by assumption 4.
Since α and β, being constants, they don’t affect the distribution of Yi.
Furthermore, the values of the explanatory variable, Xi , are a set of fixed values by
assumption
By: Amare Mitiku5 and therefore, don’t affect the shape of the distribution of Yi. :.Yi~N[(α+βXi,σ2)]
Cont....
64
B. successive values of the dependent variables are
independent,
i.e. Cov(Yi,Yj)=0
Proof:
Cov(Yi, Yj) = E[[Yi-E(Yi)][Yj-E(Yj)]]...................……………………………..….(2.6)
=
E(UiUj)=0 from Equation 2.5
Therefore, E(YiYj)=0
Specifying the model and stating its underlying assumptions are the first stage of any
econometric application.
The next step is the estimation of the numerical values of the parameters of economic
relationships.
The parameters of the simple linear regression model can be estimated by various methods.
66
The model Yi=α+βXi +Ui is called the true relationship between Y and X because Y
and X represent their respective population value, and α and β are called the true
parameters since they are estimated from the population value of Y and X.
But it is difficult to obtain the population value of Y and X because of technical or
economic reasons.
So we are forced to take the sample value of Y and X. The parameters estimated
from the sample value of Y and X are called the estimators of the true parameters α
and β and are symbolized as α and β hats.
The model Yi=ᾰ+βXi +êi , is called estimated relationship between Y and X since α
and β hats are estimated from the sample of Y and X and ei represents the sample
Estimation of α and β by least square method (OLS) or classical least square (CLS) involves
finding values for the estimates α and β hats which will minimize the sum of square of the squared
residuals (∑ei2).
• ei=Yi-α-βXi.............................................................2.6
To find the values of α and β that minimize this sum, we have to partially differentiate ∑e i 2
with respect to α and β hats and set the partial derivatives equal to zero.
68
Errors are given ei Yi a xi . Some of them are
by
positive and some others are negative. Since mean of these
errors is zero, E e 0 , it is customary to take sum of
i
squared errors and estimate the unknown parameters a and
that minimise the sum squared errors.
2
S ei2
Y a x
i i
(1)
i i
Sign of each and every squared error would be posit ive
e2 0
when e ~ N 2
0,
.
i i
S e2 ~ n
2 distribut ion where subscript n stands for
i i
degrees of freedom which equals the number of terms in S
69
S 2 Y a x 1 0 and
a i i
S 2 Y a x x 0
i i i
70
2
xi yi Na xi x
i
(4)
i i i i
N x y a N x N x2 (5)
i i i i i i i
Now subtracting (5) from (4) we get the estimator for
.
N x y x y
ˆ i i i i ii i
(6)
2
N x 2 x
i i i i
Estimator for a can be found by dividing both sides
of (2) by N and using the average values x and y .
a y ̂ x (7)
An Example of OLS Estimation
71
Food expenditure and income: data and prediction
Y X XY X2 Y2 Ypred Sqpredy prede sqprede
4 5 20 25 16 2.866285 8.21559 1.133715 1.28531
6 8 48 64 36 5.742472 32.97598 0.257528 0.066321
7 10 70 100 49 7.65993 58.67453 -0.65993 0.435508
8 12 96 144 64 9.577388 91.72636 -1.57739 2.488153
11 14 154 196 121 11.49485 132.1315 -0.49485 0.244873
15 17 255 289 225 14.37103 206.5266 0.628967 0.395599
18 20 360 400 324 17.24722 297.4666 0.75278 0.566678
22 25 550 625 484 22.04087 485.7997 -0.04087 0.00167
Sumy Sumx Sumxy sumxsq sumysq 36.4218 Smsqpredy smsqprede
91 111 1553 1843 1319 127.4218 1313.517 -3.9E-05 5.484111
Estimates
72
N xi yi xi yi
ˆ i i i
2
N xi2 xi
i i
91 111
a y ˆ x 0.95872 11 .375 0.95872(13.875) 11 .375 13.30224 1.927
8 8
(9)
73
Both slops and intercepts make economic sense. In this sample expenditure
on foods is determined by weekly income of an individual, people spend
95.9% percent of their weekly income in food expenditure. People who do
not have any income receive an income subsidy of 1.93 per week.
Mean prediction
We can use equation (10) to find the predicted values Ŷi for each
observation on x i . These are reported as YPRED in the above tab le. If the
weekly income is 40 predicted foo d expenditure will be 36.422. Error terms
are also estimated using the fact that
ˆ y y
e ˆ x y 1.92724 0.95873x
ˆ y a
i i i i i i i
These predicted errors are reported as prede in the above table. Note that as
expected some of the errors are negative and some other are positive.
74
2.2.2.2 Estimation of a function with zero intercept
ΣXi(Yi−βXi)=0
ΣXiYi-βΣXi2=0
Β=ΣXiY2 i
ΣXi
This formula involves the actual values (observations) of the variables and not
76
There are various econometric methods with which we may obtain the estimates of
the parameters of economic relationships.
We would like to have an estimated parameters which is/are as close as the value of
the true population parameters i.e. to vary within only a small range around the true
parameter.
How are we to choose among the different econometric methods, the one that gives
‘good’ estimates? We need some criteria for judging the ‘goodness’ of an estimate.
‘Closeness’ of the estimate to the population parameter is measured by the mean and
variance or standard deviation of the sampling distribution of the estimates of the
different econometric methods.
We assume the usual process of repeated sampling i.e. we assume that we get a very
large number of samples each of size ‘n’; we compute the estimates β’s from each
sample, and for each econometric method and we form their distribution.
We next compare the mean (expected value) and the variances of these distributions
and we choose among the alternative estimates the one whose distribution is
concentrated as close as possible around the population parameter.
By: Amare Mitiku
PROPERTIES OF OLS ESTIMATORS
77
The ideal or optimum properties that the OLS estimates possess may
According to the this theorem, under the basic assumptions of the classical
linear regression model, the least square estimators are linear, unbiased and
have minimum variance (i.e. are best of all linear unbiased estimators).
Some times the theorem referred as the BLUE theorem i.e. Best, Linear,
Unbiased Estimator.
1998 65 5
1999 80 4.5
2000 83 4.5
2001 76 5.5
2002 77 6
2003 71 6
2004 75 6
2005 74 5
2006 69 7
2007 78 6.5
By: Amare Mitiku
Cont…
81
Then,
Based up on the economic apriori criterion
1.Specify the simple linear regression model
With zero intercept ?
With non zero intercept?
2. Estimate the values of the parameters ?
By what amount will investment demand increase if interest declines
by 1%?
What will be the level of investment if rate of interest is 10%?
3. Calculate the sample error sum of squares?
4. Estimate the variance of the intercept?
Determination?
8. Interpret the values of R2?
9. Test whether the estimated values of the parameters are
significant or not using
Standard Error test?
Student-test? Given LS=5%=2.10 for two tail and 3.63 for one tail.
Confidence Interval? At ls=5%=1.96 for two tail test.