0% found this document useful (0 votes)
2 views

Lecture 1

The document outlines the course structure for Advanced Econometrics - I at Samara University, focusing on key topics such as linear regression, endogeneity, and discrete variable models. It details the stages of econometric research, including model specification, estimation, evaluation, and forecasting. Additionally, it provides references for further reading on econometric analysis and methods.

Uploaded by

Hussen Mahammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture 1

The document outlines the course structure for Advanced Econometrics - I at Samara University, focusing on key topics such as linear regression, endogeneity, and discrete variable models. It details the stages of econometric research, including model specification, estimation, evaluation, and forecasting. Additionally, it provides references for further reading on econometric analysis and methods.

Uploaded by

Hussen Mahammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Samara University

College of Business and Economics


Department of Economics

Advanced Econometrics - I (DEC 531)

For Development Economics Extension Students

By: Getachew W.
April, 2023
Getachew W. SU 2023 1
Course outlines

❖ Chapter One: Brief review of linear regression


➢Introduction
➢Simple linear regression
➢Ordinary least squares estimation: assumptions
➢Multiple linear regression
➢Hypothesis testing tests
➢Applications using Statistical software

Getachew W. SU 2023 2
Outlines…

❖Chapter Two: Endogeneity and Simultaneous Equation Models


➢The causes of the problem of endogeneity
➢Simultaneous equation models & identification issues
➢Instrumental variables
➢Two stage least squares
➢Generalized method of moments
➢Maximum likelihood estimation
➢Applications using Statistical software's

Getachew W. SU 2023 3
Outlines…

❖Chapter Three: Discrete and Limited Dependent Variable models


➢Binary Outcome Models
✓ Linear probability model (LPM)
✓ Logit model
✓ Probit model
➢Multi Response Models
✓ Ordered Response Models
✓ Multinomial Response Models
➢Limited Dependent Variable models
✓ Tobit model
✓ Multivariate Models
➢Applications using Statistical software's

Getachew W. SU 2023 4
References

❖Greene, W. (2003). Econometric Analysis(5th Ed.). Pearson Education, Inc., Upper


Saddle River, New Jersey, 07458.

❖Verbeek, M. (2004). A Guide to Modern Econometrics (2nd Ed.). John Wiley and
Sons Ltd.

❖Wooldridge, J.M. (2002). Econometric Analysis of Cross-section and Panel Data. The
MIT Press.

❖Hayashi, F. (2000). Econometrics. Princeton University Press

❖Gujrati, D. Basic Econometrics , 3rd Edition, ” McGraw Hill, 1993

Getachew W. SU 2023 5
Chapter One: Brief review of linear regression

1.1. Introduction
❖ In various courses we may have learnt a lot of economic theories that suggests the
relationships among economic variables.
❖ Fore instance, in microeconomics we learn quantity demand and supplied of a
commodity depends on its price. In macroeconomics, we learn that consumption function
of a person is depends on its disposable income or amount of investment in the economy
depends on the rate of interest rate.
➢ Once the relationship is determined, how this relationship can be used to make
forecasts? and
➢ Moreover, what is the effect of change in one economic variable on the other?
❖The discipline of econometrics focuses on all these and even on broad issues.
Getachew W. SU 2023 6
Introduction…

❖ Starting with the postulated theoretical relationships among economic variables,


econometric research generally proceeds along the following stages.
1. Specification the model,
2. Estimation of the model,
3. Evaluation of the estimates, and
4. Evaluation of the forecasting power of the estimated model.

1. The specification of the econometric model will be based on economic theory


and on any available information related to the phenomena under investigation.
❖ In this step the econometrician has to express the relationships between
economic variables in mathematical form by the dependent and independent
(explanatory) variables which will be included in the model. 𝑄𝑑 = 𝛽0 − 𝛽1 𝑃 + 𝑢
Getachew W. SU 2023 7
Introduction…

2. Estimation of the model: This is purely a technical stage which requires knowledge
of the various econometric methods, their assumptions and the economic
implications for the estimates of the parameters. 𝑄𝑑 = 10 − 0.2𝑃 + 𝑢

❖ This stage includes the activities starts from gathering of the data on the variables
included in the model to the choice of appropriate economic techniques for
estimation, i.e. to decide a specific econometric method to be applied in
estimation; such as, MLRM, Logit, and Probit.

3. Evaluation of the estimates: This stage consists of deciding whether the estimates
of the parameters are theoretically meaningful and statistically satisfactory.
Getachew W. SU 2023 8
Introduction…
❖For this purpose we use various criteria which may be classified into three groups:

i. Economic a priori criteria: These criteria are determined by economic theory and refer to the size
and sign of the parameters of economic relationships.

ii. Statistical criteria (first-order tests): These are determined by statistical theory and aim at the
evaluation of the statistical reliability of the estimates of the parameters of the model using
correlation coefficient test, standard error test, t-test, F-test, and R2-tests.

iii. Econometric criteria (second-order tests): These are set by the theory of econometrics and aim at
the investigation of whether the assumptions of the econometric method employed are satisfied or
not in any particular case. Econometric criteria aim at the detection of the violation or validity of the
assumptions of the various econometric techniques.
Getachew W. SU 2023 9
Introduction…

4) Evaluation of the forecasting power of the model: Forecasting is one of the


aims of econometric research.

❖ The model may be economically meaningful and statistically and


econometrically correct for the sample period, yet it may not be suitable for
forecasting due to various factors (reasons).

❖Therefore, this stage involves the investigation of the stability of the


estimates and their sensitivity to changes in the size of the sample.

Getachew W. SU 2023 10
Introduction…

❖ As a general there are three main goals of Econometrics:

i) Analysis i.e. testing economic theory

ii) Policy making i.e. Obtaining numerical estimates of the coefficients of


economic relationships for policy simulations.

iii) Forecasting i.e. using the numerical estimates of the coefficients in order to
forecast the future values of economic magnitudes.

Getachew W. SU 2023 11
1.2. Simple linear regression

❖ Much of applied econometrics analysis begins with the premises that: y and x are
two variables, representing some populations, and we are interested in ‘explaining
y in terms of x’ or in ‘studying how y varies with changes in x’.

❖The variables y and x has several different names used interchangeably, as follows.

❖ Y is called the dependent variable, the explained variable, the response variable,
the predicted variable, or the regressand.

❖ X is the independent variable, the explanatory variable, the control variable, the
predictor variable, or the regressor (covariate) variable.
Getachew W. SU 2023 12
SLRM…
❖ A relationship between x and y, characterized as y = f(x) is said to be deterministic or non-
stochastic, if for each value of the independent variable (x) there is only one corresponding value
of dependent variable (y).

❖ On the other hand, a relationship between x and y is said to be stochastic, if for a particular value
of X there is a whole probabilistic distribution of values of Y.

𝑦 = 𝛽0 + 𝛽1 𝑥 1

❖From equation 1 above, there is only one corresponding value of y for a particular value of x, which
is a deterministic (non-stochastic) relationship between x and y.

❖This implies that all the variation in y is due solely to changes in x, and that there are no other
factors affecting the dependent variable.

Getachew W. SU 2023 13
SLRM…

❖ If this were true, all the points of (x, y) pairs plotted on a two- dimensional plane
would fall on a straight line.

❖ However, if we gather observations on both x and y and we plot them on a


diagram we see that they do not fall on a straight line.

X
Getachew W. SU 2023 14
SLRM…

❖The deviation of the observation from the line may be attributed to several factors.

a. Omission of variables from the function

b. Random behavior of human beings

c. Imperfect specification of the mathematical form of the model

d. Error of aggregation

e. Error of measurement

Getachew W. SU 2023 15
SLRM…
❖ However, many more factors other than x may affect y. In econometrics the
influence of these ‘other’ factors is taken into account by the introduction into the
economic relationships of random variable having the stochastic form.
❖In order to take into account the above sources of errors, we introduce a random
variable in econometric functions which is usually denoted by the letter ‘u’ or ‘ε’.
𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝑢 2
❖ The variable u, called the error term or disturbance in the relationship, represents
factors other than x that affect y. Because u is supposed to ‘disturb’ the exact linear
relationship which is assumed to exist between X and Y.
Getachew W. SU 2023 16
SLRM…

❖ Equation 2, which is assumed to hold in the population of interest, defines the simple
linear regression model. It is also called the two-variable linear regression model or
bivariate linear regression model because it relates the two variables x and y.
Random variable that shows the part of y not
explained by x, that is to say the change in y is due
Dependent variable to the random influence of u .

Slope parameter
𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝑢
The regression line that shows the part
Intercept, or
of y explained by the changes in x.
constant term
Getachew W. SU 2023 17
1.3. Ordinary least squares estimation: assumptions

❖Once we specify the model, the next task is estimation of the parameters (β‘s) by using
various methods. There are three methods to estimate parameters of the simple linear
regression model.
1. Ordinary least square method (OLS): aimed to find the estimates that minimize the
sum of the squared residuals which is the gap between actual value and predicted value
of the dependent variable.
2. Maximum likelihood method (MLM): aimed to choose the estimates that maximize
the likelihood function or probability of observing the given Y’s is as high (or maximum)
as possible.
3. Method of moments (MM): it is similar with OLS with some restrictive assumptions.

❖But, here we will deal with only the OLS estimation procedure.

Getachew W. SU 2023 18
Assumptions …

1. The model is linear in parameters: the classicalist assumed that the model should
be linear in the parameters regardless of whether the explanatory and the
dependent variables are linear or not.

❖Example:

a) 𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝑢 linear in both parameters and the variables, so it satisfies the


assumption.

b) ln𝑦 = 𝛽0 + 𝛽1 𝑙𝑛𝑥 + 𝑢 linear only in parameters, so it satisfies the assumption.

c) 𝑦 = 𝛽0 + 𝛽1 2 𝑥 + 𝑢 not linear in parameters, so it did not satisfies the assumption.


Getachew W. SU 2023 19
Assumptions …

2. Ui is a random real variable: This means that the value which u may assume in any one
period depends on chance; it may be positive, negative or zero. Every value has a certain
probability of being assumed by u in any particular instance.

3. The mean value of the random variable(U) in any particular period is zero: This means
that for each value of x, the random variable(u) may assume various values, some greater
than zero and some smaller than zero, but if we considered all the possible and negative
values of u, for any given value of X, they would have on average value equal to zero.

In other words the positive and negative values of u cancel each other.

Mathematically, 𝐸(𝑈𝑖 ) = 0 2

Getachew W. SU 2023 20
Assumptions …

4. The assumption of homoscedasticity: The variance of the random


variable(U) is constant in each period. That is for all values of X, the u’s will
show the same dispersion around their mean.

Mathematically, 𝑉𝑎𝑟(𝑈𝑖 ) = 𝐸[(𝑈𝑖 ) − 𝐸(𝑈𝑖 )]2 = 𝛿 2

5. The random variable (U) has a normal distribution: This means the values
of u (for each x) have a bell shaped symmetrical distribution about their zero
mean and constant variance 𝛿 2 . That is
𝑈𝑖 ~𝑁(0, 𝛿 2 )
Getachew W. SU 2023 21
Assumptions …

6. No autocorrelation: The random terms of different observations 𝑈𝑖 , 𝑈𝑗 are


independent. This means the value which the random term assumed in one period
does not depend on the value which it assumed in any other period.

❖ Algebraically, 𝐶𝑜𝑣(𝑈𝑖 , 𝑈𝑗 ) = 0 → 𝐸[[(𝑈𝑖 ) − 𝐸(𝑈𝑖 )] [(𝑈𝑗 ) − 𝐸(𝑈𝑗 )]]=0

7. The random variable (U) is independent of the explanatory variables: This means
there is no correlation between the random variable and the explanatory variable. If
two variables are unrelated their covariance is zero.

Mathematically, 𝐶𝑜𝑣(𝑈𝑖 , 𝑋𝑖 ) =0
Getachew W. SU 2023 22
OLS estimations…

❖The model Y = 𝛽0 + 𝛽1 𝑋 + 𝑈 is called the true relationship between Y and X, because Y


and X represent their respective population value, and 𝛽0 and 𝛽1 are called the true
parameters since they are estimated from the population value of Y and X. But it is difficult
to obtain the population value of Y and X because of technical or economic reasons.
❖ So we are forced to take the sample value of Y and X. The parameters estimated from the
sample value of Y and X are called the estimators of the true parameters and 𝛽0 and
෢0 and 𝛽
𝛽1 are symbolized as 𝛽 ෢1 .

෢0 + 𝛽
❖The model, 𝑌𝑖 = 𝛽 ෢1 𝑋𝑖 + 𝑒𝑖 is called estimated relationship between Y and X since
෢0 and 𝛽
𝛽 ෢1 are estimated from the sample of Y and X and 𝑒𝑖 represents the sample
counterpart of the population random disturbance 𝑈𝑖 .
Getachew W. SU 2023 23
OLS estimations…
❖Estimation of 𝛽0 and 𝛽1 by ordinary least square method (OLS) or classical least square (CLS) involves finding
෢0 and 𝛽
values for the estimates 𝛽 ෢1 which will minimize the sum of the squared residuals (σ 𝑒𝑖2 ).

෢0 + 𝛽
From the estimated relationship 𝑌𝑖 = 𝛽 ෢1 𝑋𝑖 + 𝑒𝑖 , we obtain that
෢0 − 𝛽
𝑒𝑖 = 𝑌𝑖 − 𝛽 ෢1 𝑋𝑖

෢0 − 𝛽
෍ 𝑒𝑖2 = ෍(𝑌𝑖 − 𝛽 ෢1 𝑋𝑖 )2

෢0 and 𝛽
❖To find the values of 𝛽 ෢1 that minimize this sum, we have to partially differentiate σ 𝑒𝑖2 with respect to
෢0 and 𝛽
𝛽 ෢1 and set the partial derivatives equal to zero.

෢0 − 𝛽
❖Rearranging this expression we will get σ 𝑌𝑖 = 𝑛𝛽 ෢1 σ 𝑋𝑖 3 and
෢0 σ 𝑋𝑖 + 𝛽
σ 𝑌𝑖 𝑋𝑖 = 𝛽 ෢1 σ 𝑋𝑖2 4

Getachew W. SU 2023 24
OLS estimations…
❖ Both equation 3 and 4 can be in matrix notation as

𝑛 σ 𝑋𝑖 ෢
𝛽0 σ 𝑌𝑖
=
σ 𝑋𝑖 σ 𝑋𝑖2 ෢
𝛽1 σ 𝑌𝑖 𝑋𝑖

σ 𝑌𝑖 σ 𝑋𝑖
σ 𝑌𝑖 𝑋𝑖 σ 𝑋𝑖2

𝛽0 = 𝑛 σ 𝑋𝑖
σ 𝑋𝑖 σ 𝑋𝑖2

𝑛 σ 𝑌𝑖
෢ σ 𝑋𝑖 σ 𝑌𝑖 𝑋𝑖
𝛽1 = 𝑛 σ 𝑋𝑖
σ 𝑋𝑖 σ 𝑋𝑖2

Finally, ෢ ത ෢
𝛽0 = 𝑌- 𝛽1 𝑋ഥ and 5

σ 𝑌 𝑋 −𝑛𝑌ത 𝑋 ഥ

𝛽1 = σ 𝑋𝑖 2𝑖−𝑛𝑋ഥ 2 6
𝑖

Getachew W. SU 2023 25
1.4. Properties of OLS estimators
❖There are various econometric methods with which we may obtain the estimates of the
parameters of economic relationships. We would like to an estimate to be as close as the
value of the true population parameters i.e. to vary within only a small range around the
true parameter.

❖ How are we to choose among the different econometric methods, the one that gives
‘good’ estimates? We need some criteria for judging the ‘goodness’ of an estimate.

❖‘Closeness’ of the estimate to the population parameter is measured by the mean and
variance or standard deviation of the sampling distribution of the estimates of the
different econometric methods.

Getachew W. SU 2023 26
Properties…

❖The ideal or optimum properties that the OLS estimates possess may be summarized by
well known theorem known as the Gauss-Markov Theorem.

❖Statement of the theorem: “Given the assumptions of the classical linear regression
model, the OLS estimators, in the class of linear and unbiased estimators, have the
minimum variance, i.e. the OLS estimators are BLUE.
a. Linear: a linear function of the a random variable, such as, the dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.

Getachew W. SU 2023 27
Properties…
෢1 )
a. Linearity: (for 𝛽
෢0 and 𝛽
❖ Proposition: 𝛽 ෢1 are linear in Y.

෢1 is given by:
❖ Proof: From equation 6 of the OLS estimator of 𝛽

෢1 = σ 𝑥𝑖𝑦2𝑖
𝛽
σ 𝑥𝑖


෢1 = σ 𝑥𝑖(𝑌𝑖2−𝑌)
𝛽
σ 𝑥𝑖

෢1 = σ 𝑥𝑖𝑌2𝑖)
𝛽
σ 𝑥𝑖

𝑥𝑖 ෢1 = σ 𝐾𝑖 𝑌𝑖 that means 𝛽
෢1 = 𝐾1 𝑌1 + 𝐾2 𝑌2 + 𝐾3 𝑌3 +…+ 𝐾𝑛 𝑌𝑛 Therefore, 𝛽
෢1 are linear in Y.
Let 𝐾𝑖 = then 𝛽
σ 𝑥𝑖 2

෢0 = σ( 1 − 𝑋ത 𝐾𝑖 ) 𝑌𝑖
෢0 𝑖𝑠 linear in Y? Hence: 𝛽
❖ Show that 𝛽 𝑛

Getachew W. SU 2023 28
Properties…
෢0 and 𝛽
b. Unbiasedness: Proposition: 𝛽 ෢1 are the unbiased estimators of the true parameters 𝛽0 and 𝛽1 . That is
෢0 ) = 𝛽0 and E(𝛽
E(𝛽 ෢1 ) = 𝛽1

Proof it!
෢1 = σ 𝐾𝑖 𝑌𝑖 where
From the proof of linearity property we have get that 𝛽
෢1 = σ 𝐾𝑖 (𝛽0 + 𝛽1 𝑋𝑖 + 𝑈𝑖 )
Model 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑈𝑖 Which implies that 𝛽 But, σ 𝐾𝑖 = 0 𝑎𝑛𝑑 σ 𝐾𝑖 𝑋𝑖 = 1
෢1 = 𝛽1 + σ 𝐾𝑖 𝑈𝑖 that is 𝛽
After simple mathematics, 𝛽 ෢1 − 𝛽1 = σ 𝐾𝑖 𝑈𝑖

෢1 ) = 𝐸( 𝛽1 ) + σ 𝐾𝑖 𝐸(𝑈𝑖 ) Since, 𝐸(𝑈𝑖 )=0 then


E(𝛽
෢1 ) = 𝛽1
E(𝛽 ෢1 𝑖𝑠 𝑢𝑛𝑏𝑖𝑎𝑠𝑒𝑑 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 𝑜𝑓 𝛽1
Therefore, 𝛽

෢0 = σ( 1 − 𝑋ത 𝐾𝑖 ) 𝑌𝑖 )
෢0 ) = 𝛽0 (Hint: 𝛽
Proof that (𝛽
𝑛

Getachew W. SU 2023 29
Properties…
c. Minimum variance of 𝛽 ෢0 and 𝛽
෢1 : 𝛽
෢0 and 𝛽
෢1 possess the smallest sampling variances. For
this, we shall first obtain variance of 𝛽෢0 and 𝛽
෢1 and then establish that each has the
minimum variance in comparison of the variances of other linear and unbiased estimators
obtained by any other econometric methods than OLS.
෢1 = E(𝛽
𝑣𝑎𝑟𝛽 ෢1 − E(𝛽
෢1 ))2

෢1 − 𝛽1 )2
= E(𝛽
෢1 = E(σ 𝐾𝑖 𝑈𝑖 )2
𝑣𝑎𝑟𝛽

= E[𝐾12 𝑈12 + 𝐾22 𝑈22 +…+ 𝐾𝑛2 𝑈𝑛2 + 2𝐾1 𝐾2 𝑈1 𝑈2 + ⋯ . +2𝐾𝑛−1 𝐾𝑛 𝑈𝑛−1 𝑈𝑛 ]

= E(σ 𝐾𝑖2 𝑈𝑖2 ) + E(σ 2𝐾𝑖 𝐾𝑗 𝑈𝑖 𝑈𝑗 ) i ≠j


Getachew W. SU 2023 30
Properties…
= σ 𝐾𝑖2 𝐸(𝑈𝑖2 ) + 2 σ 𝐾𝑖 𝐾𝑗 𝐸(𝑈𝑖 𝑈𝑗 )

෢1 = 2 2 𝛿2
𝑣𝑎𝑟𝛽 𝛿 σ 𝐾𝑖 = σ 2
𝑥𝑖

෢0
For variance of 𝛽
෢0 = E(𝛽
𝑣𝑎𝑟𝛽 ෢0 − E(𝛽
෢0 ))2

෢0 − 𝛽0 )2
= E(𝛽
1
= E[ ( − 𝑋ത 𝐾𝑖 )2 𝑈𝑖 2 ]
σ
𝑛

2
1 1 𝑋ത 2 ෢0 =𝛿 2 ( σ 𝑋𝑖 2 )
=𝛿 2 σ( − 𝑋ത 𝐾𝑖 )2 = 𝛿 2( + σ 2) Finally, we get the 𝑣𝑎𝑟𝛽
𝑛 𝑛 𝑥𝑖 𝑛 σ 𝑥𝑖
Getachew W. SU 2023 31
Properties…
❖We have computed the variances OLS estimators and these variances of OLS
estimators do possess minimum variance property compared to the variances other
෢0 and 𝛽
estimators of the true 𝛽0 and 𝛽1 other than 𝛽 ෢1 .

෢0 and 𝛽
❖To establish that 𝛽 ෢1 possess minimum variance property, we compare their
variances with that of the variances of some other alternative linear and unbiased
estimators of 𝛽0 and 𝛽1 , say 𝛽0 ∗ and 𝛽1 ∗ .
❖ To prove that any other linear and unbiased estimator of the true population
parameter obtained from any other econometric method has larger variance than
the OLS estimators, please refer Gujarati.
Getachew W. SU 2023 32
1.5. Hypothesis testing specification tests
❖ After the estimation of the parameters and the determination of the least square regression
line, we need to know how ‘good’ is the fit of this line to the sample observation of Y and X,
that is to say we need to measure the dispersion of observations around the regression line.

❖ This concept is essential because the closer the observation to the line, the better the
goodness of fit, i.e. the better is the explanation of the variations of Y by the changes in the
explanatory variables.

Getachew W. SU 2023 33
Hypothesis…

❖ The two most commonly tests in econometric analysis are:


1. The coefficient of determination (the square of the correlation coefficient i.e. R2).
This test is used for judging the explanatory power of the independent variable(s).
2. The standard error tests of the estimators. This test is used for judging the statistical
reliability of the estimates of the regression coefficients.

1. TESTS OF THE ‘GOODNESS OF FIT’ WITH R2: R2 shows the percentage of total variation of
the dependent variable that can be explained by the changes in the explanatory
variable(s) included in the model.

❖TSS=ESS+RSS
Getachew W. SU 2023 34
Hypothesis…

❖The value of R2 falls between zero and one. i.e. 0 < R2<1 .

❖If R2=0.9, this value can be interpreted as the regression line gives a good fit
to the observed data since this line explains 90% of the total variation of the
Y value around their mean. The remaining 10% of the total variation in Y is
unaccounted for by the regression line and is attributed to the factors
included in the disturbance variable Ui.

Getachew W. SU 2023 35
Hypothesis…
2. TESTING THE SIGNIFICANCE OF OLS PARAMETERS
෢0 and 𝛽
❖The OLS estimates 𝛽 ෢1 are obtained from a sample of observations on Y and X.

❖Since sampling errors are inevitable in all estimates, it is necessary to apply test of
significance in order to measure the size of the error and determine the degree of
confidence in order to measure the validity of these estimates.
❖This can be done by using various tests. The most common ones are:
i) Standard error test ii) Student’s t-test iii) Confidence interval
❖All of these testing procedures reach on the same conclusion. Let us now see these testing
methods one by one.

Getachew W. SU 2023 36
Hypothesis…
෢0 and 𝛽
i) Standard error test: This test helps us decide whether the estimates 𝛽 ෢1 are significantly
different from zero, i.e. whether the sample from which they have been estimated might have come
from a population whose true parameters are zero. 𝛽0 = 0 𝑎𝑛𝑑/𝑜𝑟 𝛽1 =0

❖Formally we test the null hypothesis 𝐻0 : 𝛽𝑖 = 0 against the alternative hypothesis 𝐻1 : 𝛽𝑖 ≠ 0

෢𝑖 ) =
Steps: 1) Compute standard error of the parameters SE(𝛽 ෢𝑖 )
𝑣𝑎𝑟(𝛽

2) compare the standard errors with the corresponding numerical values of 𝛽෡𝑖

෢ 𝟏 ෢
3) Decision: if SE(𝛽𝑖 ) > (𝛽𝑖 ), accept the null hypothesis and we conclude that 𝛽෡𝑖 is statistically
𝟐
insignificant.

Getachew W. SU 2023 37
Hypothesis…
❖The acceptance or rejection of the null hypothesis has definite economic meaning.

❖Namely, the acceptance of the null hypothesis 𝛽1 = 0 (the slope parameter is zero) implies
that the explanatory variable to does not in fact influence the dependent variable Y and
since the conducted test provided evidence that changes in X leaves Y unaffected.

❖In other words acceptance of H0 implies that there is no relationship between X and Y.

❖ Example: Suppose that from a sample of size n=30, we estimate the following supply
function. 𝑄𝑆 = 120 (1.7) + 0.6𝑃(0.025) + 𝑒 (NB: values in bracket are standard errors)

❖Test the significance of the slope parameter at 5% level of significance using the standard
error test.
Getachew W. SU 2023 38
Hypothesis…

෢1 ) = 0.025, 𝛽
SE(𝛽 ෢1 =0.6, ½(𝛽෢
1 )=0.3

෢1 ) < ½(𝛽
❖Since SE(𝛽 ෢1 ) i. e. 0.025 < 0.3

❖Therefore we reject the null hypothesis meaning that price is statistically


significant at 5% level of significance and has a significant influence on
quantity supply of a product.

Getachew W. SU 2023 39
Hypothesis…
ii) Student’s t-test: Like the standard error test, this test is also important to test the significance of
the parameters.

❖The null hypothesis 𝐻0 : 𝛽𝑖 = 0 against the alternative hypothesis 𝐻1 : 𝛽𝑖 ≠ 0

❖Important steps:

1. Compute t*, which is called the computed value of t, by taking the value of β in the null

β෡ −𝛽 β෡
hypothesis. In our case β = 0 , then t* becomes t∗ = ෡) t∗ = ෡)
𝑺𝑬(β 𝑺𝑬(β

2. Choose level of significance. Level of significance is the probability of making ‘wrong’ decision, i.e.
the probability of rejecting the hypothesis when it is actually true or the probability of committing a
type I error.

Getachew W. SU 2023 40
Hypothesis…

❖It is customary in econometric research to choose the 5% or the 1% level of


significance. This means that in making our decision we allow (tolerate) five times
out of a hundred to be ‘wrong’ i.e. reject the hypothesis when it is actually true.

3. Check whether there is one tail test or two tail test. If the inequality sign in the
alternative hypothesis is ≠, then it implies a two tail test and divide the chosen level
of significance by two, to decide the critical rejoin or critical value of t called tc.

❖ But if the inequality sign is either > or < then it indicates one tail test and there is
no need to divide the chosen level of significance by two to obtain the critical value
of to from the t-table.
Getachew W. SU 2023 41
Hypothesis…

Example: if we have 𝐻0 : 𝛽𝑖 = 0 against 𝐻1 : 𝛽𝑖 ≠ 0, Then this is a two tail test. If


the level of significance is 5%, divide it by two to obtain critical value of t from
the t-table.
α
4. Obtain critical value of t, called tc at and n-2 degree of freedom for two
𝟐

tail test, and only α for one tail test.

5. Compare t* (the computed value of t) and tc (critical value of t)


• If t*> tc , reject H0 and accept H1. The conclusion is β෠ is statistically significant.
• If t*< tc , accept H0 and reject H1. The conclusion is β෠ is statistically insignificant.
Getachew W. SU 2023 42
Hypothesis…
❖ Numerical Example: Suppose that from a sample size n=20 we estimate the following consumption function:
C = 100 (75.5) + 0.7𝐼(0.21) + 𝑒 (NB: values in bracket are standard errors).

❖We want to test the null hypothesis: 𝐻0 : 𝛽𝑖 = 0 against the alternative hypothesis 𝐻1 : 𝛽𝑖 ≠ 0 using the t-test
at 5% level of significance.

β෡ −𝛽 0.7−0
a. the t-value for the test statistic is: t∗ = ෡) implies t∗ = =3.33
𝑺𝑬(β 𝟎.𝟐𝟏

b. α =0.05
α
c. Since the alternative hypothesis (H1) is stated by inequality sign (≠) ,it is a two tail test, hence we divide
𝟐
𝟎.𝟎𝟓 α
which is =0.025 to obtain the critical value of ‘t’ at =0.025 and 18 degree of freedom (df) i.e. (n-2=20-
𝟐 𝟐
2).

Getachew W. SU 2023 43
Hypothesis…

d) From the t-table ‘tc’ at 0.025 level of significance and 18 df is 2.10.

e) Since t*=3.33 and tc=2.1, t*>tc. It implies that β෠ is statistically significant.

Getachew W. SU 2023 44
Hypothesis…
iii) Confidence interval: Rejection of the null hypothesis doesn’t mean that our estimate β෠ is
the correct estimate of the true population parameter β. It simply means that our estimate
comes from a sample drawn from a population whose parameter is different from zero.

❖In order to define how close the estimate to the true parameter, we must construct
confidence interval for the true parameter, in other words we must establish limiting
values around the estimate with in which the true parameter is expected to lie within a
certain “degree of confidence”. In this respect we say that with a given probability, the
population parameter will be with in the defined confidence interval (confidence limits).

Getachew W. SU 2023 45
Hypothesis…
❖We choose a probability in advance and refer to it as confidence level (interval
coefficient). It is customarily in econometrics to choose the 95% confidence level.

❖This means that in repeated sampling the confidence limits, computed from the sample,
would include the true population parameter in 95% of the cases. In the other 5% of the
cases the population parameter will fall outside the confidence interval.

❖Decision rule: If the hypothesized value of β in the null hypothesis is within the confidence
interval, accept H0 and reject H1. The implication is that β෠ is statistically insignificant; while
if the hypothesized value of β in the null hypothesis is outside the limit, reject H0 and
accept H1. This indicates β෠ is statistically significant.

Getachew W. SU 2023 46
Examples

Getachew W. SU 2023 47
Examples…

Getachew W. SU 2023 48
1.6. The Multiple Linear Regression Model

❖In simple regression we study the relationship between a dependent variable and a single
explanatory (independent variable).

❖But it is rarely the case that economic relationships involve just two variables. Rather a
dependent variable Y can depend on a whole series of explanatory variables or regressors.

❖ For instance, in demand studies we study the relationship between quantity demanded of
a good and price of the good, price of substitute goods and the consumer’s income.

𝑄𝑖 = 𝛽0 + 𝛽1 𝑃1 +𝛽2 𝑃2 + 𝛽3 𝐼𝑖 + 𝑈𝑖

Where = Qi is quantity demanded, P1 is price of the good, P2 is price of other substitute


goods, Ii is consumer’s income, and 𝛽’s are unknown parameters and Ui is the disturbance.
Getachew W. SU 2023 49
MLRM…

❖The above model is a multiple regression with three explanatory variables.


❖ In general for K-explanatory variable we can write the model as follows:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 +𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑈𝑖
❖Where 𝑋𝑘𝑖 (i= 1,2,3,.......,K) are explanatory variables, Yi is the dependent variable and

𝛽𝑗 (j = 0,1,2,....(k +1)) are unknown parameters and 𝑈𝑖 is the disturbance term.

❖Multiple Regression Model has the same assumptions as in the single explanatory variable
model developed earlier except the assumption of no perfect multicollinearity.
❖No perfect multicollinearity: The explanatory variables are not perfectly and linearly
correlated.
Getachew W. SU 2023 50
MLRM…

❖ in short Y = X𝛽 + 𝑈
Getachew W. SU 2023 51
MLRM…

❖To derive the OLS estimators of 𝛽 , under the usual (classical) assumptions
mentioned earlier, we define two vectors 𝛽መ and ‘e’ as:

❖Thus we can writeY = X𝛽መ + 𝑒𝑖 and 𝑒𝑖 = Y − X𝛽መ

❖ We have to minimize σ𝑛𝑖=1 𝑒𝑖2 = 𝑒12 + 𝑒22 + 𝑒32 + ⋯ + 𝑒𝑛2

= σ 𝑒𝑖2 = 𝑒 ′ 𝑒
Getachew W. SU 2023 52
MLRM…

Getachew W. SU 2023 53
1.7. Applications using Statistical software
❖ Exercise 1: Using the data in wage1. csv, answer the following questions:

a) Which model is appropriate to explain wage in terms of education, experience, and nonwhite.

b) Run the model, write the regression line, and interpret the parameters.

c) Does the model output is compatible with economic theory and statistically reliable.

d) Test that education and experience have no significant effect on the wage of the person at a
significant level of 5%.

e) Which of the explanatory variables are practically important? Rank the variables in their
economic importance.

f) Does the model output is econometrically meaningful.

Getachew W. SU 2023 54
Answer key for exercise 1

a) Multiple linear regression model, because the dependent variable wage is continues
and it is affected by three variables.
Regression line: 𝑤𝑎𝑔𝑒 = −3.387 + 0.644𝑒𝑑𝑢𝑐 +
VARIABLES wage
b) 0.07𝑒𝑥𝑝𝑒𝑟 − 0.016𝑛𝑜𝑛𝑤ℎ𝑖𝑡𝑒
educ 0.644***
(0.054) Interpretation:
exper 0.070*** The constant term: is if all explanatory variables are zero,
(0.011) wage of the household becomes negative, or technically, the
nonwhite -0.016 vertical intercept of the regression line starts at -3.387.
(0.470)
Coefficient of education: if the persons year of schooling
Constant -3.387***
(0.775) increase by 1 grade/level, wage of that individual has been
Observations 526 increased by 0.644 unit.
R-squared 0.225 Coefficient of nonwhite: wage of the nonwhite individual is
Standard errors in parentheses lower than that of the white by 0.016 unit.
*** p<0.01, ** p<0.05, * p<0.1
Getachew W. SU 2023 55
Answer key…
c) the model output compatibility with economic theory is checked by looking at the sign
and size of the coefficients. All coefficients has expected sign. So, it is compatible with
theory or reality. statistically reliability is checked by R2 =0.225 which means all of the
three explanatory variables explain 22.5% variation of wage, the remaining 81.5% was
explained by factors other than education, experience and race.

d) 𝐻0 : 𝛽1 = 𝛽2 = 0 against the 𝐻1 : 𝛽1 ≠0 𝛽2 ≠0 at 5% level of significance.


Hence F statistics is significant @1% level of significance, so we reject the Ho meaning that
education and experience has influence on wage.

e) The important variables in rank are education, experience, and race.

Getachew W. SU 2023 56
Answer key…

f) The econometric meaningfulness of the model can be verified by whether the


assumption of CLRM are violated or not. That is

i. Are the residuals from the model estimation homoscedastic (constant variance)?

ii. Is there specification error or omitted variable bias in the model?

iii. Are the explanatory variables seriously correlated among each other?

Statistics > Postestimation > Reports and statistics

i. Homoscedasticity: The probability is significant at 1% level of significance so reject the Ho.


Meaning that there is a problem of heteroscedastic.
Getachew W. SU 2023 57
Answer key…

ii. Omitted variable bias: The Ho is rejected suggesting that the model has the problem of
omitted variable bias.

iii. Multicollinearity: The test for multicollinearity reported below suggests that there is no
serious problem of multicollinearity among explanatory variables because the mean VIF is
about 1.07. As, a rule of thumb if a given VIF is greater than 5, the multicollinearity is severe
but it is considered as “bad” if they exceed 10.

Getachew W. SU 2023 58
Assignment (5%)
2. Run the model of 𝑙𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑚𝑎𝑟𝑟𝑖𝑒𝑑 +𝛽2 𝑓𝑒𝑚𝑎𝑙𝑒 + 𝛽3 𝑒𝑥𝑝𝑒𝑟 + 𝛽4 𝑒𝑥𝑝𝑒𝑟𝑠𝑞 +
𝛽5 𝑡𝑒𝑛𝑢𝑟𝑒 + 𝛽6 𝑡𝑒𝑛𝑢𝑟𝑒𝑠𝑞 + 𝑈𝑖

a) Run the appropriate model, interpret it, and put your model result under in word format with
four decimal numbers.

b) Check the plausibility of the model with theory, statistics, and econometrics.

c) Test that marriage premium is the same for both male and female workers.

d) Which of the explanatory variables are practically important? Rank the variables in their
economic importance.

e) Predict the wage of all sample households and of those households with and without
experience. Which predicted value is greater? Interpret the results.

Getachew W. SU 2023 59
❖Description of variables wage average hourly earnings
➢educ years of education
➢exper years potential experience
➢tenure years with current employer
➢female =1 if female
➢married =1 if married
➢lwage log(wage)
➢expersq exper^2
➢tenursq tenure^2

Getachew W. SU 2023 60
End of the chapter!
Thank you!

Getachew W. SU 2023 61

You might also like