0% found this document useful (0 votes)
3 views76 pages

Lesson 1

Uploaded by

godstarkalinga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views76 pages

Lesson 1

Uploaded by

godstarkalinga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 76

ECONOMETRICS I

AAE 316

WISDOM MGOMEZULU

Email: [email protected]
MEANING OF
ECONOMETRICS
• define econometrics
• state the importance of econometrics
• list types of econometrics
• give examples of applications and use of
econometrics in real world
DEFINITION
• There are several aspects of the quantitative approach to
economics, and no single one of these aspects taken by
itself, should be confused with econometrics.
• Thus, econometrics is by no means the same as economic
statistics. Nor is it identical with what we call general
economic theory, although a considerable portion of this
theory has a definitely quantitative character.
• Nor should econometrics be taken as synonymous with the
application of mathematics to economics.
• Experience has shown that each of these three viewpoints,
that of statistics, economic theory, and mathematics, is a
necessary, but not by itself a sufficient, condition for a real
understanding of the quantitative relations in modern
economic life.
• It is however the unification of all three that is powerful.
And it is this unification that constitutes econometrics.
• Econometrics may be defined as the social science in
which the tools of economic theory, mathematics, and
statistical inference are applied to the analysis of economic
phenomena.
• A social science is, in its broadest sense, the study of
society and the manner in which people behave and
influence the world around us
• Economic phenomenon refers to observed situations or
problems that economists deal with. For example,
explaining changes in commodity prices. Econometrics is
concerned with the empirical determination of economic
laws.
• Econometrics can also be defined as a statistical and
mathematical methods to the analysis of economic data,
with a purpose of giving empirical content to economic
theories and verifying them or refuting them.

• The art of the econometrician consists of finding the set of


assumptions that are both sufficiently specific and
sufficiently realistic to allow him to take the best possible
advantage of the data available to him.
• Econometrics is thus seen as the vehicle by which
economics can claim scientific validity.
AIMS OF ECONOMETRICS
1. Specification of econometric models
•The economic models are formulated in an
empirically testable form. Several
econometric models can be derived from an
economic model. Such models differ due to
different choice of functional form,
specification of stochastic structure of the
variables etc.
2. Estimation and testing of models:
•The models are estimated on the basis of
observed set of data and are tested for their
suitability. This is the part of statistical
inference of the modeling. Various estimation
procedures are used to know the numerical
values of the unknown parameters of the
model. Based on various formulations of
statistical models, a suitable and appropriate
model is selected.
3. Use of models:
•The obtained models are used for forecasting
and policy formulation which is an essential
part in any policy decision. Such forecasts
help the policy makers to judge the goodness
of fitted model and take necessary measures
in order to re-adjust the relevant economic
variables.
IMPORTANCE OF
ECONOMETRICS
1. Econometrics provides necessary tools to an economist to
derive useful information about economic policies.
•Economics as a science involves knowing the theory and
establishing the set of hypotheses which is understood by
studying econometrics. Once the theory is known, it is tested
using various techniques. Testing of theory or hypothesis is
achieved through studying econometrics.
2. Econometrics contains statistical tools to help you defend
or test assertions in economic theory.
•For example, you think that the production in an economy is
in Cobb-Douglas form. But do data support your hypothesis?
Econometrics can help you in this case.

•The econometrician uses the mathematical equations


proposed by the mathematical economist but puts these
equations in such a form that they lend themselves to
empirical testing.
3. Studying econometrics is crucial in
understanding economic policy issues.
•Econometricians are interested in the
empirical verification of economic theory.
Econometrics use economic data to test
economic theories.
APPLICATION OF ECONOMETRICS IN
REAL WORLD
1. Forecasting macroeconomic indicators:
•Some macroeconomists are concerned with the expected
effects of monetary and fiscal policy on the aggregate
performance of the economy. Time-series models can be used
to make predictions about these economic indicators.
2. Estimating the impact of immigration on native workers:
•Immigration increases the supply of workers, so standard
economic theory predicts that equilibrium wages will
decrease for all workers. However, since immigration can
also have positive demand effects, econometric estimates are
necessary to determine the net impact of immigration in the
labor market.
3. Identifying the factors that affect a firm’s entry and exit
into a market:
•The microeconomic field of industrial organization, among
many issues of interest, is concerned with firm concentration
and market power. Econometric estimation can help
determine which factors are the most important for firm entry
and exit.
4. Determining the influence of minimum-wage laws on
employment levels:
•The minimum wage is an example of a price floor, so higher
minimum wages are supposed to create a surplus of labor
(higher levels of unemployment). However, the impact of
price floors like the minimum wage depends on the shapes of
the demand and supply curves. Therefore, labor economists
use econometric techniques to estimate the actual effect of
such policies.
5. Finding the relationship between management techniques
and worker productivity:
•The use of high-performance work practices (such as worker
autonomy, flexible work schedules, and other policies
designed to keep workers happy) has become more popular
among managers. At some point, however, the cost of
implementing these policies can exceed the productivity
benefits. Econometric models can be used to determine which
policies lead to the highest returns and improve managerial
efficiency.
6. Measuring the association between insurance coverage
and individual health outcomes:
•One of the arguments for increasing the availability (and
affordability) of medical insurance coverage is that it should
improve health outcomes and reduce overall medical
expenditures.
7. Predicting revenue increases in response to a marketing
campaign:
•The field of marketing has become increasingly dependent
on empirical methods. A marketing or sales manager may
want to determine the relationship between marketing efforts
and sales. How much additional revenue is generated from an
additional dollar spent on advertising?.
8. Calculating the impact of a firm’s tax credits on R&D
expenditure:
•Tax credits for research and development (R&D) are
designed to provide an incentive for firms to engage in
activities related to product innovation and quality
improvement. Econometric estimates can be used to
determine how changes in the tax credits influence R&D
expenditure and how distributional effects may produce tax-
credit effects that vary by firm size.
VARIABLES
• define a random variable.
• classify variables
• describe levels of measurement of
variables.
• categorize data
I. Review of Basic Statistics
Review of Basic Statistics:

- Discrete and continuous random variable


- Probability distribution of random variables
- Expected values for random variables
- Joint pdf
- Covariance and Correlation
- Bias, Efficiency, MSE, and Consistency
I.1. Basic Statistics

Random variable:
A variable whose value is unknown until it is observed.
The value of a random variable results from an experiment.

The term random variable implies the existence of some


known or unknown probability distribution defined over
the set of all possible values of that variable.

In contrast, an arbitrary variable does not have a


probability distribution associated with its values.
An example of tossing a pair of coins

First coin Second coin No. of heads

T T 0
T H 1
T H 1
H T 1
H H 2
1.1. Discrete Random Variable

Discrete random variable:


A discrete random variable can take only a finite number
of values, that can be counted by using the positive
integers.
Example: Prize money from the following
lottery is a discrete random variable:
first prize: K1,000
second prize: K50
third prize: K5.75
since it has only four (a finite number)
(count: 1,2,3,4) of possible outcomes:
K0.00; K5.75; K50.00; K1,000.00
1.2. Continuous Random Variable

Continuous random variable:


A continuous random variable can take any real value
(not just whole numbers) in at least one interval on the
real number line.

Examples:
Gross national product (GNP)
money supply
interest rates
price of eggs
household income
expenditure on clothing
Dummy Variable

A discrete random variable that is restricted to two possible


values (usually 0 and 1) is called a dummy variable (also,
binary or indicator variable).

Dummy variables account for qualitative differences,

e.g., gender (0=male, 1=female),


residence (0=rural, 1=urban),
citizenship (0=Malawian, 1=non Malawian),
income class (0=poor, 1=rich).
Nominal Variable

A nominal random variable is characterized by data that


consist of names, labels, or categories only.

Nominal variables account for qualitative


differences,

e.g., Name of an EPA


District name
Village name,
Regions of the country.
1.3 Probability Distribution of Discrete Random Variables

A list of all of the possible values taken by a discrete


random variable along with their chances of occurring is
called a probability function or probability density function
(pdf).

die x f(x)
one dot 1 1/6
two dots 2 1/6
three dots 3 1/6
four dots 4 1/6
five dots 5 1/6
six dots 6 1/6
A discrete random variable X has pdf, f(x), which is the
probability that X takes on the value x.

f(x) = P(X=x)

Therefore, 0 < f(x) < 1


If X takes on the n values: x1, x2, . . . , xn,
then f(x1) + f(x2)+. . .+f(xn) = 1.
Probability, f(x), for a discrete random
variable, X, can be represented by height:

0.4
f(x) 0.3
0.2
0.1

0 1 2 3 X
number, X, on Dean’s List of three roommates
1.4 Probability Distribution of Continuous Random Variables

A continuous random variable uses area under a curve


rather than the height, f(x), to represent probability:

f(x) red area


green area 0.1324
0.8676

. .
$34,000 $55,000 X

per capita income, X, in the United States


Since a continuous random variable has an uncountably
infinite number of values, the probability of one occurring
is zero.

P[X=a] = P[a<X<a]=0

=> Probability is represented by area.


=> Height alone has no area.
=> An interval for X is needed to get an area
under the curve.
The area under a curve is the integral of the
equation that generates the curve:

b
P[a<X<b]=
a
 f(x) dx

For continuous random variables it is the


integral of f(x), and not f(x) itself, which
defines the area and, therefore, the probability.
1.5 Mean of a Random Variable: Expected Value

There are two entirely different, but mathematically


equivalent, ways of determining the expected value:

1. Empirically:
The expected value of a random variable,
X, is the average value of the random variable
in an infinite number of repetitions of the
experiment.
In other words, draw an infinite number of samples,
and average the values of X that you get.
2. Analytically:
The expected value of a discrete random
variable, X, is determined by weighting all
the possible values of X by the corresponding
probability density function values, f(x), and
summing them up.

In other words:

E[X] = x1f(x1) + x2f(x2) + . . . + xnf(xn)


Empirical (sample) mean:
n
1
x 
n
x
i 1
i

where n is the number of sample observations (# of repetitions).

Analytical mean:
n
i =x
E[X] =  1 if(xi)
where n is the number of possible values of xi
(# of possible outcomes).
The expected value of X:
n

E [X] = xi f(xi)


i=1

The expected value of X-squared:


n
2
E [X ] = xi f(xi)
2
i=1
It is important to notice that f(xi) does not change!

The expected value of X-cubed:


n
3
E [X ] =
3
xi f(xi)
i=1
Adding and Subtracting Random Variables

E(X+Y) = E(X) + E(Y)


E(X-Y) = E(X) - E(Y)

If c is constant, E[c] = c

If c is constant and X is a random variable


E[cX] = cE[X]

If a and c are constants, E[a+cX] = a +cE[X]


Variance of Random Variables

1. var(X) = average squared deviations around the


mean of X; expected value of the squared deviations
around the expected value of X.

var(X) =2 = E [X – E(X) ]2


= E[X2] - [E(X)]2
2. variance of a discrete random variable, X: the
weighted average of the squared differences
between the values of x of the random variable X
and the mean of the random variable

n
var ( X) =  [x
i=1
i - E(X)] f(xi ) 2

standard deviation is………..


How to calculate the variance for a discrete
random variable, X?
(2) (3)
xi f(xi) xi – E(X) [xi – E(X)]2 f(xi)

2 .1 2 - 4.3 = -2.3 5.29 (.1) = .529


3 .3 3 - 4.3 = -1.3 1.69 (.3) = .507
4 .1 4 - 4.3 = - .3 .09 (.1) = .009
5 .2 5 - 4.3 = .7 .49 (.2) = .098
6 .3 6 - 4.3 = 1.7 2.89 (.3) = .867

(1) E(X) = xi f(xi) = .2 + .9 + .4 + 1.0 + 1.8 = 4.3

(4) [xi – E(X)]2 f(xi) = .529 + .507 + .009 + .098 + .867


= 2.01
Useful Property of Variances

Random variable, Z = a + cX
var(Z) = var(a + cX)

= E [(a+cX) - E(a+cX)]2

= c2 var(X)

var(a + cX) = c2 var(X)


1.7 Joint pdf

A joint probability density function, f(x,y), provides the probabilities


associated with the joint occurrence/co-movement of all of the possible
pairs of X and Y.

Y
Male (y=0) Female (y=1) Party Totals f(x)
Democrat (x=0) 200 (.20) 270 (.27) 470 (.47)
X Republican(x=1) 300 (.30) 100 (.10) 400 (.40)
Other (x=2) 60 (.06) 70 (.07) 130 (.13)
Gender Totals 560 (.56) 440 (.44) 1000
f(y)
Marginal pdf

The marginal probability density functions, f(x) and


f(y), for discrete random variables, can be obtained by
summing over the f(x,y) with respect to the values of Y
to obtain f(x); with respect to the values of X to obtain
f(y).
f(xi) =  f(xi,yj) f(yj) =  f(xi,yj)
j i

Continuous random variable


Marginal
marginal
Y=0 Y=1 pdf for X:

X=0 .20 .27 f(X=0) = .47

.30 .10 f(X=1)= .40


X=1

X=2 .06 .07 f(X=2)= .13

marginal
pdf for Y: .56 f(Y=0) .44 f(Y=1)
Conditional pdf

The conditional probability density functions of X


given Y=y , f(x|y), and of Y given X=x , f(y|x), are
obtained by dividing f(x,y) by f(y) to get f(x |y) and by
f(x) to get f(y|x).

f(x,y)
f(x|y) =
f(y)
f(x,y)
f(y|x) =
f(x)
Conditional
marginal
Y=0 Y=1 pdf for X:
X=0 .20 .27
f(Y=0|X=0)=.43 f(Y=1|X=0)=.57
.47 f(X=0)
f(X=0|Y=0)=.36 f(X=0|Y=1)=.61
X=1 .30 .10 .40 f(X=1)
f(Y=0|X=1)=.75 f(Y=1|X=1)=.25
f(X=1|Y=0)=.54 f(X=1|Y=1)=.23
X=2 .06 .07
f(Y=0|X=2)=.46 f(Y=1|X=2)=.54 .13 f(X=2)
f(X=2|Y=0)=.11 f(X=2|Y=1)=.16

marginal
pdf for Y: .56 f(Y=0) .44 f(Y=1)
Independence

X and Y are independent random variables if their


joint pdf, f(x,y),is the product of their respective
marginal pdfs, f(x) and f(y).

~ if knowing the value that one will take does NOT


reveal anything about what value the other may take

f(xi,yj) = f(xi) f(yj)


If X and Y are independent, joint pdf, f(xi,yj) = f(xi) f(yj)

Y=0 Y=1 marginal

.20 .27 pdf for X:


X=0
.47 f(X=0)
.47*.56=.26 .47*.44=.21
X=1 .30 .10
.40 f(X=1)
.40*.56=.22 .40*.44=.18
X=2 .06 .07
.13*.56=.07 .13*.44=.06 .13 f(X=2)

marginal
pdf for Y: .56 f(Y=0) .44 f(Y=1)
1.8 Covariance and Correlation

The covariance between two random variables,


X and Y, measures the linear association
between them.

cov(X,Y) = E[(X – E[X])(Y-E[Y])]


= E(XY) - E[X]E[Y]

Note that variance is a special case of covariance.

cov(X,X) = var(X) = E[(X – E[X])2]


Cov(X,Y) = E[XY] - E[X]E[Y]

E[X] = xif(xi)
For i=1,2, 3 and j=1 and 2
3 2
E ( XY )  xi y j f ( xi y j )
i 1 j 1

x1 y1 f ( x1 y1 )  x2 y1 f ( x2 y1 )  x3 y1 f ( x3 y1 )
 x1 y2 f ( x1 y2 )  x2 y2 f ( x2 y2 )  x3 y2 f ( x3 y2 )

E(XY) =   xi yj f(xi,yj)
Marginal
marginal
Y=0 Y=1 pdf for X:

X=0 .20 .27 .47 f(X=0)

.30 .10 .40 f(X=1)


X=1

X=2 .06 .07 .13 f(X=2)

marginal
pdf for Y: .56 f(Y=0) .44 f(Y=1)

Cov(X,Y) = E[XY] - E[X]E[Y]

E(XY) = (0)(0)(.2)+(0)(1)(.27)+(1)(0)(.3)+(1)(1)(.1)
+(2)(0)(.06)+(2)(1)(.07)=.24
E(X) = (0)(.47)+(1)(.40)+(2)(.13)=.66
E(Y)=(0)(.56)+(1)(.44)=.44
Cov(X,Y) = .24 - .29 = -.05
The magnitude of covariance is difficult to interpret
because it depends on the units of variables

The correlation between two random variables X and Y


is their covariance divided by the square roots of their
respective variances.
cov(X,Y)
(X,Y) =
var(X) var(Y)

var(X) = E[X2] - [E(X)]2

Correlation is a pure number falling between -1 and 1.


Marginal
marginal
Y=0 Y=1 pdf for X:

X=0 .20 .27 .47 f(X=0)

.30 .10 .40 f(X=1)


X=1

X=2 .06 .07 .13 f(X=2)

marginal
pdf for Y: .56 f(Y=0) .44 f(Y=1)

corr(X,Y) = cov(X,Y)/ var(X) var(Y)

var(X) = E[X2] - [E(X)]2 = (02)(.47)+(12)(.40)+22(.13) – (.66)2 = .49


var(Y) = E[Y2] - [E(Y)]2 = (02)(.56)+(12)(.44) – (.44)2 =.25

corr(X,Y) = -.05/.35 = - .14


Independent random variables have zero covariance
and, therefore, zero correlation; but the converse is not
true

Zero covariance or correlation does NOT necessarily mean


“independence.”

Zero correlation and covariance just means that


there is no linear relationship between two random
variables…..
1.9 Bias, Efficiency, MSE, and Consistency

How to find the best estimator for a given sample?

Unbiased : if mean or expected value of estimates of parameter is


equal to the true value, that is,

Bias E ( X   ) or E ( ˆ )  

Efficient: the estimator whose sampling distribution has


the lowest variance (standard error) is the “efficient”
or “best” estimator.

var(X) = E[X2] - [E(X)]2


What if an unbiased and efficient estimator is not available?

Depends on estimation purpose!!


1. Policy evaluation
2. Prediction

Choose the minimum mean squared error estimator!!


mean squared error criterion is MSE = bias2 + variance
Large-sample (asymptotic) properties of estimators

Consistency: as N increases, the estimator approaches the true


value. In other words, an estimator is consistent if the probability
distribution of the estimator collapses to a single point (true parameter)
as the sample size gets arbitrarily large

ˆ |   ) 1
lim Prob
N 
(|   

Does unbiasness imply consistency? OR Does consistency


imply unbiasness?

What is the difference between unbianess and consistency?


TYPES OF RANDOM
VARIABLES
• Random variables can be categorized in
different ways.
• Quantitative and qualitative random
variables
Quantitative random variable
• A variable is quantitative if the description
of a characteristic of interest results in a
numerical value. A quantitative variables
can either be discrete of continuous.
• If the random variable can assume only a
particular finite set of values, it is said to be
a discrete random variable. The value of a
discrete random variable in observed by
counting
• Examples of discrete random variables
include;
• a. Number of minibuses passing through a
road block at lunch hour.
• b. Number of courses in BSc. In
Agricultural Economics Programme at
LUANAR.
• c. Number of postgraduate degree
programmes offered at LUANAR in 2018.
• d. Household size
• A random variable is said to be continuous if it
can assume any value in a certain range. The
value of a continuous random variable is
usually obtained by measurement
• Example of a continuous random variable are;
• a. Income of a family in the southern region of
Malawi.
• b. Quantity of soya bean demanded
• c. Height of a maize plant
• d. Distance from home to the nearest market
Qualitative random variables
• Not all random variables describe a
characteristic using numerical values.
Suppose you want to report sex of a
responded, the response is going to be
either male or female. None of these two
responses is a numerical value
• Qualitative random variable is a variable
whose description of a characteristic of
interest results in a non-numerical value.
• Examples of qualitative variables are;
• a. Gender
• b. marital status
• c. location
• d. Participation in a proj
Dummy Variable
• In economics, qualitative variables are
coded using discrete variables. When a
discrete variable is used to recode a
qualitative characteristic, it is called a
dummy variable. A dummy variable is also
called a design variable or indicator
variable
• Example
• Let us create a dummy variable for sex of
household head that takes a value of 1 if the
household head is male and 0 if the
household head is female. 𝑫={1 if house
hold head is male and 0 otherwise
Independent and dependent
variables
• In an econometric model, a random variable
can be either dependent or independent
variable. A dependent variable is the
variable being explained by the model. An
independent variable is a variable that
explains changes in the dependent variable.
Like any other variable, a dummy variable
can either be a dependent variable or an
independent variable
• The nature of the dependent variable is one
of the factors that determine the choice of
an econometric model to be used in data
analysis.
• For example, you may use a regression
model if the dependent variable is a
continuous variable and you can use a
probit model if the dependent variable is
dichotomous.
DATA SET
• Data are observations that have been
collected on variables. Data are sometimes
used to calculate statistics.
• The success of any econometric analysis
ultimately depends on the availability of the
appropriate data. Data collection is a very
important stage in research
Types of Data
• Four types of data may be available for
empirical analysis: time series, cross-
section, pooled data and panel data.
Time Series Data
• A time series is a set of observations on the
values that a variable takes at different
times. Such data may be collected at regular
time intervals.
• Examples
• a. daily (e.g., stock prices, weather reports
• weekly (e.g., money supply figures
Cross-Section Data
• Cross-section data are data collected on one
or more variables collected at the same
point in time.
• Examples
• a. Population Census data collected by the
National Statistical Office (NSO) in a given
year.
Pooled data
• These are data with combined elements of
both time series and cross-section data. In
pooled data we have a “time series of cross
sections,” but the observations in each cross
section do not necessarily refer to the same
unit.
Panel data
• National Statistical Office carries out a
census of housing at periodic intervals. At
each periodic survey the same household
(or the people living at the same address) is
interviewed to find out if there has been any
change in the housing and financial
conditions of that household since the last
survey
• By interviewing the same household
periodically, the data becomes panel data
that provides very useful information on the
dynamics of household behavior. Panel data
are data from samples of the same cross-
sectional units observed at multiple points
in time.
Why should we know types of
data?
• Each type of data in analysed using specific
econometric models. There are some
models that can be used to analyse cross
sectional data but cannot be used to analyse
time series data
• For example you can use Auto-Regression
Models (AR) to analyse tome series data
but you cannot use these models to analyse
cross-section data.
• Therefore, before choosing a model for
analysing data, a research need to
understand the type of data that is available
for analysis.

You might also like