0% found this document useful (0 votes)

27 views55 pages

Lecture 1

Uploaded by

Ankhbileg Kh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views55 pages

Lecture 1

Uploaded by

Ankhbileg Kh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Econometrics II: Econometric Modelling

Jürgen Meinecke

Research School of Economics, Australian National University

27 July, 2018
Welcome
Welcome to your second course in econometrics!

Q: Wait! What is econometrics?

Definition
Econometrics is the science of using economic theory and
statistical techniques to analyze economic data.

Econometric methods are used in many branches of economics

and business, including finance, labor economics, development
economics, behavioral economics, macroeconomics,
microeconomics, marketing, economic policy

It is also used in other social sciences such as political science

and sociology
Econometrics is a nice combination of economics and statistics

Econometrics gives you skills that are rewarded in the

workplace (private banks, central banks, consulting firms,
insurance companies, government agencies all have big teams
of econometricians trying to make sense of a broad array of
data)

Econometrics can be quite mathematical, but this semester I

will focus on the big ideas and the important concepts and
intuition

But before we get started with econometrics, let’s first briefly

discuss . . .
Staff

You can seek help on matters academic from

I your friendly lecturer (me): Juergen Meinecke
I your friendly tutor: again me!
Feel free to e-mail anytime, stop by my office, randomly stop
me on campus or call me on a Sunday afternoon (or not)

You can seek help on matters administrative from

I Course administrator: Nicole Millar
I School administrator: Finola Wijnberg
Nicole and Finola are very friendly, they are happy to help and
you can find them in the first floor of the Arndt building
Indicative work load

I two hours of lecture per week

I one hour of computer tutorial per week
I 7 hours of private study per week

These are guidelines

If you miss a lecture or tute you should make up for it as soon

as possible!
Stata

For those of you who are not familiar with Stata:

I Visit the class website and click on “Stata help”

I There you will find resources to teach yourself Stata
I Dedicate some time to teach yourself Stata
I Feel free to stop by my office if you need help
I I’m also happy to use the weekly computer tutorials to
answer your Stata related questions
Website

Now let’s take a look at the course website

https://juergenmeinecke.github.io/EMET3004

(That’s right, I’m not using Wattle)

(One exception however: audio and video recordings will go

up on Wattle automatically after each session.)
Website

Now let’s take a look at the course website

https://juergenmeinecke.github.io/EMET3004

(That’s right, I’m not using Wattle)

(One exception however: audio and video recordings will go

up on Wattle automatically after each session.)

Also make sure to check out this website

https://juergenmeinecke.github.io/EMET2007

There you’ll find lots of prereq material!

Roadmap

We will cover (more or less) the following chapters in the

textbook:
I weeks 1 through 6
I chapters 1 through 8 (STAT1008 and EMET2007 prereq)
I chapter 9
I chapter 12
I weeks 7 through 12
I chapter 13
I chapter 10
I chapter 11
Roadmap

Introduction

The Big Ideas from STAT1008 and EMET2007

Expected Value, Standard Deviation and Variance
Population versus Sample
Sample Average
Central Limit Theorem
Hypothesis Testing, Confidence Intervals
Definition
Suppose the random variable Y takes on k possible values
y1 , . . . , yk . The expected value is given by
k
E[Y ] : = ∑ yj · Pr(Y = yj ) (1)
j=1

Occasionally we also call this the population mean or simply

the mean or the expectation.

Often times, the expected value is also denoted µY .

Properties of the expected value

1. Let c be a constant, then E[c] = c

2. Let c be a constant and Y be a random variable, then
E[c + Y ] = c + E[Y ]
E[c · Y ] = c · E[Y ]

It follows that for two constants c and d,

E[c + d · Y ] = c + d · E[Y ]

3. Let X and Y be random variables, then

E[X + Y ] = E[X ] + E[Y ]
E[X − Y ] = E[X ] − E[Y ]

(Can you prove all of these?)

Definition
The rth moment of a random variable Y is given by
mr (Y) := E[Yr ], for r = 1, 2, 3, . . .

I m1 (Y) equals the expected value

I m2 (Y) − µ2Y equals the variance
I m3 (Y) is related to the skewness (degree of symmetry)
I m4 (Y) is related to the kurtosis (thickness of tails)
Definition
The population variance is defined by
k
Var[Y] := ∑ (yj − µy )2 · Pr(Y = yj )
j=1

Often times, the variance is denoted by σY2 .

Definition
The populationqstandard deviation is defined by
StD[Y] := Var[Y]

It follows immediately that the standard deviation is simply σY .

Properties of the variance

1. Let c be a constant, then Var[c] = 0

2. Let c be a constant and Y be a random variable, then
Var[c + Y] = Var[Y]
Var[c · Y] = c2 · Var[Y]

3. Let X and Y be random variables, then

Var[X + Y] = Var[X] + Var[Y] + 2 · Cov(X, Y)
Var[X − Y] = Var[X] + Var[Y] − 2 · Cov(X, Y)

(Can you prove all of these?)

We haven’t yet defined what we mean by ‘Cov(X, Y)’,

we’ll do this later when we discuss bivariate analysis
Roadmap

Introduction

The Big Ideas from STAT1008 and EMET2007

Expected Value, Standard Deviation and Variance
Population versus Sample
Sample Average
Central Limit Theorem
Hypothesis Testing, Confidence Intervals
Definition
A population is a well defined group of subjects.

The population contains all the information on the underlying

probability distribution

Subjects don’t need to be people only

Examples
I Australian citizens
I kangaroos in Tidbinbilla
I leukocytes in the bloodstream
I protons in an atom
I lactobacilli in yogurt
Definition
The population size N is the number of subjects in the
population.

We typically think that N is ‘very large’

In fact, it is so large that observing the entire population

becomes impossible

Mathematically, we think that N = ∞, even though in many

applications this is clearly not the case

Setting N = ∞ merely symbolizes that we are not able to

observe the entire population
Example: population of Australian citizens

Clearly, N = 24, 986, 984

(at the time of writing this)

For all practical purposes it is so large that it might as well

have been N = ∞

Example: kangaroos in Tidbinbilla

I have no idea how many kangaroos live in Tidbinbilla

(therefore, I do not know the actual population size)

I could ask the park ranger, but suppose she also doesn’t know

We treat the population size as unimaginable: N = ∞

The point is:
for some reason we are not able to observe the entire
population (too difficult, too big, too costly)

Instead, we only have a random sample of the population

Definition
In a random sample, n subjects are selected
(without replacement) at random from the population.

Each subject of the population is equally likely to be included

in the random sample.

Typically, n is much smaller than N

Most importantly, n < N ≤ ∞

The random variable for the i-th randomly drawn subject is
denoted Yi

Definition
Because each subject is equally likely to be drawn and the
distribution is the same for all i, the random variables Y1 , . . . , Yn
are independently and identically distributed (i.i.d.)
with mean µY and variance σY2 .

We write Yi ∼ i.i.d.(µY , σY2 ).

Given a random sample, we observe the n realizations

y1 , . . . , yn of the i.i.d. random variables Y1 , . . . , Yn

What do we do with a random sample of i.i.d. data?

Roadmap

Introduction

The Big Ideas from STAT1008 and EMET2007

Expected Value, Standard Deviation and Variance
Population versus Sample
Sample Average
Central Limit Theorem
Hypothesis Testing, Confidence Intervals
In analogy to the mean of a population,
we define the mean of a subset of the population:

Definition
The sample average is the average outcome in the sample:
1 n
Ȳ := ∑ Yi
n i=1

Sometimes we call the sample average also the sample mean.

It should be obvious that this is a sensible definition

Let’s say we are interested in learning about the weights of
kangaroos in Tidbinbilla

We drive to Tidbinbilla and somehow randomly collect 30 roos

and measure their weights

This will give us a random sample of size 30 of kangaroo

weights

It’s easy to calculate the average weight of these 30 roos

Suppose we obtain a sample average of 70kg

There is a huge difference between the population mean and
the sample mean

There is only one population, therefore there is only one

population mean

But there are many different random subsets (samples) of the

population, each of which results in a (potentially) different
sample average

Let’s say we drive to Tidbinbilla for a second time, again

randomly collect 30 roos and measure their weights

Should we expect to obtain a sample average of 70kg?

It is unlikely that the second time around we collect exactly the
same 30 roos (while it is possible, it is not probable)

If we collect a different subset of 30 kangaroos, chances are that

we come up with a different sample average

Suppose we obtain a sample average of 66kg

And now we collect a third random sample . . .

. . . and obtain a sample average of 75kg

And so forth . . .
This illustrates that the sample average itself is a random
variable!

Random variables have statistical distributions

What distribution does the sample average have?

I what is its expected value?
I what is its variance?
I what is its standard deviation?
I what is its shape?
Let Yi ∼ i.i.d.(µY , σY2 ) for all i

We don’t know exactly which distribution generates the Yi , but

at least we know its expected value and its variance (turns out
this is all we need to know!)

Each random variable Yi has

I population mean µY
I variance σY2
Expected value
" #
1 n
n i∑
E[Ȳ] = E Yi
=1
" #
n
1
= E ∑ Yi
n i=1
1 n
n i∑
= E[Yi ]
=1
1 n
n i∑
= µY
=1
1
= nµY
n
= µY

(all of this follows by the properties of expected values)

Variance " #
1 n
n i∑
Var[Ȳ] = Var Yi
=1
" #
n
1
= 2 Var ∑ Yi
n i=1
n
1
=
n2 ∑ Var[Yi ]
i=1
n
1
=
n2 ∑ σY2
i=1
1
= 2 nσY2
n
= σY2 /n

(all of this follows by the properties of variances,

and realizing that Cov(Yi , Yj ) = 0 for i 6= j (why?))
Standard deviation
√
StD(Ȳ) = σY / n

(that’s an easy one, given that we know the variance)

In summary, we have figured out these three parameters for the
sample average:
I expected value is µY
I variance is σY2 /n
√
I standard deviation is σY / n

Also, we understand that the sample average itself is a random

variable

It therefore must have a statistical distribution, we write

Ȳ ∼ P(µY , σY2 /n)

where P abbreviates some unknown statistical distribution

Roadmap

Introduction

The Big Ideas from STAT1008 and EMET2007

Expected Value, Standard Deviation and Variance
Population versus Sample
Sample Average
Central Limit Theorem
Hypothesis Testing, Confidence Intervals
But what is the actual distribution P?

Is it binomial, normal, logistic, exponential, gamma, or what?

(you do not need to know exactly what these are, just accept
that they are different shapes of probability distributions)

Perhaps not too surprisingly, the exact distribution of Ȳ

depends on the distribution of the underlying components of Ȳ,
i.e., the distribution of Y1 , . . . , Yn
But instead of the exact distribution, we look at the approximate
distributions (which is easier to obtain)
I if the underlying distribution of Y1 , . . . , Yn is binomial,
the resulting distribution of Ȳ is approximately normal
I if the underlying distribution of Y1 , . . . , Yn is normal,
the resulting distribution of Ȳ is exactly normal
I if the underlying distribution of Y1 , . . . , Yn is logistic,
the resulting distribution of Ȳ is approximately normal
I if the underlying distribution of Y1 , . . . , Yn is exponential,
the resulting distribution of Ȳ is approximately normal
I if the underlying distribution of Y1 , . . . , Yn is gamma,
the resulting distribution of Ȳ is approximately normal

(‘approximately’ means ‘almost’)

Does this look surprising?

Where does this come from?

Answer: the Central Limit Theorem

Most generally, applying the CLT to the sample average Ȳ

results in the following statement:

Given an i.i.d. random sample, the sample average has an

approximate normal distribution irrespective of the
underlying distribution of Y1 , . . . , Yn
(as long as they are well-behaved).

When the underlying distribution of Y1 , . . . , Yn is normal,

you can replace the word ‘approximate’ by the word ‘exact’.
Practical meaning of the CLT:
I when the sample size n is large . . .
I the sample average Ȳ has almost a normal distribution . . .
I around the population mean µY . . .
I with variance σY2 /n . . .
I irrespective of what the underlying distribution of the
Y1 , . . . , Yn are
But when is n ‘large’ enough?

Rule of thumb: n = 30 is often times good enough!

Illustration of CLT

The underlying distribution of Y1 , . . . , Yn is exponential

Roadmap

Introduction

The Big Ideas from STAT1008 and EMET2007

Expected Value, Standard Deviation and Variance
Population versus Sample
Sample Average
Central Limit Theorem
Hypothesis Testing, Confidence Intervals
Main use of CLT: hypotheses testing

Whenever we calculate a sample average, we need to

remember that it should be interpreted as the outcome of a
random variable

In other words: the sample average is random

For a different random draw from the population, we would

have calculated a different sample average
Example: bus arrival time in Lyneham
I bus schedule says that the bus comes at 8:10am
I I assembled a random sample: during the last 30
workdays, the bus came, on average, at 8:14am
I is that consistent with the bus schedule?

Here the bus company claims that µY = 810

(population mean)

I get a sample average of Ȳ = 814

How does the CLT help me now?

I understand that my random sample is, well, random

Had I collected my data on different days, perhaps I would

have calculated a sample average closer to the bus company’s
claim

In any case, I only have the one random sample of 30

observations

I don’t know the actual distribution of the underlying Yi (bus

arrival times on day i), but thanks to the CLT I don’t need to
approx.
The CLT says that Ȳ30 ∼ N(810, σY2 /30)

Let’s say an oracle told me that σY2 = 45

Bus arrival time distribution

How should we read this picture?

If what the bus company claims (that the bus arrives at 8:10am)
is correct, then it would be very unlikely for me to obtain a
sample average of 8:14am
(because that number is far in the right-hand tail of the
distribution)

Yet, I have obtained a sample average of 8:14am

I conclude that the bus company is probably misstating the

actual bus arrival time

While it is theoretically possible that the claim of the bus

company is correct, it is improbable

This is an example of a probabilistic conclusion

Turns out, we just conducted our first hypothesis test

Null hypothesis: µY = 810

Alternative hypothesis: µY 6= 810

If the sample average obtained from the random sample is too

far away from the hypothesized population mean of 8:10am,
then we conclude that the null hypothesis probably does not
hold

In that case we reject the null in favor of the alternative

hypothesis
But what do we mean by too far?

How far away can the sample mean be from the hypothesized
population mean to imply rejection of the hypothesized value?

Answer:
if true sample mean has less than a 5% chance to occur under
the hypothesized population mean we declare this ‘too far’

Exploiting the features of the normal distribution, this

translates into the following mathematical statement:

√
Everything smaller than µY − 1.96 · σY / n and
√
everything larger than µY + 1.96 · σY / n
In the bus example too far means
√
everything smaller than 810 − 1.96 · 1.5 = 807.60 and
√
everything larger than 810 + 1.96 · 1.5 = 812.40
The sample average of 8:14 lies outside the symmetric 95% area
which is centered around the hypothesized true value of the
population mean

To repeat: our sample average of 8:14 is unlikely to occur if the

true population mean was really equal to 8:10

We therefore reject the null hypothesis that the true population

mean is equal to 8:10

This raises the question:

What would µY need to be for us not to reject the null
hypothesis?

Which population mean would be in line with our sample

average of 8:14?
Currently our approach is to propose one particular
hypothesized value for the true (unobserved) population mean
µY and compare it to the sample average obtained from the
data

If the sample average lies beyond 2.40 to the left/right of the

hypothesized population mean we conclude that the
hypothesized population mean is probably not equal to the
true population mean

But what population mean could be true given the sample

average of 8:14?

Wouldn’t is seem clever to study this thing instead:

√ √
[814 − 1.96 · 1.5, 814 + 1.96 · 1.5]
That thing is called confidence interval

Instead of looking 2.40 to the left and to the right of the

hypothesized population mean, we look 2.40 to the left and
2.40 to the right of the sample average

This gives us the set of values the hypothesized population

mean could take on in order to not be rejected

Next, a more formal definition

Definition
A confidence interval for the population mean is the set of
values the true population mean can be equal to for it not to be
rejected at a 5% significance level.

Mathematically, the interval is defined by

√ √
CI (µY ) := [Ȳ − 1.96 · σY / n, Ȳ + 1.96 · σY / n]

To be able to calculate CI we need to know Ȳ, σY , and n

But we only know two of these (which?)

We do not know σY , the standard deviation in the population

Remember: we do not observe the population, therefore we do

not know its mean nor its variance nor its standard deviation

Whenever we do not know a population parameter (such as the

mean or the variance or the standard deviation) we just use the
sample analog instead

Therefore, we replace σY (standard deviation in the population)

by the standard deviation in the sample
Definition
The sample variance is the variance in the sample:
1 n
s2Y := ∑ (Yi − Ȳ)2
n i=1

Corollary

The sample standard deviation is simply equal to sY .

An operational version of the confidence interval therefore is
given by
√ √
CI (µY ) := [Ȳ − 1.96 · sY / n, Ȳ + 1.96 · sY / n]

√
The ratio sY / n has a special name

Definition
√
The standard error of Ȳ is defined as SE(Ȳ) := sY / n.

It is the estimated standard deviation of the sample average Ȳ.

The confidence interval therefore becomes

CI (µY ) := [Ȳ − 1.96 · SE(Ȳ), Ȳ + 1.96 · SE(Ȳ)]

BN2102 1-6 Notes
No ratings yet
BN2102 1-6 Notes
38 pages
Lecture 1
No ratings yet
Lecture 1
39 pages
FECO Note 1 - Review of Statistics: Xuan Chinh Mai
No ratings yet
FECO Note 1 - Review of Statistics: Xuan Chinh Mai
13 pages
Statests
No ratings yet
Statests
20 pages
Lecture1_Review&Intro
No ratings yet
Lecture1_Review&Intro
34 pages
Review of Statistics Basic Concepts: Moments
No ratings yet
Review of Statistics Basic Concepts: Moments
4 pages
Session_1Ecotrix
No ratings yet
Session_1Ecotrix
21 pages
ECON4150 - Introductory Econometrics Lecture 1: Introduction and Review of Statistics
No ratings yet
ECON4150 - Introductory Econometrics Lecture 1: Introduction and Review of Statistics
41 pages
IST172 Note 7
No ratings yet
IST172 Note 7
15 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
74 pages
2A2. Review of Probability
No ratings yet
2A2. Review of Probability
8 pages
Lecture set 1
No ratings yet
Lecture set 1
52 pages
Review of Probability and Statistics
No ratings yet
Review of Probability and Statistics
34 pages
BDU Biometrics
No ratings yet
BDU Biometrics
122 pages
ECON 361: Income & Inequality: Lecture 2: Review of Statistics
No ratings yet
ECON 361: Income & Inequality: Lecture 2: Review of Statistics
279 pages
StockWatson Econ CH 2
No ratings yet
StockWatson Econ CH 2
39 pages
Basic Statistical Concepts
No ratings yet
Basic Statistical Concepts
14 pages
Some Stats Concepts
No ratings yet
Some Stats Concepts
6 pages
Probability and Statistics, slides
No ratings yet
Probability and Statistics, slides
73 pages
Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
No ratings yet
Statistics Boot Camp: X F X X E DX X XF X E Important Properties of The Expectations Operator
3 pages
(eBook PDF) Introduction to Econometrics, 4th Global Edition instant download
100% (6)
(eBook PDF) Introduction to Econometrics, 4th Global Edition instant download
57 pages
Elements of Probability Theory
No ratings yet
Elements of Probability Theory
6 pages
1 CourseIntro
No ratings yet
1 CourseIntro
48 pages
Basics
No ratings yet
Basics
61 pages
Lecture Notes Statistics
100% (2)
Lecture Notes Statistics
117 pages
Basics
No ratings yet
Basics
8 pages
Eco No Notes 1
No ratings yet
Eco No Notes 1
47 pages
Review of Statistics Econ3005 L1 AEF
No ratings yet
Review of Statistics Econ3005 L1 AEF
42 pages
Inferential Statistics: by The End of This Chapter You Should Be Able To
No ratings yet
Inferential Statistics: by The End of This Chapter You Should Be Able To
46 pages
1 Introduction
No ratings yet
1 Introduction
35 pages
Samenvatting Chapter 1-3 Econometrie Watson
No ratings yet
Samenvatting Chapter 1-3 Econometrie Watson
16 pages
716
No ratings yet
716
27 pages
Probability
No ratings yet
Probability
12 pages
Normal Distribution:Sampling
No ratings yet
Normal Distribution:Sampling
8 pages
Chapter3 Notes
No ratings yet
Chapter3 Notes
24 pages
(the MIT Press Ser.) Frank Westhoff - An Introduction to Econometrics _ a Self-Contained Approach-MIT Press (2013)
No ratings yet
(the MIT Press Ser.) Frank Westhoff - An Introduction to Econometrics _ a Self-Contained Approach-MIT Press (2013)
893 pages
Lecture 5 - Part II
No ratings yet
Lecture 5 - Part II
9 pages
Statistics
No ratings yet
Statistics
53 pages
F (A) P (X A) : Var (X) 0 If and Only If X Is A Constant Var (X) Var (X+Y) Var (X) + Var (Y) Var (X-Y)
No ratings yet
F (A) P (X A) : Var (X) 0 If and Only If X Is A Constant Var (X) Var (X+Y) Var (X) + Var (Y) Var (X-Y)
8 pages
James Stock CH 1, 2, 3 Slides
No ratings yet
James Stock CH 1, 2, 3 Slides
66 pages
(eBook PDF) Introduction to Econometrics, 4th Global Edition download pdf
100% (6)
(eBook PDF) Introduction to Econometrics, 4th Global Edition download pdf
56 pages
mean-variance
No ratings yet
mean-variance
14 pages
CH 1, 2, 3 Slides
No ratings yet
CH 1, 2, 3 Slides
64 pages
Notebook Aug2006
No ratings yet
Notebook Aug2006
57 pages
Statistics PDF
No ratings yet
Statistics PDF
17 pages
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
100% (6)
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
81 pages
1 The Econometrics of The Simple Regression Model: I 1 1i 2 2i K Ki I
No ratings yet
1 The Econometrics of The Simple Regression Model: I 1 1i 2 2i K Ki I
50 pages
Handout 4 Defining The Population Mean and Population Variance
No ratings yet
Handout 4 Defining The Population Mean and Population Variance
8 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
43 pages
CH 00
No ratings yet
CH 00
4 pages
A (Very) Brief Review of Statistical Inference: 1 Some Preliminaries
No ratings yet
A (Very) Brief Review of Statistical Inference: 1 Some Preliminaries
9 pages
An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download pdf
100% (2)
An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download pdf
91 pages
MATH10282: Introduction To Statistics Lecture Notes
No ratings yet
MATH10282: Introduction To Statistics Lecture Notes
49 pages