0% found this document useful (0 votes)
53 views33 pages

Lecture 1: Introduction and Review of Prerequisite Concepts: DR Jay Lee Jay - Lee@unsw - Edu.au

This document provides an introduction and overview of key concepts in econometrics. It defines econometrics as the unification of statistics, economic theory, and mathematics. It discusses questions econometrics can help answer, key terms and notation, data structures, software used, and reviews important statistical concepts like probability theory, random variables, expectations, and joint/marginal distributions of multiple random variables. The document serves as a primer on prerequisite statistical concepts for further study in econometrics.

Uploaded by

Sean Williams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views33 pages

Lecture 1: Introduction and Review of Prerequisite Concepts: DR Jay Lee Jay - Lee@unsw - Edu.au

This document provides an introduction and overview of key concepts in econometrics. It defines econometrics as the unification of statistics, economic theory, and mathematics. It discusses questions econometrics can help answer, key terms and notation, data structures, software used, and reviews important statistical concepts like probability theory, random variables, expectations, and joint/marginal distributions of multiple random variables. The document serves as a primer on prerequisite statistical concepts for further study in econometrics.

Uploaded by

Sean Williams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Lecture 1: Introduction and Review of

Prerequisite Concepts

Dr Jay Lee
[email protected]

UNSW

1 / 33
What is Econometrics?

I Ragnar Frisch: “It is the unification of all three [statistics,


economic theory, and mathematics] that is powerful. And it is
this unification that constitutes econometrics.”
I Econometric theory and Applied econometrics

2 / 33
Questions

I What is the causal relationship of interest? e.g. the effect of


class size on children’s test scores
I What would be the appropriate model?
I What econometric techniques could be used?

3 / 33
Econometric Terms and Notation

I Data, dataset, or sample: A set of repeated measurements


on a set of (random) variables.
I Observation: An element of data, dataset, or sample, often
corresponds to a specific economic unit.
I Upper case letters: X, Y, Z are random variables (vectors)
I Lower case letters: x, y, z are realizations or specific values
of X, Y, Z.
I Greek letters: α, β, γ, θ, σ are unknown parameters of an
econometric model.

4 / 33
Standard Data Structures

I Cross-section, Time series, Panel


I We focus on cross-section data and assume the data are
independent and identically distributed (iid).
I Random sample: the observations are iid.
I We view an observation is a realization of a random
variable. A complete knowledge on the probabilistic
nature of the random variables is called the population.

5 / 33
Econometric Software

I STATA: easy to use, popular among applied


econometricians
I MATLAB, R, GAUSS: more flexible, popular among
theoretical econometricians
I Fortran, C: gains in computational speed, but less popular
among econometricians

6 / 33
Review on Statistics

I Critical to understand econometric methods.


I References:
I Appendix of Hansen’s Econometrics lecture notes
I Casella and Berger (2002), Statistical Inference, Duxbury
I Hogg, Craig, and McKean (2004), Introduction to
Mathematical Statistics, Pearson

7 / 33
Probability Theory

I The set, S, of all possible outcomes of a particular


experiment is called the sample space for the experiment.
I Example:
1. Coin toss
2. Time remaining in soccer game
I An event is any collection of possible outcomes of an
experiment, that is, any subset of S (including S itself).

8 / 33
Probability Theory

I A collection of subsets of S is called a σ-field (aka σ-algebra


or Borel field), denoted by B, if it satisfies the following
three properties:
1. ∅ ∈ B,
2. If A ∈ B, then Ac ∈ B,
3. If A1 , A2 , ... ∈ B, then ∪∞
i=1 Ai ∈ B.
I Example: If S = {1, 2, 3}, then what is B?

9 / 33
Probability Theory

I (Kolmogorov Axioms) Given S and B, a probability function


is a function P with domain B that satisfies
1. P(A) > 0 for all A ∈ B,
2. P(S) = 1,
B are pairwise disjoint, then
3. If A1 , A2 , ... ∈P

P(∪∞i=1 Ai ) = i=1 P(Ai ).
I If P is a probability function and A is any set in B, then
1. P(∅) = 0,
2. P(A) 6 1,
3. P(Ac ) = 1 − P(A).

10 / 33
Probability Theory

I If A and B are events in S, and P(B) > 0, then the conditional


probability of A given B, written P(A|B), is

P(A ∩ B)
P(A|B) = .
P(B)
I Example: Monty Hall problem (google it by yourself!)
I A collection of events A1 , ..., An are mutually independent if
for any subcollection Ai1 , ..., Aik , we have
 
[k Yk
P  Aij  = P(Aij ).
j=1 j=1

11 / 33
Probability Theory

I A random variable is a function from a sample space S into


the real numbers.
Experiment Random variable
Toss two dice X = sum of the numbers
Toss a coin 25 times X = number of heads in 25 tosses
Get a PhD X = having a PhD degree

I Example: Three coin tosses (X = number of heads)

12 / 33
Probability Theory

I The cumulative distribution function (cdf) of a random


variable X, denoted by FX (x), is defined by

FX (x) = PX (X 6 x), for all x.


I The function F(x) is a cdf if and only if the following three
conditions hold:
1. limx→−∞ F(x) = 0 and limx→∞ F(x) = 1.
2. F(x) is a nondecreasing function of x.
3. F(x) is right-continuous; that is, for every number x0 ,
limx↓x0 F(x) = F(x0 ).

13 / 33
Probability Theory

I A random variable X is continuous if FX (x) is a continuous


function of x. A random variable X is discrete if FX (x) is a
step function of x.
I The following two statements are equivalent:
1. The random variables X and Y are identically distributed.
2. FX (x) = FY (x) for every x.

14 / 33
Probability Theory

I The probability mass function (pmf) of a discrete random


variable X is given by

fX (x) = P(X = x) for all x.

I The probability density function (pdf), fX (x), of a continuous


random variable X is the function that satisfies
Zx
FX (x) = fX (t)dt for all x.
−∞

I A function fX (x) is a pdf (or pmf) of a random variable X if


and only if
fX (x) > 0 for all x.
1. P R∞
2. x fX (x) = 1 (pmf) or −∞ fX (x)dx = 1 (pdf).

15 / 33
Expectations

I The expected value or mean of a random variable X, denoted


by EX, is
R∞
EX = P −∞ x · fX (x)dxP if X is continuous
x x · f X (x) = x x · P(X = x) if X is discrete,

provided that the integral or sum exists.


I The expectation is a linear operator: that is, for any
constants a and b,

E(aX + b) = a · EX + b.

16 / 33
Expectations

I For each integer n, the nth moment of X is EXn . The nth


central moment of X is E(X − µ)n where µ = EX.
I The variance of a random variable X is its second central
moment, Var(X) = E(X − EX)2 . The positive square root of
Var(X) is the standard deviation of X.
I If X has a finite variance, then for any constants a and b,

Var(aX + b) = a2 · Var(X).

I Alternative variance formula: Var(X) = EX2 − (EX)2 .

17 / 33
Multiple Random Variables
Joint and Marginal Distributions

I Let (X, Y) be a discrete bivariate random vector. Then the


function f (x, y) from R2 into R defined by
f (x, y) = P(X = x, Y = y) is called the joint probability mass
function or joint pmf of (X, Y).
I Let (X, Y) be a discrete bivariate random vector with joint
pmf f (x, y). Then the marginal pmfs of X and Y,
fX (x) = P(X = x) and fY (y) = P(Y = y), are given by
X X
fX (x) = f (x, y) and fY (y) = f (x, y).
y x

18 / 33
Multiple Random Variables
Joint and Marginal Distributions

I For any real-valued function g(x, y),


X
Eg(X, Y) = g(x, y)f (x, y).
(x,y)

I Example: Joint and marginal pmf for dice


I A function f (x, y) from R2 into R is called a joint probability
density function or joint pdf of the continuous random
vector (X, Y) if, for every A ⊂ R2 ,
Z Z
P((X, Y) ∈ A) = f (x, y)dxdy.
A

19 / 33
Multiple Random Variables
Joint and Marginal Distributions

I The marginal pdfs of X and Y are given by


Z∞
fX (x) = f (x, y)dy, −∞ < x < ∞,
−∞
Z∞
fY (y) = f (x, y)dx, −∞ < y < ∞.
−∞

I The definitions of joint and marginal distributions for a


bivariate random vector can be easily generalized to
multivariate random vector.

20 / 33
Multiple Random Variables
Conditional Distributions and Independence

I Let (X, Y) be a discrete bivariate random vector with joint


pmf f (x, y) and marginal pmfs fX (x) and fY (y). For any x
such that P(X = x) = fX (x) > 0, the conditional pmf of Y
given that X = x is the function of y denoted by f (y|x) and
defined by

f (x, y)
f (y|x) = P(Y = y|X = x) = .
fX (x)

The conditional pmf of X given Y = y is defined similarly.

21 / 33
Multiple Random Variables
Conditional Distributions and Independence

I Let (X, Y) be a continuous bivariate random vector with


joint pdf f (x, y) and marginal pdfs fX (x) and fY (y). For any
x such that fX (x) > 0, the conditional pdf of Y given that X = x
is the function of y denoted by f (y|x) and defined by

f (x, y)
f (y|x) = .
fX (x)

The conditional pdf of X given Y = y is defined similarly.

22 / 33
Multiple Random Variables
Conditional Distributions and Independence

I If g(Y) is a function of Y, then the conditional expected value


of g(Y) given that X = x is denoted by E(g(Y)|x) and is given
by
X Z∞
E(g(Y)|x) = g(y)f (y|x) and E(g(Y)|x) = g(y)f (y|x)dy,
y −∞

in the discrete and continuous cases, respectively.


I The conditional variance of Y given X = x is given by
Var(Y|x) = E(Y2 |x) − (E(Y|x))2 .

23 / 33
Multiple Random Variables
Conditional Distributions and Independence

I If X and Y are independent, then


1. f (x, y) = fX (x)fY (y).
2. P(X ∈ A, Y ∈ B) = P(X ∈ A)P(Y ∈ B).
3. E(g(X)h(Y)) = Eg(X) · Eh(Y) where g(x) be a function of x
and h(y) be a function of y.

24 / 33
Multiple Random Variables
Conditional Distributions and Independence

I Law of Iterated Expectations: If X and Y are any two


random variables, then

EX = E(E(X|Y)).

I Conditioning Theorem: For any function g(x),

E(g(X) · Y|X) = g(X) · E(Y|X).

25 / 33
Covariance and Correlation

I The covariance of X and Y is the number defined by

Cov(X, Y) = E((X − EX)(Y − EY)).

I The correlation of X and Y is the number defined by

Cov(X, Y)
ρXY = , |ρXY | 6 1.
σX σY
p p
where σX = Var(X) and σY = Var(Y).

26 / 33
Covariance and Correlation

I For any random variable X and Y,

Cov(X, Y) = EXY − EXEY.

I If X and Y are independent random variables, then


Cov(X, Y) = 0 and ρXY = 0.
I For any constants a and b,

Var(aX + bY) = a2 Var(X) + b2 Var(Y) + 2ab · Cov(X, Y).

27 / 33
Matrix Algebra: Notation
I A scalar a is a single number.
I A vector a is a k × 1 list of numbers:
 
a1
 a2 
a= . 
 
 .. 
ak

I A matrix A is a k × r rectangular array of numbers:


 
a11 a12 · · · a1r
 a21 a22 · · · a2r   
A= . ..  = a1 a2 · · · ar ,
 
. ..
 . . . 
ak1 ak2 · · · akr

where ai , i = 1, ..., r is k × 1 column vector.


28 / 33
Matrix Algebra: Notation (cont.)

I Transpose of a matrix: A 0 is obtained by flipping A on its


diagonal:  
a11 a21 · · · ak1
 a12 a22 · · · ak2 
0
A = .
 
.. .. 
 .. . . 
a1r a2r · · · akr
I A matrix is square if k = r.
I A square matrix is symmetric if A = A 0 .
I A square matrix is diagonal if the off-diagonal elements are
all zero (similar for upper/lower diagonal).

29 / 33
Matrix Algebra: Notation (cont.)

I The identity matrix is a diagonal (thus square) matrix


whose diagonal terms are all ones.
I The k × k identity matrix is denoted as
 
1 0 ··· 0
 0 1 ··· 0 
Ik =  . .
 
..
 .. ..

. 
0 0 ··· 1

30 / 33
Matrix Addition

I For matrices A and B with the same number of columns


and rows,
A + B = (aij + bij )
where aij , bij are elements of A and B, respectively.
I The communicative and associative laws hold:

A+B = B+A
A + (B + C) = (A + B) + C.

31 / 33
Matrix Multiplication

I For a scalar c, cA = Ac = (aij c).


I For k × 1 vectors a and b,

X
k
a 0 b = a1 b1 + a2 b2 + · · · + ak bk = aj bj = b 0 a.
j=1

I Two vectors a and b are orthogonal if a 0 b = 0.


I To multiply matrices A and B, say A × B, the number of
columns of A should be equal to the number of rows of B.
I AB =

32 / 33
Matrix Multiplication (cont.)

I Not commutative: AB , BA in general.


I Associative and distributive:

A(BC) = (AB)C
A(B + C) = AB + AC

I For the identity matrix and a k × r matrix A,

AIr = A, Ik A = A.

33 / 33

You might also like