0% found this document useful (0 votes)
16 views42 pages

C0 English

Uploaded by

Vy Lương Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views42 pages

C0 English

Uploaded by

Vy Lương Mai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Introductory Chapter.

Review of Probability
and Statistics and Introduction to
Econometrics
(Course: Econometrics)

Lê Phương

Faculty of Economic Mathematics


University of Economics and Law
Vietnam National University Ho Chi Minh City
Content

1 Probability Theory
Random Variables and Probability Distributions
Characteristics of Random Variables
Common Probability Distributions
Random Vector

2 Applied Statistics
Characteristic Statistical Parameters
Parameter Estimation Problems
Hypothesis Testing Problems

3 Introduction to Econometrics
Introduction to Econometrics
Methodology of Econometrics
Research Data
Regression Relationships
RVs and Probability Distributions

Let Ω denote the sample space of an experiment in a random


phenomenon.
Definition
A function X : Ω → R is called a random variable.
Let X (Ω) denote the set of values that X can take.
Discrete Random Variables
A random variable X is discrete if its set of values is countable:
• Finite: X (Ω) = {x1 , x2 , . . . , xn }, or
• Countably infinite: X (Ω) = {x1 , x2 , . . . , xn , . . . }.

Continuous Random Variables


A random variable X is continuous if its set of values is an interval (or
a set of intervals or the entire real line R).
RVs and Probability Distributions
Probability Distribution of a Discrete Random Variable
The probability distribution of a discrete random variable X is a
function
f (xi ) = P(X = xi )
where xi ∈ X (Ω).
Probability Distribution Table

X x1 x2 ··· xn ···
P f (x1 ) f (x2 ) · · · f (xn ) · · ·

Properties
1 0 ≤ f (xi ) ≤ 1 for xi ∈ X (Ω),
P
2 f (xi ) = 1,
i
P
3 P(a < X ≤ b) = f (xi ).
a<xi ≤b
RVs and Probability Distributions

Probability Distribution of a Continuous Random Variable


The probability distribution of a continuous random variable X is a
probability density function f : R → R of X that satisfies the following
conditions:
1 f (x) ≥ 0 for all x ∈ R,
+∞R
2 f (x)dx = 1,
−∞
Rb
3 P(a < X ≤ b) = f (x)dx for a < b.
a

Note: For a continuous random variable X , we have P(X = x) = 0 for


all x ∈ R. Hence,

P(a ≤ X ≤ b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a < X < b).


RVs and Probability Distributions

Cumulative Distribution Function


The cumulative distribution function of a random variable X , denoted
by FX (x) or F (x), is defined as:

F (x) = P(X ≤ x) ∀x ∈ R.

Interpretation
The cumulative distribution function gives the percentage of the
values of X that are less than or equal to the real number x.

Calculation
X

 f (xi ) if X is discrete,

xi ≤x
F (x) = Rx
f (s)ds if X is continuous.



−∞
RVs and Probability Distributions

Properties of the Cumulative Distribution Function


1 0 ≤ F (x) ≤ 1, for all x ∈ R.
2 F is non-decreasing and continuous when X is continuous.
3 lim F (x) = 0 and lim F (x) = 1.
x→−∞ x→+∞
4 P(a < X ≤ b) = F (b) − F (a).

Relation to Probability Distributions


• For discrete random variables: f (xi ) = F (xi ) − F (xi−1 ).
• For continuous random variables:
1 F is a continuous function,
2 F ′ (x) = f (x) at points where the density function f is continuous.
Characteristics of Random Variables

Definition
The expectation (or mean) of a random variable X , denoted by E(X )
or EX , is defined as:
X

 xi f (xi ) if X is discrete,

i
E(X ) = +∞R
xf (x)dx if X is continuous.



−∞

Meaning
The expectation represents the average value of the random variable
X in the experiment.
Characteristics of Random Variables

Properties of Expectation
1 E(a) = a for any constant a,
2 E(a + bX ) = a + bE(X ),
3 If g is a function,
 P


 g(xi )f (xi ) if X is discrete,
i
E(g(X )) = +∞
R


 g(x)f (x)dx if X is continuous.
−∞

4 If X , Y are independent, then: E(XY ) = E(X )E(Y ).

Note: Two random variables X and Y are independent if the events


(X ≤ x) and (Y ≤ y) are independent for all x, y ∈ R.
Characteristics of Random Variables
Definition
The variance of a random variable X , denoted by Var (X ), V (X ), or
VX , is defined as:
VX = E (X − EX )2


The standard deviation


√ of a random variable X , denoted by σ(X ), is
given by: σ(X ) = VX .
From the definition and the third property of expectation, we have:
 P


 (xi − EX )2 f (xi ) if X is discrete,
i
VX = +∞
(x − EX )2 f (x)dx
R


 if X is continuous.
−∞

Meaning
The variance represents the degree of dispersion of the random
variable around the mean value.
Characteristics of Random Variables

Properties of Variance
1 V (X ) = E(X 2 ) − (EX )2 ,
2 V (a) = 0 for any constant a,
3 V (a + bX ) = b 2 V (X ),
4 If X , Y are independent, then:

V (X ± Y ) = V (X ) + V (Y ).
Common Probability Distributions

Normal Distribution
A continuous random variable X is said to have a normal distribution
with parameters µ and σ 2 (µ ∈ R, σ > 0), denoted by X ∼ N(µ, σ 2 ), if
its probability density function is

1 (x−µ)2
f (x) = √ e− 2σ2 .
σ 2π
X is called a standard normal distribution if X ∼ N(0, 1).
Common Probability Distributions

Chi-Squared Distribution
A random variable X has a chi-squared distribution with n degrees of
freedom, denoted by X ∼ χ2 (n), if its probability density function is

 n 1 e− x2 x n2 −1 , for x > 0;
f (x) = 2 2 Γ( n2 )
 0, for x ≤ 0.
R∞
where Γ(x) = 0
t x−1 e−t dt is called the Gamma function.
Common Probability Distributions

Student’s t-Distribution
X is said to have a Student’s t-distribution with n degrees of freedom,
denoted by X ∼ t(n), if its probability density function is

Γ( n+1
− n+1
2 )

√1 x2 2

πn Γ( n2 )
1+ n , for x > 0;
f (x) =
0, for x ≤ 0.

where Γ(x) is the Gamma function.


Common Probability Distributions

Fisher’s F-Distribution
X is said to have a Fisher’s F-distribution with n and m degrees of
freedom, denoted by X ∼ F (n, m), if its probability density function is
 n m
 n 2 m 2 Γ( n+m 2 )
n−2 n+m
2 (m + nx)− 2 ,
f (x) = Γ( n2 )Γ( m2 ) x for x > 0;
 0, for x ≤ 0.

where Γ(x) is the Gamma function.


Random Vector

Definition
A mapping X = (X1 , X2 , . . . , Xn ) : Ω → Rn is called an n-dimensional
random vector.

Classification
An n-dimensional random vector X = (X1 , X2 , . . . , Xn ) is called
continuous or discrete if all component random variables
X1 , X2 , . . . , Xn are continuous or discrete.
Random Vector
Discrete Joint Probability Distribution
The joint probability distribution of a two-dimensional discrete random
vector (X , Y ) shows X (Ω) = {x1 , x2 , . . . }, Y (Ω) = {y1 , y2 , . . . }, and
the joint discrete probability function is

f (xi , yj ) = P(X = xi ; Y = yj )

with xi ∈ X (Ω), yj ∈ Y (Ω).


The joint probability distribution table:
HH Y
y1 y2 ··· ym
X
H
HH
x1 f (x1 , y1 ) f (x1 , y2 ) ··· f (x1 , ym )
x2 f (x2 , y1 ) f (x2 , y2 ) ··· f (x2 , ym )
.. .. .. .. ..
. . . . .
xn f (xn , y1 ) f (xn , y2 ) ··· f (xn , ym )
Random Vector

Properties
1 0 ≤ f (xi , yj ) ≤ 1 for xi ∈ X (Ω), yj ∈ Y (Ω),
P
2 f (xi , yj ) = 1,
i,j
P P
3 P(a < X ≤ b; c < Y ≤ d) = f (xi , yj ).
a<xi ≤b c<yj ≤d
Random Vector

Continuous Joint Probability Distribution


The joint probability distribution of a two-dimensional continuous
random vector (X , Y ) is the joint probability density function
f : R × R → R of (X , Y ) satisfying the following conditions:
1 f (x, y) ≥ 0 for all x, y ∈ R,
+∞R +∞R
2 f (x, y)dydx = 1,
−∞ −∞
Rb Rd
3 P(a < X ≤ b; c < Y ≤ d) = f (x, y)dydx with a < b and
a c
c < d.
Random Vector

Cumulative Distribution Function


The joint cumulative distribution function of a two-dimensional
random vector (X , Y ) is defined as follows:

F (x, y) = P(X ≤ x; Y ≤ y) ∀x, y ∈ R.

Calculation Formula:
X X

 f (xi , yj ) if (X , Y ) are discrete,

xi ≤x yj ≤y
F (x) = Rx Ry
f (t, s)dsdt if (X , Y ) are continuous.



−∞ −∞
Random Vector

Marginal Probability Distribution


The probability distribution of the component random variables is
called the marginal probability distribution of the random vector
(P
f (x, y) if (X , Y ) are discrete,
fX (x) = R ∞y
−∞
f (x, y)dy if (X , Y ) are continuous.

Theorem
Two random variables X and Y are independent if

⇔ f (x, y) = fX (x)fY (y) ⇔ F (x, y) = FX (x)FY (y).


Random Vector

Covariance
The covariance of two random variables X and Y is

Cov(X , Y ) = E ((X − EX )(Y − EY ))


= E(XY ) − E(X )E(Y ).

Note:
1 If X , Y are independent, then Cov(X , Y ) = 0 (However, the
reverse is not true).
2 Cov(X , X ) = Var (X ).
Interpretation: Covariance measures the extent to which X and Y
move together or in opposite directions.
1 Cov(X , Y ) > 0: X and Y move together.
2 Cov(X , Y ) < 0: X and Y move in opposite directions.
Random Vector
Correlation Coefficient
The correlation coefficient of two random variables X and Y is
Cov(X , Y ) E(XY ) − E(X )E(Y )
ρ(X , Y ) = p = .
Var (X )Var (Y ) σ(X )σ(Y )

If ρ(X , Y ) = 0, we say X and Y are uncorrelated. If ρ(X , Y ) ̸= 0, we


say X and Y are correlated.
Note:
1 −1 ≤ ρ(X , Y ) ≤ 1.
2 ρ(X , Y ) = 1 ⇔ X and Y are linearly dependent with a positive
coefficient (i.e., X = aY + b with a > 0).
3 ρ(X , Y ) = −1 ⇔ X and Y are linearly dependent with a negative
coefficient (i.e., X = aY + b with a < 0).
4 If X , Y are independent, then ρ(X , Y ) = 0 (X and Y are
uncorrelated).
Random Vector

Covariance Matrix
The covariance matrix (also known as the correlation matrix) of the
random vector (X , Y ) is the matrix
" #
Cov(X , X ) Cov(X , Y )
Var (X , Y ) =
Cov(Y , X ) Cov(Y , Y )
" #
Var (X ) Cov(X , Y )
= .
Cov(Y , X ) Var (Y )

The covariance matrix provides an assessment of data dispersion.


Characteristic Statistical Parameters

Given a general sample (X1 , X2 , . . . , Xn ) and a specific sample


(x1 , x2 , . . . , xn ) from the population being studied, to examine the
parameters of the population, we include:
• Population proportion p (the proportion of units exhibiting
property A in the population, applicable to both qualitative and
quantitative characteristics),
• Population mean µ (µ = E(X ), applicable to quantitative
characteristics),
• Population variance σ 2 (σ 2 = V (X ), applicable to quantitative
characteristics),
we need to study the following statistics:
Characteristic Statistical Parameters

General sample characteristics:


X1 + X2 + · · · + Xn
Mean X =
n
XA
Proportion F = n
n 2
1
S2 =
P
Variance n−1 Xi − X
i=1

Specific sample characteristics:


x1 + x2 + · · · + xn
Mean x=
n
xA
Proportion f = n
n
1 2
s2 =
P
Variance n−1 (xi − x)
i=1
Parameter Estimation Problems

Problem: Find an estimate for the parameter θ of a population, where


θ is one of the following:
1 µ (population mean),
2 p (population proportion),
3 σ 2 (population variance).
We can use a single value to estimate θ. This is known as a point
estimate.

Alternatively, we can provide an interval (θ1 , θ2 ) that is likely to contain


θ. This is known as an interval estimate.
Parameter Estimation Problems
Point Estimation Problem
Find a statistic θ̂(X1 , X2 , ..., Xn ) to estimate the unknown parameter θ.
The statistic θ̂ is called the estimation function for θ.
From a specific sample (x1 , ..., xn ), we can calculate the value
θ̂∗ = θ̂(x1 , ..., xn ). This value is called the point estimate of θ.

Unbiased Estimation
The statistic θ̂ is called an unbiased estimate of θ if E θ̂ = θ.

Efficient Estimation
The statistic θ̂ is called an efficient estimate of θ if it is an unbiased
estimate and has the smallest variance among all unbiased estimates
of θ.

Consistent Estimation
P
The statistic θ̂ is called a consistent estimate of θ if θ̂(X1 , ..., Xn ) −→ θ.
Thus, for large n, with probability close to 1, we have: θ̂ ≃ θ.
Parameter Estimation Problems

Interval Estimation Problem


Given probability 1 − α, from a general sample (X1 , ..., Xn ) find
statistics θ̂1 , θ̂2 such that

P(θ̂1 < θ < θ̂2 ) = 1 − α.

With a specific sample (x1 , x2 , ..., xn ), θ̂1 takes the value θ1 and θ̂2
takes the value θ2 . The interval (θ1 , θ2 ) is called the interval estimate
of θ, where
• 1 − α: confidence level of the estimate,
• (θ1 , θ2 ): confidence interval of the estimate,
• ε = θ2 −θ2 : precision (error) of the estimate.
1

Example: The interval estimate with confidence level 1 − α for the


mean, for a large sample size, is (x − ε, x + ε) with ε = z α2 √sn . The
value √sn is called the standard error.
Hypothesis Testing Problems

Let H0 be the null hypothesis that we wish to test and H1 be the


statement opposing H0 which we suspect to be true. H1 is called the
alternative hypothesis to H0 .

H0 and H1 form a pair of hypotheses that are tested simultaneously to


reach one of the following conclusions:
• Reject H0 (which implies accepting H1 ),
• Do not reject H0 (therefore, accept H0 and reject H1 ).
Hypothesis Testing Problems

Two methods for hypothesis testing with a significance level α:


1 Based on p-value:
• P-value < α: reject the null hypothesis H0 ,
• P-value ≥ α: insufficient evidence to reject the null hypothesis H0 .
2 Based on critical value:
• Compute the test statistic,
• Compute the critical value corresponding to α,
• If the test statistic falls in the rejection region defined by the critical
value, reject the null hypothesis H0 ; otherwise, insufficient evidence
to reject H0 .
Hypothesis Testing Problems
Since conclusions about H0 are based on a sample, errors may
occur:
• Type I error: H0 is true but we reject it,
• Type II error: H0 is false but we do not reject it.

XXX
XXXReality H0 True H0 False
Decision XXX
X
Do not reject H0 Correct decision, Type II error,
probability 1 − α probability β
Reject H0 Type I error, Correct decision,
probability α probability 1 − β
In hypothesis testing, if the probability of Type I error decreases, the
probability of Type II error increases. The significance level, denoted
by α, is often set before testing to control the probability of Type I
error.
Introduction to Econometrics

Econometrics
• Estimating and measuring economic relationships.
• Comparing economic theories with real-world data to test the
validity of economic theories.
• Forecasting economic variables.

Econometrics uses results from:


• Economic theory
• Economic modeling
• Probability and Statistics
Methodology of Econometrics

1 Identifying the Research Problem


2 Collecting Data
3 Building the Model
4 Estimating the Parameters
5 Testing:
• If the results are satisfactory, use the model.
• If not, return to the previous steps.
Methodology of Econometrics

1. Identifying the Research Problem: Define the scope, nature,


characteristics of the objects, and their relationships.
2. Collecting Data: from primary or secondary sources.
3. Building the Model:
• Identify a reasonable economic theoretical model
• Construct an economic mathematical model:
• Each object represented by one or more variables.
• Each relationship: equation, function, inequality, etc.
• Parameter values: indicate the nature of the relationship.

4. Estimating the Parameters: Using a given dataset and specific


methods, the estimation results are concrete numbers.
Methodology of Econometrics

5. Testing:
• Using statistical testing methods: test the values of parameters, the
nature of relationships,
• Test the accuracy of the model,
• If not appropriate: return to previous steps. Modify or construct a
new model to achieve the best results.
6. Using the model:
• Based on the results considered satisfactory: forecast
relationships, and objects under defined conditions.
• Evaluate decisions.
Research Data

Types of Data
• Time series data: data of an economic variable at multiple points
in time.
• Cross-sectional data: data of multiple economic variables at the
same point in time.
• Panel data: a combination of the two types above.

Data Sources
• Experimental data
• Non-experimental data
Regression Relationships

Regression Relationship
Regression studies the dependence of one economic variable
(dependent variable) on one or more other economic variables
(independent variables, explanatory variables) based on the idea of
estimating the mean value of the dependent variable given the
known values of the independent variables.
Thus:
• Independent variables have predetermined values
• The dependent variable is a random quantity following probability
distribution laws.
Regression Relationships
Terminology for two types of variables in econometrics:
Regression Relationships

Note: Distinguish regression relationships from other types of


relationships: causal relationships, correlation relationships,
functional relationships.
• Function: Y = f (X )
• Regression function: Y = f (X ) + U where U is the error term.
Why does the error term U always exist in a regression model?
• Because not all factors affecting the dependent variable Y are
known
• Because not all factors affecting Y can be included in the model
(would make the model complex)
• Because not all necessary data is available
• Due to errors and inaccuracies in data collection.
Regression Relationships

Population Regression Function (PRF)


This is the regression function constructed based on data from all
objects to be studied

Yi = f (X1i , X2i , . . . , Xki ) + Ui

or
E(Y |X1i , X2i , . . . , Xki ) = f (X1i , X2i , . . . , Xki ),
where
• Y : dependent variable
• Yi : specific actual value of the dependent variable
• X1 , X2 , . . . , Xk : independent variables
• X1i , X2i , . . . , Xki : specific values of the independent variables
• Ui : random error term corresponding to the i-th observation.
Regression Relationships

Sample Regression Function (SRF)


In practice, it is very difficult to study the entire population, so typically
a regression function is constructed based on a sample

Yi = f (X1i , X2i , . . . , Xki ) + ei

or
Ŷi = f (X1i , X2i , . . . , Xki ),
where ei is the error term in the sample, the residual, an estimate of
Ui .

You might also like