0% found this document useful (0 votes)
14 views8 pages

2A2. Review of Probability

Review of probability

Uploaded by

selena.shi.jw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

2A2. Review of Probability

Review of probability

Uploaded by

selena.shi.jw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

ECO2001 Econometrics I, 2024/2025-1

FUNDAMENTALS OF PROBABILITY

Random variable and probability distribution


The probability of an event is the proportion of times the event will occur in repeated
trials of an experiment.

A random variable (Y) is a variable whose value (y) is determined by the outcome of a
chance experiment. A discrete random variable takes on only a finite, or countable,
number of values. A continuous random variable can take any value in some interval of
values.

A probability density function (pdf) of a random variable, 0 ≤ f() ≤ 1, summarizes the


probabilities of possible outcomes. For example, f(y) = P(Y = y) for discrete Y, and
∫ 𝑓 (𝑦)𝑑𝑦 = 𝑃(𝑎 ≤ 𝑌
𝑏

≤ 𝑏) for continuous Y.
𝑎 𝑌

fY(y) fY(y)

P(a ≤ Y ≤ 𝑏)

Y Y

The cumulative density function (cdf) of the random variable Y, F(y), gives the
probability that Y is less than or equal to a specific value x, that is, FY(y) = P(Y ≤ y ).

A probability distribution can often be summarized in terms of a few of its


characteristics, known as the moments of the distribution.

1
EXPECTED VALUE / MEAN [measure of central tendency] {first moment}
The expected value of a random variable is the average value that occurs in many
repeated trials of an experiment.
𝐸(𝑌) = 𝜇𝑌 = ∑𝑦 𝑦𝑓𝑌(𝑦)
This is a weighted average of all possible values of Y, with weights being the
probabilities that the values occurs.

If Y represents some variable in a population, it is called the population mean.

Properties of expected value


Given any constants a, b and c,
1. E(c) = c, for any constant c
2. E(cY) = cE(Y) , for any constant c
3. E(aY + b) = aE(Y) + b, for any constants a, b

VARIANCE [measure of variability] {second moment}


The variance of a random variable measures the spread of the values of Y around its

𝑉𝑎𝑟( 𝑌) = 𝜎2 = ∑𝑦[𝑦 − 𝐸(𝑌)]2𝑓𝑌(𝑦) = 𝐸[ 𝑌 − 𝐸( 𝑌)]2 = 𝐸( 𝑌2) − [𝐸( 𝑌)]2


expected value.


The larger the variance of a random variable, the more “spread out” its values are.
fY(y)

Y
If Y is some random variable from a population, it is called the population variance.

The square root of the variance is the standard deviation, sd(Y) = Y.

Properties of variance
(a) Var(a) = 0, for any constant a
(b) Var(aY) = a2Var(Y), for any constant a
(c) Var(aY + b) = a2Var(Y), for any constants a, b

2
variance 𝜎2, a standardized random variable with mean 0 and variance 1 can be
▶ Standardizing a random variable: Given a random variable Y with mean  and

defined as:
𝑍 = [𝑌 − 𝐸(𝑌)]/√𝑉𝑎𝑟(𝑌), with

𝐸(𝑍) 𝐸(𝑌)
= 0 and = �𝑌 =
2
𝜇 𝑉𝑎𝑟(
= − = − 𝑌)
𝜇𝑌 𝜇𝑌
𝑉𝑎𝑟(𝑍) = 1
𝑌
𝜎𝑌 𝜎 𝜎 𝜎
2
�𝑌 𝑌

𝑌 𝑌 𝑌

Standardization is useful in formulating a test statistic in hypothesis testing.

SKEWNESS [measure of symmetry of a distribution around the mean] {third moment}


< 0: left skewed (with long left tail)
skewness =𝐸[𝑌 − 𝐸(𝑌)]3/𝜎3= 0: symmetric

> 0: right skewed (with long right tail)

KURTOSIS [measure of thickness of tails of a distribution] {fourth moment}


< 3: (thinner tails with smaller chance of extreme values
than kurtosis = 3)
kurtosis = 𝐸[𝑌 − 𝐸(𝑌)]4/𝜎4 = 3 : (e.g., normal distribution)

> 3: (thicker tails with greater chance for extreme values
than kurtosis = 3, e.g., t distribution)

Joint distributions, conditional distributions, and independence


These are related to the occurrence of events involving more than one random variable.
Joint distribution: 𝑓𝑋,𝑌(𝑥, 𝑦)
If X and Y are independent, then 𝑓𝑋,𝑌(𝑥, 𝑦) = 𝑓𝑋(𝑥)𝑓𝑌(𝑦), with 𝑓𝑋(∙) and 𝑓𝑌(∙)
being the marginal probability density functions

In econometrics, we consider the probability that 𝑢𝑖 and 𝑢𝑗 occurs at the same time.

Conditional distribution: 𝑓𝑌|𝑋(𝑦|𝑥) = 𝑓𝑋,𝑌(𝑥, 𝑦)/𝑓𝑋(𝑥)


In econometrics, we consider the probability distribution of ui conditional on the values

3
of a certain factor Xk.

4
COVARIANCE [measure of linear association between 2 random variables]
𝐶𝑜𝑣( 𝑋, 𝑌) = 𝜎𝑋𝑌 = 𝐸[ 𝑋 − 𝐸( 𝑋)][𝑌 − 𝐸( 𝑌)] = 𝐸( 𝑋𝑌) − 𝐸(X) 𝐸( 𝑌)

This is positive (negative) if the two random variables move in the same direction
(opposite directions).

If X and Y are some random variables from a population, it is called the population
covariance.

Properties of covariance
(a) Cov(a, d) = 0, for any constants a, d
(b) Cov(aX , cY) = acCov(X,Y), for any constants a, c
(c) Cov(aX + b, cY + d) = acCov(X, Y), for any constants a, b, c, d
(d) Cov(X, Y) = 0 if X and Y are independent (but the reverse does not necessarily hold)

CORRELATION COEFFICIENT [unit free measurement]


Interpreting the actual value of XY is difficult because X and Y may have different units
of measurement. Scaling the covariance by the standard deviations of the variables

𝐶𝑜𝑣( 𝑋, 𝑌) 𝜎𝑋𝑌
eliminates the units of measurement, and defines the correlation between X and Y.

−1 ≤ 𝐶𝑜𝑟𝑟( 𝑋, 𝑌) = 𝑉𝑎𝑟( 𝑋) 𝑉𝑎𝑟(=𝑌) ≤𝜎1 𝜎


𝜌𝑋𝑌 =
√ 𝑋 𝑌

+1: perfectly positive


unit free measurement:
-1: perfectly negative
0: independent

5
More properties of expected value and variance
1. E(aX + bY + c) = aE(X) + b E(Y) + c
1’. If {𝑎1, 𝑎2 , …, 𝑎𝑁} are constants and {𝑌1, 𝑌2 , …, 𝑌𝑁}
𝐸(∑𝑁 𝑎𝑖𝑌𝑖) = 𝐸(𝑎1𝑌1 + 𝑎2𝑌2 + ⋯ + 𝑎𝑁𝑌𝑁) = 𝑎1𝐸(𝑌1) + 𝑎2𝐸(𝑌2) +
are random variables,

⋯+
𝑖
then

𝑎𝑁𝐸(𝑌𝑁) = 𝑎𝑖𝐸(𝑌𝑖) = 𝐸(𝑎𝑖𝑌𝑖).


∑𝑁 𝑖
∑𝑁
𝑖
[The expected value of a sum is the sum of the expected values.]
2. E(XY) = E(X)E(Y) if X and Y are independent.
3. Var(𝑎𝑋 ± 𝑏𝑌 ± 𝑐𝑍) = 𝑎2Var(𝑋) + 𝑏2Var(𝑌) + 𝑐2Var(𝑍)
±2𝑎𝑏Cov(𝑋, 𝑌) ± 2𝑎𝑐Cov(𝑋, 𝑍) ± 2𝑏𝑐Cov(𝑌, 𝑍)
𝑎𝑖𝑌𝑖) = 𝑎2�Var(𝑌𝑖) + 2 ∑𝑖<𝑗 ∑ 𝑎𝑖𝑎𝑗Cov(𝑌𝑖, 𝑌)
Var(∑𝑁
𝑖
3’.
∑𝑁
𝑖

CONDITIONAL EXPECTATION
𝐸(𝑌|𝑋 = 𝑥) = 𝜇𝑌|𝑋 = ∑ 𝑦𝑓𝑌(𝑦|𝑥)
𝑦

Properties of conditional expectation


1. E[c(X)|X] = c(X) [If we know X, then we also know c(X).]
2. E{[c(X)Y + c(X)]|X} = c(X)E(Y|X) + c(X)
3. E(Y|X) = E(Y) if X and Y are independent.

Var(𝑌|𝑋 = 𝑥) = ∑𝑦[𝑦|𝑥 − 𝐸(𝑌|𝑋 = 𝑥)]2𝑓(𝑦|𝑥)


CONDITIONAL VARIANCE

Properties of conditional variance


1. Var(Y|X) = Var(Y) if X and Y are independent.

6
Standard distribution
NORMAL DISTRIBUTION

pdf of Y:𝑓𝑌(𝑦) = (1/𝜎√2𝜋) 𝑒𝑥𝑝[ − (𝑦 − 𝜇𝑌)2/2𝜎2],



where 𝜇𝑌 = 𝐸(𝑌) and 𝜎2 = Var(𝑌) are the parameters of the distribution,
-∞ < Y < ∞,

𝑌~𝑁(𝜇𝑌, 𝜎2)

fY(y)

1. Symmetry distribution: skewness = 0


2. kurtosis = 3
3. 𝑍 = ~𝑁(0,1) is a standard normal random variable
𝑌−𝜇𝑌

4. Any linear combination of independent normal random variables has a normal

𝑌 = 𝑎𝑌1 + 𝑏𝑌2~𝑁(𝜇𝑌 = 𝑎𝜇1 + 𝑏𝜇2, 𝜎2 = 𝑎2𝜎2 + 𝑏2𝜎2)


distribution.
𝑌 1 2

Let 𝑍𝑖~𝑁(0,1), i  1,..., N , be N independent random variables.


CHI-SQUARE DISTRIBUTION

𝑋 = ∑𝑁 𝑍2~𝜒2 , where N is the degrees of freedom (df), with E( Var( X )  2N


𝑖=1 𝑖 𝑁
X)N ,

fY(Y)

7
t DISTRIBUTION

Let 𝑍~𝑁(0,1) X~ be two independent random variables.
N
and 2

𝑇 = 𝑍/√𝑋/𝑁~𝑡𝑁, with 𝐸(𝑇) = 0, Var(𝑇) = 𝑁/(𝑁 − 2)

It is symmetric, and has thicker tails than standard normal distribution.


fY(Y)

Let 𝑋1~𝜒2 and 𝑋2~𝜒2 be two independent random variables.


F DISTRIBUTION
𝑁1 𝑁2
𝐹 = ~𝐹 , where N1: numerator df, N2: denominator df
𝑋1/𝑁1

𝑋2 / 𝑁1 ,
𝑁2 𝑁2
2𝑁2(𝑁1+𝑁2−2)
𝐸(𝐹) ==
Var(𝐹) 𝑁2/(𝑁2 − 2), 2
𝑁1(𝑁2−2)2(𝑁2−4)

𝑡2 = 𝐹1,𝑁

fY(Y)

For large df, the chi-square, t and F distributions converge to the normal distribution.

You might also like