0% found this document useful (0 votes)
54 views8 pages

Distributions and Normal Random Variables

This document provides an overview of distributions and normal random variables. It discusses: 1) Basic definitions of random variables including discrete, continuous, and mixed random variables. It also defines cumulative distribution functions, probability density functions, and quantiles. 2) How functions of random variables can be used to define new random variables from existing ones. It provides examples of linear transformations. 3) Expected value and its properties including linearity. It defines variance and provides examples of discrete and continuous distributions like the Bernoulli, Poisson, uniform, and normal distributions. 4) Bivariate and multivariate distributions including joint, marginal, and conditional distributions and properties of conditional expectation.

Uploaded by

aurelio.fdez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views8 pages

Distributions and Normal Random Variables

This document provides an overview of distributions and normal random variables. It discusses: 1) Basic definitions of random variables including discrete, continuous, and mixed random variables. It also defines cumulative distribution functions, probability density functions, and quantiles. 2) How functions of random variables can be used to define new random variables from existing ones. It provides examples of linear transformations. 3) Expected value and its properties including linearity. It defines variance and provides examples of discrete and continuous distributions like the Bernoulli, Poisson, uniform, and normal distributions. 4) Bivariate and multivariate distributions including joint, marginal, and conditional distributions and properties of conditional expectation.

Uploaded by

aurelio.fdez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Lecture 1

Distributions and Normal Random Variables

1 Random variables

1.1 Basic De˝nitions


Given a random variable X , we de˝ne a cumulative distribution function (cdf ), FX : R → [0, 1], such that
FX (t) = P {X ≤ t} for all t ∈ R. Here P {X ≤ t} denotes the probability that X ≤ t. To emphasize that
random variable X has cdf FX , we write X ∼ FX . Note that FX (t) is a nondecreasing function of t.
There are 3 types of random variables: discrete, continuous, and mixed.
Discrete random variable, X , is characterized by a list of possible values, X = {x1 , ..., xn }, and their
probabilities, p = {p1 , ..., pn }, where pi denotes the probability that X will take value xi , i.e. pi = P {X = xi }
for all i = 1, ..., n. Note that p1 + ... + pn = 1 and pi ≥ 0 for all i = 1, ..., n by de˝nition of probability. Then

the cdf of X is given by FX (t) = j=1,...,n: xj ≤t pj .
Continuous random variable, Y , is characterized by its probability density function (pdf), fY : R → R,
∫ ∫ +∞
such that P {a < Y ≤ b} = ab fY (s)ds. Note that −∞ fY (s)ds = 1 and fY (s) ≥ 0 for all s ∈ R by
∫t
de˝nition of probability. Then the cdf of Y is given by FY (t) = −∞ fY (s)ds. By the Fundamental Theorem
of Calculus, fY (t) = dFY (t)/dt.
A random variable is referred to as mixed if it is not discrete and not continuous.
If cdf F of some random variable X is strictly increasing and continuous then it has inverse, q(x) =
F (x). It is de˝ned for all x ∈ (0, 1). Note that
−1

P {X ≤ q(x)} = P {X ≤ F −1 (x)} = F (F −1 (x)) = x

for all x ∈ (0, 1). Therefore q(x) is called the x-quantile of X . It is such a number that random variable X
takes a value smaller or equal to this number with probability x. If F is not strictly increasing or continuous,
then we de˝ne q(x) as a generalized inverse of F , i.e. q(x) = inf{t ∈ R : F (t) ≥ x} for all x ∈ (0, 1). In
other words, q(x) is a number such that F (q(x) + ε) ≥ x and F (q(x) − ε) < x for any ε > 0. As an exercise,
check that P {X ≤ q(x)} ≥ x.

1
1.2 Functions of Random Variables
Suppose we have random variable X and function g : R → R. Then we can de˝ne another random variable
Y = g(X). The cdf of Y can be calculated as follows

FY (t) = P {Y ≤ t} = P {g(X) ≤ t} = P {X ∈ g −1 (−∞, t]},

where g −1 may be the set-valued inverse of g . The set g −1 (−∞, t] consists of all s ∈ R such that g(s) ∈
(−∞, t], i.e. g(s) ≤ t. If g is strictly increasing and continuously di˙erentiable then it has strictly increasing
and continuously di˙erentiable inverse g −1 de˝ned on set g(R). In this case P {X ∈ g −1 (−∞, t]} = P {X ≤
g −1 (t)} = FX (g −1 (t)) for all t ∈ g(R). If, in addition, X is a continuous random variable, then
( ) ( )−1 ( )−1
dFY (t) dFX (g −1 (t)) dFX (s) dg(s) dg(s)
fY (t) = = = = fX (g −1 (t))
dt dt ds s=g −1 (t) ds ds
s=g −1 (t) s=g −1 (t)

for all t ∈ g(R) . If t ∈/ g(R), then fY (t) = 0.


One important type of function is a linear transformation. If Y = X − a for some a ∈ R, then

FY (t) = P {Y ≤ t} = P {X − a ≤ t} = P {X ≤ t + a} = FX (t + a).

In particular, if X is continuous, then Y is also continuous with fY (t) = fX (t + a). If Y = bX with b > 0,
then
FY (t) = P {bX ≤ t} = P {X ≤ t/b} = FX (t/b).

In particular, if X is continuous, then Y is also continuous with fY (t) = fX (t/b)/b.

1.3 Expected Value


Informally, the expected value of some random variable can be interpreted as its average. Formally, if X is
a random variable and g : R → R is some function, then, by de˝nition,

E[g(X)] = g(xi )pi
i

for discrete random variables and ∫ +∞


E[g(X)] = g(x)fX (x)dx
−∞

for continuous random variables.


Expected values for some functions g deserve special names:

• mean: g(x) = x, E[X]

• second moment: g(x) = x2 , E[X 2 ]

• variance: g(x) = (x − E[X])2 , E[(X − E[X])2 ]

2
• k -th moment: g(x) = xk , E[X k ]

• k -th central moment: E[(X − EX)k ]

The variance of random variable X is commonly denoted by V (X).

1.3.1 Properties of expectation

1) For any constant a (non-random), E[a] = a.


2) The most useful property of an expectation is its linearity: if X and Y are two random variables and
a and b are two constants, then E[aX + bY ] = aE[X] + bE[Y ].
3)If X is a random variable, then V (X) = E[X 2 ] − (E[X])2 . Indeed,

V (X) = E[(X − E[X])2 ]


= E[X 2 − 2XE[X] + (E[X])2 ]
= E[X 2 ] − E[2XE[X]] + E[(E[X])2 ]
= E[X 2 ] − 2E[X]E[X] + (E[X])2
= E[X 2 ] − (E[X])2 .

4) If X is a random variable and a is a constant, then V (aX) = a2 V (X) and V (X + a) = V (X).

1.4 Examples of Random Variables


Discrete random variables:

• Bernoulli (p): random variable X has Bernoully(p) distribution if it takes values from X = {0, 1},
P {X = 0} = 1 − p and P {X = 1} = p. Its expectation E[X] = 1 · p + 0 · (1 − p) = p. Its second moment
E[X 2 ] = 12 · p + 02 · (1 − p) = p. Thus, its variance V (X) = E[X 2 ] − (E[X])2 = p − p2 = p(1 − p).
Notation: X ∼ Bernoulli(p).

• Poisson (λ): random variable X has a Poisson(λ) distribution if it takes values from X = {0, 1, 2, ...}
and P {X = j} = e−λ λj /j!. As an exercise, check that E[X] = λ and V (X) = λ. Notation: X ∼
Poisson(λ}.

Continuous random variables:

• Uniform (a, b): random variable X has a Uniform(a, b) distribution if its density fX (x) = 1/(b − a) for
x ∈ (a, b) and fX (x) = 0 otherwise. Notation: X ∼ U (a, b).

• Normal (µ, σ 2 ): random variable X has a Normal(µ, σ 2 ) distribution if its density fX (x) = exp(−(x −

µ)2 /(2σ 2 ))/( 2πσ) for all x ∈ R. Its expectation E[X] = µ and its variance V (X) = σ 2 . Notation:
X ∼ N (µ, σ 2 ). As an exercise, check that if X ∼ N (µ, σ 2 ), then Y = (X − µ)/σ ∼ N (0, 1). Y is
said to have a standard normal distribution. It is known that the cdf of N (µ, σ 2 ) is not analytical,
i.e. it can not be written as a composition of simple functions. However, there exist tables that give

3
its approximate values. The cdf of a standard normal distribution is commonly denoted by Φ, i.e. if
Y ∼ N (0, 1), then FY (t) = P {Y ≤ t} = Φ(t).

2 Bivariate (multivariate) distributions

2.1 Joint, marginal, conditional


If X and Y are two random variables, then FX,Y (x, y) = P {X ≤ x, Y ≤ y} denotes their joint cdf. X and Y
∫x ∫y
are said to have joint pdf fX,Y if fX,Y (x, y) ≥ 0 for all x, y ∈ R and FX,Y (x, y) = −∞ f
−∞ X,Y
(s, t)dtds.
Under some mild regularity conditions (for example, if fX,Y (x, y) is continuous),

∂ 2 FX,Y (x, y)
fX,Y (x, y) =
∂x∂y

From the joint pdf fX,Y one can calculate the pdf of, say, X . Indeed,
∫ x ∫ +∞
FX (x) = P {X ≤ x} = f (s, t)dtds
−∞ −∞


Therefore fX (s) = −∞ +∞
f (s, t)dt. The pdf of X is called marginal to emphasize that it comes from a joint
pdf of X and Y .
If X and Y have a joint pdf, then we can de˝ne a conditional pdf of Y given X = x (for x such that
fX (x) > 0): fY |X (y|x) = fX,Y (x, y)/fX (x). Conditional probability is a full characterization of how Y is
distributed for any given given X = x. The probability that Y ∈ A for some set A given that X = x can

be calculated as P {Y ∈ A|X = x} = A fY |X (y|x)dy . In a similar manner we can calculate the conditional
∫ +∞
expectation of Y given X = x: E[Y |X = x] = −∞ yfY |X (y|x)dy . As an exercise, think how we can de˝ne
the conditional distribution of Y given X = x if X and Y are discrete random variables.
Two extremely useful properties of a conditional expectation are: for any random variables X and Y ,

• E[f (X)Y |X = x] = f (x)E[Y |X = x];

• the law of iterated expectations : E[E[Y |X = x]] = E[Y ].

2.2 Independence
Random variables X and Y are said to be independent if fY |X (y|x) = fY (y) for all x ∈ R, i.e. if the marginal
pdf of Y equals conditional pdf Y given X = x for all x ∈ R. Note that fY |X (y|x) = fY (y) if and only if
fX,Y (x, y) = fX (x)fY (y). If X and Y are independent, then g(X) and f (Y ) are also independent for any
functions g : R → R and f : R → R. In addition, if X and Y are independent, then E[XY ] = E[X]E[Y ].

4
Indeed,
∫ +∞ ∫ +∞
E[XY ] = xyfX,Y (x, y)dxdy
−∞ −∞
∫ +∞ ∫ +∞
= xyfX (x)fY (y)dxdy
−∞ −∞
∫ +∞ ∫ +∞
= xfX (x)dx yfY (y)dy
−∞ −∞
= E[X]E[Y ]

2.3 Covariance
For any two random variables X and Y we can de˝ne covariance as

cov(X, Y ) = E[(X − E[X])(Y − E[Y ])].

As an exercise, check that cov(X, Y ) = E[XY ] − E[X]E[Y ].


Covariances have several useful properties:

1. cov(X, Y ) = 0 whenever X and Y are independent

2. cov(aX, bY ) = abcov(X, Y ) for any random variables X and Y and any constants a and b

3. cov(X + a, Y ) = cov(X, Y ) for any random variables X and Y and any constant a

4. cov(X, Y ) = cov(Y, X) for any random variables X and Y



5. |cov(X, Y )| ≤ V (X)V (Y ) for any random variables X and Y

6. V (X + Y ) = V (X) + V (Y ) + 2cov(X, Y ) for any random variables X and Y


∑n ∑n
7. V ( i=1 Xi ) = i=1 V (Xi ) whenever X1 , ..., Xn are independent

To prove property 5, consider random variable X − aY with a = cov(X, Y )/V (Y ). On the one hand, its
variance V (X − aY ) ≥ 0. On the other hand,

V (X − aY ) = V (X) − 2acov(X, Y ) + a2 V (Y )
= V (X) − 2(cov(X, Y ))2 /V (Y ) + (cov(X, Y )2 /V (Y )

Thus, the last expression is nonnegative as well. Multiplying it by V (Y ) yields the result.

The correlation of two random variables X and Y is de˝ned by corr(X, Y ) = cov(X, Y )/ V (X)V (Y ).
By property 5 as before, |corr(X, Y )| ≤ 1. If |corr(X, Y )| = 1, then X and Y are linearly dependent, i.e.
there exist constants a and b such that X = a + bY .

5
3 Normal Random Variables

Let us begin with the de˝nition of a multivariate normal distribution. Let Σ be a positive de˝nite n × n
matrix. Remember that the n × n matrix Σ is positive de˝nite if aT Σa > 0 for any non-zero n × 1 vector a.
Here superindex T denotes transposition. Let µ be n × 1 vector. Then X ∼ N (µ, Σ) if X is continuous and
its pdf is given by
exp(−(x − µ)T Σ−1 (x − µ)/2)
fX (x) = √
(2π)n/2 det(Σ)
for any n × 1 vector x.
A normal distribution has several useful properties:

1. if X ∼ N (µ, Σ), then Σij = cov(Xi , Xj ) for any i, j = 1, ..., n where X = (X1 , ..., Xn )T

2. if X ∼ N (µ, Σ), then µi = E[Xi ] for any i = 1, ..., n

3. if X ∼ N (µ, Σ), then any subset of components of X is normal as well. In particular, Xi ∼ N (µi , Σii )

4. if X and Y are uncorrelated normal random variables, then X and Y are independent. As an exercise,
check this statement

5. if X ∼ N (µX , σX
2
), Y ∼ N (µY , σY2 ), and X and Y are independent, then X +Y ∼ N (µX +µY , σX
2
+σY2 )

6. Any linear combination of normals is normal. That is, if X ∼ N (µ, Σ) is an n × 1 dimensional normal
vector, and A is a ˝xed k × n full-rank matrix with k ≤ n, then Y = AX is a normal k × 1 vector:
Y ∼ N (Aµ, AΣAT ).

3.1 Conditional distribution


Another useful property of a normal distribution is that its conditional distribution is normal as well. If
[ ] ([ ] [ ])
X1 µ1 Σ11 Σ12
X= ∼N ,
X2 µ2 Σ21 Σ22

then X1 |X2 = x2 ∼ N (µ̃, Σ)


˜ with µ̃ = µ1 +Σ12 Σ−1 (x2 −µ2 ) and Σ
22
˜ = Σ11 −Σ12 Σ−1 Σ21 . If X1 and X2 are both
22
random variables (as opposed to random vectors), then E[X1 |X2 = x2 ] = µ1 + cov(X1 , X2 )(x2 − µ2 )/V (X2 ).
Let us prove the last statement. Let [ ]
σ11 σ12
Σ=
σ12 σ22

be the covariance matrix of 2 × 1 normal random vector X = (X1 , X2 )T with mean µ = (µ1 , µ2 )T . Note that
Σ12 = Σ21 = σ12 since cov(X1 , X2 ) = cov(X1 , X2 ). From linear algebra, we know that det(Σ) = σ11 σ22 − σ12
2

and [ ]
1 σ22 −σ12
Σ−1 = .
det(Σ) −σ12 σ11

6
Thus the pdf of X is

exp{−[(x1 − µ1 )2 σ22 + (x2 − µ2 )2 σ11 − 2(x1 − µ1 )(x2 − µ2 )σ12 ]/(2 det(Σ)}


fX (x1 , x2 ) = √ ,
2π det(Σ)

and the pdf of X2 is


exp{−(x2 − µ2 )2 /(2σ22 )}
fX2 (x2 ) = √ .
2πσ22
Note that
σ11 1 σ11 σ22 − (σ11 σ22 − σ12
2
) 2
σ12
− = = .
det(Σ) σ22 det(Σ)σ22 det(Σ)σ22
Therefore the conditional pdf of X1 , given X2 = x2 , is

fX (x1 , x2 )
fX1 |X2 (x1 |X2 = x2 ) =
fX2 (x2 )
exp{−[(x1 − µ1 )2 σ22 + (x2 − µ2 )2 σ12
2
/σ22 − 2(x1 − µ1 )(x2 − µ2 )σ12 ]/(2 det(Σ))}
= √ √
2π det(Σ)/σ22
exp{−[(x1 − µ1 )2 + (x2 − µ2 )2 σ12
2 2
/σ22 − 2(x1 − µ1 )(x2 − µ2 )σ12 /σ22 ]/(2 det(Σ)/σ22 )}
= √ √
2π det(Σ)/σ22
exp{−[x1 − µ1 − (x2 − µ2 )σ12 /σ22 ]2 /(2 det(Σ)/σ22 )}
= √ √
2π det(Σ)/σ22
exp{−(x1 − µ̃)2 /(2σ̃)}
= √ √ ,
2π σ̃

where µ̃ = µ1 + (x2 − µ2 )σ12 /σ22 and σ̃ = det(Σ)/σ22 . Note, that the last expression equals the pdf of a
normal random variable with mean µ̃ and variance σ̃ yields the result.

7
MIT OpenCourseWare
https://ocw.mit.edu

14.381 Statistical Method in Economics


Fall 2018

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms

You might also like