0% found this document useful (0 votes)
6 views

lect2

The document discusses transformations and expectations of random variables, detailing how to derive the cumulative distribution function (CDF) and probability density function (PDF) of a transformed variable Y = g(X) based on the original variable X. It includes examples of transformations, the expected value of random variables, and the concept of moments, including the moment-generating function and characteristic function. The document emphasizes the importance of these concepts in understanding the behavior and properties of random variables.

Uploaded by

Tewachew Guadie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

lect2

The document discusses transformations and expectations of random variables, detailing how to derive the cumulative distribution function (CDF) and probability density function (PDF) of a transformed variable Y = g(X) based on the original variable X. It includes examples of transformations, the expected value of random variables, and the concept of moments, including the moment-generating function and characteristic function. The document emphasizes the importance of these concepts in understanding the behavior and properties of random variables.

Uploaded by

Tewachew Guadie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Transformations and Expectations of random variables

X ∼ FX (x): a random variable X distributed with CDF FX .


Any function Y = g(X) is also a random variable.
If both X, and Y are continuous random variables, can we find a simple way to characterize
FY and fY (the CDF and PDF of Y ), based on the CDF and PDF of X?

For the CDF:
FY (y) = PY (Y ≤ y)
= PY (g(X) ≤ y)
= PX (x ∈ X : g(X) ≤ y) (X is sample space for X)
Z
= fX (s)ds.
{x∈X :g(X)≤y}

PDF: fY (y) = Fy0 (y)


Caution: need to consider support of y.
Consider several examples:

1. X ∼ U [−1, 1] and y = exp(x)


That is:
1

if x ∈ [−1, 1]
2
fX (x) =
0 otherwise
1 1
FX (x) = + x, for x ∈ [−1, 1].
2 2

FY (y) = P rob(exp(X) ≤ y)
= P rob(X ≤ log y)
1 1 1
= FX (log y) = + log y, for y ∈ [ , e].
2 2 e
Be careful about the bounds of the support!

fY (y) = FY (y)
∂y
1 1 1
= fX (log y) = , for y ∈ [ , e].
y 2y e

1
2. X ∼ U [−1, 1] and Y = X 2

FY (y) = P rob(X 2 ≤ y)
√ √
= P rob(− y ≤ X ≤ y)
√ √
= FX ( y) − FX (− y)
√ √ √
= 2FX ( y) − 1, by symmetry: FX (− y) = 1 − FX ( y).


fY (y) = FY (y)
∂y
√ 1 1
= 2fX ( y) √ = √ , for y ∈ [0, 1].
2 y 2 y


As the first example above showed, it’s easy to derive the CDF and PDF of Y when g(·) is
a strictly monotonic function:
Theorems 2.1.3, 2.1.5: When g(·) is a strictly increasing function, then
Z g −1 (y)
FY (y) = fX (x)dx = FX (g −1 (y))
−∞
∂ −1
fY (y) = fX (g −1 (y)) g (y) using chain rule.
∂y

Note: by the inverse function theorem,


∂ −1
g (y) = 1/ [g 0 (x)] |x=g−1 (y) .
∂y

When g(·) is a strictly decreasing function, then


Z ∞
FY (y) = fX (x)dx = 1 − FX (g −1 (y))
g −1 (y)
∂ −1
fY (y) = −fX (g −1 (y)) g (y) using chain rule.
∂y

These are the change of variables formulas for transformations of univariate random variables.
Thm 2.1.8 generalizes this to piecewise monotonic transformations.


2
Here is a special case of a transformation:
Thm 2.1.10: Let X have a continuous CDF FX (·) and define the random variable Y =
FX (X). Then Y ∼ U [0, 1], i.e., FY (y) = y, for y ∈ [0, 1].
Note: all that is required is that the CDF FX is continuous, not that it must be strictly
increasing. The result also goes through when FX is continuous but has flat parts (cf.
discussion in CB, pg. 34).

Expected value (Definition 2.2.1): The expected value, or mean, of a random variable
g(X) is
 R∞
−∞
g(x)fX (x)dx if X continuous
Eg(X) = P
x∈X g(x)P (X = x) if X discrete
provided that the integral or the sum exists
The expectation is a linear operator (just like integration): so that
" n
# n
X X
E α∗ gi (X) + b = α ∗ Egi (X) + b.
i=1 i=1

Note: Expectation is a population average, i.e., you average values of the random variable
g(X) weighting by the population density fX (x).

P1 , X2 , . . . , Xn ∼ FX . From these sample


A statistical experiment yields sample observations X
observations, we can calculate sample avg. X̄n ≡ n1 i Xi . In general: X̄n 6= EX. But under
some conditions, as n → ∞, then X̄n → EX in some sense (which we discuss later).

Expected value is commonly used measure of “central tendency” of a random variable X.
1
Example: But mean may not exist: Cauchy random variable with density f (x) = π(1+x2 )
for x ∈ (−∞, ∞). Note that
Z ∞ Z 0 Z ∞
x x x
2
dx = 2
dx + dx
−∞ π(1 + x ) −∞ π(1 + x ) 0 π(1 + x2 )
Z 0 Z b
0 x x
= lim 2
dx + lim dx
a→−∞ a π(1 + x ) b→∞ 0 π(1 + x2 )
0 1 1
= lim [log(1 − x2 )]0a + lim [log(1 − x2 )]b0
a→−∞ 2π b→∞ 2π
= −∞ + ∞ undefined

3

Other measures:

1. Median: med(X) = m such that FX (x) = 0.5. Robust to outliers, and has nice
invariance property: for Y = g(X) and g(·) monotonic increasing, then med(Y ) =
g(med(X)).

2. Mode: Mode(X) = maxx fX (x).


Moments: important class of expectations
For each integer n, the n-th (uncentred) moment of X ∼ FX (·) is µ0n ≡ EX n .
The n-th centred moment is µn ≡ E(X − µ)n = E(X − EX)n . (It is centred around the
mean EX.)

For n = 2: µ2 = E(X − EX)2 is the Variance of X. µ2 is the standard deviation.
Important formulas:

• V ar(aX + b) = a2 V arX (variance is not a linear operation)

• V arX = E(X 2 ) − (EX)2 : alternative formula for the variance


The moments of a random variable are summarized in the moment generating function.
Definition: the moment-generating function of X is MX (t) ≡ E exp(tX), provided that the
expectation exists in some neighborhood t ∈ [−h, h] of zero.
This is also called the “Laplace transform”.
Specifically:
 R ∞ tx
−∞
e fX (x)dx for X continuous
Mx (t) = P tx
x∈X e P (X = x) for X discrete.

The uncentred moments of X are generated from this function by:

(n) dn
EX n = MX (0) ≡ MX (t) ,
dtn t=0

4
which is the n-th derivative of the MGF, evaluated at t = 0.
Example: standard normal distribution:
Z ∞
x2
 
1
MX (t) = √ exp tx − dx
−∞ 2π 2
Z ∞  
1 1 2 2
= √ exp − ((x − t) − t ) dx
−∞ 2π 2
Z ∞  
1 2 1 1 2
= exp( t ) · √ exp − (x − t) dx
2 −∞ 2π 2
1
= exp( t2 ) · 1
2
where last term on RHS is integral over density function of N (t, 1), which integrates to one.
First moment: EX = MX1 (0) = t · exp( 12 t2 ) t=0 = 0.
Second moment: EX 2 = MX2 (0) = exp( 21 t2 ) + t2 exp( 12 t2 ) = 1.

In many cases, the MGF can characterize a distribution. But problem is that it may not
exist (eg. Cauchy distribution)
For a RV X, is its distribution uniquely determined by its moment generating function?
Thm 2.3.11: For X ∼ FX and Y ∼ FY , if MX and MY exist, and MX (t) = MY (t) for all t
in some neighborhood of zero, then FX (u) = FY (u) for all u.
Note that if the MGF exists, then it characterizes a random variable with an infinite number
of moments (because the MGF is infinitely differentiable). Converse not necessarily true.
(ex. log-normal random variable: X ∼ N (0, 1), Y = exp(X))

Characteristic function:
The characteristic function of a random variable g(x), defined as
Z +∞
φg(x) (t) = Ex exp(itg(x)) = exp(itg(x))f (x)dx
−∞

where f (x) is the density for x.


This is also called the “Fourier transform”.
Features of characteristic function:

5
• The CF always exists. This follows from the equality eitx = cos(tx) + i · sin(tx), and
both the real and complex parts of the integrand are bounded functions.

• Consider a symmetric density function, with f (−x) = f (x) (symmetric around zero).
Then resulting φ(t) is real-valued, and symmetric around zero.

• The CF completely determines the distribution of X (every cdf has a unique charac-
teristic function).

• Let X have characteristic function φX (t). Then Y = aX + b has characteristic function


φY (t) = eibt φX (at).

• X and Y , independent, with characteristic functions φX (t) and φY (t). Then φX+Y (t) =
φX (t)φY (t)

• φ(0) = 1.
R +∞
• For a given characteristic function φX (t) such that −∞ |φX (t)|dt < ∞,1 the corre-
sponding density fX (x) is given by the inverse Fourier transform, which is
Z +∞
1
fX (x) = φX (t) exp(−itx)dt.
2π −∞

Example: N (0, 1) distribution, with density f (x) = √1 exp(−x2 /2).


Take as given that the characteristic function of N (0, 1) is


Z
1
exp itx − x2 /2) dx = exp(−t2 /2).

φN (0,1) (t) = √ (1)

Hence the inversion formula yields
Z +∞
1
f (x) = exp(−t2 /2) exp(−itx)dt.
2π −∞

Now making substitution z = −t, we get


Z +∞
1
exp izx − z 2 /2 dz

2π −∞
1 1
= √ φN (0,1) (x) = √ exp(x2 /2) = fN (0,1) (x). (Use Eq. (1))
2π 2π
p
1
Here | · | denotes the modulus of a complex number. For x + iy, we have |x + iy| = x2 + y 2 .

6
• Characteristic function also summarizes the moments of a random variable. Specifi-
cally, note that the h-th derivative of φ(t) is
Z +∞
h
φ (t) = ih g(x)h exp(itg(x))f (x)dx. (2)
−∞

Hence, assuming the h-th moment, denoted µhg(x) ≡ E[g(x)]h exists, it is equal to

µhg(x) = φh (0)/ih .

Hence, assuming that the required moments exist, we can use Taylor’s theorem to
expand the characteristic function around t = 0 to get:

it 1 (it)2 2 (it)k k
φ(t) = 1 + µg(x) + µg(x) + ... + µg(x) + o(tk ).
1 2 k!

• Cauchy distribution, cont’d: The characteristic function for the Cauchy distribu-
tion is
φ(t) = exp(−|t|).
This is not differentiable at t = 0, which by Eq. (2) is saying that its mean does not
exist. Hence, the expansion of the characteristic function in this case is invalid.

You might also like