0% found this document useful (0 votes)

19 views16 pages

Recitation 4 Solution

The document provides definitions and properties of various continuous probability distributions, including Uniform, Exponential, Standard Normal, Gamma, and Beta distributions. It includes exercises related to waiting times for buses, transmission of messages over noisy channels, survival times of mice, and the hazard function in survival analysis. Additionally, it discusses the universality of the Uniform distribution and its implications in probability theory.

Uploaded by

a2671631196

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views16 pages

Recitation 4 Solution

Uploaded by

a2671631196

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

10510134: Probability Theory and Mathematical Statistics Fall 2023

Recitation 4

4.1 Famous Continuous Distributions

4.1.1 Review

Definition 4.1 (Uniform distribution). A continuous r.v. U is said to have the Uniform distribution
on the interval (a, b) if its PDF is

 1
if a < x < b
b−a
f (x) =
0 otherwise

We denote this by U ∼ Unif(a, b).

Definition 4.2 (Exponential distribution). A continuous r.v. X is said to have the Exponential
distribution with parameter λ, where λ > 0, if its PDF is

f (x) = λe−λx , x>0

We denote this by X ∼ Expo(λ).

Definition 4.3 (Standard Normal distribution). A continuous r.v. Z is said to have the standard
Normal distribution if its PDF φ is given by
1
φ(z) = √ e−z /2 ,
2
−∞ < z < ∞
2π
We write this as Z ∼ N (0, 1).

Definition 4.4 (Gamma distribution). A r.v. Y is said to have the Gamma distribution with
parameters a and λ, where a > 0 and λ > 0, if its PDF is
λa a−1 −λy
f (y) = y e , y>0
Γ(a)
R∞
where Γ(a) = 0
xa−1 e−x dx is the Gamma function. We write Y ∼ Gamma(a, λ).

Definition 4.5 (Beta distribution). An r.v. X is said to have the Beta distribution with parameters
a and b, where a > 0 and b > 0, if its PDF is
1 Γ(a)Γ(b)
f (x) = xa−1 (1 − x)b−1 , 0 < x < 1, β(a, b) =
β(a, b) Γ(a + b)
where the constant β(a, b) is chosen to make the PDF integrate to 1. We write this as X ∼ Beta(a, b).

4-1
4-2

4.1.2 Exercise

Exercise 1. Joe lives in a city where buses always arrive exactly on time, with the time between
successive buses fixed at 10 minutes. Having lost his watch, he arrives at the bus stop at a random
time (assume that buses run 24 hours a day, and that the time that Joe arrives is uniformly random
on a particular day).

1. What is the distribution of how long Joe has to wait for the next bus? What is the average time
that Joe has to wait?
2. Given that the bus has not yet arrived after 6 minutes, what is the probability that Joe will have
to wait at least 3 more minutes?
3. Joe moves to a new city with inferior urban planning and where buses are much more erratic.
Now, when any bus arrives, the time until the next bus arrives is an Exponential random variable
with mean 10 minutes. Joe arrives at the bus stop at a random time, not knowing how long ago
the previous bus came. What is the distribution of Joe’s waiting time for the next bus? What is
the average time that Joe has to wait?

Answer:

1. The distribution is Uniform on (0, 10), so the mean is 5 minutes.

2. Let T be the waiting time. Then
P (T ≥ 9, T > 6) P (T ≥ 9) 1/10 1
P (T ≥ 6 + 3 | T > 6) = = = = .
P (T > 6) P (T > 6) 4/10 4
In particular, Joe’s waiting time is not memoryless; conditional on having waited 6 minutes
already, there’s only a 1/4 chance that he’ll have to wait at least another 3 minutes, whereas if
he had just showed up, there would be a P (T ≥ 3) = 7/10 chance of having to wait at least 3
minutes.
3. By the memoryless property, the distribution is Exponential with parameter 1/10 (and mean
10 minutes) regardless of when Joe arrives; how much longer the next bus will take to arrive is
independent of how long ago the previous bus arrived. The average time that Joe has to wait is
10 minutes.

Exercise 2. Joe is trying to transmit to Donald the answer to a yes-no question, using a noisy channel.
He encodes “yes” as 1 and “no” as 0, and sends the appropriate value. However, the channel adds noise;
specifically, Donald receives what Joe sends plus a N (0, σ 2 ) noise term (the noise is independent of
what Joe sends). If Donald receives a value greater than 1/2 he interprets it as “yes”; otherwise, he
interprets it as “no”.

1. Find the probability that Donald understands Joe correctly.

4-3

2. What happens to the result from question 1 if σ is very small? What about if σ is very large?
Explain intuitively why the results in these extreme cases make sense.

Answer:

1. Let a be the value that Joe sends and ϵ be the noise, so B = a + ϵ is what Donald receives. If
a = 1, then Donald will understand correctly if and only if ϵ > −1/2. If a = 0, then Donald will
understand correctly if and only if ϵ ≤ 1/2. By symmetry of the Normal, P (ϵ > −1/2) = P (ϵ ≤
1/2), so the probability that Donald understands does not depend on a. This probability is

1 ϵ 1 1
P ϵ≤ =P ≤ =Φ .
2 σ 2σ 2σ
2. If σ is very small, then
1
Φ ≈ 1,
2σ
since Φ(x) (like any CDF) goes to 1 as x → ∞. This makes sense intuitively: when the standard
deviation σ is very small, the noise ϵ is unlikely to take values very different from 0 (i.e., there is
very little noise), then it’s easy for Donald to understand Joe. If σ is very large, then

1
Φ ≈ Φ(0) = 1/2
2σ
Again this makes sense intuitively: if there is a huge amount of noise, then Joe’s message will get
drowned out (the noise dominates over the signal).

Exercise 3. In studies of anticancer drugs it was found that if mice are injected with cancer cells, the
survival time (in hours) can be modeled with the exponential distribution Expo(λ) with λ = 0.1. What
is the probability that a randomly selected mouse will survive at least 8 hours? At most 12 hours?
Between 8 and 12 hours?

Answer: Let T ∼ Expo(0.1) be the survival time (in hours) of selected mice. Then P (T ≥ 8) =
−0.1×8
1 − (1 − e ) = e−0.1×8 = 0.4493, P (T ≤ 12) = 1 − e−0.1×12 = 0.6988. Combining these two
answers, P (8 ≤ T ≤ 12) = P (T ≤ 12) − P (T < 8) = 0.6988 − (1 − 0.4493) = 0.1481.

Exercise 4. Let T be the lifetime of a certain person (how long that person lives), and let T have
CDF F and PDF f . The survival function G is defined as

G(t) = P (T > t) = 1 − F (t),

which is the probability of survival at time t. The hazard function of T is defined by

f (t) f (t)
h(t) = = .
G(t) 1 − F (t)
4-4

1. Explain why h is called the hazard function and in particular, why h(t) is the probability density
for death at time t, given that the person survived up until then.
2. Show that the CDF and PDF are related to hazard function by
Z t Z t
F (t) = 1 − exp − h(s)ds , f (t) = h(t) exp − h(s)ds , ∀t > 0.
0 0

3. Show that an Exponential r.v. has constant hazard function and conversely, if the hazard function
of T is a constant then T must be Expo(λ) for some λ.
4. What is the hazard function of a Weibull(γ, λ) distribution? (A Weibull(γ, λ) distribution has
CDF F (x) = 1 − exp(−(λx)γ ) for x ≥ 0 and F (x) = 0 otherwise, where γ > 0 is a parameter).

Answer:

1. Given that T > t0 , the conditional CDF of T is

P (t0 ≤ T ≤ t) F (t) − F (t0 )
P (T ≤ t | T ≥ t0 ) = =
P (T ≥ t0 ) 1 − F (t0 )
for t ≥ t0 (and 0 otherwise). So the conditional PDF of T given T ≥ t0 is
f (t)
f (t | T ≥ t0 ) = , for t ≥ t0 .
1 − F (t0 )
The conditional PDF of T at t0 , given that the person survived up until t0 , is then f (t0 ) / (1 − F (t0 )) =
h (t0 ). Alternatively, we can use the hybrid form of Bayes’ rule:
P (T ≥ t | T = t)f (t) f (t)
f (t | T ≥ t) = = .
P (T ≥ t) 1 − F (t)
In the last display, densities are treated like probabilities in the application of the Bayes’ rule.
This is actually legitimate. We will see more about this in the next Chapter.
Thus, h(t) gives the probability density of death at time t given that the person has survived up
until then. For a very small ϵ > 0, the probability of dying before t0 + ϵ given survival up to t0 is
Z t0 +ϵ
P (T ≤ t0 + ϵ | T ≥ t0 ) = f (t | T ≥ t0 ) dt ≈ h(t0 )ϵ.
t0

Thus the hazard function gives a natural way to measure the instantaneous hazard of death at
some time since it accounts for the person having survived up until that time.
2. Since G(s) = 1 − F (s) for any s > 0, we have
f (s) F ′ (s) −G′ (s)
h(s) = = = .
G(s) G(s) G(s)
Integrating both sides with respect to t gives the following for any t > 0:
Z t Z t ′ Z t
G (s)
h(s)ds = − ds = − d log G(s).
0 0 G(s) 0
4-5

Obviously, Z t
d log G(s) = log G (t) − log G(0) = log G (t) ,
0

since G(0) = 1 − F (0) = 1.

Thus, Z t
log (1 − F (t)) = log G(t) = − h(s)ds,
0

which shows that Z t

F (t) = 1 − exp − h(s)ds .
0

Differentiating both sides of above equation, we have

Z t
f (t) = h(t) exp − h(s)ds ,
0

d
Rt
since dt 0
h(s)ds = h(t) (by the fundamental theorem of calculus).
3. Let T ∼ Expo(λ). Then the hazard function is

λe−λt
h(t) = = λ, ∀t > 0,
e−λt
which is a constant.
Conversely, suppose that h(t) = λ for all t, with λ > 0 a constant. According to the conclusion
in Question 2, we have
Z t
F (t) = 1 − exp(− h(s)ds) = 1 − exp(−λt).
0

Thus, T ∼ Expo(λ).
4. The CDF and PDF of Weibull distribution are as follows

F (t) = 1 − exp(−(λt)γ )

f (t) = λ · γ · (λt)γ−1 · exp (−(λx)γ )

So,we have

f (t) λ · γ · (λt)γ−1 · exp (−(λt)γ )

h(t) = =
1 − F (t) exp (−(λt)γ )
= λ · γ(λt)γ−1

We can observe that when γ = 1, the hazard rate is a constant so the Weibull distribution reduces
to an Exponential distribution. When γ > 1, the hazard function increases with t. This can be
used to model the wear-and-tear effects in survival times. For example, let the lifetime of a certain
machine be a random variable X. If X has an Exponential distribution, then X is memeoryless
4-6

and has a constant hazard rate. This means that no matter how long the machine has been used,
its remaining lifetime has the same distribution as a new machine. However, in practice, the
machine often ages as it is used, thus the risk of machine breakdown should increase as times goes
by. This can be captured by an increasing hazard function.

4.2 Universaility of the Uniform Distribution

Exercise 5. Let F be a CDF which is a continuous function and strictly increasing on the support
of the distribution. This ensures that the inverse function F −1 exists, as a function from (0, 1) to R.
Prove the following statements:

1. Let U ∼ Unif(0, 1) and X = F −1 (U ). Then X is an r.v. with CDFF .

2. Let X be an r.v. with CDF F . Then F (X) ∼ Unif(0, 1).

Answer:

1. Let U ∼ Unif(0, 1) and X = F −1 (U ). For all real x,

P (X ≤ x) = P F −1 (U ) ≤ x = P (U ≤ F (x)) = F (x)

so the CDF of X is F , as claimed. For the last equality, we used the fact that P (U ≤ u) = u for
u ∈ (0, 1) since U ∼ Unif(0, 1).
2. Let X have CDF F , and find the CDF of Y = F (X). Since F (·) takes values in (0, 1), P (Y ≤ y)
equals 0 for y ≤ 0 and equals 1 for y ≥ 1. For any y ∈ (0, 1),

P (Y ≤ y) = P (F (X) ≤ y) = P X ≤ F −1 (y) = F F −1 (y) = y.

Thus Y has the Unif(0, 1) CDF.

Note that there is a potential notational confusion: F (x) = P (X ≤ x) by definition, but it would
be incorrect to say “F (X) = P (X ≤ X) = 1”. Rather, we should first find an expression for the
CDF F (x) as a function of x, then replace x with X to obtain an r.v. For example, if the CDF
of X is F (x) = 1 − e−x for x > 0, then F (X) = 1 − e−X when X > 0.

Exercise 6. Suppose that a random variable U ∼ Unif(0, 1). Based on the results in Exercise 1,
discuss how you can construct a random variable X with the following CDFs from U :

1. FX (x) = ex /(1 + ex ) for any x ∈ R. (This is called Logistic distribution).

2. FX (x) = 1 − e−x
2
/2
for any x > 0. (This is called Rayleigh distribution).
4-7

3. FX (x) = 1 − e−λx for x > 0. (This is the Exponential distribution with parameter λ).

Answer:

1. Suppose we have U ∼ Unif(0, 1) and wish to generate a Logistic random variable. Statement 1
in Exercise 5 says that F −1 (U ) ∼ Logistic if F is the CDF of the Logistic distribution. Thus we
first invert the CDF to get F −1 :

u
F −1 (u) = log .
1−u

Then we plug in U for u :

−1 U
F (U ) = log
1−U

Therefore log U
1−U
∼ Logistic. We can verify directly that log U
1−U
has the required CDF:
start from the definition of CDF, do some algebra to isolate U on one side of the inequality, and
then use the CDF of the Uniform distribution. Let’s work through these calculations once for
practice:

U U
P log ≤x =P ≤ ex
1−U 1−U
= P (U ≤ ex (1 − U ))

ex
=P U ≤
1 + ex
x
e
= ,
1 + ex
which is indeed the Logistic CDF.
2. The quantile function (the inverse of the CDF) is
p
F −1 (u) = −2 log(1 − u),
p
so if U ∼ Unif(0, 1), then F −1 (U ) = −2 log(1 − U ) ∼ Rayleigh.
3. The quantile function (the inverse of the CDF) is

log(1 − u)
F −1 (u) = − ,
λ

so if U ∼ Unif(0, 1), then F −1 (U ) = − log(1−U

λ
)
∼ Exponential distribution with parameter λ.

Exercise 7. In Exercises 5 and 6, we consider continuous r.v.s X. Discuss how to construct a discrete
Pn
random variable taking values in {0, . . . , n} with P (X = j) = pj such that j=0 pj = 1 from U ∼
Unif(0, 1).
4-8

4.1: Given a PMF, chop up the interval (0, 1) into pieces, with lengths given by the PMF values.

Answer: Suppose we want to use U ∼ Unif(0, 1) to construct a discrete r.v. X with pj = P (X = j) for
j = 0, 1, 2, . . . , n. As illustrated in following Figure 4.1, we can chop up the interval (0, 1) into pieces
of lengths p0 , p1 , . . . , pn . By the properties of a valid PMF, the sum of the pj ’s is 1, so this perfectly
divides up the interval, without overshooting or undershooting.

Now define X to be the r.v. which equals 0 if U falls into the p0 interval, 1 if U falls into the p1 interval,
2 if U falls into the p2 interval, and so on. Then X is a discrete r.v. taking on values 0 through n.
The probability that X = j is the probability that U falls into the interval of length pj . But for a Unif
(0, 1) r.v., the probability that this r.v. falls into an interval in (0, 1) is just the length of this interval,
so P (X = j) is precisely pj , as desired!

4.3 Functions of a Random Variable

4.3.1 Review

A general approach to finding the PDF of Y = g(X) for continuous X:

• Find FY (y) by converting {Y ≤ y} to {X ∈ {x : g(x) ≤ y}}.

• Differentiate FY (y) to obtain the PDF fY (y).

For a continuous and strictly monotonic transformation, we can also apply the following change-of-
variable formula.

Theorem 4.6 (Change of variables). Let X be a continuous r.v. with PDF fX , and let Y = g(X),
where g is differentiable and strictly increasing (or strictly decreasing) over the support of X. Then the
PDF of Y is given by
dx
fY (y) = fX (x)
dy
4-9

where x = g −1 (y). The support of Y is all g(x) with x in the support of X, namely, Supp(Y ) =
{y = g(x) : x ∈ Supp(X)}.

4.3.2 Exercise

Exercise 8. If the length of a side of a square X is random with the PDF fX (x) = x/8, 0 < x < 4,
and Y is the area of the square, find the PDF of Y .

Answer: Given the area of the square Y = X 2 , we first have

Z √
√ √ y
x 1
FY (y) = P (X ≤ y) = P (X ≤
2
y) = FX ( y) = dx = y
0 8 16

Thus, the PDF of Y is fY (y) = FY′ (y) = 1/16, 0 < y < 16. That is, the area Y is uniform on (0, 16).

Exercise 9. Find the PDF of the random variable Y in the following settings:

1. Y = |X| for X ∼ Unif[−1, 1].

2. Y = X 2 for X ∼ Unif[−1, 1].
3. Y = X 2 for X ∼ Unif[−1, 3].

Answer:

1. Given that X is uniformly distributed in the interval [−1, 1], the probability density function
(PDF) of X is: 
1 for − 1 ≤ x ≤ 1
2
fX (x) =
0 otherwise
To find the distribution of Y = |X|, we consider the range of Y :

For x ∈ [−1, 0], y = |x| ranges from 1 to 0.

For x ∈ [0, 1], y = |x| ranges from 0 to 1.

Thus, the range of Y is [0, 1]. Now, we can derive the CDF of Y : for 0 ≤ y ≤ 1,
1
FY (y) = P (|X| ≤ y) = P (−y ≤ X ≤ y) = 2y × =y
2
Thus the CDF of Y is: 

 0 for y < 0


FY (y) = y for 0 ≤ y < 1



1 for y ≥ 1.
4-10

Differentiating the CDF to find the PDF: For 0 ≤ y ≤ 1,

fY (y) = FY′ (y) = 1, 0 < y < 1

So, the PDF of Y is 

1 for 0 ≤ y ≤ 1
fY (y) =
0 otherwise
which means that the random variable Y is uniformly distributed on the area [0, 1].
2. Given that X is uniformly distributed in the interval [−1, 1], the probability density function
(PDF) of X is: 
1 for − 1 ≤ x ≤ 1
2
fX (x) =
0 otherwise

To find the distribution of Y = X 2 , we consider the range of Y :

For x ∈ [−1, 0], y = x2 ranges from 1 to 0.

For x ∈ [0, 1], y = x2 ranges from 0 to 1.

Thus, the range of Y is [0, 1]. Now, we can derive the CDF of Y :
√ √ √ 1 √
FY (y) = P (X 2 ≤ y) = P (− y ≤ X ≤ y) = 2 y × = y
2
Thus the CDF of Y is: 

 0 for y < 0


FY (y) = √y for 0 ≤ y < 1



1 for y ≥ 1.

Consequently, for 0 ≤ y ≤ 1, the PDF of Y is fY (y) = FY′ (y) = 1

√
2 y
. So the PDF of random
variable Y is 
 1
for 0 ≤ y ≤ 1
2y 1/2
fY (y) =
0 otherwise
3. Given that X is uniformly distributed in the interval [−1, 3], the probability density function
(PDF) of X is: 
1 for − 1 ≤ x ≤ 3
4
fX (x) =
0 otherwise

To find the distribution of Y = X 2 , we consider the range of Y :

For x ∈ [−1, 0], y = x2 ranges from 1 to 0.

For x ∈ [0, 3], y = x2 ranges from 0 to 9.
4-11

Thus, the range of Y is [0, 9].

Now, the cumulative distribution function (CDF) of Y is:

FY (y) = P (Y ≤ y) = P (X 2 ≤ y)

For 0 ≤ y < 1:
√ √ 1√
FY (y) = P (X 2 ≤ y) = P (− y ≤ X ≤ y) = y
2
For 1 ≤ y ≤ 9:
√ 1 √
FY (y) = P (X 2 ≤ y) = P (X ≤ y) = ( y + 1)
4
Thus, the CDF is:


 0 for y < 0




 1 √y for 0 ≤ y < 1
2
FY (y) =

 1 √

 ( y + 1) for 1 ≤ y ≤ 9


4
1 for y > 9

Differentiating the CDF to find the PDF: For 0 ≤ y < 1:

1
fY (y) =
4y 1/2
For 1 ≤ y ≤ 9:
1
fY (y) =
8y 1/2
So, the PDF is: 

 1
for 0 ≤ y < 1

 4y1/2
fY (y) = 1
for 1 ≤ y ≤ 9

 8y 1/2

0 otherwise

4.4 Expectation and Variance

4.4.1 Review

Definition 4.7 (Expectation of a discrete r.v.). The expected value (also called the expectation
or mean) of a discrete r.v. X with distinct possible values x1 , x2 , . . . is defined by

X
∞
E(X) = xj P (X = xj )
j=1
4-12

If the support is finite, then this is replaced by a finite sum. We can also write
X
E(X) = x pX (x)
|{z} | {z }
x:x∈Supp(X) value PMF at x

Definition 4.8 (Expectation of a continuous r.v.). The expected value of a continuous r.v. X with
PDF f is
Z ∞
E(X) = xf (x)dx
−∞

Theorem 4.9 (Linearity of expectation). For any r.v.s X, Y (whose expectations exist) and any
constant c,

E(X + Y ) = E(X) + E(Y ),

E(cX) = cE(X)

Theorem 4.10 (Law of the unconscious statistician (LOTUS)). The expected value or mean of a
random variable g(X), denoted by E(g(X)), is
R
 ∞ g(x)fX (x)dx if X is continuous
−∞
E(g(X)) = P P
 if X is discrete
x∈Supp(X) g(x)pX (x) = x∈Supp(X) g(x)P (X = x)

Definition 4.11 (Variance and standard deviation). The variance of an r.v. X is

Var(X) = E(X − EX)2

The square root of the variance is called the standard deviation:

p
SD(X) = Var(X)

Theorem 4.12 (Variance shortcut formula). For any r.v. X,

Var(X) = E X 2 − (EX)2

Proposition 4.13 (Properties of Variance). For any constants a, b,

Var(aX + b) = a2 Var(X), SD(aX + b) = |a| SD(X).

4.4.2 Exercise

Exercise 10. The PMF for X = the number of major defects on a randomly selected appliance of a
certain type is
x 0 1 2 3 4
p(x) 0.08 0.15 0.45 0.27 0.05
Compute the following:
4-13

1. E(X).
2. V (X) directly from the definition.
3. The standard deviation of X.
4. E(X 2 ) and E(X 3 ).
5. V (X) using the shortcut formula.

Answer:
P4
1. E(X) = x=0 xp(x) = 0 × 0.08 + 1 × 0.15 + 2 × 0.45 + 3 × 0.27 + 4 × 0.05 = 2.06;
P4 2
2. V (X) = x=0 (x − E(X)) p(x) = (0 − 2.06)2 × 0.08 + (1 − 2.06)2 × 0.15 + (2 − 2.06)2 × 0.45 +
(3 − 2.06)2 × 0.27 + (4 − 2.06)2 × 0.05 = 0.9364;
p √
3. SD(X) = V (X) = 0.9364 = 0.9677;
P4
4. E (X 2 ) = x=0 x2 p(x) = 02 × 0.08 + 12 × 0.15 + 22 × 0.45 + 32 × 0.27 + 42 × 0.05 = 5.18;
P4
E (X 3 ) = x=0 x3 p(x) = 03 × 0.08 + 13 × 0.15 + 23 × 0.45 + 33 × 0.27 + 43 × 0.05 = 14.24;
5. Using the shortcut formula, V (X) = E (X 2 ) − (E(X))2 = 5.18 − 2.062 = 0.9364, the same answer
as 2.

Exercise 11. Let Z ∼ N (0, 1), and c be a nonnegative constant. Find E(max(Z, 0)) and E(min(Z, 0)),
in terms of the standard Normal CDF Φ and PDF φ.

Answer: By LOTUS,
Z ∞
E(max(Z, 0)) = max(z, 0)φ(z)dz
−∞
Z ∞
= zφ(z)dz
0
∞
−1
= √ e−z /2
2

2π 0
1
=√ .
2π

Z ∞
E(min(Z, 0)) = min(z, 0)φ(z)dz
−∞
Z 0
= zφ(z)dz
−∞
Z ∞ Z ∞
= zφ(z)dz − zφ(z)dz
−∞ 0
1 1
= 0 − √ = −√ .
2π 2π
4-14

Remark: we can use these two results to obtain E(|Z|). Note that |Z| = max(Z, 0) − min(Z, 0). So
r
2
E(|Z|) = E(max(Z, 0) − min(Z, 0)) = E(max(Z, 0)) − E(min(Z, 0)) = .
π

4.5 Moment Generating Function

4.5.1 Review

Definition 4.14 (Moment generating function). The moment generating function (MGF) of an r.v.

X is M (t) = E etX , as a function of t, if this is finite on some open interval (−a, a) containing 0.
Otherwise we say the MGF of X does not exist.

Proposition 4.15 (MGF of location-scale transformation). If X has MGF M (t), then the MGF of
a + bX is

E et(a+bX) = E eat ebtX = eat E ebtX = eat M (bt).

Theorem 4.16 (Moments via derivatives of the MGF). Given the MGF of X, we can get the nth
moment of X by evaluating the nth derivative of the MGF at 0:
dn
E (X n ) = M (n) (0) = M (t)|t=0 .
dtn
Theorem 4.17 (MGF determines the distribution). The MGF of a random variable determines its
distribution: if two r.v.s have the same MGF, they must have the same distribution.

Mathematically, let X, Y be two r.v.s with MGFs MX , MY and CDFs FX , FY . If MX (t) = MY (t) for
all t in an interval (−a, a) containing 0, then FX (u) = FY (u) for all u.

• The fact that a MGF uniquely determines a distribution is very useful for

– finding the distribution of the sum of multiple independent random variables

– establishing the Central Limit Theorem.

4.5.2 Exercises

Exercise 12. Let X ∼ Expo(λ). Find E (X 3 ) in the following two ways:

1. Use LOTUS and the fact that E(X) = 1/λ and Var(X) = 1/λ2 , and integration by parts (
).
4-15

2. Use the MGF of X.

Answer:

1. By LOTUS, we have:
Z
∞
E X 3
= x3 λe−λx dx
Z 0
∞
= −x3 de−λx
0
Z ∞
∞
=− x3 e−λx 0 +3 x2 e−λx dx (Integration by Parts)
Z 0
3 ∞ 2 −λx
= x λe dx
λ 0
3 3 6
= E X2 = Var(X) + (EX)2 = 3 .
λ λ λ
2. The MGF of an Exponential random variable with rate parameter λ is
Z ∞

M (t) = E e tX
= etx λe−λx dx
0
Z ∞
=λ e−(λ−t)x dx
0
Z ∞
λ
= (λ − t)e−(λ−t)x dx
λ−t 0
λ
= , ∀t < λ.
λ−t
R∞
Here we use the fact that 0 (λ − t)e−(λ−t)x dx = 1 for any t < λ. You can easily verify this or
you can simply think of (λ − t)e−(λ−t)x as the density function of Expo(λ − t) for λ > 0. Thus
there is an open interval containing 0 on which M (t) is finite.
To get the third moment, we can take the third derivative of the MGF and evaluate at t = 0 :
−4
d3 M (t) 6 t 6
E X 3
= = 3 1− =
dt3 t=0 λ λ λ3
t=0

But a much nicer way to use the MGF here is via pattern recognition: note that M (t) looks like
it came from a geometric series:
∞ n
X X∞
λ 1 t n! tn
M (t) = = = = , ∀|t| < λ.
λ−t 1− t
λ n=0
λ n=0
λn n!

Recall that the Taylor expansion of M (t) around t = 0 is

∞ n
X
d tn
M (t) = n
M (t)|t=0 .
n=0
dt n!
4-16

tn dn
Thus the coeﬀicient of n!
here is exactly dtn
M (t)|t=0 , namely the nth moment of X. Therefore,
n n! 6
we have E (X ) = λn
for all nonnegative integers n. So again we get E (X 3 ) = λ3
. This method
not only avoids the need to compute the 3rd derivative of M (t) directly, but also it gives us all
the moments of X.

Exercise 13. Let M (t) be the moment generating function (MGF) of a random variable X. The
cumulant generating function is defined to be g(t) = ln M (t). Expanding g(t) as a Taylor series
X
∞
cj
g(t) = tj
j=1
j!

(the summation starts at j = 1 because g(0) = 0 ), the coeﬀicient cj is called the j-th cumulant of X.
Find the j-th cumulant of X, for all j ≥ 1.

1. Prove that g(0) = 0, g ′ (0) = c1 = E[X], and g ′′ (0) = c2 = V ar(X).

2. Now consider a possion random variable X ∼ Pois(λ). Derive the cumulant generating function
of X and find the j-th cumulant of X, for all j ≥ 1.

Answer:
M ′ (0)
1. g(0) = ln M (0) = ln E e0·X = 0, g ′ (0) = M (0)
= E[X]
1
= E[X],
′′ ′ 2
′′
g (0) = M (0)M (0)−(M (0))
(M (0))2
= E[X ] − (E[X]) = V ar(X).
2 2

2. Recall that the probability mass function (PMF) of Poisson distribution with parameter λ > 0 is
λk −λ
P (X = k) = k!
e . Then, by definition, the MGF of X has the form

X
∞ k X
∞
(et λ)k
tk λ −λ −λ t
−λ
M (t) = E[e tX
]= e e =e = eλe
k=0
k! k=0
k!

Therefore, the cumulant generating function g(t) = ln M (t) can be expaned as a Taylor series
with the form
X
∞ k
t
g(t) = ln M (t) = λ(et − 1) = λ
k=1
k!
which means that the coeﬀicient of k-th cumulant is ck = λ for all k ≥ 1.

Introduction To Stochastic Processes With R (Solution Manual) (Robert P. Dobrow)
No ratings yet
Introduction To Stochastic Processes With R (Solution Manual) (Robert P. Dobrow)
98 pages
MAT 361 Lecture-12
No ratings yet
MAT 361 Lecture-12
24 pages
Continuous R.V. Student
No ratings yet
Continuous R.V. Student
82 pages
Article 13
No ratings yet
Article 13
20 pages
Lecture Notes 1-31, 2-2
No ratings yet
Lecture Notes 1-31, 2-2
9 pages
Lecture 5 20240325
No ratings yet
Lecture 5 20240325
38 pages
Chapter 4 Slide
No ratings yet
Chapter 4 Slide
35 pages
Exercise 2.1: Homework#1
100% (1)
Exercise 2.1: Homework#1
8 pages
Mda Practical2 Dist R
No ratings yet
Mda Practical2 Dist R
20 pages
5 PP
No ratings yet
5 PP
36 pages
Stochastic Solutions Manual
100% (8)
Stochastic Solutions Manual
144 pages
5 PP
No ratings yet
5 PP
36 pages
Probability Theory I STA 112 IPETU
No ratings yet
Probability Theory I STA 112 IPETU
31 pages
Topic 2 - Probability Distribution
No ratings yet
Topic 2 - Probability Distribution
33 pages
Assignment 4 - Engineering Statistics - Spring 2018
No ratings yet
Assignment 4 - Engineering Statistics - Spring 2018
7 pages
MF0018 Insurance and Risk Management
89% (9)
MF0018 Insurance and Risk Management
340 pages
Quiz 10
No ratings yet
Quiz 10
2 pages
Cheatsheet PDF
100% (1)
Cheatsheet PDF
4 pages
CAS OC1 Summary Notes
No ratings yet
CAS OC1 Summary Notes
337 pages
Ds Unit 1
No ratings yet
Ds Unit 1
77 pages
Lecture 1
No ratings yet
Lecture 1
17 pages
Ross Chapter 3 Sols
No ratings yet
Ross Chapter 3 Sols
5 pages
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 1 30
No ratings yet
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 1 30
30 pages
Massachusetts Institute of Technology
No ratings yet
Massachusetts Institute of Technology
12 pages
Solutions To Problem Set 6
No ratings yet
Solutions To Problem Set 6
4 pages
Negative Exponential Distribution PDF
100% (1)
Negative Exponential Distribution PDF
3 pages
hw5 Sol
No ratings yet
hw5 Sol
11 pages
Section 2.2 - Future Lifetime Random Variable and The Survival Function
No ratings yet
Section 2.2 - Future Lifetime Random Variable and The Survival Function
45 pages
A Two Parameter Distribution Obtained by
No ratings yet
A Two Parameter Distribution Obtained by
15 pages
Biostat Camp2
No ratings yet
Biostat Camp2
16 pages
Lean Six Sigma Green Belt Certification Training Manual CSSC 2018 06b (1) (251 300)
No ratings yet
Lean Six Sigma Green Belt Certification Training Manual CSSC 2018 06b (1) (251 300)
50 pages
HW 2 Sol
No ratings yet
HW 2 Sol
14 pages
Probability Solution07
No ratings yet
Probability Solution07
7 pages
Properties of The Exponential Distribution
No ratings yet
Properties of The Exponential Distribution
13 pages
LIFECO1LEC1
No ratings yet
LIFECO1LEC1
48 pages
Spring 2006 Final Solution
No ratings yet
Spring 2006 Final Solution
12 pages
Solutions Ex1
No ratings yet
Solutions Ex1
11 pages
Hw6sols09 PDF
No ratings yet
Hw6sols09 PDF
4 pages
300 Continuous Time Markov Chains
No ratings yet
300 Continuous Time Markov Chains
78 pages
Ece 313: Problem Set 7: Solutions Continuous Rvs and Poisson Processes
No ratings yet
Ece 313: Problem Set 7: Solutions Continuous Rvs and Poisson Processes
4 pages
Chapter 4
No ratings yet
Chapter 4
21 pages
Negative Exponential Distribution
No ratings yet
Negative Exponential Distribution
3 pages
Modern Business Statistics With Microsoft Excel 5th Edition Anderson Solutions Manual PDF Download
100% (2)
Modern Business Statistics With Microsoft Excel 5th Edition Anderson Solutions Manual PDF Download
54 pages
Global Optimization Methods in Geophysical Inversion 2ed. Edition Sen M.K. - Download The Ebook Now To Start Reading Without Waiting
100% (1)
Global Optimization Methods in Geophysical Inversion 2ed. Edition Sen M.K. - Download The Ebook Now To Start Reading Without Waiting
73 pages
hw01 Sol
No ratings yet
hw01 Sol
6 pages
Tutorial 3
100% (1)
Tutorial 3
2 pages
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
100% (1)
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
14 pages
Review 2024 04
No ratings yet
Review 2024 04
5 pages
CS434a/541a: Pattern Recognition Prof. Olga Veksler
No ratings yet
CS434a/541a: Pattern Recognition Prof. Olga Veksler
65 pages
Assignment 1 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 1 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
7 pages
The Exponential Distribution
No ratings yet
The Exponential Distribution
5 pages
Uniform and Exponential Probability Distribution.
No ratings yet
Uniform and Exponential Probability Distribution.
4 pages
Tutorial ProblemsCO1
No ratings yet
Tutorial ProblemsCO1
34 pages
STA116 Chapter 2 - Probability & Counting Rules
No ratings yet
STA116 Chapter 2 - Probability & Counting Rules
28 pages
Mini Project
No ratings yet
Mini Project
4 pages
Unit 5 Notes On Design For Assembly Automation
No ratings yet
Unit 5 Notes On Design For Assembly Automation
23 pages
Problems On P&C
No ratings yet
Problems On P&C
6 pages
Hw3sol 21015 PDF
No ratings yet
Hw3sol 21015 PDF
13 pages
Statistics Cheat Sheet
No ratings yet
Statistics Cheat Sheet
4 pages
Probstats
No ratings yet
Probstats
2 pages
Chapter 1 Mathematics of Survival Analysis
No ratings yet
Chapter 1 Mathematics of Survival Analysis
13 pages
Probability Addition and Multiplication Theorem
No ratings yet
Probability Addition and Multiplication Theorem
7 pages
Hw2sol PDF
No ratings yet
Hw2sol PDF
15 pages
Poisson Process PDF
No ratings yet
Poisson Process PDF
8 pages
Complete Notes STATS
No ratings yet
Complete Notes STATS
16 pages
Probability & Statistics, Discrete Random Variables, RL 2.1.1
No ratings yet
Probability & Statistics, Discrete Random Variables, RL 2.1.1
21 pages
HW 2 Sol
No ratings yet
HW 2 Sol
10 pages
Exponential PDF
No ratings yet
Exponential PDF
5 pages
Probability Theory & Stochastic Processes PDF
No ratings yet
Probability Theory & Stochastic Processes PDF
4 pages
Business Statistics Exam Final
No ratings yet
Business Statistics Exam Final
3 pages
Durability Design PDF
No ratings yet
Durability Design PDF
18 pages
Math425 Probability FINAL-Exam SOLUTIONS Fall04
No ratings yet
Math425 Probability FINAL-Exam SOLUTIONS Fall04
6 pages
INDR 430 - Sample Final
No ratings yet
INDR 430 - Sample Final
2 pages
38 3 Exp Dist
No ratings yet
38 3 Exp Dist
5 pages
Bivariate Probability Function
No ratings yet
Bivariate Probability Function
3 pages
Mathematics For Informatics 4a: The Story of The Film So Far..
No ratings yet
Mathematics For Informatics 4a: The Story of The Film So Far..
6 pages
STAT2080 Ass1 2022
No ratings yet
STAT2080 Ass1 2022
3 pages
Making Sense of Probability in Professional and Everyday Life
No ratings yet
Making Sense of Probability in Professional and Everyday Life
10 pages
HW5 Sols
No ratings yet
HW5 Sols
5 pages
Lecture 8 - MATH 403 - DISCRETE PROBABILITY DISTRIBUTIONS
No ratings yet
Lecture 8 - MATH 403 - DISCRETE PROBABILITY DISTRIBUTIONS
6 pages
Semana 13
No ratings yet
Semana 13
3 pages
STAT Exam 1 - Review Sheet
No ratings yet
STAT Exam 1 - Review Sheet
2 pages
Statistics 100A Homework 6 Solutions: Ryan Rosario
No ratings yet
Statistics 100A Homework 6 Solutions: Ryan Rosario
13 pages
Bernoulli Vs Binomial Distribution
No ratings yet
Bernoulli Vs Binomial Distribution
2 pages
Harmonic Analysis and the Theory of Probability
From Everand
Harmonic Analysis and the Theory of Probability
Salomon Bochner
No ratings yet
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Recitation 4 Solution

Uploaded by

Recitation 4 Solution

Uploaded by

10510134: Probability Theory and Mathematical Statistics Fall 2023

4.1 Famous Continuous Distributions

We denote this by U ∼ Unif(a, b).

f (x) = λe−λx , x>0

We denote this by X ∼ Expo(λ).

1. The distribution is Uniform on (0, 10), so the mean is 5 minutes.

1. Find the probability that Donald understands Joe correctly.

G(t) = P (T > t) = 1 − F (t),

which is the probability of survival at time t. The hazard function of T is defined by

1. Given that T > t0 , the conditional CDF of T is

since G(0) = 1 − F (0) = 1.

which shows that  Z t 

Differentiating both sides of above equation, we have

f (t) = λ · γ · (λt)γ−1 · exp (−(λx)γ )

f (t) λ · γ · (λt)γ−1 · exp (−(λt)γ )

4.2 Universaility of the Uniform Distribution

1. Let U ∼ Unif(0, 1) and X = F −1 (U ). Then X is an r.v. with CDFF .

1. Let U ∼ Unif(0, 1) and X = F −1 (U ). For all real x,

Thus Y has the Unif(0, 1) CDF.

1. FX (x) = ex /(1 + ex ) for any x ∈ R. (This is called Logistic distribution).

Then we plug in U for u :

so if U ∼ Unif(0, 1), then F −1 (U ) = − log(1−U

4.3 Functions of a Random Variable

A general approach to finding the PDF of Y = g(X) for continuous X:

• Find FY (y) by converting {Y ≤ y} to {X ∈ {x : g(x) ≤ y}}.

• Differentiate FY (y) to obtain the PDF fY (y).

Answer: Given the area of the square Y = X 2 , we first have

1. Y = |X| for X ∼ Unif[−1, 1].

For x ∈ [−1, 0], y = |x| ranges from 1 to 0.

Differentiating the CDF to find the PDF: For 0 ≤ y ≤ 1,

fY (y) = FY′ (y) = 1, 0 < y < 1

So, the PDF of Y is 

To find the distribution of Y = X 2 , we consider the range of Y :

For x ∈ [−1, 0], y = x2 ranges from 1 to 0.

Consequently, for 0 ≤ y ≤ 1, the PDF of Y is fY (y) = FY′ (y) = 1

To find the distribution of Y = X 2 , we consider the range of Y :

For x ∈ [−1, 0], y = x2 ranges from 1 to 0.

Thus, the range of Y is [0, 9].

Differentiating the CDF to find the PDF: For 0 ≤ y < 1:

4.4 Expectation and Variance

E(X + Y ) = E(X) + E(Y ),

Definition 4.11 (Variance and standard deviation). The variance of an r.v. X is

Var(X) = E(X − EX)2

The square root of the variance is called the standard deviation:

Theorem 4.12 (Variance shortcut formula). For any r.v. X,

Proposition 4.13 (Properties of Variance). For any constants a, b,

Var(aX + b) = a2 Var(X), SD(aX + b) = |a| SD(X).

4.5 Moment Generating Function

– finding the distribution of the sum of multiple independent random variables

Exercise 12. Let X ∼ Expo(λ). Find E (X 3 ) in the following two ways:

2. Use the MGF of X.

Recall that the Taylor expansion of M (t) around t = 0 is

1. Prove that g(0) = 0, g ′ (0) = c1 = E[X], and g ′′ (0) = c2 = V ar(X).

You might also like

which shows that Z t