Probability Chapter5
Probability Chapter5
Wang Fei
Department of Mathematics
Office: S17-06-16
Tel: 6516-2937
1
Distribution of a Function of a Random Variable 54
Function of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2
Continuous Random Variable 2 / 59
3 / 59
4 / 59
3
Examples
✔ Let X be a continuous random variable with density function
C(4x − 2x2 ), 0 < x < 2,
✘ f (x) =
0, otherwise.
(a) Find the value of C.
(b) Find P (X > 1).
Solution.
∞ 2
8 3
Z Z
(a) 1 = f (x) dx = C(4x − 2x2 ) dx = C · ⇒ C = .
−∞ 0 3 8
Z ∞ Z 2
3 1
(b) P (X > 1) = f (x) dx = (4x − 2x2 ) dx = .
1 1 8 2
5 / 59
Examples
✔ The amount of hours X that a computer functions has density function
−x/100
λe , x ≥ 0,
✘ f (x) =
0, x < 0.
Find the probability that
(a) a compute functions between 50 and 150 hours.
(b) a compute functions for fewer than 100 hours.
Z ∞ Z ∞
1
Solution. 1 = f (x) dx = λe−x/100 dx = 100λ ⇒ λ = .
−∞ 0 100
Z 150
1 −x/100
(a) P (50 < X < 150) = e dx = e−1/2 − e−3/2 ≈ 0.3834.
50 100
Z 100
1 −x/100
(b) P (X < 100) = e dx = 1 − e−1 ≈ 0.6321.
0 100
6 / 59
4
Examples
✔ Let X be a continuous random variable with c.d.f. FX and p.d.f. fX .
✘ Find the c.d.f. and p.d.f. of Y = 2X and Z = −2X.
Solution.
(a) FY (a) = P (Y ≤ a) = P (2X ≤ a) = P (X ≤ a/2) = FX (a/2).
✘ Differentiate with respect to a:
d
✓ fY (a) = FY′ (a) = F (a/2)
da X
= 21 fX (a/2).
(b) FZ (a) = P (Z ≤ a) = P (X ≥ −a/2) = 1 − FX (−a/2).
✘ Differentiate with respect to a:
d
✓ fZ (a) = FZ′ (a) = da
[1 − FX (−a/2)] = 12 fX (−a/2).
7 / 59
Examples
✔ Let X be a continuous random variable with density function f .
Z b
✘ P (a < X < b) = f (x) dx.
a
In particular, for small ǫ > 0,
Z a+ǫ
✘ P (a < X < a + ǫ) = f (x) dx.
a
Assume that f is continuous at a. Then
✘ P (a < X < a + ǫ) ≈ ǫf (a).
If f is continuous at a, and I ∋ a is an interval of length (small) ǫ,
✘ P (X ∈ I) ≈ ǫf (a).
8 / 59
5
Examples
✔ Let X be a continuous random variable with c.d.f. FX and p.d.f. fX .
✘ Find the c.d.f. and p.d.f. of Y = 2X.
Solution. Recall that P (a < X < a + ǫ) ≈ ǫfX (a).
9 / 59
Expectation
P
✔ If X is a discrete random variable, then E[X] = x xP (X = x).
✔ Suppose that X is continuous with probability density function f .
✘ Assume that f is zero outside [a, b].
b−a
Divide [a, b] into n equal subintervals [x0 , x1 ], . . . , [xn−1 , xn ]; ∆x = n
.
✘ f (xi )∆x ≈ P (xi−1 < X ≤ xi )
Let Y be a discrete random variable with P (Y = xi ) = f (xi )∆x.
n n
Z b
P P
✘ E[X] ≈ E[Y ] = xi P (Y = xi ) = xi f (xi )∆x ≈ xf (x) dx.
i=1 i=1 a
✔ Definition. If X is a continuous random variable with density function fX ,
Z ∞
✘ the expected value E[X] = xfX (x) dx.
−∞
11 / 59
6
Examples
✔ Let X be a continuous random variable with density function
2x, if 0 ≤ x ≤ 1,
✘ f (x) =
0, otherwise.
Z ∞ Z 1
2
✘ Then E[X] = xf (x) dx = x(2x) dx = .
−∞ 0 3
1 if 0 ≤ x ≤ 1,
✔ Find E[eX ] if the density function of X is f (x) =
0 otherwise.
Let Y = eX . Then 1 ≤ Y ≤ e.
R ln y
✘ FY (y) = P (eX ≤ y) = P (X ≤ ln y) = 0
1 dx = ln y.
✓fY (y) = FY′ (y) = 1/y, 1 ≤ y ≤ e.
Z ∞ Z e
✘ E[Y ] = yfY (y) dy = y(1/y) dy = e − 1.
−∞ 1
12 / 59
Properties
✔ Proposition. Suppose continuous r.v. X has p.d.f. f .
✘ For any function g,
Z ∞
✓ E[g(X)] = g(x)f (x) dx.
−∞
1 if 0 ≤ x ≤ 1,
✔ Example. Let X have density function f (x) =
0 otherwise.
Z ∞ Z 1
x
✘ X
E[e ] = e f (x) dx = ex · 1 dx = e − 1.
−∞
Z ∞ 0 Z 1
✘ E[ln X] = ln xf (x) dx = ln x dx = −1.
−∞ 0
13 / 59
7
Properties
✔ Let Y be a nonnegative continuous r.v. with p.d.f. f .
Z ∞ Z ∞ Z ∞ Z ∞ Z x
P (Y > y) dy = f (x) dx dy = f (x) dy dx
0 0 y 0 0
Z ∞
= xf (x) dx = E[Y ].
0
14 / 59
Properties
✔ Lemma. If X is a nonnegative continuous r.v. with p.d.f. f ,
Z ∞
✘ E[X] = P (X > y) dy.
0
✔ Proposition. If X is a continuous r.v. with p.d.f. f ,
Z ∞
✘ E[g(X)] = g(x)f (x) dx.
−∞
8
Examples
1 0 < u < 1,
✔ A stick of length 1 is split at point U with p.d.f. f (u) =
0 otherwise.
✘ Fix 0 ≤ a ≤ 1 and find the expected length of the piece containing a.
Solution. Let L(U) be the length of the substick containing a.
U if U > a,
✘ L(U) =
1 − U if U < a.
Z 1
E[L(U)] = L(u)f (u) du
0
Z a Z 1
= (1 − u) du + u du
0 a
1 1 1
= [1 − (1 − a)2 ] + (1 − a2 ) = + a(1 − a).
2 2 2
16 / 59
Examples
✔ Suppose the travel time is continuous with density function f .
✘ If one is s minutes early for appointment, the cost is cs.
✘ If one is s minutes late for appointment, the cost is ks.
Determine the time to depart to minimize the expected cost.
Solution. Let X be the travel time. Let t be the time to depart. The cost
c(t − X) if X ≤ t,
✘ Ct (X) =
k(X − t) if X ≥ t.
Z ∞ Z t Z ∞
E[Ct (X)] = Ct (x)f (x) dx = c(t − x)f (x) dx + k(x − t)f (x) dx
0 0 t
Z t Z t Z ∞ Z ∞
= ct f (x) dx − c xf (x) dx + k xf (x) dx − kt f (x) dx
0 0 t t
d
Rt
Recall the fundamental theorem of calculus, dt a
g(x) dx = g(t).
17 / 59
9
Examples
✔ Suppose the travel time is continuous with density function f .
✘ If one is s minutes early for appointment, the cost is cs.
✘ If one is s minutes late for appointment, the cost is ks.
Determine the time to depart to minimize the expected cost.
Solution. Let X be the travel time. Let t be the time to depart.
d
E[Ct (X)] = cF (t) + ctf (t) − ctf (t) − ktf (t) − k(1 − F (t)) + ktf (t)
dt
= (c + k)F (t) − k.
d
✘ E[Ct (X)] = 0 at t = t∗ with F (t∗ ) = k/(c + k).
dt
d2
E[Ct (X)] = (c + k)f (t) ≥ 0. So E[Ct (X)] has minimum at t = t∗ .
dt2
18 / 59
Variance
✔ Proposition. Let X be a continuous random variable. For a, b ∈ R,
✘ E[aX + b] = aE[X] + b.
19 / 59
10
Uniform Random Variable 20 / 59
21 / 59
11
Uniform Random Variable
✔ Definition. A random variable X is uniformly distributed over (α, β) if
1/(β − α) α < x < β,
✘ the density function is f (x) =
0 otherwise.
✔ Let X be uniform over (0, 1).
✘ Y = aX + b (a > 0) is uniform over (b, a + b).
Hence, Y = (β − α)X + α is uniform over (α, β).
✘ E[Y ] = (β − α)E[X] + α = (β − α) 12 + α = 21 (α + β).
1
✘ Var(Y ) = (β − α)2 Var(X) = 12 (β − α)2 .
Moreover, the cumulative distribution function of Y is
0 y ≥ α,
✘ FY (y) = (y − α)/(β − α), α < y < β,
1 y ≥ β.
23 / 59
Examples
✔ Suppose X is uniform over (0, 10).
3−0 3
✘ P (X < 3) = P (0 < X < 3) = 10−0
= 10 .
✘ P (X > 6) = P (6 < X < 10) = 10−6
10−0
= 52 .
✘ P (3 < X < 12) = P (3 < X < 10) = 10−3
10−0
= 7
10
.
24 / 59
12
Normal Random Variables 25 / 59
0.399
2
f (x) = √1 e−x /2
2π
−3 −2 −1 0 1 2 3 x
26 / 59
13
Standard Normal Random Variable
1 2
✔ Suppose Z has probability density function f (x) = √ e−x /2 .
2π
Z ∞
1 2
✘ E[Z 2 ] = √ x2 e−x /2 dx.
2π −∞
2 /2
✓ Let u = x and dv/dx = xe−x .
2 /2
✗ Then du/dx = 1 and v = −e−x .
Z ∞
2 1 −x2 /2
∞
−x2 /2
E[Z ] = √ −xe + e dx
2π −∞ −∞
1 √
= √ (0 + 2π) = 1.
2π
✘ Var(Z) = E[Z 2 ] − (E[Z])2 = 1 − 0 = 1.
28 / 59
14
Examples
✔ Let X be a normal random variable with parameters µ = 3 and σ 2 = 9.
✘ Z = (X − µ)/σ = (X − 3)/3 is standard normally distributed.
P (X > 0) = P ( X−3
3
> 0−3
3
) = P (Z > −1)
= 1 − Φ(−1) = Φ(1) ≈ 0.8413.
P (|X − 3| > 6) = P (X > 9) + P (X < −3)
= P ( X−3
3
> 9−3
3
) + P ( X−3
3
< −3−3
3
)
= P (Z > 2) + P (Z < −2) = 2P (Z > 2)
= 2[1 − Φ(2)] ≈ 0.0456.
30 / 59
Examples
✔ Let X be a normal random variable with parameters (µ, σ 2).
✘ Let Z = (X − µ)/σ. Then Z is standard normally distributed.
(µ+kσ)−µ
Then P (X > µ + kσ) = P ( X−µ
σ
> σ
) = P (Z > k) = 1 − Φ(k).
✘ P (X > µ) = 1 − Φ(0) = 0.5.
✘ P (X > µ + σ) = 1 − Φ(1) ≈ 0.1587.
✘ P (X > µ + 2σ) = 1 − Φ(2) ≈ 0.0228.
✘ P (X > µ + 3σ) = 1 − Φ(3) ≈ 0.0013.
✘ P (X > µ + 4σ) = 1 − Φ(4) ≈ 0.00003.
15
Approximation to Binomial Distribution
✔ Consider independent trials with success probability p.
✘ Let Sn be the number of successes in the first n trials.
Then Sn is binomial with parameters (n, p).
✘ E[Sn ] = np and Var(Sn ) = np(1 − p).
De Moivre and Laplace observed that Sn may be approximated by
✘ a normal random variable with parameters µ = np and σ 2 = np(1 − p).
Equivalently,
p
✘ (Sn − np)/ np(1 − p) is approximated by standard normal Z.
!
Sn − np
✓ P a≤ p ≤ b ≈ Φ(b) − Φ(a).
np(1 − p)
32 / 59
33 / 59
16
Examples
✔ The class size is expected to be 150.
✘ Only 30% of those accepted for admission will attend.
Suppose that 450 applications are accepted.
✘ Find the probability that the class size exceed 150.
Solution. Let X be the number of students who attend.
✘ Then X is binomial with parameters n = 450 and p = 0.3.
P450
✓ P (X > 150) = i=151 450 (0.3)i (0.7)450−i ≈ 0.0565.
i
E[X] = 135 and Var(X) = 94.5. Using normal approximation,
34 / 59
Examples
✔ 52% of balls are white. A sample of n balls are randomly chosen.
✘ Find the probability that more than 50% of balls in the sample are white.
Solution. Let Sn be the number of white balls in the sample.
✘ Let N be the total number of balls.
Then Sn is hypergeometric with parameters (n, N, 0.52N).
✘ For large N, Sn is approximately (n, 0.52)-binomial.
P (Sn > 0.5n) = P √ Sn −0.52n > √0.5n−0.52n
n(0.52)(0.48) n(0.52)(0.48)
S
√ √
≈P √n −0.52n > −0.04 n ≈ Φ(0.04 n).
0.2496n
√
For instance, P (S1001 > 500.5) ≈ Φ(0.04 1001) ≈ 0.8973.
35 / 59
17
Exponential Random Variables 36 / 59
37 / 59
38 / 59
18
Exponential Random Variables
✔ Let X be exponential with parameter λ > 0.
✘ It is the waiting time for the 1st event of Poisson process to occur.
Let Y = λX. Then
✘ Y is exponentially distributed with parameter 1.
✓ E[Y ] = Var(Y ) = 1.
Then it follows that
✘ E[X] = E[Y ]/λ = 1/λ.
✘ Var(X) = Var(Y )/λ2 = 1/λ2 .
39 / 59
Properties
✔ Let X be the lifetime of a component.
✘ Suppose the component has survived for t hours.
✓ The probability that it can survive at least another s hours is
✗ P (X > s + t|X > t) = P (X > s + t)/P (X > t).
✘ Suppose that X is exponentially distributed with parameter λ.
e−λ(s+t)
✓ P (X > s + t|X > t) = = e−λs = P (X > s).
e−λt
The components does not remember it has been in use for a time t.
✔ Definition. If P (X > s + t|X > t) = P (X > s) for all s, t ≥ 0,
✘ then X is said to be memoryless.
In particular, exponential random variables are memoryless.
40 / 59
19
Examples
✔ Suppose the distance that a car can run before its battery wears out is exponentially
distributed with average value 10 000 km.
✘ If one wants to take a 5 000 km trip, find the probability that he does not need to
replace the battery.
Solution. Let X be the lifetime of the battery (in 1000 km).
✘ P (X > 5 + t|X > t) = P (X > 5) = e−5λ = e−1/2 ≈ 0.607.
Suppose X is not exponential and the battery has survived for t.
P (X > 5 + t) 1 − F (5 + t)
✘ P (X > 5 + t|X > t) = = .
P (X > t) 1 − F (t)
✔ Remark. One verifies that if X is memoryless,
✘ then X must be exponentially distributed.
41 / 59
Examples
✔ Definition. X is a double exponential r.v. with parameter λ > 0 if
✘ the probability density function f (x) = 21 λe−λ|x| , −∞ < x < ∞.
Let Y = |X|. Then for y ≥ 0
20
Hazard Rate Function
✔ Let X the survival time of an item with distribution F and density f .
✘ Suppose the item has survived for a time t.
The probability that it will not survive (will fail) for an additional time ǫ is
P (t < X < t + ǫ) f (t) ǫ
✘ P (X < t + ǫ|X > t) = ≈ .
P (X > t) 1 − F (t)
✔ Definition. Let X be a positive continuous random variable.
✘ Let F̄ (t) = 1 − F (t).
✓ λ(t) = f (t)/F̄ (t) is the hazard (failure) rate function.
✔ Example. Let X be exponential with parameter λ.
f (t) λe−λt
✘ λ(t) = = −λt = λ is also called the rate of X.
F̄ (t) e
43 / 59
t t
f (s)
Z Z t
λ(s) ds = ds = − ln(1 − F (s))
0 0 1 − F (s) 0
21
Example
✔ The death rate of a smoker is, at each age, twice that of a non-smoker.
✘ What does this mean?
Solution. Let λs (t), λn (t) be the hazard rates. So λs (t) = 2λn (t).
✘ The probability Ps : an A-year-old smoker survive until age B (> A) is
RB
1 − Fs (B) e− 0 λs (t) dt RB
✓ P (Xs > B|Xs > A) = = RA = e− A λs (t) dt .
1 − Fs (A) e− 0 λs (t) dt
✘ The probability Pn : an A-year-old non-smoker survive until age B is
RB
1 − Fn (B) e− 0 λn (t) dt RB
✓ P (Xn > B|Xn > A) = = RA = e− A λn (t) dt .
1 − Fn (A) e− 0 λn (t) dt
R R R 2
− AB 2λn (t) dt −2 AB λn (t) dt − AB λn (t) dt
✘ Ps = e =e = e = Pn2 .
45 / 59
Gamma Distribution
✔ Let N(t) be a Poisson process with rate λ: P (N(t) = n) = e−λt (λt)n /n!.
✘ The waiting time of the first event T1 is exponential of rate λ.
Let T2 be the waiting time until the second event.
✘ P (T2 > t) = P (N(t) < 2) = e−λt (1 + λt).
d
✘ The density function is f2 (t) = − dt P (T2 > t) = λe−λt (λt).
Let T3 be the waiting time until the third event.
✘ P (T3 > t) = P (N(t) < 3) = e−λt (1 + λt + (λt)2 /2).
d
✘ The density function is f3 (t) = − dt P (T3 > t) = 21 λe−λt (λt)2 .
In general, let Tn be the waiting time until the nth event.
λe−λt (λt)n−1
✘ The density function is fn (t) = .
(n − 1)!
47 / 59
22
Gamma Distribution
✔ Definition. A r.v. X is gamma with parameters (n, λ), λ > 0, if
λe−λt (λt)n−1
✘ the density function is f (t) = , t ≥ 0.
(n − 1)!
It is time at which the nth event of a Poisson process of rate λ occurs.
✔ Definition. A r.v. X is gamma with parameters (α, λ), α, λ > 0, if
λe−λt (λt)α−1
✘ the density function is f (t) = , t ≥ 0,
Γ(α)
R∞ R∞
Γ(α) = 0 λe−λt (λt)α−1 dt = 0 e−y y α−1 dy is the gamma function.
✔ Remarks. One can prove that
✘ Γ(α + 1) = αΓ(α), for α > 0.
✘ Γ(n) = (n − 1)! for positive integer n.
48 / 59
Gamma Distribution
✔ Let X be gamma with parameters (α, λ).
λe−λt (λt)α−1
✘ f (t) = , t ≥ 0.
Γ(α)
Z ∞
1
E[X] = t · λe−λt (λt)α−1 dt
Γ(α) 0
Z ∞
1
= e−λt (λt)α dt
Γ(α) 0
Z ∞
1
= λe−λt (λt)(α+1)−1 dt
λΓ(α) 0
1
= Γ(α + 1)
λΓ(α)
α
= .
λ
49 / 59
23
Gamma Distribution
✔ Let X be gamma with parameters (α, λ).
λe−λt (λt)α−1
✘ f (t) = , t ≥ 0.
Γ(α)
Z ∞
2 1
E[X ] = t2 · λe−λt (λt)α−1 dt
Γ(α) 0
Z ∞
1
= e−λt (λt)α+1 dt
λΓ(α) 0
Z ∞
1
= 2 λe−λt (λt)(α+2)−1 dt
λ Γ(α) 0
1 α(α + 1)
= 2 Γ(α + 2) = .
λ Γ(α) λ2
α(α + 1) α2 α
Var(X) = E[X 2 ] − (E[X])2 = 2
− 2 = 2.
λ λ λ
50 / 59
Beta Distribution
✔ A coin was flipped 100 times of which 40 are heads.
40
✘ It is expected that the probability of showing heads is p = 100
= 0.4.
Indeed, it only shows that this probability is most likely to be 0.4.
✘ The density function of this probability is
1
✓ f (x) = x40−1 (1 − x)60−1 , 0 < x < 1,
B(40, 60)
R1
where B(40, 60) = 0 x40−1 (1 − x)60−1 dx.
✔ Definition. A r.v. X is beta with parameters (a, b), a, b > 0, if
1
✘ the density function is f (x) = xa−1 (1 − x)b−1 , 0 < x < 1,
B(a, b)
R1 Γ(a)Γ(b)
where B(a, b) = 0 xa−1 (1 − x)b−1 dx = .
Γ(a + b)
51 / 59
24
Beta Distribution
✔ Let X be beta with parameters (a, b).
xa−1 (1 − x)b−1 R1
✘ f (x) = , 0 < x < 1, B(a, b) = 0 xa−1 (1 − x)b−1 dx.
B(a, b)
It represents the distribution of the probability of success if
✘ it is given there are a successes and b failures in the first a + b trials.
✔ Remarks.
✘ Let a = b = 1. Then X is uniform on (0, 1).
Γ(a)Γ(b)
✘ B(a, b) = .
Γ(a + b)
B(a + 1, b) Γ(a + 1)Γ(b) Γ(a)Γ(b) a
✓ = / = .
B(a, b) Γ(a + b + 1) Γ(a + b) a+b
52 / 59
Beta Distribution
✔ Let X be beta with parameters (a, b).
xa−1 (1 − x)b−1 Γ(a)Γ(b)
✘ f (x) = , 0 < x < 1, B(a, b) = .
B(a, b) Γ(a + b)
Z 1
xa−1 (1 − x)b−1 B(a + 1, b) a
E[X] = x· dx = = .
0 B(a, b) B(a, b) a+b
Z 1
2 xa−1 (1 − x)b−1 B(a + 2, b) (a + 1)a
E[X ] = x2 · = = .
0 B(a, b) B(a, b) (a + b + 1)(a + b)
ab
Var(X) = E[X 2 ] − (E[X])2 = · · · = 2
.
(a + b) (a + b + 1)
In particular, X is uniform on (0, 1) by letting a = b = 1.
1 1
✘ E[X] = 2
and Var(X) = 12
.
53 / 59
25
Distribution of a Function of a Random Variable 54 / 59
55 / 59
56 / 59
26
Examples
✔ Let X be nonnegative continuous with density function fX .
✘ Let Y = X n . Find the probability density function of Y .
d −1
Solution. Let g(x) = xn . Then g −1(y) = y 1/n and dy
g (y) = n1 y 1/n−1 .
✘ For y ≥ 0,
d −1
✓ fY (y) = fX (g −1 (y)) dy
g (y) = fX (y 1/n ) · n1 y 1/n−1 .
In particular, if n = 2,
√ √
✓ fY (y) = fX ( y) · 12 y −1/2 = 1
√ f ( y).
2 y X
57 / 59
Examples
✔ Let X be a normal random variable with parameters (µ, σ 2).
✘ Then Y = eX is called lognormal with parameters (µ, σ 2).
In other words, Y is lognormal ⇔ ln(Y ) is normal.
✔ Let Sn be the price of a security at the end of day n.
Sn
✘ One may expect that is lognormal.
Sn−1
In other words, X = ln(Sn /Sn−1) is normal.
✘ Sn = Sn−1 eX .
58 / 59
27
Examples
✔ Let X be a normal random variable with parameters (µ, σ 2).
✘ Find the probability density function of Y = eX .
2 2
Solution. The density of X is fX (x) = √ 1 e−(x−µ) /2σ .
2πσ
d −1
✘ Let g(x) = ex . Then g −1 (y) = ln y and dy
g (y) = y1 .
✓ Let y > 0. Then
fY (y) = fX (g −1(y)) d −1
dy
g (y)
= fX (ln y) · y1
1
exp −(ln y − µ)2 /2σ 2 .
=√
2πσy
59 / 59
28