Assignment 2 Solution
Assignment 2 Solution
1.
(a) Let A = {a patient is tested A postitive}, D = {a patient has disease D}. Given P r(A) =
20%, P r(D|A) = 21%, and P r(Dc |Ac ) = 99%, we have
= 5%
(c) From
P r(D|A) 21%
RR(D|A) = c
= = 21
P r(D|A ) 1 − 99%
1
P r(D|B) 23%
RR(D|B) = c
= ≈ 7.667 < RR(D|A),
P r(D|B ) 3%
we conclude that screening test A performs better than test B in detecting disease D.
2.
(a) The sample space of Father’s genotype is Ωf = {AB, BO, AA, AO} and the sample
space of Mother’s genotype is Ωm = {AB, AA, BB}. Father’s and Mother’s genotype are
independent, hence
1 1 1
P r(F ather = AB, M other = AB) = P r(F ather = AB)P r(M other = AB) = × = .
4 2 8
(b) Let Af = {Father provides an allele A}, B f = {Father provides an allele B}, Am =
{Mother provides an allele A}, B m = {Mother provides an allele B}. Note that the two al-
leles provided by Father and Mother are independent, hence P r(Betty = AB) = P r(Af B m ∪
B f Am ) = P r(Af B m ) + P r(B f Am ) = P r(Af )P r(B m ) + P r(B f )P r(Am ), where
1 1
P r(Af ) = P r(F ather = AB) × + P r(F ather = AA) + P r(F ather = AO) ×
2 2
1 1 1 1 1 1
= × + + × = ,
4 2 4 4 2 2
1 1
P r(B f ) = P r(F ather = AB) × + P r(F ather = BO) ×
2 2
1 1 1 1 1
= × + × = ,
4 2 4 2 4
1
P r(Am ) = P r(M other = AB) × + P r(M other = AA)
2
1 1 1 1
= × + = ,
2 2 4 2
1
P r(B m ) = P r(M other = AB) × + P r(M other = BB)
2
1 1 1 1
= × + = .
2 2 4 2
1
So P r(Betty = AB) = 2
× 21 + 14 × 1
2
= 83 .
2
(c) P r(Betty’s phenotype is A) = P r(Betty = AA) + P r(Betty = AO), where
1
P r(Betty = AA) = P r(Af Am ) = P r(Af )P r(Am ) = ,
4
1
Of = {Father provides an allele O}, P r(Of ) = ,
4
Om = {Mother provides an allele O}, P r(Om ) = 0,
1
P r(Betty = AO) = P r(Af Om ∪ Of Am ) = 0 + P r(Of )P r(Am ) = ,
8
1 1
hence P r(Betty’s phenotype is A) = 4
+ 8
= 38 .
(d) Recall Ωf and Ωm , the sample space for Father’s and Mother’s genotype. Notice that
once parent’s genotypes are given, Betty and her brother’s phenotypes are independent and
P r(Betty’s phenotype is A) = P r(Brother’s phenotype is A), instead of independence with-
out given parents genotypes. Such conditional probability of the target event is expressed,
for instance, as
P r(Both Betty and Brother have phenotype A|F ather = AB, M other = AB)
= P r(Betty has phenotype A|F = AB, M = AB) × P r(Brother has phenotype A|F = AB, M = AB)
3
P r(Both Betty and Brother have phenotype A)
1 1 1
= P r(Both...A|F = AB, M = AB)P r(F = AB, M = AB) → ( )2 × ×
4 4 2
1 2 1 1
+ P r(Both...A|F = AB, M = AA)P r(F = AB, M = AA) →( ) × ×
2 4 4
1 1
+ P r(Both...A|F = AB, M = BB)P r(F = AB, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = BO, M = AB)P r(F = BO, M = AB) →( ) × ×
4 4 2
1 2 1 1
+ P r(Both...A|F = BO, M = AA)P r(F = BO, M = AA) →( ) × ×
2 4 4
1 1
+ P r(Both...A|F = BO, M = BB)P r(F = BO, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = AA, M = AB)P r(F = AA, M = AB) →( ) × ×
2 4 2
2 1 1
+ P r(Both...A|F = AA, M = AA)P r(F = AA, M = AA) →1 × ×
4 4
1 1
+ P r(Both...A|F = AA, M = BB)P r(F = AA, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = AO, M = AB)P r(F = AO, M = AB) →( ) × ×
2 4 2
1 1
+ P r(Both...A|F = AO, M = AA)P r(F = AO, M = AA) → 12 × ×
4 4
1 1
+ P r(Both...A|F = AO, M = BB)P r(F = AO, M = BB) →0× ×
4 4
1 1 1 1 1 1 1 1
= + + + + + + +
128 64 128 64 32 16 32 16
15
=
64
(e)
P r(Mother’s phenotype is A) = P r(M other = AA) = 14 .
P r(Mother’s phenotype is A ∩ Betty’s pheotype is A)
= P r(M other = AA, Betty = AA) + P r(M other = AA, Betty = AO)
= P r(M other = AA)P r(Af ) + P r(M other = AA)P r(Of )
1
= 4
× ( 12 + 14 ) = 3
16
.
Hence
3/16 3
P r(Betty’s phenotype is A|Mother’s phenotype is A) = = .
1/4 4
4
(f) From (e) and (c),
3/16 1
P r(Mother’s phenotype is A|Betty’s phenotype is A) = = .
3/8 2
(b)
P r(2 < X ≤ 3) = P r(X ≤ 3) − P r(X ≤ 2) = F (3) − F (2) = 0.8 − 0.6 = 0.2
(d) P r(X = 6) = 0
5
(e) µ = E(X) = ni=1 xi P r(X = xi ) = 1 × 0.3 + 2 × 0.3 + 3 × 0.2 + 5 × 0.2 = 2.5
P
E(X 2 ) = ni=1 x2i P r(X 2 = x2i ) = 1 × 0.3 + 4 × 0.3 + 9 × 0.2 + 25 × 0.2 = 8.3,
P
= 2.05 )
ALso,
= F (3.932) − F (1.068)
(h) The Transformation Z = X 2 is not a linear transformation, the linear property for
Expectation cannot be directly applied. Thus, we derive the P.M.F. for Z first,
Z z = 1 z = 4 z = 9 z = 25
P (x) 0.3 0.3 0.2 0.2
6
Hence, we can derive the CDF by their relation,
0 : z<1
0.3 : 1 ≤ z < 4
F (z) = 0.6 : 4 ≤ z < 9
0.8 : 9 ≤ z < 25
1 : z ≥ 25
4.
= 0.98135 + C1135 × 0.02 × 0.98134 + C2135 × 0.022 × 0.98133 + C3135 × 0.023 × 0.98132
≈ 0.7148
7
Figure 1: CDF
Figure 2: Density of X
8
(e) Since n = 135 > 20; p = 0.02 < 0.1; np = 2.7 < 5, X can be well approximated by
Y ∼ P oisson(µ = 2.7).
(f)
(g) Because it is a random sample, We can estimate the p̂ = 5/10 = 0.5 from this sample.
This estimated proportion is far away from the assumed p = 0.02. Hence we may draw the
conclusion:1) This sample is lucky enough to have large proportion students living in Lamma
island. 2) Or the student in this class is not randomly selected from whole Hong Kong, It
might be a school near Lamma Island, i.e., students in this class is not randomly chosen in
whole Hong Kong. 3) Or the assumed true proportion is wrong.
5.
= 8/27 = 0.2963
(c)
9
2
X
P r(X > Y ) = P r(Y < X|X = i)P r(X = i)
i=0
X2
= P r(Y < i)P r(X = i)
i=0
= P r(Y = 0)P r(X = 1) + P r(Y = 0) + P r(Y = 1) P r(X = 2)
= (1 − p) × 2p(1 − p) + (1 − p) + 4p(1 − p) p2
4 4 3
= 112/729 = 0.1536
(d) To calculate the mode, Just to find the point that maximum the corresponding P.M.F.
Thus, Mode of X are 0, 1 and Mode of Y is 1. For X and Y, they all are p = 1/3 < 1/2,
Hence, they are right skewed.
6.
(a) By problem setting, we obtained: X ∼ Bin(2, 1/2),Y ∼ Bin(2, 1/2), and X are inde-
pendent of Y .
(b) by the linear property of Expectation operation, E(Z) = E(X + Y ) = E(X) + E(Y ) =
2×1/2+3×1/2 = 5/2. Also, by independence, V ar(Z) = V ar(X+Y ) = V ar(X)+V ar(Y ) =
2 × (1/2)2 + 3 × (1/2)2 = 5/4
(c) To obtain the P.m.f. of Z, we using the conditional probability to derive the P.m.f. ,for
10
k = 1, 2, . . . , 5
2
X
P r(Z = k) = P r(Z = k|X = j)P r(X = j)
j=0
2
X
= P r(X + Y = k)P r(X = j)
j=0
2
X
= P r(Y = k − j)P r(X = j)1(k − j ≥ 0)
j=0
2
X 2 1 (k−j) 1 (2−k+j) 3 1 (j) 1 (3−j)
= ( ) ( ) ( ) ( ) 1(k − j ≥ 0)
j=0
k−j 2 2 j 2 2
2
1 X 2 3
= 5 1(k − j ≥ 0)
2 j=0 k − j j
5 1
=
k 25
Since the P.m.f can characterize the distribution, thus we have Z is a Binomial distribution
with n = 5 and p = 1/2.
Conclusion: For any two independent Binomial Random Variable X ∼ Bin(n1 , p) and Y ∼
Bin(n2 , p) with same p. We have Z = X + Y follow Bin(n1 + n2 , p).
(e) For W = X − Y , we can use the same technique, for k = −3, −2, . . . , 2
2
X
P r(W = k) = P r(W = k|X = j)P r(X = j)
j=0
2
X
= P r(X − Y = k|X = j)P r(X = j)
j=0
2
X
= P r(Y = k + j)P r(X = j)1(0 ≤ k + j ≤ 3)
j=0
2
1 X 3 2
= ( 5) 1(0 ≤ k + j ≤ 3)
2 j=0 k + j j
11
It is not a standard distribution. However, we can still easily derive the expectation and
variance. E(W ) = E(X − Y ) = E(X) − E(Y ) = 1 − 3/2 = −1/2, V ar(W ) = V ar(X − Y ) =
V ar(X) + V ar(Y ) = 2/4 + 3/4 = 5/4.
7.
µk e−µ
(a) X ∼ P oisson(µ), P r(X = k) = k!
.
µ4 µ5
P r(X = 4) = P r(X = 5) =⇒ = =⇒ µ = 5
4! 5!
Hence EX = V ar(X) = µ = 5.
5k e−5
(b) P r(X = k) = k!
. Recall that Poisson distribution is right-skewed. Since P r(X =
3 6
4) = P r(X = 5) = 0.1755, P r(X = 3) = e−5 × 53! = 0.140 < 0.1755, P r(X = 6) = e−5 × 56! =
0.1462 < 0.1755, we conclude that 4 or 5 cracks in 10km have the largest probability 0.1755.
(c) Since the expected number of cracks in 15km is µ = 5, it is easy to see that the
expected number of cracks in 5km is 5/3. Let Y be the number of cracks in 5km, then
Y ∼ P oisson(5/3). P r(at least 1 crack in 5km) = P r(Y ≥ 1) = 1 − P r(Y = 0) = 1 −
e−5/3 = 0.8111.
e−5/3 (5/3)k
(d) P r(Y = k) = k!
,k = 0, 1, 2, · · · . Hence P r(Y = 0) = e−5/3 = 0.1889, P r(Y ≤
1) = 0.5037, P r(Y ≤ 2) = 0.766, P r(Y ≤ 3) = 0.9117, P r(Y ≤ 4) = 0.9725.
(e) Suppose n packages should be prepared to ensure P r(n ≥ Y ) ≥ 95%. From (d) we have
12
P r(Y ≤ 3) = 0.9117 < 95%, and P r(Y ≤ 4) = 0.9725 > 95%. Hence n should be 4.
13