0% found this document useful (0 votes)
24 views13 pages

Assignment 2 Solution

This document contains solutions to STAT 5101 Assignment 2 problems. It includes calculations of probabilities using the total probability rule and conditional probabilities. It also covers probabilities of genotypes and phenotypes in genetic crosses, and calculating related conditional probabilities. The problems demonstrate applying probability concepts to genetics scenarios.

Uploaded by

godlionwolf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views13 pages

Assignment 2 Solution

This document contains solutions to STAT 5101 Assignment 2 problems. It includes calculations of probabilities using the total probability rule and conditional probabilities. It also covers probabilities of genotypes and phenotypes in genetic crosses, and calculating related conditional probabilities. The problems demonstrate applying probability concepts to genetics scenarios.

Uploaded by

godlionwolf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

STAT 5101 Assignment2 Solutions

October 15, 2021

1.

(a) Let A = {a patient is tested A postitive}, D = {a patient has disease D}. Given P r(A) =
20%, P r(D|A) = 21%, and P r(Dc |Ac ) = 99%, we have

P r(D) = P r(D|A)P r(A) + P r(D|Ac )P r(Ac )

= 21% × 20% + (1 − 99%) × (1 − 20%)

= 5%

by the total probability rule.


(b) Let B = {a patient is tested B postitive}. Given that P r(D|B) = 23% and P r(B) =
10%, we have
P r(D ∩ B c )
P r(D|B c ) =
P r(B c )
P r(D) − P r(D ∩ B)
=
1 − P r(B)
P r(D) − P r(D|B)P r(B)
=
1 − P r(B)
5% − 2.3%
=
1 − 10%
= 3% .

(c) From
P r(D|A) 21%
RR(D|A) = c
= = 21
P r(D|A ) 1 − 99%

1
P r(D|B) 23%
RR(D|B) = c
= ≈ 7.667 < RR(D|A),
P r(D|B ) 3%

we conclude that screening test A performs better than test B in detecting disease D.

2.

(a) The sample space of Father’s genotype is Ωf = {AB, BO, AA, AO} and the sample
space of Mother’s genotype is Ωm = {AB, AA, BB}. Father’s and Mother’s genotype are
independent, hence

1 1 1
P r(F ather = AB, M other = AB) = P r(F ather = AB)P r(M other = AB) = × = .
4 2 8

(b) Let Af = {Father provides an allele A}, B f = {Father provides an allele B}, Am =
{Mother provides an allele A}, B m = {Mother provides an allele B}. Note that the two al-
leles provided by Father and Mother are independent, hence P r(Betty = AB) = P r(Af B m ∪
B f Am ) = P r(Af B m ) + P r(B f Am ) = P r(Af )P r(B m ) + P r(B f )P r(Am ), where

1 1
P r(Af ) = P r(F ather = AB) × + P r(F ather = AA) + P r(F ather = AO) ×
2 2
1 1 1 1 1 1
= × + + × = ,
4 2 4 4 2 2
1 1
P r(B f ) = P r(F ather = AB) × + P r(F ather = BO) ×
2 2
1 1 1 1 1
= × + × = ,
4 2 4 2 4
1
P r(Am ) = P r(M other = AB) × + P r(M other = AA)
2
1 1 1 1
= × + = ,
2 2 4 2
1
P r(B m ) = P r(M other = AB) × + P r(M other = BB)
2
1 1 1 1
= × + = .
2 2 4 2

1
So P r(Betty = AB) = 2
× 21 + 14 × 1
2
= 83 .

2
(c) P r(Betty’s phenotype is A) = P r(Betty = AA) + P r(Betty = AO), where

1
P r(Betty = AA) = P r(Af Am ) = P r(Af )P r(Am ) = ,
4
1
Of = {Father provides an allele O}, P r(Of ) = ,
4
Om = {Mother provides an allele O}, P r(Om ) = 0,
1
P r(Betty = AO) = P r(Af Om ∪ Of Am ) = 0 + P r(Of )P r(Am ) = ,
8

1 1
hence P r(Betty’s phenotype is A) = 4
+ 8
= 38 .

(d) Recall Ωf and Ωm , the sample space for Father’s and Mother’s genotype. Notice that
once parent’s genotypes are given, Betty and her brother’s phenotypes are independent and
P r(Betty’s phenotype is A) = P r(Brother’s phenotype is A), instead of independence with-
out given parents genotypes. Such conditional probability of the target event is expressed,
for instance, as

P r(Both Betty and Brother have phenotype A|F ather = AB, M other = AB)

= P r(Betty has phenotype A|F = AB, M = AB) × P r(Brother has phenotype A|F = AB, M = AB)

= P r(Betty has phenotype A|F = AB, M = AB)2 .


1 1
= ( )2 = .
4 16

By the Total Probability rule,

3
P r(Both Betty and Brother have phenotype A)
1 1 1
= P r(Both...A|F = AB, M = AB)P r(F = AB, M = AB) → ( )2 × ×
4 4 2
1 2 1 1
+ P r(Both...A|F = AB, M = AA)P r(F = AB, M = AA) →( ) × ×
2 4 4
1 1
+ P r(Both...A|F = AB, M = BB)P r(F = AB, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = BO, M = AB)P r(F = BO, M = AB) →( ) × ×
4 4 2
1 2 1 1
+ P r(Both...A|F = BO, M = AA)P r(F = BO, M = AA) →( ) × ×
2 4 4
1 1
+ P r(Both...A|F = BO, M = BB)P r(F = BO, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = AA, M = AB)P r(F = AA, M = AB) →( ) × ×
2 4 2
2 1 1
+ P r(Both...A|F = AA, M = AA)P r(F = AA, M = AA) →1 × ×
4 4
1 1
+ P r(Both...A|F = AA, M = BB)P r(F = AA, M = BB) →0× ×
4 4
1 2 1 1
+ P r(Both...A|F = AO, M = AB)P r(F = AO, M = AB) →( ) × ×
2 4 2
1 1
+ P r(Both...A|F = AO, M = AA)P r(F = AO, M = AA) → 12 × ×
4 4
1 1
+ P r(Both...A|F = AO, M = BB)P r(F = AO, M = BB) →0× ×
4 4
1 1 1 1 1 1 1 1
= + + + + + + +
128 64 128 64 32 16 32 16
15
=
64

(e)
P r(Mother’s phenotype is A) = P r(M other = AA) = 14 .
P r(Mother’s phenotype is A ∩ Betty’s pheotype is A)
= P r(M other = AA, Betty = AA) + P r(M other = AA, Betty = AO)
= P r(M other = AA)P r(Af ) + P r(M other = AA)P r(Of )
1
= 4
× ( 12 + 14 ) = 3
16
.
Hence
3/16 3
P r(Betty’s phenotype is A|Mother’s phenotype is A) = = .
1/4 4

4
(f) From (e) and (c),

3/16 1
P r(Mother’s phenotype is A|Betty’s phenotype is A) = = .
3/8 2

3. (a) The Protability Mass Function (PMF) of X can be represented as

x x=1 x=2 x=3 x=5


P (x) 0.3 0.3 0.2 0.2

(b)
P r(2 < X ≤ 3) = P r(X ≤ 3) − P r(X ≤ 2) = F (3) − F (2) = 0.8 − 0.6 = 0.2

P r(X = 3) = P r(X ≤ 3) − P r(X < 3) = F (3) − F (3− ) = 0.8 − 0.6 = 0.2

(c) P r(X = 4) = P r(X ≤ 4) − P r(X < 4) = F (4) − F (4− ) = 0.8 − 0.8 = 0

P r(3 ≤ X ≤ 5) = P r(X ≤ 5) − P r(X < 3) = F (5) − F (3− ) = 1 − 0.6 = 0.4

(d) P r(X = 6) = 0

P r(3 < X ≤ 6) = P r(X ≤ 6) − P r(X ≤ 3) = F (6) − F (3) = 1 − 0.8 = 0.2

5
(e) µ = E(X) = ni=1 xi P r(X = xi ) = 1 × 0.3 + 2 × 0.3 + 3 × 0.2 + 5 × 0.2 = 2.5
P

E(X 2 ) = ni=1 x2i P r(X 2 = x2i ) = 1 × 0.3 + 4 × 0.3 + 9 × 0.2 + 25 × 0.2 = 8.3,
P

σ 2 = V ar(X) = E(X 2 ) − (E(X))2 = 8.3 − 2.52 = 2.05


(or X
σ 2 = V ar(X) = E(X − EX)2 = (x − µ)2 P r(X = x)
x

= (1 − 2.5)2 × 0.3 + (2 − 2.5)2 × 0.3 + (3 − 2.5)2 × 0.2 + (5 − 2.5)2 × 0.2

= 2.05 )

ALso,

E((X + 2)2 ) = V ar((X + 2)) + E 2 (X + 2) = V ar(X) + (E(X) + 2)2 = 22.3

(f) σ ≈ 1.432, hence

P r(µ − σ < X ≤ µ + σ) = P r(1.068 < X ≤ 3.932)

= F (3.932) − F (1.068)

= 0.8 − 0.3 = 0.5

(g) Y = 3X − 2 is a linear transformation. Hence,

E(Y ) = E(3X − 2) = 3E(X) − 2 = 3 × 2.5 − 2 = 5.5

V ar(Y ) = V ar(3X − 2) = 32 V ar(X) = 9 × 2.05 = 18.45

(h) The Transformation Z = X 2 is not a linear transformation, the linear property for
Expectation cannot be directly applied. Thus, we derive the P.M.F. for Z first,

Z z = 1 z = 4 z = 9 z = 25
P (x) 0.3 0.3 0.2 0.2

6
Hence, we can derive the CDF by their relation,




 0 : z<1


0.3 : 1 ≤ z < 4





F (z) = 0.6 : 4 ≤ z < 9



0.8 : 9 ≤ z < 25






 1 : z ≥ 25

4.

(a) X ∼ Binomial(135, 0.02), n = 135, p = 0.02


(b)

P r(X ≤ 3) = P r(X = 0) + P r(X = 1) + P r(X = 2) + P r(X = 3)

= 0.98135 + C1135 × 0.02 × 0.98134 + C2135 × 0.022 × 0.98133 + C3135 × 0.023 × 0.98132

≈ 0.7148

(c) see as below.


(d) E(X) = np = 2.7, V ar(X) = np(1 − p) = 2.646. The density of X has been shown as
The long right tail entails right-skewed distribution.

7
Figure 1: CDF

Figure 2: Density of X

8
(e) Since n = 135 > 20; p = 0.02 < 0.1; np = 2.7 < 5, X can be well approximated by
Y ∼ P oisson(µ = 2.7).
(f)

P r(X ≤ 3) ≈ P r(Y ≤ 3) = P r(Y = 0) + P r(Y = 1) + P r(Y = 2) + P r(Y = 3)


µ2 µ3
= e−µ (1 + µ + + )
2! 3!
≈ 0.7142

(g) Because it is a random sample, We can estimate the p̂ = 5/10 = 0.5 from this sample.
This estimated proportion is far away from the assumed p = 0.02. Hence we may draw the
conclusion:1) This sample is lucky enough to have large proportion students living in Lamma
island. 2) Or the student in this class is not randomly selected from whole Hong Kong, It
might be a school near Lamma Island, i.e., students in this class is not randomly chosen in
whole Hong Kong. 3) Or the assumed true proportion is wrong.

5.

(a) By problem setting X ∼ Bin(2, p), Y ∼ Bin(4, p) and X is independent of Y . Since


5
P r(X ≥ 1) = 9
implies P r(X = 0) = (1 − p)2 = 1 − 95 , we have p = 1
3
and P r(Y ≥ 2) =
1 − P r(Y = 1) − P r(Y = 0) = 1 − 4p(1 − p)3 − (1 − p)4 = 33/81 = 0.4074
(b)
2
X
P r(X = Y ) = P r(Y = X|X = i)P r(X = i)
i=0
2
X
= P r(Y = i)P r(X = i) Since independence
i=0

= (1 − p)2 × (1 − p)4 + 2p(1 − p) × 4p(1 − p)3 + p2 × 6p2 (1 − p)2

= 8/27 = 0.2963

(c)

9
2
X
P r(X > Y ) = P r(Y < X|X = i)P r(X = i)
i=0
X2
= P r(Y < i)P r(X = i)
i=0
 
= P r(Y = 0)P r(X = 1) + P r(Y = 0) + P r(Y = 1) P r(X = 2)
 
= (1 − p) × 2p(1 − p) + (1 − p) + 4p(1 − p) p2
4 4 3

= 112/729 = 0.1536

(d) To calculate the mode, Just to find the point that maximum the corresponding P.M.F.
Thus, Mode of X are 0, 1 and Mode of Y is 1. For X and Y, they all are p = 1/3 < 1/2,
Hence, they are right skewed.

6.

(a) By problem setting, we obtained: X ∼ Bin(2, 1/2),Y ∼ Bin(2, 1/2), and X are inde-
pendent of Y .
(b) by the linear property of Expectation operation, E(Z) = E(X + Y ) = E(X) + E(Y ) =
2×1/2+3×1/2 = 5/2. Also, by independence, V ar(Z) = V ar(X+Y ) = V ar(X)+V ar(Y ) =
2 × (1/2)2 + 3 × (1/2)2 = 5/4
(c) To obtain the P.m.f. of Z, we using the conditional probability to derive the P.m.f. ,for

10
k = 1, 2, . . . , 5

2
X
P r(Z = k) = P r(Z = k|X = j)P r(X = j)
j=0
2
X
= P r(X + Y = k)P r(X = j)
j=0
2
X
= P r(Y = k − j)P r(X = j)1(k − j ≥ 0)
j=0
2    
X 2 1 (k−j) 1 (2−k+j) 3 1 (j) 1 (3−j)
= ( ) ( ) ( ) ( ) 1(k − j ≥ 0)
j=0
k−j 2 2 j 2 2
2   
1 X 2 3
= 5 1(k − j ≥ 0)
2 j=0 k − j j
 
5 1
=
k 25

where 1(x) stands for indicator. it means if x is true, then 1(x) = 1.


(d) From (c), we know the P.m.f of Z is P r(Z = k) = k5 215 , which is the same as Bin(5, 1/2).


Since the P.m.f can characterize the distribution, thus we have Z is a Binomial distribution
with n = 5 and p = 1/2.
Conclusion: For any two independent Binomial Random Variable X ∼ Bin(n1 , p) and Y ∼
Bin(n2 , p) with same p. We have Z = X + Y follow Bin(n1 + n2 , p).
(e) For W = X − Y , we can use the same technique, for k = −3, −2, . . . , 2

2
X
P r(W = k) = P r(W = k|X = j)P r(X = j)
j=0
2
X
= P r(X − Y = k|X = j)P r(X = j)
j=0
2
X
= P r(Y = k + j)P r(X = j)1(0 ≤ k + j ≤ 3)
j=0
2   
1 X 3 2
= ( 5) 1(0 ≤ k + j ≤ 3)
2 j=0 k + j j

11
It is not a standard distribution. However, we can still easily derive the expectation and
variance. E(W ) = E(X − Y ) = E(X) − E(Y ) = 1 − 3/2 = −1/2, V ar(W ) = V ar(X − Y ) =
V ar(X) + V ar(Y ) = 2/4 + 3/4 = 5/4.

7.

µk e−µ
(a) X ∼ P oisson(µ), P r(X = k) = k!
.
µ4 µ5
P r(X = 4) = P r(X = 5) =⇒ = =⇒ µ = 5
4! 5!

Hence EX = V ar(X) = µ = 5.
5k e−5
(b) P r(X = k) = k!
. Recall that Poisson distribution is right-skewed. Since P r(X =
3 6
4) = P r(X = 5) = 0.1755, P r(X = 3) = e−5 × 53! = 0.140 < 0.1755, P r(X = 6) = e−5 × 56! =
0.1462 < 0.1755, we conclude that 4 or 5 cracks in 10km have the largest probability 0.1755.
(c) Since the expected number of cracks in 15km is µ = 5, it is easy to see that the
expected number of cracks in 5km is 5/3. Let Y be the number of cracks in 5km, then
Y ∼ P oisson(5/3). P r(at least 1 crack in 5km) = P r(Y ≥ 1) = 1 − P r(Y = 0) = 1 −
e−5/3 = 0.8111.
e−5/3 (5/3)k
(d) P r(Y = k) = k!
,k = 0, 1, 2, · · · . Hence P r(Y = 0) = e−5/3 = 0.1889, P r(Y ≤
1) = 0.5037, P r(Y ≤ 2) = 0.766, P r(Y ≤ 3) = 0.9117, P r(Y ≤ 4) = 0.9725.

(e) Suppose n packages should be prepared to ensure P r(n ≥ Y ) ≥ 95%. From (d) we have

12
P r(Y ≤ 3) = 0.9117 < 95%, and P r(Y ≤ 4) = 0.9725 > 95%. Hence n should be 4.

13

You might also like