621labex2 2018 Answers
621labex2 2018 Answers
621labex2 2018 Answers
621
First Term, 2018-2019
Laboratory Exercise 2
Probability Concepts and Binomial Distribution
Answer Key
1. Below find a table from a US National Medical Expenditure Survey (NMES) showing the
joint distribution of smoking and chronic obstructive pulmonary disease (COPD) status.
Smoking
COPD Never Former Current Total
No 5,030 3,282 2,876 11,188
Yes 20 98 65 183
Total 5,050 3,380 2,941 11,371
a) If a person is chosen at random from the 11,371 individuals in the NMES sample,
what is the probability that he or she will be:
i) a current smoker
b) Calculate the probability that a COPD patient is a current or former smoker using
Bayes theorem.
98 65 3380 2941
3380 2941 11,371 163
Pr(Smoke | COPD) 0.89
98 65 3380 2941 20 5050 183
3380 2941 11,371 5050 11,371
d) Estimate the three conditional probabilities of having COPD given smoking status is
never, former, and current. Make a table of these probabilities that allows the viewer
to observe the association between smoking and the risk of COPD.
e) Hypothesize about the biological and behavioral processes that might have given rise
to the data summarized in step 1.d).
We observe that the probability of COPD is higher in former smokers than current
smokers. Possible reasons for this include: former smokers may be older, former
smokers may have quit smoking due to illness, these data are only cross-sectional data
and do not provide information on longitudinal changes over time.
2. The following 3x2 tables derived from the Nepal mortality data show mortality at 16 months
of follow-up for different ages of girls in both treatment groups.
-> sex = Female trt= Placebo
Age of | Vital status
child | Alive Dead | Total
-----------+----------------------+----------
< 1 | 1219 69 | 1288
| 94.64 5.36 | 100.00
-----------+----------------------+----------
1-2 | 2615 72 | 2687
| 97.32 2.68 | 100.00
-----------+----------------------+----------
3-4 | 2542 25 | 2567
| 99.03 0.97 | 100.00
-----------+----------------------+----------
Total | 6376 166 | 6542
| 97.46 2.54 | 100.00
3. Suppose 3 girls (call them J, R and Y) were randomly selected from the Vitamin A-treated
group. Define the random variable of interest, X, as the number out of the 3 who die during
follow-up.
a) What are the possible outcomes (values) that may be observed for this random
variable?
b) What is the probability that only the first girl ( J) would die during follow-up?
c) What is the probability that only one (exactly one) girl would die during follow-
up?
P(only one girl dies) = P(only the 1st dies)+P(only the 2nd dies)+P(only the 3rd dies)
= 0.0.17+0.017+0.017= 0.052
n 3
Note: This is the same as P(X=1) = p x q n x = (0.018)1 (0.982) 31 =0.052
x 1
d) What is the probability that only the first two girls (J and R) would die during follow-
up?
P(only the first 2 girls die ) = P ( the first 2 girls die and the third girl survives)
= P ( girl dies)2 *P ( girl survives)1
= P( D D S)
=
(0.018) 2 (0.982)1 = 0.0003
p x q n x
e) What is the probability that exactly two girls would die during follow-up?
P(only 2 girls die) = P(1st and 2nd die)+P(1st and 3rd die) + P(2nd and 3rd die)
= 0.0003 + 0.0003 + 0.0003 = 0.0009
3
= (0.018) 2 (0.982)1
2
= 0.0009
f) Describe the probability distribution for the number of deaths of Vitamin A-treated
girls during 16 months of follow-up by filling in the table below:
n 3
P(X = 0) = p x q n x = (0.018) 0 (0.982) 3 = 0.947
x 0
n 3
P(X=1) = p x q n x = (0.018)1 (0.982) 2 = 0.052
x 1
n 3
P(X = 2) = p x q n x = (0.018) 2 (0.982)1 = 0.0009
x 2
n 3
P(X = 3) = p x q n x = (0.018) 3 (0.982) 0 ~0
x 3
Assumptions:
1. Assume that the n girls are independent individuals; the outcome observed in
a girl does not influence the outcome of other girls.
2. Assume that the probability of death = p = 0.018.
3. Assume that the probability of death is the same for all 3 girls.