0% found this document useful (0 votes)

503 views39 pages

Textbook Practice Problems 1

The document discusses a study on the effects of inoculating soybean plants with nitrogen-fixing bacteria. It provides the pod weight data for 8 inoculated plants and 8 uninoculated plants. Descriptive statistics are computed, showing that the inoculated plants had higher average and median pod weights than the uninoculated plants. Graphical methods are also used to compare the two groups, indicating that inoculated plants tended to have higher pod weights.

Uploaded by

this hihi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

503 views39 pages

Textbook Practice Problems 1

Uploaded by

this hihi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Unit 2:

Lessen 2:

A man runs 1 mile approximately once per weekend. He records his time
over an 18-week period. The individual times and summary statistics are
given in Table 2.14.

2.9 What is standard deviation of the 1 mile running time over 18 weeks?

Solution:
d1 <- c(12.80,11.57,12.20,11.73,12.25,12.67,12.18,11.92,11.53,11.67,12.47,11.80,
12.30,12.33,12.08,12.55,11.72,11.83)

sd(d1) = 0.3874181
_________________________________________________________________

Suppose we construct a new variable called time_100 =100 Å~ time (e.g.,

for week 1, time_100 = 1280).
2.10 What is the mean and standard deviation of time_100?

Solution:
d1 <- c(12.80,11.57,12.20,11.73,12.25,12.67,12.18,11.92,11.53,11.67,12.47,11.80,
12.30,12.33,12.08,12.55,11.72,11.83)
d2 <- 100*(d1)
d2
mean(d2) = 1208.889
sd(d2) = 38.74181
2.11 Construct a stem and leaf plot of time_100 using the first 3 most
significant digits for the stem and the least significant digit for the leaf.
So, for week 1, time_100 = 1280 which has a stem = 128 and a leaf = 0.

Solution:
stem (d2, scale=3)
115 | 37
116 | 7
117 | 23
118 | 03
119 | 2
120 | 8
121 | 8
122 | 05
123 | 03
124 | 7
125 | 5
126 | 7
127 |
128 | 0

Hypertension:
In an experiment that examined the effect of body position on blood
pressure [8], 32 participants had their blood pressures measured while
lying down with their arms at their sides and again standing with their
arms supported at heart level. The data are given in Table 2.16.
#Spb taken in recumbent position
rec_sbp <- c(99,126,108,122,104,108, 116,106,118,92,110,138,120,142,118,134,
118,126,108,136,110,120,108,132,102,118,116,118,110,122,106,146)

#Dpb taken in recumbent position

rec_dbp <- c(71,74,72,68,64,60,70,74,82,58,78,80,70,88,58,
76,72,78,78,86,78,74,74,92,68,70,76,80,74,72,62,90)

#Spb taken in standing position

st_sbp <- c(105,124,102,114,96,96,106,106,120,88,102,124,118,136,92,126,108,
114,94,144,100,106,94,128,96,102,88,100,96,118,94,138)

#Dpb taken in standing position

st_dbp <- c(79,76,68,72,62,56,70,76,90,60,80,75,84,90,58,68,68,76,70,88,64,
70,74,88,64,68,60,84,70,78,56,94)

#difference in sys BP (rec - Standing)

diff_sbp <- rec_sbp - st_sbp

#difference in dias BP (rec - Standing)

diff_dbp <- rec_dbp - st_dbp

diff_sbp
[1] -6 2 6 8 8 12 10 0 -2 4 8 14 2 6 26 8 10 12 14 -8 10 14 14 4 6 16 28
[28] 18 14 4 12 8

diff_dbp
[1] -8 -2 4 -4 2 4 0 -2 -8 -2 -2 5 -14 -2 0 8 4 2 8 -2
[21] 14 4 0 4 4 2 16 -4 4 -6 6 -4

2.20 Construct stem-and-leaf and box plots for the difference scores for
each type of blood pressure.

> stem(diff_sbp)

-0 | 86
-0 | 2
0 | 022444
0 | 66688888
1 | 00022244444
1 | 68
2|
2 | 68

> stem(diff_dbp)

-1 | 4
-0 | 886
-0 | 444222222
0 | 0002224444444
0 | 5688
1|4
1|6

>boxplot(diff_sbp)

bwplot(diff_dbp)
2.21 Based on your answers to Problems 2.19 and 2.20, comment on the
effect of body position on the levels of systolic and diastolic blood
pressure.

Systolic blood pressure clearly seems to be higher in the recumbent position than in
the standing position. Diastolic blood pressure appears to be comparable in the two
positions. The distributions are each reasonably symmetric.

2.22 Orthostatic hypertension is sometimes defined based on an unusual

change in blood pressure after changing position. Suppose we define a
normal range for change in systolic blood pressure (SBP) based on change
in SBP from the recumbent to the standing position in Table 2.16 that is
between the upper and lower decile. What should the normal range be?

#The normal range of the diff_spb:

quantile (diff_sbp, probs= c (0.1,0.9), na.rm= TRUE, type=2)

10% 90%
0 16

#The normal range of the diff_dpb:

quantile (diff_dbp, probs= c (0.1,0.9), na.rm= TRUE, type=2)

10% 90%
-6 8

Lessen 3:

#preparing data

#1. Set working directory:

setwd("~/Desktop/3rd Semester/Methods I/Datasets")

#2. Load the data set:

lead.df <- read.table(file="LEAD.DAT.txt", header=TRUE, sep=",") #DEFULT IS
FALES

#3. View the file:

View(lead.df)
dim(lead.df) #Num of rows, Num of columns
names(lead.df) #columns names
head(lead.df)
str(lead.df) #structure of the variables

#4. Create a list of the variables that you want to change to factor:
fac_var <- c ("area", "sex", "iq_type", "lead_grp","Group", "fst2yrs",
"pica", "colic", "irrit", "convul")

lead.df[,fac_var] <- lapply(lead.df[,fac_var],factor)

str(lead.df)

2.31 Compare the exposed and control groups regarding age and gender,
using appropriate numeric and graphic descriptive measures.

#2.31:
# we want a numerical (e.g. mean, median, 5th num summary, sd, etc,)
#and graphical (e.g., boxplot, histogram, density plot, etc,) summary:
# 1* comparing the ages of the individuals in the exposed and control groups
# 2* comparing the distribution of sex in the exposed and control groups

#1. comparison of ages by groups:

names(lead.df) #"ageyrs" "Group"

require(mosaic)
#Num:
favstats(~ageyrs|Group,data=lead.df)
#gragh:
Bwplot(Group~ageyrs,data=lead.df)

#2. comparison of sex by groups:

#Num:
xtabs(~Group+sex,data=lead.df)
install.packages("tigerstats")
require("tigerstats")
rowPerc(xtabs(~Group+sex,data=lead.df))

#gragh:
bargraph(~sex,group=Group,data=lead.df) #for count
bargraph(~sex,group=Group,data=lead.df,type="percent") #for percentage

2.32 Compare the exposed and control groups regarding verbal and
performance IQ, using appropriate numeric and graphic descriptive
measures.

# we want a numerical (e.g. mean, median, 5th num summary, sd, etc,)
#and graphical (e.g., boxplot, histogram, density plot, etc,) summary:
# 1* comparing the verbal IQ in the exposed and control groups
# 2* comparing performance IQ in the exposed and control groups

names(lead.df)
str(lead.df)

# "iqv" , "iqp" , "Group"

#1. comparison of verbal IQ "iqv" by groups:
#NUM:
favstats(~iqv|Group,data=lead.df)
Group min Q1 median Q3 max mean sd n missing
1 1 57 74 85 95 126 85.14103 14.68609 78 0
2 2 51 76 83 91 116. 83.84783 11.56809 46 0

#Graphic:
bwplot(Group~iqv,data=lead.df)

#1. comparison of performance IQ "iqp" by groups:

#NUM:
favstats(~iqp|Group,data=lead.df)
Group min Q1 median Q3 max mean sd n missing
1 1 51 92.00 101 113.0 149 102.70513 16.78675 78 0
2 2 51 85.25 97 105.5 121 94.93478 13.34733 46 0
#Graphic:
bwplot(Group~iqp,data=lead.df)
The exposed children have somewhat lower mean and median IQ scores compared
to the unexposed children, but the differences don't appear to be very large.

Microbiology

A study was conducted to demonstrate that soybeans inoculated with nitrogen-fixing
bacteria yield more and grow adequately without expensive environmentally deleterious
synthesized fertilizers. The trial was conducted under con- trolled conditions with uniform
amounts of soil. The initial hy- pothesis was that inoculated plants would outperform their
uninoculated counterparts. This assumption is based on the facts that plants need nitrogen
to manufacture vital proteins and amino acids and that nitrogen-fixing bacteria would
make more of this substance available to plants, increasing their size and yield. There were
8 inoculated plants (I) and 8 uninoculated plants (U). The plant yield as measured by pod
weight for each plant is given in Table 2.20.

2.35 Compute appropriate descriptive statistics for I and U plants.

I <- c(1.76,1.45,1.03,1.53,2.34,1.96,1.79,1.21)
U <- c(0.49,0.85,1.00,1.54,1.01,0.75,2.11,0.92)

> favstats(I)
min Q1 median Q3 max mean sd n missing
1.03 1.39 1.645 1.8325 2.34 1.63375 0.4198958 8 0

> favstats(U)
min Q1 median Q3 max mean sd n missing
0.49 0.825 0.96 1.1425 2.11 1.08375 0.5097881 8 0
2.36 Use graphic methods to compare the two groups.
Sulotion:
cor(I ~ U)
0.0266867

xyplot(I ~ U)

2.37 What is your overall impression concerning the pod weight in the two
groups?

Inoculated plants (I) tend to have higher pod weight than uninoculated plants (U)
Unit 3:

Mental Health

Estimates of the prevalence of Alzheimer’s disease have recently been provided by

Pfeffer et al. [8]. The estimates are given in Table 3.5.

Suppose an unrelated 77-year-old man, 76-year-old woman, and 82-year-old

woman are selected from a community.

P(A) = event of 77-year old man having Alzheimer’s disease = 0.049

P(B) = event of 76-year old woman having Alzheimer’s disease = 0.023

P(C) = event of 82-year old woman having Alzheimer’s disease = 0.078

3.19: What is the probability that exactly one of the three people has Alzheimer’s
disease?

P( A∩ B̄∩C̄ )+ P( Ā∩B∩C̄ )+ P ( Ā∩ B̄∩C )

=(.049×. 977×. 922 )+(.951×. 023×.922)+(.951×. 977×.078 )
=.04414+.02017+.07247=.1368

3.20 Suppose we know one of the three people has Alzheimer’s

disease, but we don’t know which one. What is the conditional
probability that the affected person is a woman?

P( Ā∩B∩C̄ )+ P( Ā∩ B̄∩C ) . 02017+. 07247

= =. 677
P( A∩ B̄∩C̄ )+ P( Ā∩B∩C̄ )+ P( Ā∩B̄∩C ) .1368
3.21 Suppose we know two of the three people have Alzheimer’s
disease. What is the conditional probability that they are both
women?

P( Ā∩B∩C )
P( A∩ B∩C̄ )+ P ( A∩ B̄∩C )+ P( Ā∩B ∩C )
(.951×.023×. 078) . 00171
= = =.2639
(.049×. 023×. 922)+(.049×. 977×. 078 )+(. 951×. 023×. 078 ) . 00648

--------------------------------------------------------------------------------------------
-

Suppose the probability that both members of a married couple,

each of whom is 75–79 years of age, will have Alzheimer’s disease
is .0015.

P(A) = {The probability that male 75–79 years of age will have Alzheimer’s}
= 0.049

P(B) = {The probability that Female 75–79 years of age will have
Alzheimer’s} = 0.023

P (A∩ B) = 0.0015
3.25 What is the probability that at least one member of the couple is affected?

P (A ∪ B) = P(A) + P(B) – P (A ∩ B)

= 0.049 + 0.023 – 0.0015 = 0.0705

3.26 What is the expected overall prevalence of Alzheimer’s

disease in the community if the prevalence estimates in Table 3.5
for specific age–gender groups hold?
Let A={Alzheimer’s}.
P(A) = P(A | 65-69 y.o.male) x P(65-69 y.o.male)
+ P(A | 65-69 y.o.female) x P(65-69 y.o.female)
+ P(A | 70-74 y.o.male) x P(70-74 y.o.male)
+ P(A | 70-74 y.o.female) x P(70-74 y.o.female)
+ P(A | 75-79 y.o.male) x P(75-79 y.o.male)
+ P(A | 75-79 y.o.female) x P(75-79 y.o.female)
+ P(A | 80-84 y.o.male) x P(80-84 y.o.male) x
+ P(A | 80-84 y.o.female) x P(80-84 y.o.female)
+ P(A | 85+ y.o.male) x P(85+ y.o.male)
+ P(A | 85+ y.o.female) x P(85+ y.o.female)
= (.05 x .016) + (.10 x 0.0) + (.09 x 0.0) + (.17 x .022) + (.11 x .049) + (.18 x .023)
+ (.08 x .086) + (.12 x .078) + (.04 x .35) + (.06 x .279) = .061

The expected overall prevalence is 6.1% (or, 6.1 per 100 population).

--------------------------------------------------------------------------------------------
Hypertension

Laboratory measures of cardiovascular reactivity are receiving increasing attention. Much of the
expanded interest is based on the belief that these measures, obtained under challenge from
physical and psychological stressors, may yield a more biologically meaningful index of
cardiovascular function than more traditional static measures. Typically, measurement of
cardiovascular reactivity involves the use of an automated blood-pressure monitor to examine the
changes in blood pressure before and after a stimulating experience (such as playing a video
game). For this purpose, blood-pressure measurements were made with the Vita- Stat blood-
pressure machine both before and after playing a video game. Similar measurements were
obtained using manual methods for measuring blood pressure. A person was classified as a
“reactor” if his or her DBP increased by 10 mm Hg or more after playing the game and as a
nonre- actor otherwise. The results are given in Table 3.11.

3.78 If the population tested is representative of the general population, then what are the PV+
and PV− using this test?

PPV = P(D | S+) = P(D ∩ S+) / P(S+) = (6/79) / (21/79) = 6 / 21 = .286

PPN = P(no D | S-) = P(no D ∩ S-) / P(S-) = (51/79) / (58/79) = 51 / 58 = .879

Mental Health

The Chinese Mini-Mental Status Test (CMMS) consists of 114 items intended to identify people
with Alzheimer’s disease and senile dementia among people in China [14]. An extensive clinical
evaluation of this instrument was perormed, whereby participants were interviewed by
psychiatrists and nurses and a definitive diagnosis of dementia was made. Table 3.13 shows the
results obtained for the subgroup of people with at least some formal education.

Suppose a cutoff value of ≤ 20 on the test is used to identify people with dementia.

3.87 What is the sensitivity of the test?

+
sensitivity = P (test | disease) = 12/16 = 0.75

3.88 What is the specificity of the test?

=
Specificity = P (test | no disease) = 34/16 =0.73
3.89 The cutoff value of 20 on the CMMS used to identify people with dementia is arbitrary.
Suppose we consider changing the cutoff. What are the sensitivity and specificity if cutoffs of 5,
10, 15, 20, 25, or 30 are used? Make a table of your results.

CMMS Specificity sensitivity False Positive

5 46/46 = 1 2/16 = 0.125 1–1=0

10 46/46 = 1 3/16 = 0.188 1–1=0

15 43/46 = 0.935 7/16 = 0.438 1 – 0.935 =

0.065

20 34/46 = 0.739 12/16 = 0.750 1 – 0.739 =

0.261

25 18/46 = 0.391 15/16 = 0.938 1 – 0.391 =

0.609

30 0/46 = 0 16/16 = 1 1–0=1

3.90 Construct a ROC curve based on the table constructed in Problem 3.89.
3.91 Suppose we want both the sensitivity and specificity to be at least 70%. Use the ROC curve
to identify the possible value(s) to use as the cutoff for identifying people with dementia, based
on these criteria.

Based on criteria that both the sensitivity and specificity must be at least 70%, the cutoff value
for people with dementia is CMMS score =< 20.

The criterion is that both the sensitivity for sensitivity and specificity must be at least 70%. From
Exercise 3.89, the sensitivity for CMMS score < 20 is 0.750 and specificity is 0.739. Both the
values are greater than 70%. Also, from Exercise 3.90, in ROC curve it can be observed that
point (0.261, 0.750) satisfies the criterion for CMMS score < 20. Hence, from both the table and
ROC curve, cut off value for people with dementia is CMMS score < 20 for based on criteria that
both the sensitivity and specificity must be at least 70%.
3.92 Calculate the area under the ROC curve. Interpret what this area means in words in the
context of this problem.

Area = 0.5 [ (0.188+0.438) (0.065) + (0.438+0.750) (0.261- 0.065) + (0.750+0.938)(0.609-

0.261) + (0.938+1)(1-0.609) ] = 0.809

Suppose a birth defect has a recessive form of inheritance. In a

study population, the recessive gene (a) initially has a prevalence
of 25%. A subject has the birth defect if both maternal and
paternal genes are of type a.

3.115 In the general population, what is the probability that an

individual will have the birth defect, assuming that maternal and
paternal genes are inherited independently?

**Answer**

The recessive gene (a) initially has a prevalence of 25%.

P (Maternal gene) = 25/100 = 0.25

P (Paternal gene) = 25/100 = 0.25

P (having a birth defect that maternal and paternal genes are

inherited independently) = P (Maternal gene) * P (Paternal gene)

= 0.25 * 0.25 = 0.0625

**Supporting Work**

The recessive gene (a) initially has a prevalence of 25% in the

study population. Since both maternal and paternal genes are (a)
and inherited independently, the probability of each is 0.25. Then,
we conduct P(A∩B) = P(A)*P(B) to find the probability of having a
birth defect for these parents. The result is 0.625 of babies will
probably have a birth defect if their parents have genes (a)
independently.

A further study finds that after 10 generations (≈200 years) a lot

of inbreeding has taken place in the population. Two
subpopulations (populations A and B), consisting of 30% and 70%
of the general population, respectively, have formed. Within
population A, prevalence of the recessive gene is 40%, whereas in
population B it is 10%.

P(A) = 0.30, P(B) = 0.70

P(A)*P(B) = 0.30 X 0.70 = 0.021

3.116 Suppose that in 25% of marriages both people are from

population A, in 65% both are from population B, and in 10%
there is one partner from population A and one from population B.
What is the probability of a birth defect in the next generation?

*Answer*
For marriage:

25%: P(A)* P(A)* 0.25 = 0.16 * 0.25 = 0.04

65%: P(B)* P(B)* 0.65 = 0.01 * 0.65 = 0.0065

10%: P(A)* P(B)* 10% = 0.10 * 0.40 *0.10 = 0.004

Probability of a defect birth in the next generation = 0.04 +

0.0065 + 0.004 = 0.0505

3.117 Suppose that a baby is born with a birth defect, but the
baby’s ancestry is unknown. What is the posterior probability that
the baby will have both parents from population A, both parents
from population B, or mixed ancestry, respectively? (Hint: Use
Bayes’ rule.)

## Problem 3.117

The Prevalence of Population A (PRE-A) = 0.40

The Prevalence of Population B (PRE-B) = 0.10

The Probability of Both Parents from Population A = P(A) = 0.25

The Probability of Both Parents from Population B = P(B) = 0.65

The Probability of One from A and B = P(AB) = 0.10

The Probability of a defect birth in the next generation = P(D) =

0.0505

### Both Parents from Population A:

**Answer**

The Posterior Probability = 0.792

**Supporting Work**
The posterior probability that the baby will have both parents
from population A = ((PRE-A) ^2 * P(A)) /P(D)

The posterior probability = ((0.40) ^2 * 0.25) /0.0505 = 0.792

### Both Parents from Population B:

**Answer**

The Posterior Probability = 0.1287

**Supporting Work**

The posterior probability that the baby will have both parents
from population B = ((PRE-B) ^2 *P(B)) / P(D))

The posterior probability = ((0.10) ^2 * 0.65)/0.0505 = 0.1287

### Mixed Ancestry:

**Answer**

The Posterior Probability = 0.0792

**Supporting Work**

The posterior probability that both parents from mixed ancestry

= (PRE-A * PRE-B * P(AB))/P(D)

The posterior probability = (0.40 * 0.10 * 0.10)/ 0.0505 = 0.0792

Unit 4:
Let X be the random variable representing the number of
hypertensive adults in Example 3.12.

Hypertension Genetics Suppose we are conducting a

hypertension-screening program in the home. Consider all
possible pairs of DBP measurements of the mother and father
within a given family, assuming that the mother and father are
not genetically related. This sample space consists of all pairs of
numbers of the form (X, Y) where X > 0, Y > 0. Certain specific
events might be of interest in this context. In particular, we might
be interested in whether the mother or father is hypertensive,
which is described, respectively, by events A = {mother’s DBP ≥
90}, B = {father’s DBP ≥ 90}. These events are diagrammed in
Figure 3.4.

Suppose we know that Pr(A) = .1, Pr(B) = .2. What can we say
about Pr(A ∩ B) = Pr(mother’s DBP ≥ 90 and father’s DBP ≥ 90) =
Pr(both mother and father are hypertensive)? We can say nothing
unless we are willing to make certain assumptions.

4.1 Derive the probability-mass function for X.

Let X be the random variable representing the number of adults.

Consider that Events A and B represent the male and female.

It is given that the probability of male adults has hypertensive as

= P(A) = 0.1 and the probability of female adult as hypertensive
as P(B) = 0.2.

Here, X can take 0, 1, and 2,

 "0" represents that both male and female are not affected,
 "1" represents one adult is affected and one adult is not
affected, and
 "2" represents that both male and female are affected.

S AB not hyper One of them AB hyper

hyper

X 0 1 2

P(X=x) 0.72 0.26 0.02

CDF 0.72 0.98 1

1. When X-0, the probability mass function is as follows:

P (Both are not affected) = 1-P(A) * 1-P(B) = (1-0.1) * (1-0.2) =

0.72

2. When X-1, the probability mass function is as follows:

P (one of them are affected) = (P(A)(1-P(B)) + (P(B)(1-P(A))

= (0.1 (1-0.2)) + (0.2(1-0.1)) = 0.26

3. When X-0, the probability mass function is as follows:

P (Both are affected) = P(A) * P(B) = 0.1 * 0.2 = 0.02

4.2 What is its expected value?

E(X)= μX = {x · P(X=x)

E(X) = (00.72) + (10.26) + (2*0.02) = 0.3

4.3 What is its variance?

Var(X) = σX2 = E(X^2) − [E(X)] ^2

E(X^2) = ((0^2) * 0.72) + ((1^2) * 0.26) + ((2^2) * 0.02) = 0.34

Var(X) = 0.34 -(0.3^2) = 0.25

4.4 What is its cumulative-distribution function?

FX(x)=P (X ≤x)

F (x) = P(X<0) = 0

F (0) = P(X≤0) = P(X=0) = 0.72

F (1) = P(X≤1) = P(X=0) + P(X=1) = 0.72 + 0.26 = 0.98

F (1) = P(X≤2) = P(X=0) + P(X=1) + P(X=2) = 0.72 + 0.26 +

0.02 = 1.0

Suppose we want to check the accuracy of self-reported

diagnoses of angina by getting further medical records on
a subset of the cases.

4.5 If we have 50 reported cases of angina and we want to

select 5 for further review, then how many ways can we
select these cases if order of selection matters?

nPk = n! / (n-k)!

50P5 = 50! / (50-5)! = (5049484746) * 45! / 45! =

50*49*48*47*46

254251200

4.6 Answer Problem 4.5 assuming order of selection does

not matter.

nCk = n! / k! * (n-k)!

50C5 = 50! / 5! * (50-5)! = 2118760

4.9 Suppose 6 of 15 students in a grade-school class

develop influenza, whereas 20% of grade-school students
nationwide develop influenza. Is there evidence of an
excessive number of cases in the class? That is, what is
the probability of obtaining at least 6 cases in this class if
the nationwide rate holds true?

X ~ Binom (n=15, Prop=0.20)

P(X>= 6) = 1- P(X ≤ 5)

= 1- (P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4) + P(X=5))

= 1 - pbinom(q = 5, size = 15, prob = 0.20) = 0.061

4.10 What is the expected number of students in the class

who will develop influenza?

E(X) = np = 150.20 = 3

Hypertension

A national study found that treating people appropriately

for high blood pressure reduced their overall mortality by
20%. Treating people adequately for hypertension has
been difficult because it is estimated that 50% of
hypertensives do not know they have high blood pressure,
50% of those who do know are inadequately treated by
their physicians, and 50% who are appropriately treated
fail to follow this treatment by taking the right number of
pills.

4.30 What is the probability that among 10 true

hypertensives at least 50% are being treated
appropriately and are complying with this treatment?

P=0.5*0.5*0.5=0.125

X ~ Binom (n=10, Prop=0.125)

P(X>= 5) = 1- P(X ≤ 4)

= 1 - pbinom(q = 4, size = 10, prob = 0.125) = 0.00445

4.31 What is the probability that at least 7 of the 10

hypertensives know they have high blood pressure?

X ~ Binom (n=10, Prop=0.5)

P (X>= 7) = 1- P (X ≤ 6)

= 1 - pbinom(q = 6, size = 10, prob = 0.5) = 0.172

4.32 If the preceding 50% rates were each reduced to 40%

by a massive education program, then what effect would
this change have on the overall mortality rate among true
hypertensives; that is, would the mortality rate decrease
and, if so, what percentage of deaths among
hypertensives could be prevented by the education
program?

Cardiovascular Disease

An article was published [13] concerning the incidence of

cardiac death attributable to the earthquake in Los
Angeles County on January 17, 1994. In the week before
the earthquake there were an average of 15.6 cardiac
deaths per day in Los Angeles County. On the day of the
earthquake, there were 51 cardiac deaths.

4.64 What is the exact probability of 51 deaths occurring

on one day if the cardiac death rate in the previous week
continued to hold on the day of the earthquake?
X = Pois (μ = 15.6)

μ = lambda ⁄t. =15.6*1 = 15.6

P(X=51) = dpois(x = 51, lambda = 15.6) = 7.650953e-13

4.65 Is the occurrence of 51 deaths unusual? (Hint: Use

the same methodology as in Example 4.32.)

P(X ≥ 51) = 1 - P(X ≤ 50)

1 - ppois (q = 50, lambda = 15.6) = 1.089351e-12

4.66 What is the maximum number of cardiac deaths that

could have occurred on the day of the earthquake to be
consistent with the rate of cardiac deaths in the past
week? (Hint: Use a cutoff probability of .05 to determine
the maximum number.)

P(X ≥ x) > 0.05 = 1- P(X ≤ x-1)

myx <- seq(20,22, 1)

1 – ppois (q=myx, lambda=15.6)

x= 22

Hospital Epidemiology

Suppose the number of admissions to the emergency

room at a small hospital follows a Poisson distribution but
the incidence rate changes on different days of the week.
On a weekday there are on average two admissions per
day, while on a weekend day there is on average one
admission per day.
4.90 What is the probability of at least one admission on a
Saturday?

X ~ Pois (μ = 1)

P (X ≥ 1) = 1- P (X ≤ 0)

1- ppois (q=0, lambda=1) = 0.6321206

4.91 What is the probability of having 0, 1, and 2+

admissions for an entire week, if the results for different
days during the week are assumed to be independent?

μ = 2+2+2+2+2 = 10 for weekdays, μ = 1+1 = 2 for weekend

1. P (X = 0) = dpois (x=0, lambda=10) * dpois(x=0, lambda=2)

= 6.144212e-06
2. P (X = 1) =

dpois (x=1, lambda=10) * dpois(x=0, lambda=2) + dpois (x=1,

lambda=2) * dpois(x=0, lambda=10) = 7.373055e-05

3. P (X ≥ 2) = 1 – P (X = 0) + P (X = 1) = 1- (6.144212e-06 +
7.373055e-05)
= 0.9999201

Obstetrics
Suppose the incidence of a specific birth defect in a high
socioeconomic status (SES) census tract is 50 cases per
100,000 births.

4.92 If there are 5000 births in the census tract in 1 year,

then what is the probability that there will be exactly 5
cases of the birth defect during the year (census tract A in
Table 4.21)?

p = 50 / 100,000 = 0.0005

n = 5000, μ = np = 5000* 0.0005 = 2.5

X = Pois (μ = 2.5)

P (X =5) = dpois (x=5, lambda=2.5) = 0.06680094

Suppose the incidence of the same birth defect in a low

SES census tract is 100 cases per 100,000 births.

4.93 If there are 12,000 births in the census tract in 1

year, then what is the probability that there will be at
least 8 cases of the birth defect during the year (census
tract B in Table 4.21)?

p = 100 / 100,000 = 0.001

n = 12,000, μ = np = 12000* 0.001 = 12

X ~ Pois (μ = 12)

P (X ≥ 8) = 1- P (X ≤ 7)

= 1- ppois (q=7, lambda=12) = 0.9104955

Suppose a city is divided into eight census tracts as shown
in Table 4.21.

4.94 Suppose a child is born with the birth defect but the
address of the mother is unknown. What is the probability
that the child comes from a low SES census tract?

High SES μ = 10000*0.0005 = 5

Low SES μ = (12000+10000+8000+7000+20000+3000) * 0.001

= 60

Total = 65

P (Low SES| total) = 60 / 65 = 0.9230769

4.95 What is the expected number of cases over 1 year in

the city?

High SES μ = 10000*0.0005 = 5

High SES μ = (12000+10000+8000+7000+20000+3000) * 0.001

= 60

The expected number of cases over 1 year = High SES μ + Low

SES μ

= 60 +5 = 65
Unit 5:
Cardiovascular Disease

Because serum cholesterol is related to age and sex, some

investigators prefer to express it in terms of z-scores. If
X= raw serum cholesterol, then Z= X−μ / σ, where μ is the
mean and σ is the standard deviation of serum cholesterol
for a given age–gender group. Suppose Z is regarded as a
standard normal random variable.

5.1 What is Pr (Z < 0.5)?

Z ~ N (0, 1)

P (Z < 0.5) = pnorm (q=0.5, mean=0, sd=1) = 0.6914625

5.2 What is Pr (Z > 0.5)?

Z ~ N (0, 1)

P (Z > 0.5) = 1- P (Z < 0.5) = 1- pnorm (q=0.5, mean=0, sd=1) =

0.3085375
5.3 What is Pr (−1.0 < Z < 1.5)?

Z ~ N (0, 1)

P (−1.0 < Z < 1.5) = P (Z < 0.5) - P (Z < -1) = P (Z < 0.5) - (1- P
(Z < 1))
pnormGC (bound = c(-1, 1.5), region = "between", mean = 0, sd = 1, graph =
TRUE) = 0.7745375

Suppose a person is regarded as having high cholesterol if

Z > 2.0 and borderline cholesterol if 1.5 < Z < 2.0.

5.4 What proportion of people have high cholesterol?

Z ~ N (0, 1)

P (people have high cholesterol) = P (Z > 2.0) = 1- P (Z < 2)

= 1- pnorm (q=2, mean=0, sd=1) = 0.02275013

= pnormGC (bound = 2, region = "above", mean = 0, sd = 1, graph = TRUE) =
0.02275013

5.5 What proportion of people have borderline

cholesterol?

Z ~ N (0, 1)

P (people have borderline cholesterol) = P (1.5 < Z < 2.0.)

P (1.5 < Z < 2.0) = P (Z < 2.0) - P (Z < 1.5)

= pnorm (q=2, mean=0, sd=1) - pnorm (q=1.5, mean=0, sd=1)
= 0.04405707

= pnormGC (bound = c(1.5, 2.0), region = "between", mean = 0, sd = 1, graph =

TRUE) = 0.04405707

Cardiovascular Disease

Serum cholesterol is an important risk factor for coronary

disease. We can show that serum cholesterol is
approximately normally distributed, with mean = 219
mg/dL and standard deviation = 50 mg/dL.

5.14 If the clinically desirable range for cholesterol is <

200 mg/dL, what proportion of people have clinically
desirable levels of cholesterol?

X ~ N (219, 50)

P (people have clinically desirable levels of cholesterol) = P (X <

200)

= pnorm (q=200, mean=219, sd=50) = 0.3519727

5.15 Some investigators believe that only cholesterol
levels over 250 mg/dL indicate a high-enough risk for
heart disease to warrant treatment. What proportion of
the population does this group represent?

X ~ N (219, 50)

P (high-enough risk for heart disease to warrant treatment) = P (X

> 250)

= 1- P (X < 250) = 1- pnorm (q=250, mean=219, sd=50) =

0.2676289

5.16 What proportion of the general population has

borderline high-cholesterol levels—that is, > 200 but <
250 mg/dL?

X ~ N (219, 50)

P (population has borderline high-cholesterol levels) = P (200 < X

< 250)

P (200 < X < 250) = P (X < 250) - P (X < 200)

pnormGC (bound = c(200, 250), region = "between", mean = 219, sd = 50, graph
= TRUE) = 0.3803984

Hepatic Disease

Suppose we observe 84 alcoholics with cirrhosis of the

liver, of whom 29 have hepatomas—that is, liver-cell
carcinoma. Suppose we know, based on a large sample,
that the risk of hepatoma among alcoholics without
cirrhosis of the liver is 24%.

5.50 What is the probability that we observe exactly 29

alcoholics with cirrhosis of the liver who have hepatomas
if the true rate of hepatoma among alcoholics (with or
without cirrhosis of the liver) is .24?

X = Binom (n=84, Prop=0.24)

P (X= 29) = dbinom (x = 29, size = 84, prob = 0.24) =

0.008730478

5.51 What is the probability of observing at least 29

hepatomas among the 84 alcoholics with cirrhosis of the
liver under the assumptions in Problem 5.50?

X ~ Binom (n=84, Prop=0.24)

P (X ≥ 29) = 1- P (X ≤ 28)

= 1- pbinom (q = 28, size = 84, prob = 0.24) = 0.01935102

5.52 What is the smallest number of hepatomas that

would have to be observed among the alcoholics with
cirrhosis of the liver for the hepatoma experience in this
group to differ from the hepatoma experience among
alcoholics without cirrhosis of the liver? (Hint: Use a 5%
probability of getting a result at least as extreme to
denote differences between the hepatoma experiences of
the two groups.)

P(X ≥ x) > 0.05 = 1- P(X ≤ x-1)

myx <- seq (26, 28, 1)

1- pbinom (q=myx, size = 84, prob = 0.24)

x= 28

Environmental Health

5.58 A study was conducted relating particulate air

pollution and daily mortality in Steubenville, Ohio [4]. On
average over the past 10 years there have been 3 deaths
per day in Steubenville. Suppose that on 90 high-pollution
days—days in which the total suspended particulates are in the
highest quartile among all days—the death rate is 3.2
deaths per day, or 288 deaths observed over the 90 high-
pollution days. Are there an unusual number of deaths on
high-pollution days?

X = Pois (μ = 288)

Since μ ≥ 10, X=~Y ~ Pois (μ = 288, sd=sqrt (288))

= P (X ≥ 288) = ~ P (Y ≥ 288) = 1- P (Y < 288)

= 1- pnorm (q=288, mean=288, sd=sqrt (288)) = 0.5 > 0.05

Not unusual

5.106 What is the 40th percentile of a normal distribution

with mean = 5 and variance = 9?

X ~ N (mean = 5, variance = 9)

X ~ N (mean = 5, SD = sqrt (9) = 3)

P (X < x0.40) = qnorm (p=0.40, mean=5, sd=3) = 4.24

qnormGC (area = 0.40, region = "below", mean = 5, sd = 3,

graph = TRUE) = 4.24

5.108 What is z 0.90?

Z ~ N (0,1)

P (Z < z0.90) = qnorm (p=0.90, mean=0, sd=1) = 1.28

qnormGC (area = 0.90, region = "below", mean = 0, sd = 1,

graph = TRUE) = 1.28
Hypertension

Blood pressure readings are known to be highly variable.

Suppose we have mean SBP for one individual over n visits
with k readings per visit (Xn,k ). The variability of (Xn,k )
depends on n and k and is given by the formula σw2 =
σA2/n + σ2/(nk), where σA2 = between visit variability
and σ2 = within visit variability. For 30- to 49-year-old
Caucasian females, σA2 = 42.9 and σ2 = 12.8. For one
individual, we also assume that Xn,k is normally
distributed about their true long-term mean = μ with
variance = σw2.

n=visits, k=readings per visit, σA2 = 42.9, σ2 = 12.8

σw2 = σA2/n + σ2/(nk)

mean = μ with variance = σw2

5.123 Suppose a woman is measured at two visits with

two readings per visit. If her true long-term SBP = 130 mm
Hg, then what is the probability that her observed mean
SBP is ≥140 mm Hg? (Ignore any continuity correction.)
(Note: By true mean SBP we mean the average SBP over a
large number of visits for that subject.)

σw2 = σA2/n + σ2/(nk) = (42.9/2)+ (12.8/(2*2)) = 24.65

mean = μ = 130, with variance = σw2 = 24.65

P (SBP is ≥140) = P (X ≥140) = 1- P (X < 140)

= 1- pnorm (q=140, mean=130, sd=sqrt (24.65)) = 0.02199696

It is also known that over a large number of 30- to 49-
year-old Caucasian women, their true mean SBP is
normally distributed with mean = 120 mm Hg and
standard deviation = 14 mm Hg. Also, over a large number
of African American 30- to 49-year-old women, their true
mean SBP is normal with mean = 130 mm Hg and standard
deviation = 20 mm Hg.

5.125 Suppose we select a random 30- to 49-year-old

Caucasian woman and a random 30- to 49-year-old African
American woman. What is the probability that the African
American woman has a higher true SBP?

Hint: Use Equation 5.10 (on page 133).

30- to 49-year-old Caucasian woman = X1…

mean = 120 mm Hg, sd = 14 mm Hg

African American 30- to 49-year-old women = X2…

mean = 130 mm Hg, sd = 20 mm Hg

Ld = X2 – X1

E(Ld) = E(X2) - E(X1) = 130 – 120 = 10

Var (Ld) = ∑ ci^2 Var (Xi ) = (20^2) + (14^2) = 596

P (Ld > 0) = 1- P (Ld < 0)

= 1- pnorm (q=0, mean=10, sd=sqrt (596)) = 0.6589562

Biology Year 12 Syllabus Checklist - Rubix Learning
No ratings yet
Biology Year 12 Syllabus Checklist - Rubix Learning
4 pages
Navaidot Internal Medicine
No ratings yet
Navaidot Internal Medicine
530 pages
Vanders Human Physiology 15th Edition by Eric P Widmaier DR Ebook and TestBank Bundle Verified PDF
No ratings yet
Vanders Human Physiology 15th Edition by Eric P Widmaier DR Ebook and TestBank Bundle Verified PDF
415 pages
Comprehensive Icf Core Set For Spinal Cord Injury For Post-Acute Care
No ratings yet
Comprehensive Icf Core Set For Spinal Cord Injury For Post-Acute Care
12 pages
Ecgs and Murmurs Notes
No ratings yet
Ecgs and Murmurs Notes
59 pages
Cardiology
No ratings yet
Cardiology
249 pages
Swollen Legs-Leadrer's Version
No ratings yet
Swollen Legs-Leadrer's Version
6 pages
Chapter 30 Circulation of Blood
No ratings yet
Chapter 30 Circulation of Blood
20 pages
ABPM50 Contec
0% (1)
ABPM50 Contec
40 pages
Diary Done
No ratings yet
Diary Done
9 pages
Does Cat Attachment Have An Effect On Human Health? A Comparison Between Owners and Volunteers
No ratings yet
Does Cat Attachment Have An Effect On Human Health? A Comparison Between Owners and Volunteers
12 pages
Algorithms in C Part 5 3rd Edition 2001 8
No ratings yet
Algorithms in C Part 5 3rd Edition 2001 8
501 pages
(Ebook PDF) Introductory Statistics: Exploring The World Through Data 3rd Edition Download
100% (3)
(Ebook PDF) Introductory Statistics: Exploring The World Through Data 3rd Edition Download
51 pages
Pressures in The Body - Physics - Course Hero
No ratings yet
Pressures in The Body - Physics - Course Hero
10 pages
MHA 610 Week 4 Assignment
No ratings yet
MHA 610 Week 4 Assignment
7 pages
HW 2 Soln PDF
No ratings yet
HW 2 Soln PDF
8 pages
11 Hypothesistesting
No ratings yet
11 Hypothesistesting
21 pages
Scope and Presentation
No ratings yet
Scope and Presentation
8 pages
Parameter of STAR5000C
No ratings yet
Parameter of STAR5000C
4 pages
Vitamin D
No ratings yet
Vitamin D
15 pages
Short Learning Articles - Akhil - June 2021
No ratings yet
Short Learning Articles - Akhil - June 2021
104 pages
AO 2013-0006 Annex C MER Landbased
No ratings yet
AO 2013-0006 Annex C MER Landbased
2 pages
Wisner 2019
No ratings yet
Wisner 2019
1 page
SAFEMOB Final18673
No ratings yet
SAFEMOB Final18673
2 pages
AlexanderCampbell - Tamari.Hypothesis Testing 2 Samples Homework 1
No ratings yet
AlexanderCampbell - Tamari.Hypothesis Testing 2 Samples Homework 1
7 pages
SimCube SC 5 User Manual PDF
No ratings yet
SimCube SC 5 User Manual PDF
24 pages
Shock Exams
100% (3)
Shock Exams
21 pages
Mock 1 H2425 P2
No ratings yet
Mock 1 H2425 P2
28 pages
IAL Maths Statistics 3 SB
No ratings yet
IAL Maths Statistics 3 SB
35 pages
NeuVision 500
No ratings yet
NeuVision 500
6 pages
KS4 Physical Education: The Circulatory System
No ratings yet
KS4 Physical Education: The Circulatory System
36 pages
Technical Background Technicality of The Project
No ratings yet
Technical Background Technicality of The Project
8 pages
Fisiologi Tekanan Darah
No ratings yet
Fisiologi Tekanan Darah
13 pages
Computer Science A2 Level 9618 Theory Notes
No ratings yet
Computer Science A2 Level 9618 Theory Notes
151 pages
Medical Term PQRST
No ratings yet
Medical Term PQRST
37 pages
COVERAM BP Decrease, Central Aortic and Synergy
100% (1)
COVERAM BP Decrease, Central Aortic and Synergy
34 pages
Assignment Introduction To Biostatistics
No ratings yet
Assignment Introduction To Biostatistics
6 pages
Availability Bloomfield's 14 Mar 24
100% (1)
Availability Bloomfield's 14 Mar 24
43 pages
Catalogue Mano Medical en Malte PDF
No ratings yet
Catalogue Mano Medical en Malte PDF
8 pages
S2 Continuous Random Variables
100% (1)
S2 Continuous Random Variables
59 pages
Mathematics IA
100% (1)
Mathematics IA
19 pages
Cardiology DR Osama Mahmoud
100% (4)
Cardiology DR Osama Mahmoud
138 pages
Stroke
No ratings yet
Stroke
22 pages
Biology Inheritance
No ratings yet
Biology Inheritance
4 pages
Stats1 Chapter 2::: Measures of Location & Spread
No ratings yet
Stats1 Chapter 2::: Measures of Location & Spread
53 pages
Maths
No ratings yet
Maths
292 pages
(Revise Edexcel AS - A Level) Harry Smith, Steve Woolley, Steve Adams - Physics Revision Guide-Pearson Education Limited (2016)
No ratings yet
(Revise Edexcel AS - A Level) Harry Smith, Steve Woolley, Steve Adams - Physics Revision Guide-Pearson Education Limited (2016)
152 pages
BABS1201 Study Notes at UNSW
No ratings yet
BABS1201 Study Notes at UNSW
34 pages
The Simple Guide to SAS: From Null to Novice
From Everand
The Simple Guide to SAS: From Null to Novice
Kirby Thomas
No ratings yet
SAS Viya: The Python Perspective
From Everand
SAS Viya: The Python Perspective
Kevin D. Smith
No ratings yet
Wednesday 15 January 2020: Human Biology
No ratings yet
Wednesday 15 January 2020: Human Biology
24 pages
1.4 Averages and Comparing Data
No ratings yet
1.4 Averages and Comparing Data
6 pages
Poisson Distribution
No ratings yet
Poisson Distribution
22 pages
ER Protocol Paeds
No ratings yet
ER Protocol Paeds
171 pages
2017 Unit 1 Traditional Scenario
No ratings yet
2017 Unit 1 Traditional Scenario
8 pages
1.3 Grouping Data
No ratings yet
1.3 Grouping Data
8 pages
9897 - Lesson Note On Linear and Quadratic Graph
No ratings yet
9897 - Lesson Note On Linear and Quadratic Graph
4 pages
IAL Statistics Revision Worksheet Month 6
100% (1)
IAL Statistics Revision Worksheet Month 6
5 pages
Scientific Thinking and Processes: Teacher Notes and Answers
100% (1)
Scientific Thinking and Processes: Teacher Notes and Answers
5 pages
Investigating Breathing
No ratings yet
Investigating Breathing
4 pages
Biostatistics Lecture - 8 - Probability (Part - 2)
100% (1)
Biostatistics Lecture - 8 - Probability (Part - 2)
18 pages
B2 MS
0% (1)
B2 MS
3 pages
Quiz 2 Formula Sheet
No ratings yet
Quiz 2 Formula Sheet
2 pages
Math1041 Study Notes For UNSW
No ratings yet
Math1041 Study Notes For UNSW
16 pages
Gcse Data Work Book Compiled by MR Bradford
0% (1)
Gcse Data Work Book Compiled by MR Bradford
80 pages
MST005 Solved
No ratings yet
MST005 Solved
41 pages
Jan 2020 QP
No ratings yet
Jan 2020 QP
24 pages
Probability Distributions in Data Science - Towards Data Science
No ratings yet
Probability Distributions in Data Science - Towards Data Science
15 pages
1st Year 2nd Year 3rd Year: Diagram of The Undergraduate Programme in Economics
No ratings yet
1st Year 2nd Year 3rd Year: Diagram of The Undergraduate Programme in Economics
1 page
Up Tps6 Lecture Powerpoint 11.1 2
No ratings yet
Up Tps6 Lecture Powerpoint 11.1 2
63 pages
Book IntroStatistics PDF
No ratings yet
Book IntroStatistics PDF
263 pages
Carbohydrate As Biology Answers OCR AQA Edexcel
No ratings yet
Carbohydrate As Biology Answers OCR AQA Edexcel
4 pages
Unit 1 Molecules, Diet, Transport and Health
No ratings yet
Unit 1 Molecules, Diet, Transport and Health
61 pages
Biology AQA AS 2020 Paper 1
No ratings yet
Biology AQA AS 2020 Paper 1
28 pages
Edexcel IAL Biology A Level 11: Core Practical
No ratings yet
Edexcel IAL Biology A Level 11: Core Practical
13 pages
Probability Distributions
No ratings yet
Probability Distributions
17 pages
AQA Biology A-Level: Required Practical 6
No ratings yet
AQA Biology A-Level: Required Practical 6
4 pages
BS UNIT 2 Normal Distribution New
No ratings yet
BS UNIT 2 Normal Distribution New
41 pages
AP CSA Java Notes
No ratings yet
AP CSA Java Notes
27 pages
As A 2 Physics Practical Handbook
50% (2)
As A 2 Physics Practical Handbook
48 pages
1 - 3.group Data PDF
No ratings yet
1 - 3.group Data PDF
8 pages
Sample Mean Distribution
No ratings yet
Sample Mean Distribution
10 pages
January 2020 Biology PDF
No ratings yet
January 2020 Biology PDF
28 pages
Data Base PDF
No ratings yet
Data Base PDF
95 pages
Stats Formula
No ratings yet
Stats Formula
2 pages
APCSP Written Responses
No ratings yet
APCSP Written Responses
4 pages
Lecture 05 - Linear Regression
No ratings yet
Lecture 05 - Linear Regression
12 pages
Hookes Law Lab Activity
No ratings yet
Hookes Law Lab Activity
4 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
MATH1041 Final Cheat Sheet
No ratings yet
MATH1041 Final Cheat Sheet
3 pages
AQA Practical Handbook Final
No ratings yet
AQA Practical Handbook Final
35 pages
Statistics 1 AQA Revision Notes
No ratings yet
Statistics 1 AQA Revision Notes
7 pages
Statistics 4040
No ratings yet
Statistics 4040
9 pages