0% found this document useful (0 votes)
5K views418 pages

541SolutionsManuel PDF

The document contains solutions to chapter 1 exercises from a statistics textbook. It discusses concepts like types of variables, constructing histograms and stemplots from data, and examples of quantitative and categorical variables. Several solutions involve analyzing sample data sets to identify variable types and distributions or constructing visualizations like histograms.

Uploaded by

Oswald Evans
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5K views418 pages

541SolutionsManuel PDF

The document contains solutions to chapter 1 exercises from a statistics textbook. It discusses concepts like types of variables, constructing histograms and stemplots from data, and examples of quantitative and categorical variables. Several solutions involve analyzing sample data sets to identify variable types and distributions or constructing visualizations like histograms.

Uploaded by

Oswald Evans
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 418

Chapter 1 Solutions

1.1. Most students will prefer to work in seconds, to avoid having to work with decimals or
fractions.

1.2. Who? The individuals in the data set are students in a statistics class. What? There are
eight variables: ID (a label, with no units); Exam1, Exam2, Homework, Final, and Project
(in units in “points,” scaled from 0 to 100); TotalPoints (in points, computed from the other
scores, on a scale of 0 to 900); and Grade (A, B, C, D, and E). Why? The primary purpose
of the data is to assign grades to the students in this class, and (presumably) the variables
are appropriate for this purpose. (The data might also be useful for other purposes.)

1.3. Exam1 = 79, Exam2 = 88, Final = 88.

1.4. For this student, TotalPoints = 2 · 86 + 2 · 82 + 3 · 77 + 2 · 90 + 80 = 827, so the grade is B.

1.5. The cases are apartments. There are five variables: rent (quantitative), cable (categorical),
pets (categorical), bedrooms (quantitative), distance to campus (quantitative).

1.6. (a) To find injuries per worker, divide the rates in Example 1.6 by 100,000 (or, redo the
computations without multiplying by 100,000). For wage and salary workers, there are
0.000034 fatal injuries per worker. For self-employed workers, there are 0.000099 fatal
injuries per worker. (b) These rates are 1/10 the size of those in Example 1.6, or 10,000
times larger than those in part (a): 0.34 fatal injuries per 10,000 wage/salary workers, and
0.99 fatal injuries per 10,000 self-employed workers. (c) The rates in Example 1.6 would
probably be more easily understood by most people, because numbers like 3.4 and 9.9 feel
more “familiar.” (It might be even better to give rates per million worker: 34 and 99.)

1.7. Shown are two possible stemplots; the first uses split 5 58 5 58
stems (described on page 11 of the text). The scores are 6 0 6 058
6 58 7 00235558
slightly left-skewed; most range from 70 to the low 90s. 7 0023 8 000035557
7 5558 9 00022338
8 00003
8 5557
9 0002233
9 8

1.8. Preferences will vary. However, the stemplot in Figure 1.8 shows a bit more detail, which
is useful for comparing the two distributions.

1.9. (a) The stemplot of the altered data is shown on the right. (b) Blank stems 1 6
should always be retained (except at the beginning or end of the stemplot), 2
2 5568
because the gap in the distribution is an important piece of information about 3 34
the data. 3 55678
4 012233
4 8
5 1

53
54 Chapter 1 Looking at Data—Distributions

1.10. Student preferences will vary. The stemplot 9


has the advantage of showing each individual 8
score. Note that this histogram has the same 7

Frequency
shape as the second histogram in Exercise 1.7. 6
5
4
3
2
1
0
50 60 70 80 90 100
First exam scores

1.11. Student preferences may vary, but the 18


larger classes in this histogram hide a lot of 16
detail. 14

Frequency
12
10
8
6
4
2
0
40 60 80 100
First exam scores

1.12. This histogram shows more details about 7


the distribution (perhaps more detail than 6
is useful). Note that this histogram has the
5
Frequency

same shape as the first histogram in the solu-


4
tion to Exercise 1.7.
3
2
1
0
55 60 65 70 75 80 85 90 95 100
First exam scores

1.13. Using either a stemplot or histogram, we see that the distribution is left-skewed, centered
near 80, and spread from 55 to 98. (Of course, a histogram would not show the exact values
of the maximum and minimum.)

1.14. (a) The cases are the individual employees. (b) The first four (employee identification
number, last name, first name, and middle initial) are labels. Department and education level
are categorical variables; number of years with the company, salary, and age are quantitative
variables. (c) Column headings in student spreadsheets will vary, as will sample cases.

1.15. A Web search for “city rankings” or “best cities” will yield lots of ideas, such as crime
rates, income, cost of living, entertainment and cultural activities, taxes, climate, and school
system quality. (Students should be encouraged to think carefully about how some of these
might be quantitatively measured.)
Solutions 55

1.16. Recall that categorical variables place individuals into groups or categories, while
quantitative variables “take numerical values for which arithmetic operations. . . make sense.”
Variables (a), (d), and (e)—age, amount spent on food, and height—are quantitative. The
answers to the other three questions—about dancing, musical instruments, and broccoli—are
categorical variables.

1.18. Student answers will vary. A Web search for “college ranking methodology” gives
some ideas; in recent year, U.S. News and World Report used “16 measures of academic
excellence,” including academic reputation (measured by surveying college and university
administrators), retention rate, graduation rate, class sizes, faculty salaries, student-faculty
ratio, percentage of faculty with highest degree in their fields, quality of entering students
(ACT/SAT scores, high school class rank, enrollment-to-admission ratio), financial resources,
and the percentage of alumni who give to the school.

1.19. For example, blue is by far the most popu- 40


lar choice; 70% of respondents chose 3 of the 35
10 options (blue, green, and purple). 30

Percent
25
20
15
10
5
0

orange
blue
green

red
purple

yellow

gray
black

white
brown
Favorite color

1.20. For example, opinions about least-favorite 30


color are somewhat more varied than favorite
25
colors. Interestingly, purple is liked and dis-
liked by about the same fractions of people. 20
Percent

15
10
5
0
orange

blue
green

red
purple
yellow
gray

white

black
brown

Least favorite color

1.21. (a) There were 232 total respondents. The table that follows gives the percents; for
10 .
example, = 4.31%. (b) The bar graph is on the following page. (c) For example, 87.5%
232
of the group were between 19 and 50. (d) The age-group classes do not have equal width:
The first is 18 years wide, the second is 6 years wide, the third is 11 years wide, etc.
Note: In order to produce a histogram from the given data, the bar for the first age
group would have to be three times as wide as the second bar, the third bar would have to
be wider than the second bar by a factor of 11/6, etc. Additionally, if we change a bar’s
56 Chapter 1 Looking at Data—Distributions

width by a factor of x, we would need to change that bar’s height by a factor of 1/x.

40
35
Age group 30
(years) Percent

Percent
25
1 to 18 4.31% 20
19 to 24 41.81% 15
25 to 35 30.17% 10
5
36 to 50 15.52%
0
51 to 69 6.03%

70 and over
19 to 24

25 to 35

36 to 50

51 to 69
1 to 18
70 and over 2.16%

Age group (years)

1.22. (a) & (b) The bar graph and pie charts are shown below. (c) A clear majority (76%)
agree or strongly agree that they browse more with the iPhone than with their previous
phone. (d) Student preferences will vary. Some might prefer the pie chart because it is more
familiar.

Strongly
50 disagree
Response percent

40

30
Mildly
20 disagree Strongly
agree
10 Mildly
agree
0
Strongly Mildly Mildly Strongly
agree agree disagree disagree

1.23. Ordering bars by decreasing height shows 25


the models most affected by iPhone sales.
Replacement percent

However, because “other phone” and ”re- 20


placed nothing” are different than the other 15
categories, it makes sense to place those two
bars last (in any order). 10

0
e r
bil lm he
zr

ry

ian
k

Pa
kic

o Ot
Ra

er

thi
mb

sM
kB

de

No
ola

Sy
ac

Si

ow
tor

Bl

ind
Mo

W Previous phone model


Solutions 57

1.24. (a) The weights add to 254.2 million tons, and the percents add to 99.9.
(b) & (c) The bar graph and pie chart are shown below.

Glass Other

Rubber, leather, textiles


30 Paper, paperboard
Percent of total waste

Yard trimmings
Wood

Food scraps
25 Rubber, leather, Paper
20 textile

Plastics
15 Metals

Metals

Wood

Glass
10

Other
5 Plastics
Yard trimmings
0 Food scraps
Source

1.25. (a) & (b) Both bar graphs are shown below. (c) The ordered bars in the graph from (b)
make it easier to identify those materials that are frequently recycled and those that are not.
(d) Each percent represents part of a different whole. (For example, 2.6% of food scraps are
recycled; 23.7% of glass is recycled, etc.)

60 60
50 50
Percent recycled

Percent recycled

40 40
30 30
20 20
10 10
0 0
s s r r d r r s s r d r
las tal ape e
the pe tal as be he
ps

ps
s

oo oo
s

od cs
bb
ng

ng
tic

Me Pa Me Gl b Ot
ra

ra
Fo asti
G P Ru W O Ru W
as

mi

mi
sc

sc
Pl

Pl
im

im
od

Tr

Tr
Fo

Material Material

1.26. (a) The bar graph is shown on 80


the right. (b) The graph clearly illus- 70
Market share (%)

trates the dominance of Google; its 60


bar dwarfs those of the other search 50
engines. 40
30
20
10
0
Google Yahoo MSN AOL Microsoft Ask Other
Live
Search engine
58 Chapter 1 Looking at Data—Distributions

1.27. The two bar graphs are shown below.

20 20
Percent of all spam

15 15

10 10

5 5

0 0
Adult Financial Health Leisure Products Scams Products Financial Adult Scams Leisure Health
Type of spam Type of spam

1.28. (a) The bar graph is below. (b) The number of Facebook users trails off rapidly after the
top seven or so. (Of course, this is due in part to the variation in the populations of these
countries. For example, that Norway has nearly half as many Facebook users as France is
remarkable, because the 2008 populations of France and Norway were about 62.3 million
and 4.8 million, respectively.)
Facebook users (millions)

10

0
ng
m

uth la

ark
da

Co lia

No e

Ho frica
en

y
bia
Au ey

De ypt

ain
Ve xico
ile

Sw y

l
ly
Ge ia

ae
an
nc

So zue
do

Ita
Ind
a

Ko
Ch
na

rk

ed
rw

nm
lom

Eg

Sp

Isr
str

Fra

rm
Me
ing

A
Tu
Ca

ne

ng
dK
ite
Un

Country

1.29. (a) Most countries had moderate (single- or double-digit) increases in Face- 0 000
book usages. Chile (2197%) is an extreme outlier, as are (maybe) Venezuela 0 2333
0 4444
(683%) and Colombia (246%). (b) In the stemplot on the right, Chile and 0 6
Venezuela have been omitted, and stems are split five ways. (c) One observa- 0 99
tion is that, even without the outliers, the distribution is right-skewed. (d) The 1
stemplot can show some of the detail of the low part of the distribution, if the 1 33
1
outliers are omitted. 1
1
2
2
2 4
Solutions 59

1.30. (a) The given percentages refer to nine

Degrees earned by women (%)


distinct groups (all M.B.A. degrees, all 70
M.Ed. degrees, and so on) rather than one 60
single group. (b) Bar graph shown on the 50
right. Bars are ordered by height, as sug- 40
gested by the text; students may forget to do 30
this or might arrange in the opposite order 20
(smallest to largest). 10
0

Other Ph.D.
Other M.A.

Other M.S.

M.D.
Law
Ed.D.

Theology
M.Ed.

M.B.A.
Graduate degree

1.31. (a) The luxury car bar graph is below 25 Luxury cars
on the left; bars are in decreasing order of
size (the order given in the table). (b) The 20 Intermediate cars
intermediate car bar graph is below on the

Percent
15
right. For this stand-alone graph, it seemed
appropriate to re-order the bars by decreasing 10
size. Students may leave the bars in the order 5
given in the table; this (admittedly) might
make comparison of the two graphs simpler. 0

ld
ite
y
rl

e
er

er
d
ck

Gra
pea

Blu

Re
/go
(c) The graph on the right is one possible Silv

Oth
Wh
Bla

low
choice for comparing the two types of cars: ite
Wh

Yel
Color
for each color, we have one bar for each car
type.
Percent of intermediate cars

25
20
Percent of luxury cars

20
15
15
10
10
5 5

0 0
ld

d
ite

ite
y

y
rl

rl
e

e
er

er

er

er

d
d
ck

ck
Gra

Gra

l
pea

pea
Blu

Blu

Re
Re
/go

/go
Silv

Oth

Silv

Oth
Wh

Wh
Bla

Bla
low

low
ite

ite
Wh

Wh
Yel

Yel

Color Color

1.32. This distribution is skewed to the right, meaning that Shakespeare’s plays contain many
short words (up to six letters) and fewer very long words. We would probably expect most
authors to have skewed distributions, although the exact shape and spread will vary.
60 Chapter 1 Looking at Data—Distributions

1.33. Shown is the stemplot; as the text suggests, we have trimmed num- 0 799
bers (dropped the last digit) and split stems. 359 mg/dl appears to be 1 0134444
1 5577
an outlier. Overall, glucose levels are not under control: Only 4 of the 2 0
18 had levels in the desired range. 2 57
3
3 5

1.34. The back-to-back stemplot on the right suggests that the Individual Class
individual-instruction group was more consistent (their num- 0 799
22 1 0134444
bers have less spread) but not more successful (only two had 99866655 1 5577
numbers in the desired range). 22222 2 0
8 2 57
3
3 5

1.35. The distribution is roughly symmetric, centered near 7 (or “between 6 and 7”), and
spread from 2 to 13.

1.36. (a) Totals emissions would almost certainly be higher for 0 00000000000000011111
very large countries; for example, we would expect that even 0 222233333
0 445
with great attempts to control emissions, China (with over 0 6677
1 billion people) would have higher total emissions than the 0 888999
smallest countries in the data set. (b) A stemplot is shown; a 1 001
histogram would also be appropriate. We see a strong right 1
1
skew with a peak from 0 to 0.2 metric tons per person and a 1 67
smaller peak from 0.8 to 1. The three highest countries (the 1 9
United States, Canada, and Australia) appear to be outliers;
apart from those countries, the distribution is spread from 0 to 11 metric tons per person.

1.37. To display the 0 000000000000000000000000000000000000011111111111111111111


distribution, use 0 2222222222222222233333333333333333333333
0 444444444444444444445555555555555555555
either a stemplot 0 666666666666666666667777777777777
or a histogram. DT 0 888888888888888999999999999999999
scores are skewed to 1 000000000000111111111
the right, centered 1 22222222222233333333333
1 444444455
near 5 or 6, spread 1 66666777
from 0 to 18. There 1 8
are no outliers. We
might also note that only 11 of these 264 women (about 4%) scored 15 or higher.
Solutions 61

1.38. (a) The first histogram shows two modes: 5–5.2 and 5.6–5.8. (b) The second histogram
has peaks in locations close to those of the first, but these peaks are much less pronounced,
so they would usually be viewed as distinct modes. (c) The results will vary with the
software used.

18 18
16 16
14 14
Frequency

12 12
10 10
8 8
6 6
4 4
2 2
0 0
4.2 4.6 5 5.4 5.8 6.2 6.6 7 4.14 4.54 4.94 5.34 5.74 6.14 6.54 6.94
Rainwater pH Rainwater pH

1.39. Graph (a) is studying time (Question 4); it is reasonable to expect this to be right-skewed
(many students study little or not at all; a few study longer).
Graph (d) is the histogram of student heights (Question 3): One would expect a fair
amount of variation but no particular skewness to such a distribution.
The other two graphs are (b) handedness and (c) gender—unless this was a particularly
unusual class! We would expect that right-handed students should outnumber lefties
substantially. (Roughly 10 to 15% of the population as a whole is left-handed.)

1.40. Sketches will vary. The distribution of coin years would be left-skewed because newer
coins are more common than older coins.

1.41. (a) Not only are most responses multiples of 10; Women Men
many are multiples of 30 and 60. Most people will 0 033334
96 0 66679999
“round” their answers when asked to give an estimate 22222221 1 2222222
like this; in fact, the most striking answers are ones 888888888875555 1 558
4440 2 00344
such as 115, 170, or 230. The students who claimed 360 2
minutes (6 hours) and 300 minutes (5 hours) may have 3 0
6 3
been exaggerating. (Some students might also “consider
suspicious” the student who claimed to study 0 minutes per night. As a teacher, I can easily
believe that such students exist, and I suspect that some of your students might easily accept
that claim as well.) (b) The stemplots suggest that women (claim to) study more than men.
The approximate centers are 175 minutes for women and 120 minutes for men.
62 Chapter 1 Looking at Data—Distributions

1.42. The stemplot gives more information than a histogram (since all the 48 8
original numbers can be read off the stemplot), but both give the same im- 49
50 7
pression. The distribution is roughly symmetric with one value (4.88) that 51 0
is somewhat low. The center of the distribution is between 5.4 and 5.5 (the 52 6799
median is 5.46, the mean is 5.448); if asked to give a single estimate for the 53 04469
“true” density of the earth, something in that range would be the best answer. 54 2467
55 03578
56 12358
57 59
58 5

1.43. (a) There are four variables: GPA, IQ, and self-concept are quantitative, while gender
is categorical. (OBS is not a variable, since it is not really a “characteristic” of a student.)
(b) Below. (c) The distribution is skewed to the left, with center (median) around 7.8. GPAs
are spread from 0.5 to 10.8, with only 15 below 6. (d) There is more variability among the
boys; in fact, there seems to be a subset of boys with GPAs from 0.5 to 4.9. Ignoring that
group, the two distributions have similar shapes.
0 5 Female Male
1 8 0 5
2 4 1 8
3 4689 2 4
4 0679 4 3 689
7 4 069
5 1259 952 5 1
6 0112249 4210 6 129
7 22333556666666788899 98866533 7 223566666789
8 0000222223347899 997320 8 0002222348
9 002223344556668 65300 9 2223445668
710 10 68
10 01678

1.44. Stemplot at right, with split stems. The distribution is 7 24


fairly symmetric—perhaps slightly left-skewed—with center 7 79
8
around 110 (clearly above 100). IQs range from the low 70s 8 69
to the high 130s, with a “gap” in the low 80s. 9 0133
9 6778
10 0022333344
10 555666777789
11 0000111122223334444
11 55688999
12 003344
12 677888
13 02
13 6
Solutions 63

1.45. Stemplot at right, with split stems. The distribution is 2 01


skewed to the left, with center around 59.5. Most self-concept 2 8
3 0
scores are between 35 and 73, with a few below that, and one 3 5679
high score of 80 (but not really high enough to be an outlier). 4 02344
4 6799
5 1111223344444
5 556668899
6 00001233344444
6 55666677777899
7 0000111223
7
8 0

1.46. The time plot on the right shows that 190

Winning time (minutes)


women’s times decreased quite rapidly from
1972 until the mid-1980s. Since that time, 180
they have been fairly consistent: Almost all 170
times since 1986 are between 141 and 147
minutes. 160

150

140
1970 1975 1980 1985 1990 1995 2000 2005
Year

1.47. The total for the 24 countries was 897 days, so with Suriname, it is 897 + 694 = 1591
days, and the mean is x = 1591
25 = 63.64 days.

821
1.48. The mean score is x = = 82.1.
10

1.49. To find the ordered list of times, start with the 24 times in Example 1.23, and add 694 to
the end of the list. The ordered times (with median highlighted) are
4, 11, 14, 23, 23, 23, 23, 24, 27, 29, 31, 33, 40 ,
42, 44, 44, 44, 46, 47, 60, 61, 62, 65, 77, 694
The outlier increases the median from 36.5 to 40 days, but the change is much less than the
outlier’s effect on the mean.

1.50. The median of the service times is 103.5 seconds. (This is the average of the 40th and
41st numbers in the sorted list, but for a set of 80 numbers, we assume that most students
will compute the median using software, which does not require that the data be sorted.)

1.51. In order, the scores are:


55, 73, 75, 80, 80 , 85 , 90, 92, 93, 98
80 + 85
The middle two scores are 80 and 85, so the median is M = = 82.5.
2
64 Chapter 1 Looking at Data—Distributions

1.52. See the ordered list given in the previous solution.


The first quartile is Q 1 = 75, the median of the first five numbers: 55, 73, 75 , 80, 80.
Similarly, Q 3 = 92, the median of the last five numbers: 85, 90, 92 , 93, 98.

1.53. The maximum and minimum can be found by inspecting the list. The sorted list (with
quartile and median locations highlighted) is
1 2 2 3 4 9 9 9 11 19
19 25 30 35 40 44 48 51 52 54
55 56 57 59 64 67 68 73 73 75
75 76 76 77 80 88 89 90 102 103
104 106 115 116 118 121 126 128 137 138
140 141 143 148 148 157 178 179 182 199
201 203 211 225 274 277 289 290 325 367
372 386 438 465 479 700 700 951 1148 2631
This confirms the five-number summary (1, 54.5, 103.5, 200, and 2631 seconds)
given in Example 1.26. The sum of the 80 numbers is 15,726 seconds, so the mean is
x = 15,726
80 = 196.575 seconds (the value 197 in the text was rounded).
Note: The most tedious part of this process is sorting the numbers and adding them
all up. Unless you really want to confirm that your students can sort a list of 80 numbers,
consider giving the students the sorted list of times, and checking their ability to identify the
locations of the quartiles.

1.54. The median and quartiles were found earlier; the minimum and maximum are easy to
locate in the ordered list of scores (see the solutions to Exercises 1.51 and 1.52), so the
five-number summary is Min = 55, Q 1 = 75, M = 82.5, Q 3 = 92, Max = 98.

1.55. Use the five-number summary from the solution to Exercise 1.54:
95
Min = 55, Q 1 = 75, M = 82.5, Q 3 = 92, Max = 98 90
Score on first exam

85
80
75
70
65
60
55
50

1.56. The interquartile range is IQR = Q 3 − Q 1 = 92 − 75 = 17, so the 1.5 × IQR rule would
consider as outliers scores outside the range Q 1 − 25.5 = 49.5 to Q 3 + 25.5 = 117.5.
According to this rule, there are no outliers.

1 
1.57. The variance can be computed from the formula s 2 = (xi − x)2 ; for
n−1
example, the first term in the sum would be (80 − 82.1)2 = 4.41. However, in practice,
1416.9
software or a calculator is the preferred approach; this yields s 2 = = 157.43 and
√ . 9
s = s 2 = 12.5472.
Solutions 65

1.58. In order to have s = 0, all 5 cases must be equal; for example, 1, 1, 1, 1, 1, or


12.5, 12.5, 12.5, 12.5, 12.5. (If any two numbers are different, then xi − x would be nonzero
for some i, so the sum of squared differences would be positive, so s 2 > 0, so s > 0.)

1.59. Without Suriname, the quartiles are 23 and 46.5 days; with Suriname included, they are
23 and 53.5 days. Therefore, the IQR increases from 23.5 to 30.5 days—a much less drastic
change than the change in s (18.6 to 132.6 days).

950
1.60. Divide total score by 4: = 237.5 points.
4

1.61. (a) Use a stemplot or histogram. (b) Because 0 333333333333333333444444444444


the distribution is skewed, the five-number 0 55555555566666677777777778888889
1 00001112223333333
summary is the best choice; in millions of 1 79
dollars, it is 2 01111233
Min Q1 M Q3 Max 2 559
3 114
3338 4589 7558.5 13,416 66,667 3 5
Some students might choose the less-appropriate 4
. . 4
summary: x = 12,144 and s = 12,421 mil- 5 3
lion dollars. (c) For example, the distribution 5 99
is sharply right-skewed. (This is not surprising 6
6 6
given that we are looking at the top 100 compa-
nies; the top fraction of most distributions will tend to be skewed to the right.)

1.62. (a) Either a stemplot x s Min Q1 M Q 3 Max


or histogram can be used All points 4.7593 0.7523 0.4 4.30 4.7 5 6.5
to display the distribu- No O’Doul’s 4.8106 0.5864 3.8 4.35 4.7 5 6.5
tion. Two stemplots are shown on the following page: one with all points, and one with the
outlier mentioned in part (b) excluded. In the table are the mean and standard deviation, as
well as the five-number summary, both with and without the outlier (all values are percents).
The latter is preferable because of the outlier; in particular, note the outlier’s effect on the
standard deviation. (See also the solution to the next exercise.) (b) O’Doul’s is marketed as
“non-alcoholic” beer.
Note: In federal regulations, part of the definition of beer is that it has at least 0.5%
alcohol. By that standard, O’Doul’s is a low-alcohol beverage, but it is not beer.
66 Chapter 1 Looking at Data—Distributions

All points Without O’Doul’s


0 4 3 88
0 4 111111
1 4 2222222222333
1 4 4444555555
2 4 66666777777777
2 4 8889999999999
3 5 000000011
3 88 5 22
4 11111122222222223334444 5 45
4 555555666667777777778889999999999 5 6666
5 000000011224 5 88999999
5 5666688999999 6 1
6 1 6
6 5 6 5

1.63. All of these numbers are given in the table in the solution to the previous exercise.
(a) x changes from 4.76% (with) to 4.81% (without); the median (4.7%) does not change.
(b) s changes from 0.7523% to 0.5864%; Q 1 changes from 4.3% to 4.35%, while Q 3 = 5%
does not change. (c) A low outlier decreases x; any kind of outlier increases s. Outliers
have little or no effect on the median and quartiles.

1.64. (a) A stemplot or histogram can be used to display 7 0


the distribution. Students may report either mean/standard 8
9 4556889
deviation or the five-number summary (in units of calories): 10 2458
x s Min Q1 M Q3 Max 11 00000000334
12 08
141.06 27.79 70 113 145.5 157 210 13 0235558
(b) O’Doul’s has the fewest calories (70) of these 86 beers. 14 22333444555666788899
15 0012233356777
(c) Nearly all the beers with fewer than 120 calories are
16 00012336669
marketed as light beers (and most have “light” in their 17 01459
names). Of the other beers, only one (Weinhard’s Amber 18 8
Light) is called “light.” 19 5
20 00
Note: If we apply the 1.5 × IQR rule to all 86 beers, 21 0
O’Doul’s does not qualify as an outlier (the cutoff is 47).
However, if we restrict our attention to the light beers (fewer than 120 calories), any beer
below 80 calories is an outlier.

1.65. Use a small data set with an odd number of points, so that the median is the middle
number. After deleting the lowest observation, the median will be the average of that middle
number and the next number after it; if that latter number is much larger, the median will
change substantially. For example, start with 0, 1, 2 , 998, 1000; after removing 0, the
median changes from 2 to 500.

1.66. Salary distributions (especially in professional sports) tend to be skewed to the right. This
skew makes the mean higher than the median.
Solutions 67

1.67. (a) The distribution is left-skewed. While the skew makes the 3 7
five-number summary is preferable, some students might give the 4 3
4 7777
mean/standard deviation. In ounces, these statistics are: 5 23
x s Min Q1 M Q3 Max 5
6.456 1.425 3.7 4.95 6.7 7.85 8.2 6 0033
6 7
(b) The numerical summary does not reveal the two weight clusters (vis- 7 03
ible in a stemplot or histogram). (c) For small potatoes (less than 6 oz), 7 668899999
8 2
n = 8, x = 4.662 oz, and s = 0.501 oz. For large potatoes, n = 17,
x = 7.300 oz, and s = 0.755 oz. Because there are clearly two groups, it seems appropriate
to treat them separately.

1.68. (a) The five-number summary is Min = 2.2 cm, Q 1 = 10.95 cm, M = 28.5 cm, Q 3 =
41.9 cm, Max = 69.3 cm. (b) & (c) The boxplot and histogram are shown below. (Students
might choose different interval widths for the histogram.) (d) Preferences will vary. Both
plots reveal the right-skew of this distribution, but the boxplot does not show the two peaks
visible in the histogram.

70 9
Diameter at breast height (cm)

8
60
7
Frequency
50 6
5
40
4
30 3
2
20
1
10 0
0 10 20 30 40 50 60 70 80
0 Diameter at breast height (cm)

1.69. (a) The five-number summary is Min = 0 mg/l, Q 1 = 0 mg/l, M = 5.085 mg/l, Q 3 =
9.47 mg/l, Max = 73.2 mg/l. (b) & (c) The boxplot and histogram are shown below.
(Students might choose different interval widths for the histogram.) (d) Preferences will
vary. Both plots reveal the sharp right-skew of this distribution, but because Min = Q 1 , the
boxplot looks somewhat strange. The histogram seems to convey the distribution better.

70 30

60 25
Frequency

50 20
CRP (mg/l)

40 15
30 10
20 5
10 0
0 0 10 20 30 40 50 60 70 80 90
CRP (mg/l)
68 Chapter 1 Looking at Data—Distributions

1.70. Answers depend on whether natural (base-e) or com-


mon (base-10) logarithms are used. Both sets of answers 4.5 2
4

Base-10 log of (1+CRP)


are shown here. If this exercise is assigned, it would

Natural log of (1+CRP)


3.5 1.5
probably be best for the sanity of both instructor and
3
students to specify which logarithm to use. 2.5
(a) The five-number summary is: 2
1

1.5
Logarithm Min Q1 M Q3 Max
1 0.5
Natural 0 0 1.8048 2.3485 4.3068
0.5
Common 0 0 0.7838 1.0199 1.8704
0 0
.
(The ratio between these answers is roughly ln 10 = 2.3.)
(b) & (c) The boxplots and histograms are shown below. (Students might choose different
interval widths for the histograms.) (d) As for Exercise 1.69, preferences will vary.

16 16
14 14
12 12
Frequency
Frequency

10 10
8 8
6 6
4 4
2 2
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
Natural log of (1+CRP) Base-10 log of (1+CRP)

1.71. (a) The five-number summary (in units of µmol/l) is Min = 0.24, Q 1 = 0.355, M =
0.76, Q 3 = 1.03, Max = 1.9. (b) & (c) The boxplot and histogram are shown below.
(Students might choose different interval widths for the histogram.) (d) The distribution is
right-skewed. A histogram (or stemplot) is preferable because it reveals an important feature
not evident from a boxplot: This distribution has two peaks.

1.8 14
1.6 12
Retinol level (µmol/l)

1.4 10
Frequency

1.2 8
1 6
0.8 4
0.6
2
0.4
0
0.2
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
0 Retinol level (µmol/l)
Solutions 69

1.72. The mean and standard deviation for these ratings are 1 0000000000000000
.
x = 5.9 and s = 3.7719; the five-number summary is 2 0000
3 0
Min = Q 1 = 1, M = 6.5, Q 3 = Max = 10. For a graphical 4 0
presentation, a stemplot (or histogram) is better than a boxplot 5 00000
because the latter obscures details about the distribution. (With 6 000
a little thought, one might realize that Min = Q 1 = 1 and 7 0
8 000000
Q 3 = Max = 10 means that there are lots of 1’s and lots 9 00000
of 10’s, but this is much more evident in a stemplot or his- 10 000000000000000000
togram.)

1.73. The distribution of household net worth would almost surely be strongly skewed to the
right: Most families would generally have accumulated little or modest wealth, but a few
would have become rich. This strong skew pulls the mean to be higher than the median.

1.74. See also the solution to Exercise 1.36. (a) The five- 0 00000000000000011111
number summary (in units of metric tons per person) is: 0 222233333
0 445
Min = 0, Q 1 = 0.75, M = 3.2, Q 3 = 7.8, Max = 19.9 0 6677
The evidence for the skew is in the large gaps between the 0 888999
higher numbers; that is, the differences Q 3 − M and Max− Q 3 1 001
are large compared to Q 1 − Min and M − Q 1 . (b) The IQR 1
1
is Q 3 − Q 1 = 7.05, so outliers would be less than −9.825 or 1 67
greater than 18.375. According to this rule, only the United 1 9
States qualifies as an outlier, but Canada and Australia seem
high enough to also include them.

.
1.75. The total salary is $690,000, so the mean is x = $690,000
9 = $76,667. Six of the nine
employees earn less than the mean. The median is M = $35,000.

1.76. If three individuals earn $0, $0, and $20,000, the reported median is $20,000. If the two
individuals with no income take jobs at $14,000 each, the median decreases to $14,000.
The same thing can happen to the mean: In this example, the mean drops from $20,000 to
$16,000.

.
1.77. The total salary is now $825,000, so the new mean is x = $825,000
9 = $91,667. The
median is unchanged.

1.78. Details at right. xi xi − x (xi − x)2


1792 192 36864
11,200 1666 66 4356
x= = 1600 1362 −238 56644
7
214,872 1614 14 196
s2 = = 35,812 and 1460 −140 19600
6
 . 1867 267 71289
s= 35,812 = 189.24 1439 −161 25921
11200 0 214872
70 Chapter 1 Looking at Data—Distributions

1.79. The quote describes a distribution with a strong right skew: Lots of years with no losses
to hurricane ($0), but very high numbers when they do occur. For example, if there is one
hurricane in a 10-year period causing $1 million in damages, the “average annual loss” for
that period would be $100,000, but that does not adequately represent the cost for the year
of the hurricane. Means are not the appropriate measure of center for skewed distributions.

1.80. (a) x and s are appropriate for symmetric dis- Women Men
tributions with no outliers. (b) Both high numbers x s x s
are flagged as outliers. For women, IQR = 60, Before 165.2 56.5 117.2 74.2
so the upper 1.5 × IQR limit is 300 minutes. For After 158.4 43.7 110.9 66.9
men, IQR = 90, so the upper 1.5 × IQR limit is 285 minutes. The table on the right shows
the effect of removing these outliers.

1.81. (a) & (b) See the table on the right. In both cases, x s M
the mean and median are quite similar. pH 5.4256 0.5379 5.44
Density 5.4479 0.2209 5.46

1.82. See also the solution to Exercise 1.43. (a) The mean of x s M
this distribution appears to be higher than 100. (There is IQ 108.9 13.17 110
no substantial difference between the standard deviations.) GPA 7.447 (2.1) 7.829
(b) The mean and median are quite similar; the mean is slightly smaller due to the slight left
skew of the data. (c) In addition to the mean and median, the standard deviation is shown for
reference (the exercise did not ask for it).
Note: Students may be somewhat puzzled by the statement in (b) that the median is
“close to the mean” (when they differ by 1.1), followed by (c), where they “differ a bit”
(when M − x = 0.382). It may be useful to emphasize that we judge the size of such differ-
.
1.1
ences relative to the spread of the distribution. For example, we can note that 13.17 = 0.08
.
2.1 = 0.18 for (c).
for (b), and 0.382

1.83. With only two observations, the mean and median are always equal because the median
is halfway between the middle two (in this case, the only two) numbers.

1.84. (a) The mean (green arrow) moves along with the moving point (in fact, it moves in
the same direction as the moving point, at one-third the speed). At the same time, as long
as the moving point remains to the right of the other two, the median (red arrow) points to
the middle point (the rightmost nonmoving point). (b) The mean follows the moving point
as before. When the moving point passes the rightmost fixed point, the median slides along
with it until the moving point passes the leftmost fixed point, then the median stays there.

1.85. (a) There are several different answers, depending on the configuration of the first five
points. Most students will likely assume that the first five points should be distinct (no
repeats), in which case the sixth point must be placed at the median. This is because the
median of 5 (sorted) points is the third, while the median of 6 points is the average of the
third and fourth. If these are to be the same, the third and fourth points of the set of six
must both equal the third point of the set of five.
The diagram below illustrates all of the possibilities; in each case, the arrow shows the
Solutions 71

location of the median of the initial five points, and the shaded region (or dot) on the line
indicates where the sixth point can be placed without changing the median. Notice that there
are four cases where the median does not change, regardless of the location of the sixth
point. (The points need not be equally spaced; these diagrams were drawn that way for
convenience.)

(b) Regardless of the configuration of the first five points, if the sixth point is added so as to
leave the median unchanged, then in that (sorted) set of six, the third and fourth points must
be equal. One of these two points will be the middle (fourth) point of the (sorted) set of
seven, no matter where the seventh point is placed.
Note: If you have a student who illustrates all possible cases above, then it is likely that
the student either (1) obtained a copy of this solutions manual, (2) should consider a career
in writing solutions manuals, (3) has too much time on his or her hands, or (4) both 2 and
3 (and perhaps 1) are true.

1.86. The five-number summaries (all in millimeters) are: 50

Min Q1 M Q3 Max 48
bihai 46.34 46.71 47.12 48.245 50.26 46
Length (mm)

red 37.40 38.07 39.16 41.69 43.09 44


yellow 34.57 35.45 36.11 36.82 38.13 42
40
H. bihai is clearly the tallest variety—the shortest bihai was
38
over 3 mm taller than the tallest red. Red is generally taller
36
than yellow, with a few exceptions. Another noteworthy
fact: The red variety is more variable than either of the 34
bihai red yellow
other varieties. Heliconia variety

1.87. (a) The means and standard deviations bihai red yellow
(all in millimeters) are: 46 3466789 37 4789 34 56
47 114 38 0012278 35 146
Variety x s 48 0133 39 167 36 0015678
bihai 47.5975 1.2129 49 40 56 37 01
50 12 41 4699 38 1
red 39.7113 1.7988 42 01
yellow 36.1800 0.9753 43 0

(b) Bihai and red appear to be right-skewed (although it is difficult to tell with such small
samples). Skewness would make these distributions unsuitable for x and s.
72 Chapter 1 Looking at Data—Distributions

.
1.88. (a) The mean is x = 15, and the standard deviation is s = 5.4365. (b) The mean is still
15; the new standard deviation is 3.7417. (c) Using the mean as a substitute for missing data
will not change the mean, but it decreases the standard deviation.

1.89. The minimum and maximum are easily determined to be 1 and 12 letters, and the
quartiles and median can be found by adding up the bar heights. For example, the first
two bars have total height 22.3% (less than 25%), and adding the third bar brings the total
to 45%, so Q 1 must equal 3 letters. Continuing this way, we find that the five-number
summary, in units of letters, is:

Min = 1, Q 1 = 3, M = 4, Q 3 = 5, Max = 12

Note that even without the frequency table given in the data file, we could draw the same
conclusion by estimating the heights of the bars in the histogram.

1.90. Because the mean is to be 7, the five numbers must add up to 35. Also, the third number
(in order from smallest to largest) must be 10 because that is the median. Beyond that, there
is some freedom in how the numbers are chosen.
Note: It is likely that many students will interpret “positive numbers” as meaning
positive integers only, which leads to eight possible solutions, shown below.
1 1 10 10 13 1 1 10 11 12 1 2 10 10 12 1 2 10 11 11
1 3 10 10 11 1 4 10 10 10 2 2 10 10 11 2 3 10 10 10

1.91. The simplest approach is to take (at least) six numbers—say, a, b, c, d, e, f in increasing
order. For this set, Q 3 = e; we can cause the mean to be larger than e by simply choosing
f to be much larger than e. For example, if all numbers are nonnegative, f > 5e would
accomplish the goal because then
a+b+c+d +e+ f e+ f e + 5e
x= > > = e.
6 6 6

1.92. The algebra might be a bit of a stretch for some students:


(x1 − x) + (x2 − x) + (x3 − x) + · · · + (xn−1 − x) + (xn − x)
= x1 − x + x2 − x + x3 − x + · · · + xn−1 − x + xn − x
(drop all the parentheses)
= x1 + x2 + x3 + · · · + xn−1 + xn − x − x − x − ··· − x − x
(rearrange the terms)
= x1 + x2 + x3 + · · · + xn−1 + xn −n·x
Next, simply observe that n · x = x1 + x2 + x3 + · · · + xn−1 + xn .

1.93. (a) One possible answer is 1, 1, 1, 1. (b) 0, 0, 20, 20. (c) For (a), any set of four
identical numbers will have s = 0. For (b), the answer is unique; here is a rough description
of why. We want to maximize the “spread-out”-ness of the numbers (which is what standard
deviation measures), so 0 and 20 seem to be reasonable choices based on that idea. We also
want to make each individual squared deviation—(x1 − x)2 , (x2 − x)2 , (x3 − x)2 , and
(x4 − x)2 —as large as possible. If we choose 0, 20, 20, 20—or 20, 0, 0, 0—we make the
Solutions 73

first squared deviation 152 , but the other three are only 52 . Our best choice is two at each
extreme, which makes all four squared deviations equal to 102 .

1.94. Answers will vary. Typical calculators will carry only about 12 to 15 digits; for example,
a TI-83 fails (gives s = 0) for 14-digit numbers. Excel (at least the version I checked) also
fails for 14-digit numbers, but it gives s = 262,144 rather than 0. The (very old) version of
Minitab used to prepare these answers fails at 20,000,001 (eight digits), giving s = 2.

1.95. The table on the right reproduces the (in mm) (in inches)
means and standard deviations from the Variety x s x s
solution to Exercise 1.87 and shows those bihai 47.5975 1.2129 1.874 0.04775
values expressed in inches. For each conver- red 39.7113 1.7988 1.563 0.07082
yellow 36.1800 0.9753 1.424 0.03840
sion, multiply by 39.37/1000 = 0.03937 (or
divide by 25.4—an inch is defined as 25.4 millimeters). For example, for the bihai variety,
x = (47.5975 mm)(0.03937 in/mm) = (47.5975 mm) ÷ (25.4 mm/in) = 1.874 in.

1.96. (a) x = 5.4479 and s = 0.2209. (b) The first measurement corresponds to
5.50 × 62.43 = 343.365 pounds per cubic foot. To find x new and snew , we similarly multiply
. .
by 62.43: x new = 340.11 and snew = 13.79.
Note: The conversion from cm to feet is included in the multiplication by 62.43; the
step-by-step process of this conversion looks like this:
(1 g/cm3 )(0.001 kg/g)(2.2046 lb/kg)(30.483 cm3/ft3 ) = 62.43 lb/ft3
.
1.97. Convert from kilograms to pounds by multiplying by 2.2: x = (2.42 kg)(2.2 lb/kg) =
.
5.32 lb and s = (1.18 kg)(2.2 lb/kg) = 2.60 lb.

1.98. Variance is changed by a factor of 2.542 = 6.4516; generally, for a transformation


xnew = a + bx, the new variance is b2 times the old variance.

1.99. There are 80 service times, so to find the 10% trimmed mean, remove the highest and
lowest eight values (leaving 64). Remove the highest and lowest 16 values (leaving 48) for
the 20% trimmed mean.
The mean and median for the full data set are x = 196.575 and M = 103.5 minutes. The
. .
10% trimmed mean is x ∗ = 127.734, and the 20% trimmed mean is x ∗∗ = 111.917 minutes.
Because the distribution is right-skewed, removing the extremes lowers the mean.
74 Chapter 1 Looking at Data—Distributions

1.100. After changing the scale from centimeters to inches, the five-number summary values
change by the same ratio (that is, they are multiplied by 0.39). The shape of the histogram
might change slightly because of the change in class intervals. (a) The five-number
summary (in inches) is Min = 0.858, Q 1 = 4.2705, M = 11.115, Q 3 = 16.341, Max =
27.027. (b) & (c) The boxplot and histogram are shown below. (Students might choose
different interval widths for the histogram.) (d) As in Exercise 1.56, the histogram reveals
more detail about the shape of the distribution.

12
Diameter at breast height (in)

25
10
20

Frequency
8

15 6
4
10
2
5 0
0 5 10 15 20 25 30 35
0 Diameter at breast height (in)

1.101. Take the mean plus or minus two standard deviations: 572 ± 2(51) = 470 to 674.

1.102. Take the mean plus or minus three standard deviations: 572 ± 3(51) = 419 to 725.

620 − 572 .
1.103. The z-score is z = 51 = 0.94.

− 572 .
1.104. The z-score is z = 510 51 = −1.22. This is negative because an ISTEP score of 510 is
below average; specifically, it is 1.22 standard deviations below the mean.

.
1.105. Using Table A, the proportion below 620 (z = 0.94) 620
is 0.8264 and the proportion at or above is 0.1736; these
0.8264
two proportions add to 1. The graph on the right illus- 0.1736
trates this with a single curve; it conveys essentially the
same idea as the “graphical subtraction” picture shown in
419 470 521 572 623 674 725
Example 1.36.

.
1.106. Using Table A, the proportion below 620 (z = 0.94)
. 620 660
is 0.8264, and the proportion below 660 (z = 1.73) is
0.8264
0.9582. Therefore:
0.9582
area between area left area left
= −
620 and 660 of 660 of 620 419 470 521 572 623 674 725

0.1318 = 0.9582 − 0.8264


The graph on the right illustrates this with a single curve; it conveys essentially the same
idea as the “graphical subtraction” picture shown in Example 1.37.
Solutions 75

.
1.107. Using Table A, this ISTEP score should correspond to a standard score of z = 0.67
.
(software gives 0.6745), so the ISTEP score (unstandardized) is 572 + 0.67(51) = 606.2
(software: 606.4).

.
1.108. Using Table A, x should correspond to a standard score of z = −0.84 (software gives
.
−0.8416), so the ISTEP score (unstandardized) is x = 572 − 0.84(51) = 529.2 (software:
529.1).

1.109. Of course, student sketches will not


be as neat as the curves on the right,
but they should have roughly the correct
shape. (a) It is easiest to draw the curve
first, and then mark the scale on the 1 4 7 10 13 16 19 22 25 28
axis. (b) Draw a copy of the first curve, with the peak over 20. (c) The curve has the same
shape, but is translated left or right.

1.110. (a) As in the previous exercise, draw the curve


first, and then mark the scale on the axis. (b) In order
to have a standard deviation of 1, the curve should be
1/3 as wide, and three times taller. (c) The curve is
centered at the same place (the mean), but its height
and width change. Specifically, increasing the standard
deviation makes the curve wider and shorter; decreas-
ing the standard deviation makes the curve narrower
1 4 7 10 13 16 19
and taller.

1.111. Sketches will vary.

1.112. (a) The table on the right gives the Women Men
ranges for women; for example, about 68% 68% 7856 to 20,738 4995 to 23,125
of women speak between 7856 and 20,738 95% 1415 to 27,179 −4070 to 32,190
words per day. (b) Negative numbers do 99.7% −5026 to 33,620 −13,135 to 41,255
not make sense for this situation. The 68–95–99.7 rule is reasonable for a distribution that
is close to Normal, but by constructing a stemplot or histogram, it is easily confirmed that
this distribution is slightly right-skewed. (c) These ranges are also in the table; the men’s dis-
tribution is more skewed than the women’s distribution, so the 68–95–99.7 rule is even less
appropriate. (d) This does not support the conventional wisdom: The ranges from parts (a)
and (c) overlap quite a bit. Additionally, the difference in the means is quite small relative to
the large standard deviations.
76 Chapter 1 Looking at Data—Distributions

1.113. (a) Ranges are given in the table on Women Men


the right. In both cases, some of the lower 68% 8489 to 20,919 7158 to 22,886
limits are negative, which does not make 95% 2274 to 27,134 −706 to 30,750
sense; this happens because the women’s 99.7% −3941 to 33,349 −8,570 to 38,614
distribution is skewed, and the men’s distribution has an outlier. Contrary to the conventional
wisdom, the men’s mean is slightly higher, although the outlier is at least partly responsible
for that. (b) The means suggest that Mexican men and women tend to speak more than peo-
ple of the same gender from the United States.

1.114. (a) For example, 68−70


10 = −0.2. The complete list is given on the right. 68 −0.2
(b) The cut-off for an A is the 85th percentile for the N (0, 1) distribution. 54 −1.6
From Table A, this is approximately 1.04; software gives 1.0364. (c) The top 92 2.2
two students (with scores of 92 and 98) received A’s. 75 0.5
73 0.3
98 2.8
64 −0.6
55 −1.5
80 1
70 0

1.115. (a) We need the 5th, 15th, 55th, and Table A Software
85th percentiles for a N (0, 1) distribu- Standard Actual Standard Actual
tion. These are given in the table on the F −1.64 53.6 −1.6449 53.55
right. (b) To convert to actual scores, take D −1.04 59.6 −1.0364 59.64
the standard-score cut-off z and compute C 0.13 71.3 0.1257 71.26
10z + 70. (c) Opinions will vary. B 1.04 80.4 1.0364 80.36
Note: The cut-off for an A given in the previous solution is the lowest score that gets an
A—that is, the point where one’s grade drops from an A to a B. These cut-offs are the points
where one’s grade jumps up. In practice, this is only an issue for a score that falls exactly
on the border between two grades.

1.116. (a) The curve forms a 1 × 1 square, which


has area 1.
(b) P(X < 0.35) = 0.35.
(c) P(0.35 < X < 0.65) = 0.3.

0 0.35 1 0 0.35 0.65 1

1.117. (a) The height should be 14 since the area


under the curve must be 1. The density curve
is on the right. (b) P(X ≤ 1) = 14 = 0.25. 0 1 2 3 4
(c) P(0.5 < X < 2.5) = 0.5.

1.118. The mean and median both equal 0.5; the quartiles are Q 1 = 0.25 and Q 3 = 0.75.

1.119. (a) Mean is C, median is B (the right skew pulls the mean to the right). (b) Mean A,
median A. (c) Mean A, median B (the left skew pulls the mean to the left).
Solutions 77

1.120. Hint: It is best to draw the curve first, then place


the numbers below it. Students may at first make mis-
takes like drawing a half-circle instead of the correct
“bell-shaped” curve, or being careless about locating the
standard deviation. 218 234 250 266 282 298 314

1.121. (a) The applet shows an area of 0.6826 between −1.000 and 1.000, while the
68–95–99.7 rule rounds this to 0.68. (b) Between −2.000 and 2.000, the applet reports
0.9544 (compared to the rounded 0.95 from the 68–95–99.7 rule). Between −3.000 and
3.000, the applet reports 0.9974 (compared to the rounded 0.997).

1.122. See the sketch of the curve in the solution to Exercise 1.120. (a) The middle 95% fall
within two standard deviations of the mean: 266 ± 2(16), or 234 to 298 days. (b) The
shortest 2.5% of pregnancies are shorter than 234 days (more than two standard deviations
below the mean).

1.123. (a) 99.7% of horse pregnancies fall within three stan-


dard deviations of the mean: 336 ± 3(3), or 327 to 325
days. (b) About 16% are longer than 339 days since 339
days or more corresponds to at least one standard devia-
tion above the mean. 327 330 333 336 339 342 345
Note: This exercise did not ask for a sketch of the Normal curve, but students should be
encouraged to make such sketches anyway.

1.124. Because the quartiles of any distribution have 50% of


observations between them, we seek to place the flags so
that the reported area is 0.5. The closest the applet gets
is an area of 0.5034, between −0.680 and 0.680. Thus,
the quartiles of any Normal distribution are about 0.68
standard deviations above and below the mean.
Note: Table A places the quartiles at about ±0.67;
other statistical software gives ±0.6745.

1.125. The mean and standard deviation are x = 5.4256 and s = 0.5379. About 67.62%
.
(71/105 = 0.6476) of the pH measurements are in the range x ± s = 4.89 to 5.96. About
95.24% (100/105) are in the range x ± 2s = 4.35 to 6.50. All (100%) are in the range
x ± 3s = 3.81 to 7.04.
78 Chapter 1 Looking at Data—Distributions

1.126. Using values from Table A: (a) (b)


(a) Z > 1.65: 0.0495. (b) Z < 1.65: 0.9505. 1.65 1.65
(c) Z > −0.76: 0.7764. (d) −0.76 < Z <
1.65: 0.9505 − 0.2236 = 0.7269. –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
–0.76 –0.76
(c) (d)
1.65

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3

1.127. Using values from Table A: (a) (b)


(a) Z ≤ −1.8: 0.0359. (b) Z ≥ −1.8: –1.8 –1.8
0.9641. (c) Z > 1.6: 0.0548. (d) −1.8 <
Z < 1.6: 0.9452 − 0.0359 = 0.9093. –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3

(c) (d)
1.6 –1.8 1.6

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3

1.128. (a) 22% of the observations fall below (a) (b)


−0.7722. (This is the 22nd percentile of the 0.22 0.40
standard Normal distribution.) (b) 40% of
the observations fall above 0.2533 (the 60th –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
percentile of the standard Normal distribution).

1.129. (a) z = 0.3853 has cumulative pro-


(a) (b)
portion 0.65 (that is, 0.3853 is the 65th 0.65 0.45
percentile of the standard Normal distribu-
tion). (b) If z = 0.1257, then Z > z has –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
proportion 0.45 (0.1257 is the 55th percentile).

1.130. 70 is two standard deviations below the mean (that is, it has standard score z = −2), so
about 2.5% (half of the outer 5%) of adults would have WAIS scores below 70.

1.131. 130 is two standard deviations above the mean (that is, it has standard score z = 2), so
about 2.5% of adults would score at least 130.

− 1509 .
1.132. Tonya’s score standardizes to z = 1820321 = 0.9688, while Jermaine’s score
29 − 21.5 .
corresponds to z = 5.4 = 1.3889. Jermaine’s score is higher.

.
1.133. Jacob’s score standardizes to z = 16 −5.421.5 = −1.0185, while Emily’s score corresponds
− 1509 .
to z = 1020321 = −1.5234. Jacob’s score is higher.

− 1509 .
1.134. Jose’s score standardizes to z = 2080321 = 1.7788, so an equivalent ACT score is
.
21.5 + 1.7788 × 5.4 = 31.1. (Of course, ACT scores are reported as whole numbers, so this
would presumably be a score of 31.)
Solutions 79

30 − 21.5 .
1.135. Maria’s score standardizes to z = = 1.5741, so an equivalent SAT score is
. 5.4
1509 + 1.5741 × 321 = 2014.

2090 − 1509 .
1.136. Maria’s score standardizes to z = 321 = 1.81, for which Table A gives 0.9649.
Her score is the 96.5 percentile.

19 − 21.5 .
1.137. Jacob’s score standardizes to z = 5.4 = −0.4630, for which Table A gives 0.3228.
His score is the 32.3 percentile.

1.138. 1920 and above: The top 10% corresponds to a standard score of z = 1.2816, which in
.
turn corresponds to a score of 1509 + 1.2816 × 321 = 1920 on the SAT.

1.139. 1239 and below: The bottom 20% corresponds to a standard score of z = −0.8416,
.
which in turn corresponds to a score of 1509 − 0.8416 × 321 = 1239 on the SAT.

1.140. The quartiles of a Normal distribution are ±0.6745 standard deviations from the mean,
.
so for ACT scores, they are 21.5 ± 0.6745 × 5.4 = 17.9 to 25.1.

1.141. The quintiles of the SAT score distribution are 1509 − 0.8416 × 321 = 1239,
1509 − 0.2533 × 321 = 1428, 1509 + 0.2533 × 321 = 1590, and 1509 + 0.8416 × 321 = 1779.

1.142. For a Normal distribution with mean 55 mg/dl and standard deviation 15.5 mg/dl:
− 55 .
(a) 40 mg/dl standardizes to z = 4015.5 = −0.9677. Using Table A, 16.60% of women fall
− 55 .
below this level (software: 16.66%). (b) 60 mg/dl standardizes to z = 6015.5 = 0.3226.
Using Table A, 37.45(c) Subtract the answers from (a) and (b) from 100%: Table A gives
45.95% (software: 45.99%), so about 46% of women fall in the intermediate range.

1.143. For a Normal distribution with mean 46 mg/dl and standard deviation 13.6 mg/dl:
− 46 .
(a) 40 mg/dl standardizes to z = 4013.6 = −0.4412. Using Table A, 33% of men fall below
− 46 .
this level (software: 32.95%). (b) 60 mg/dl standardizes to z = 6013.6 = 1.0294. Using
Table A, 15.15(c) Subtract the answers from (a) and (b) from 100%: Table A gives 51.85%
(software: 51.88%), so about 52% of men fall in the intermediate range.

1.144. (a) About 0.6% of healthy young adults have osteoporosis (the cumulative probability
below a standard score of −2.5 is 0.0062). (b) About 31% of this population of older
women has osteoporosis: The BMD level which is 2.5 standard deviations below the young
adult mean would standardize to −0.5 for these older women, and the cumulative probability
for this standard score is 0.3085.

1.145. (a) About 5.2%: x < 240 corresponds to z < −1.625. Table A gives 5.16% for
−1.63 and 5.26% for −1.62. Software (or averaging the two table values) gives 5.21%.
(b) About 54.7%: 240 < x < 270 corresponds to −1.625 < z < 0.25. The area to the
left of 0.25 is 0.5987; subtracting the answer from part (a) leaves about 54.7%. (c) About
279 days or longer: Searching Table A for 0.80 leads to z > 0.84, which corresponds to
x > 266 + 0.84(16) = 279.44. (Using the software value z > 0.8416 gives x > 279.47.)
80 Chapter 1 Looking at Data—Distributions

1.146. (a) The quartiles for a standard Normal distribution are ±0.6745. (b) For a N (µ, σ )
distribution, Q 1 = µ − 0.6745σ and Q 3 = µ + 0.6745σ . (c) For human pregnancies,
. .
Q 1 = 266 − 0.6745 × 16 = 255.2 and Q 3 = 266 + 0.67455 × 16 = 276.8 days.

1.147. (a) As the quartiles for a standard Normal distribution are ±0.6745, we have
IQR = 1.3490. (b) c = 1.3490: For a N (µ, σ ) distribution, the quartiles are
Q 1 = µ − 0.6745σ and Q 3 = µ + 0.6745σ .

1.148. In the previous two exercises, we found that for a N (µ, σ ) distribution,
Q 1 = µ − 0.6745σ , Q 3 = µ + 0.6745σ , and IQR = 1.3490σ . Therefore,
1.5 × IQR = 2.0235σ , and the suspected outliers are below Q 1 − 1.5 × IQR = µ − 2.698σ ,
and above Q 3 + 1.5 × IQR = µ + 2.698σ . The percentage outside of this range is
2 × 0.0035 = 0.70%.

1.149. (a) The first and last deciles for a standard Normal distribution are ±1.2816. (b) For
.
a N (9.12, 0.15) distribution, the first and last deciles are µ − 1.2816σ = 8.93 and
.
µ + 1.2816σ = 9.31 ounces.

1.150. The shape of the quantile plot suggests that the data are right-skewed (as was observed
in Exercises 1.36 and 1.74). This can be seen in the flat section in the lower left—these
numbers were less spread out than they should be for Normal data—and the three apparent
outliers (the United States, Canada, and Australia) that deviate from the line in the upper
right; these were much larger than they would be for a Normal distribution.

1.151. (a) The plot is reasonably linear except for the point in the upper right, so this
distribution is roughly Normal, but with a high outlier. (b) The plot is fairly linear, so
the distribution is roughly Normal. (c) The plot curves up to the right—that is, the large
values of this distribution are larger than they would be in a Normal distribution—so the
distribution is skewed to the right.

1.152. See also the solution to Exercise 1.42.


The plot suggests no major deviations from 5.8
Normality, although the three lowest mea- 5.6
surements do not quite fall in line with the
Density

5.4
other points.
5.2

4.8
–3 –2 –1 0 1 2 3
Normal score
Solutions 81

1.153. (a) All three quantile plots are below; the yellow variety is the nearest to a straight line.
(b) The other two distributions are slightly right-skewed (the lower-left portion of the graph
is somewhat flat); additionally, the bihai variety appears to have a couple of high outliers.

50 H. bihai 43 H. caribaea red 38 H. caribaea yellow


Flower length (mm)

42
49 37
41
48 40 36
39
47 35
38
46 37 34
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

1.154. Shown are a histogram and quantile plot for one sample of 200 simulated N (0, 1)
points. Histograms will vary slightly but should suggest a bell curve. The Normal quantile
plot shows something fairly close to a line but illustrates that, even for actual Normal data,
the tails may deviate slightly from a line.

3
50
2
Simulated values

40
1
Frequency

30
0
20 –1
10 –2

0 –3
–3 –2.5 –2 –1.5 –1 –0.5 0 0.5 1 1.5 2 2.5 3 –3 –2 –1 0 1 2 3
Simulated values Normal score

1.155. Shown are a histogram and quantile plot for one sample of 200 simulated uniform data
points. Histograms will vary slightly but should suggest the density curve of Figure 1.34
(but with more variation than students might expect). The Normal quantile plot shows that,
compared to a Normal distribution, the uniform distribution does not extend as low or as
high (not surprising, since all observations are between 0 and 1).

1
0.9
25
0.8
Simulated values

20 0.7
Frequency

0.6
15 0.5
0.4
10 0.3
5 0.2
0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 –3 –2 –1 0 1 2 3
Simulated values Normal score
82 Chapter 1 Looking at Data—Distributions

1.156. Shown is a back-to-back stemplot; the distributions Hatchback Large sedan


could also be compared with histograms or boxplots. Either 13 00
14
mean/standard deviation or the five-number summary could 15 00
be used; both are given below. Both the graphical and 00 16 00000000
17 0000000000
numerical descriptions reveal that hatchbacks generally have 0 18 0000
higher fuel efficiency (and also are more variable). 000 19 00
00000000 20
0000000 21
x s Min Q1 M Q3 Max 00 22
Hatchback 22.548 3.423 16 20 21.5 25 30 00 23
00000 24
Large 000 25
sedan 16.571 1.425 13 16 17.0 17 19 0 26
00000 27
0 28
0 29
0 30

1.157. (a) The distribution appears to be roughly Normal. (b) One could 8 28
justify using either the mean and standard deviation or the five-number 9
10 58
summary: 11 34
x s Min Q1 M Q3 Max 12 023689
13 015788
15.27% 3.118% 8.2% 13% 15.5% 17.6% 22.8% 14 0077
(c) For example, binge drinking rates are typically 10% to 20%. Which 15 13466889
16 01567
states are high, and which are low? One might also note the geographical
17 45677789
distribution of states with high binge-drinking rates: The top six states 18 8
(Wisconsin, North Dakota, Iowa, Minnesota, Illinois, and Nebraska) are 19 148
all adjacent to one another. 20 2
21 6
22 8

1.158. (a) The stemplot on the right suggests that there are two groups of 16 3
states: the under-23% and over-23% groups. Additionally, while they do 17
18 14678
not qualify as outliers, Oklahoma (16.3%) and Vermont (30%) stand out 19 4679
as notably low and high. (b) One could justify using either the mean and 20 268
standard deviation or the five-number summary: 21 346899
22 3488
x s Min Q1 M Q3 Max 23
23.71% 3.517% 16.3% 20.8% 24.3% 26.4% 30% 24 12446
25 023468
Neither summary reveals the two groups of states visible in the stemplot. 26 02346
(c) One could explore the connections (geographical, socioeconomic, etc.) 27 0455
between the states in the two groups; for example, the top group includes 28 355679
29
many northeastern states, while the bottom group includes quite a few 30 0
southern states.
Solutions 83

1.159. Students might compare 100


color preferences using Silver
90
a stacked bar graph like 80 White
that shown on the right, 70
Gray
60

Percent
or side-by-side bars like
those below. (They could 50 Black
40
also make six pie charts, Blue
30
but comparing slices across 20 Red
pies is difficult.) Possible 10
observations: white is con- Brown
0
siderably less popular in North South Europe China South Japan Other
America America Korea
Europe, and gray is less
common in China.
Note: The orders of countries and colors is as given in the text, which is more-or-less
arbitrary. (Colors are ordered by decreasing popularity in North America.)

North America
25
South America
20
Percent

15 Europe

10 China

5 South Korea

0 Japan
Silver White Gray Black Blue Red Brown Other
Color

1.162. Using either a histogram or stemplot, 80


we see that this distribution is sharply right- 70
skewed. For this reason, the five-number 60
Frequency

summary is preferred. 50
Min Q1 M Q3 Max 40
0 3 12.5 34 86 30
20
Some students might report the less- 10
. .
appropriate x = 21.62 and s = 22.76. 0
From the histogram and five-number 0 10 20 30 40 50 60 70 80 90 100
summary, we can observe, for example, that Internet users per hundred people
many countries have fewer than 10 Internet users per 100 people. In 75% of countries, less
than 1/3 of the population uses the Internet.
84 Chapter 1 Looking at Data—Distributions

1.163. The distribution is somewhat right-skewed (although considerably less 0 145789


than the distribution with all countries) with only one country (Bosnia and 1 23488889
2 5
Herzegovina) in the 20’s. Because of the irregular shape, students might 3 0134467
choose either the mean/standard deviation or the five-number summary: 4 124666669
5 022345688
x s Min Q1 M Q3 Max 6 223
39.85 22.05 1.32 18.68 43.185 54.94 85.65 7 026
8 15

1.164. (a) & (b) The graphs are below. Bars are shown in alpha- Baltimore 7.82
betical order by city name (as the data were given in the table). Boston 8.26
.
651 = 7.82. The
(c) For Baltimore, for example, this rate is 5091 Chicago 4.02
complete table is shown on the right. (d) & (e) Graphs below. Long Beach 6.25
Los Angeles 8.07
Note that the text does not specify whether the bars should be Miami 3.67
ordered by increasing or decreasing rate. (f) Preferences may Minneapolis 14.87
vary, but the ordered bars make comparisons easier. New York 6.23
Oakland 9.30
Philadelphia 7.04
San Francisco 7.61
Washington, D.C. 13.12

8000 50000
Population (thousands)

7000
Open space (acres)

40000
6000
5000 30000
4000
3000 20000
2000
10000
1000
0 0
.
Oa rk
Ch on

s
Los each

Ne lis

and
nea i

Oa rk

.
ngt cisco
Wa Fran ia

Ch on
ore

s
Lon icago

Los each

Ne lis

Phi kland
nea i

ngt cisco
Wa Fran ia
ore

gB o
D.C
Min iam

D.C
Min Miam
ele

ele
g
o

San delph

San delph
po

po
t

t
a
wY
Bos

wY
Bos
tim

tim
l
Ang

c
M

Ang
k
gB

on,

on,
i
Bal

Bal
la

la
Lon
Phi

shi

shi

14 14
Acres of open space

Acres of open space

12 12
per 1000 people

per 1000 people

10 10
8 8
6 6
4 4
2 2
0 0
.

.
Oa rk

Ch rk
Ch on

Los oston
s

s
Los each

Ne ch
Ne lis

on, lis
Phi kland

nd
nea i

mi
Wa Fran ia

Lon lphia
ngt cisco

Phi cisco
ore

Fra re
gB o

go
D.C

D.C
Min Miam
ele

ele
g

o
San delph

San altimo
po

shi eapo

ea
t

Mia
kla
a

ica
wY

wY
Bos
tim

Ang

Ang

e
gB
on,
i

n
Oa
B

lad
Bal

n
la

ngt

B
Lon

Min
shi

Wa
Solutions 85

1.165. The given description is true on the average, but the curves (and a few calculations)
give a more complete picture. For example, a score of about 675 is about the 97.5th
percentile for both genders, so the top boys and girls have very similar scores.

1.166. (a) & (b) Answers will vary. Definitions might be as simple as “free time,” or “time
spent doing something other than studying.” For part (b), it might be good to encourage
students to discuss practical difficulties; for example, if we ask Sally to keep a log of her
activities, the time she spends filling it out presumably reduces her available “leisure time.”

1.167. Shown is a stemplot; a histogram 22 013


should look similar to this. This distri- 22 7899
23 000011222233344444
bution is relatively symmetric apart from 23 55566666667777778888888999
one high outlier. Because of the outlier, 24 00000011111112222222223333333333444444
the five-number summary (in hours) is 24 555555666666666777777888888999999
preferred: 25 00001111233344
25 56666889
22 23.735 24.31 24.845 28.55 26 2
Alternatively, the mean and standard 26 56
deviation are x = 24.339 and s = 0.9239 27 2
27
hours. 28
28 5

1.168. Gender and automobile preference are categorical; age and household income are
quantitative.

1.169. Either a bar graph or a pie chart could


be used. The given numbers sum to 66.7, so
Subscribers (millions)

25
the “Other” category presumably includes the
20
remaining 29.3 million subscribers.
15

10

0
er

line
Ca west
e
&T

ited ion
ink

er
adR st

aO n

arte
nlin
o
unn
a

Oth
Am Veriz
AT

thL

vis
mc

On
Q
Ch

ble
Ear
Co

eric
Ro

Un

1.170. Women’s weights are skewed to the right: This makes the mean higher than the median,
and it is also revealed in the differences M − Q 1 = 14.9 lb and Q 3 − M = 24.1 lb.

1.171. (a) For car makes (a categorical variable), use either a bar graph or pie chart. For
car age (a quantitative variable), use a histogram, stemplot, or boxplot. (b) Study time is
quantitative, so use a histogram, stemplot, or boxplot. To show change over time, use a time
plot (average hours studied against time). (c) Use a bar graph or pie chart to show radio
station preferences. (d) Use a Normal quantile plot to see whether the measurements follow
a Normal distribution.
86 Chapter 1 Looking at Data—Distributions

1.172. The counts given add to 6067, so the 1800


others received 626 spam messages. Ei- 1600
ther a bar graph or a pie chart would be 1400

Spam count
appropriate. What students learn from this 1200
1000
graph will vary; one observation might be
800
that AA and BB (and perhaps some others) 600
might need some advice on how to reduce 400
the amount of spam they receive. 200
0
AA BB CC DD EE FF GG HH II JJ KK LL other
Account ID

1.173. No, and no: It is easy to imagine examples of many different data sets with mean 0 and
standard deviation 1—for example, {−1,0,1} and {−2,0,0,0,0,0,0,0,2}.
Likewise, for any given five numbers a ≤ b ≤ c ≤ d ≤ e (not all the same), we can
create many data sets with that five-number summary, simply by taking those five numbers
and adding some additional numbers in between them, for example (in increasing order):
10, , 20, , , 30, , , 40, , 50. As long as the number in the first blank is
between 10 and 20, and so on, the five-number summary will be 10, 20, 30, 40, 50.

1.174. The time plot is shown below; because of the great detail in this plot, it is larger than
other plots. Ruth’s and McGwire’s league-leading years are marked with different symbols.
(a) During World War II (when many baseball players joined the military), the best home
run numbers decline sharply and steadily. (b) Ruth seemed to set a new standard for other
players; after his first league-leading year, he had 10 seasons much higher than anything that
had come before, and home run production has remained near that same level ever since
(even the worst post-Ruth year—1945—had more home runs than the best pre-Ruth season).
While some might argue that McGwire’s numbers also raised the standard, the change is
not nearly as striking, nor did McGwire maintain it for as long as Ruth did. (This is not
necessarily a criticism of McGwire; it instead reflects that in baseball, as in many other
endeavors, rates of improvement tend to decrease over time as we reach the limits of human
ability.)

70
League-leading HRs in season

60

50

40

30

20

10

0
1880 1900 1920 1940 1960 1980 2000
Year
Solutions 87

1.175. Bonds’s mean changes from 36.56 to 34.41 home runs (a drop of 2.15), 1 69
while his median changes from 35.5 to 34 home runs (a drop of 1.5). This 2 4
2 55
illustrates that outliers affect the mean more than the median. 3 3344
3 77
4 02
4 5669
5
5
6
6
7 3

1.176. Recall the text’s description of the effects of a linear transformation xnew = a + bx: The
mean and standard deviation are each multiplied by b (technically, the standard deviation
is multiplied by |b|, but this problem specifies that b > 0). Additionally, we add a to the
(new) mean, but a does not affect the standard deviation. (a) The desired transformation
is xnew = −40 + 2x; that is, a = −40 and b = 2. (We need b = 2 to double the standard
deviation; as this also doubles the mean, we then subtract 40 to make the new mean 100.)
1 . .
(b) xnew = −45.4545 + 1.8182x; that is, a = −49 11 = −49.0909 and b = 20 11 = 1.8182.
5
(This choice of b makes the new standard deviation 20 and the new mean 145 11 ; we then
subtract 45.4545 to make the new mean 100.) (c) David’s score—2 · 72 − 40 = 104—is
.
higher within his class than Nancy’s score—1.8182 · 78 − 45.4545 = 96.4—is
within her class. (d) A third-grade score of 75 corresponds to a score of 110 from the
− 100
N (100, 20) distribution, which has a standard score of z = 110 20 = 0.5. (Alternatively,
− 70
z = 75 10 = 0.5.) A sixth-grade score of 75 corresponds to about 90.9 on the transformed
− 100 − 80 .
scale, which has standard score z = 90.920 = 75 11 = −0.45. Therefore, about 69% of
third graders and 32% of sixth graders score below 75.

1.177. Results will vary. One set of 20 samples gave Means Standard deviations
the results at the right (Normal quantile plots are not 22 568 5 6
shown). 23 6
23 89 6 66899
Theoretically, x will have a Normal distribution
√ . 24 02 7 3
with mean 25 and standard deviation 8/ 30 = 1.46, 24 89 7
so that about 99.7% of the time, one should find x 25 3 8 113
between 20.6 and 29.4. Meanwhile, the theoretical dis- 25 6799 8 789
26 124 9 000
tribution of s is nearly Normal (slightly skewed) with 26 59 9 556
. .
mean = 7.9313 and standard deviation = 1.0458; about 27 4 10 2
99.7% of the time, s will be between 4.8 and 11.1.
Note: If we take a sample of size√n from a Normal distribution and compute the sam-
ple standard deviation S, then (S/σ ) n − 1 has a “chi” distribution with n − 1 degrees of
freedom (which looks like a Normal distribution when n is reasonably large). You can learn
all you would want to know—and more—about this distribution on the Web (for example, at
Wikipedia). One implication
 √  that “on the average,” s underestimates σ ; specifically,
of this is
2 (n/2)
the mean of S is σ √n − 1 (n/2 − 1/2) . The factor in parentheses is always less than 1, but
approaches 1 as n approaches infinity. The proof of this fact is left as an exercise—for the
instructor, not for the average student!
Chapter 2 Solutions

2.1. The cases are students.

2.2. When students are classified like this, PSQI is being used as a categorical variable,
because each student is categorized by the group he/she falls in.
One advantage is that it might simplify the analysis, or at least it might simplify the
process of describing the results. (Saying that someone fell into the “poor” category is easier
to interpret than saying that person had a PSQI score of 12.) A more subtle issue is that it is
not clear whether finding an average is appropriate for these numbers; technically, averages
are not appropriate for a quantitative measurement unless the variable is measured on an
“interval” scale, meaning (for example) that the difference between PSQI scores of 1 and 2
is the same as the difference between PSQI scores of 10 and 11.

2.3. With this change, the cases are cups of Mocha Frappuccino (as before). The variables
(both quantitative) are size and price.

2.4. One could make the argument that being subjected to stress makes it more difficult to
sleep, so that SUDS (stress level) is explanatory and PSQI (sleep quality) is the response.

2.5. (a) The spreadsheet should look like the


image on the right (especially if students
use the data file from the companion CD).
(b) There are 10 cases. (c) The image on the
right shows the column headings used on the
companion CD; some students may create
their own spreadsheets and use slightly dif-
ferent headings. (The values of the variables
should be the same.) (d) The variables in
the second and third columns (“Bots” and
“SpamsPerDay”) are quantitative.

2.6. Stemplots are shown; histograms would be equivalent. Students Bots Spams/day
may choose different ways to summarize the data, such as bar 0 1223 0 002359
graphs (one bar for each botnet). Note that summarizing each 0 58 1 06
1 2 2
variable separately does not reveal the relationship between the 1 58 3 0
two variables; that is done using a scatterplot in the next exercise. 2 4
Because both distributions are skewed, we prefer five-number 2 5
summaries to the mean and standard deviation. 3 1 6 0
x s Min Q 1 M Q3 Max
Bots (thousands) 99.7 96.6 12 20 67.5 150 315
Spams/day (billions) 13.6 18.6 0.35 2 7.0 16 60

88
Solutions 89

2.7. (a) The scatterplot is on the right. 60

Spams per day (billions)


(b) Bobax is the second point from the right.
50
(Bobax has the second-highest bot count with
185 thousand, but is relatively low in spam 40
messages at 9 billion per day.) 30
20
10 Bobax
0
0 50 100 150 200 250 300 350
Bots (thousands)

2.8. (a) The resulting spreadsheet is 60,000,000,000


not shown. (b) Scatterplot on the
50,000,000,000
right. (c) The points are arranged

Spams per day


exactly as before, but the large 40,000,000,000
numbers on the axes are distracting. 30,000,000,000
20,000,000,000
10,000,000,000
0
0 100,000 200,000 300,000
Bots

2.9. Size seems to be the most reasonable 4.50


choice for explanatory variable because it
seems nearly certain that Starbucks first de- 4.25
cided which sizes to offer, then determined
Cost ($)

4.00
the appropriate price for each size (rather
than vice versa). The scatterplot shows a 3.75
positive association between size and price. 3.50

3.25
10 12 14 16 18 20 22 24
Size (ounces)
90 Chapter 2 Looking at Data—Relationships

2.10. Two good choices are the change in debt from 2006 to Increase Ratio
2007 (subtract the two numbers for each country) or the ratio −0 21 9 88
of the two debts (divide one number by the other). Students 0 0044 10 4
0 56667 10 668
may think of other new variables, but these have the most 1 034 11 0223334
direct bearing on the question. 1 11 66778
Shown are stemplots of the increase (2007 debt minus 2 13 12 013
2006 debt, measured in US$ billions), and the debt ratio 2 7 12 589
3
(2007 debt divided by 2006 debt; these numbers have no 3 77
units). From either variable, we can see that debt increased 4 122
for all but two of the 24 countries. This can be summarized 4
5 34
using either the mean and standard deviation or the five-
number summary (the latter is preferred for increase, because of the skew).
x s Min Q1 M Q3 Max
Increase 19.07 18.38 −2.89 5.25 12.205 37.66 54.87
Ratio 1.145 0.082 0.984 1.098 1.143 1.193 1.298

Note: In looking at increases, one notes that the size of the debt and the size of the change
are related (countries with smaller debts typically changed less than countries with large
debts). Debt ratio does not have this relationship with debt size (or at least it is less appar-
ent); for this reason, it might be considered a better choice for answering this question.

2.11. The new points (marked with a different 5000 =


symbol) are far away from the others, but
fall roughly in the same line, so the rela- 4000
Debt 2007

tionship is essentially unchanged: It is still 3000


strong, linear, and positive.
=
2000
=
=
1000

0
0 1000 2000 3000 4000
Debt 2006

2.12. Student choices of symbols +


11 + o+o
will vary; the plot on the right + ++ ++++++ + +++ Forbes rank
+ + o
uses +, o, −, rather than the 10 ++ ++ o+ ++ o
LGDP per cap

+++ o + Top third


++ o o– + o–– o+ o
more obvious H, M, L. (The + o +o
oo o –o +
o
9 o o –o o o Middle third

latter symbols are harder to dis- – o oo – –oo oo
o – o oo
tinguish when overlapping.) 8 – – – – – Bottom third
– – –– – –
– – o
This graph reinforces the ob- –– – ––
7 –

servation in Example 2.14 that
GDP ties in closely with rank- 6
0 1 2 3 4
ings; generally, high GDP goes
L Unemployment
with high rank, middle GDP with
middle rank, and low GDP with low rank. As before, the relationship (if any) between unem-
ployment and rankings is not so clear.
Solutions 91

2.13. (a) A boxplot summarizes the distribution of one variable. (Two [or more] boxplots
can be used to compare two [or more] distributions, but that does not allow us to
examine the relationship between those variables.) (b) This is only correct if there is an
explanatory/response relationship. Otherwise, the choice of which variable goes on which
axis might be somewhat arbitrary. (c) High values go with high values, and low values go
with low values. (Of course, those statements are generalizations; there can be exceptions.)

2.14. (a) The points should all fall close to a negatively sloped line. (b) Look for a “cloud”
of points with no discernible pattern. Watch for students who mistakenly consider “no
relationship” as meaning “no linear relationship.” For example, points that suggest a
circle, triangle, or curve may indicate a non-linear relationship. (c) The points should be
widely scattered around a positively sloped line. (d) Sketches might be curved, angular, or
something more imaginative.

2.15. (a) Below, left. (b) Adding up the numbers in the first column of the table gives 46,994
thousand (that is, about 47 million) uninsured; divide each number in the second column
by this amount. (Express answers as percents; that is, multiply by 100.) (c) Below, right.
(The title on the vertical axis is somewhat wordy, but that is needed to distinguish between
this graph and the one in the solution to the next exercise.) (d) The plots differ only in the
vertical scale. (e) The uninsured are found in similar numbers for the five lowest age groups
(with slightly more in those aged 25–34 and 45–64), and fewer among those over 65.
Number uninsured (thousands)

X X X X
10000
Percent of all uninsured

20%
X X X X
8000 X X
in age group

15%
6000
10%
4000

2000 5%
X X
0 0%
<18 18–24 25–34 35–44 45–64 65+ <18 18–24 25–34 35–44 45–64 65+
Age group (years) Age group (years)

2.16. (a) For example, in the under-18 age


8661 .
30% X
group, 74,101 = 0.1169 = 11.69%; in X
8323 .
25%
Percent uninsured
within age group

the 18–24 age group, 28,405 = 0.2930 =


20%
29.30%; etc. (b) Plot on the right. The X

title on the vertical axis is rather long 15% X


X
to distinguish between this graph and 10%
the second one in the previous solution. 5%
(c) The youngest (often covered by their X
0%
parents’ insurance) and oldest (covered by <18 18–24 25–34 35–44 45–64 65+
Medicare) are least likely to be uninsured. Age group (years)
For the age groups in between, the per-
cent uninsured gradually decreases.
92 Chapter 2 Looking at Data—Relationships

2.17. The percents in Exercise 2.15 show what fraction of the uninsured fall in each age group.
The percents in Exercise 2.16 show what fraction of each age group is uninsured.
Note: When looking at fractions and percents, encourage students to focus on
the “whole”—that is, what does the denominator represent? For all the fractions in
Exercise 2.15, the “whole” is the group of all uninsured people. For Exercise 2.16, the
“whole” for each fraction is the total number of people in the corresponding age group.

2.18. See also the solutions to Exercises 1.62 25


and 1.64. (a) Apart from the outlier,
20
the scatterplot suggests a weak positive

Carbohydrates
relationship—that is, beers with high alcohol 15
content generally have more carbohydrates,
and those with low alcohol content generally 10
have fewer carbohydrates. (b) The outlier 5
is O’Doul’s, which is marketed as “non-
alcoholic” beer. (c) The scatterplot is on the 0
3.5 4 4.5 5 5.5 6 6.5
right. (d) Without the outlier, the scatterplot Percent alcohol
suggests a slightly curved relationship. (That
relationship was also somewhat visible in Figure 2.10, but it is easier to see when the points
are not crowded together in half of the graph.)

2.19. (a) The scatterplot is on the right.


(b) There is a moderate positive linear 200
relationship. (There is some suggestion of 180
a curve, but a line seems to be a reasonable
Calories

160
approximation given the amount of scatter.) 140
120
100
80
3.5 4 4.5 5 5.5 6 6.5
Percent alcohol

2.20. (a) Figure 2.11 shows a positive curved relationship. More specifically, in countries with
fewer than 10 Internet users per 100 people, life expectancy ranges between 40 and about
75 years. For countries with more than 10 Internet users per 100 people, life expectancy
increases (slowly) with increasing Internet usage. (b) A more likely explanation for the
association is that countries with higher Internet usage are typically more developed and
affluent, which comes with benefits such as better access to medical care, etc.

2.21. There is a moderate positive linear relationship; the relationship for all countries is less
linear because of the wide range in life expectancy among countries with low Internet use.
Solutions 93

2.22. Students might choose different ways


80

Life expectancy (years)


of creating a single plot; the most obvious 75
choice is to use different symbols for Eu- 70
ropean and other countries. (This requires 65
students either to separate the European 60
countries from the others, or to superimpose 55 Europe
one graph on another.) 50
45 Other
40
0 10 20 30 40 50 60 70 80 90
Internet users (per 100 people)

2.23. (a) “Month” (the passage of time) explains X

Temperature (degrees F)
60
changes in temperature (not vice versa).
55
(b) Temperature increases linearly with time
50 X
(about 10 degrees per month); the relationship
is strong. 45
40 X

35
30 X
25
February March April May
Month

2.24. (a) First test score should be explanatory 175


because it comes first chronologically. (b) The 170
Final exam score

scatterplot shows no clear association; however, 165


the removal of one point (the sixth student, in 160
the upper left corner of the scatterplot) leaves 155
a weak-to-moderate positive association. (c) A 150
few students can disrupt the pattern quite a bit; 145
for example, perhaps the sixth student studied 140
very hard after scoring so low on the first test,
115 120 125 130 135 140 145 150 155 160
while some of those who did extremely well on First test score
the first exam became overconfident and did not
study hard enough for the final (the points in the lower right corner of the scatterplot).

2.25. (a) The second test happens before the fi- 175


nal exam, so that score should be viewed as 170
Final exam score

explanatory. (b) The scatterplot shows a weak 165


positive association. (c) Students’ study habits 160
are more established by the middle of the term. 155
150
145
140
135 140 145 150 155 160 165 170 175
Second test score
94 Chapter 2 Looking at Data—Relationships

2.26. To be considered an outlier, the point for the ninth student should be in either the upper
left or lower right portion of the scatterplot. The former would correspond to a student who
had a below-average second-test score but an above-average final-exam score. The latter
would be a student who did well on the second test but poorly on the final.

2.27. (a) Age is explanatory; weight is the response variable. (b) Explore the relationship;
there is no reason to view one or the other as explanatory. (c) Number of bedrooms is
explanatory; price is the response variable. (d) Amount of sugar is explanatory; sweetness is
the response variable. (e) Explore the relationship.

2.28. Parents’ income is explanatory, and college debt is the response. Both variables are
quantitative. We would expect a negative association: Low income goes with high debt, high
income with low debt.

2.29. (a) In general, we expect more intelligent children to be better readers and less intelligent
children to be weaker. The plot does show this positive association. (b) The four points are
for children who have moderate IQs but poor reading scores. (c) The rest of the scatterplot
is roughly linear but quite weak (there would be a lot of variation about any line we draw
through the scatterplot).

2.30. (a) The response variable (estimated level) can only take on the values 1, 2, 3, 4, 5, so
the points in the scatterplot must fall on one of those five levels. (b) The association is
(weakly) positive. (c) The estimate is 4, which is an overestimate; that child had the lowest
score on the test.

2.31. (a) If we used the number of males return- 85


Percent of males returning

ing, then we might not see the relationship 80


because areas with many breeding pairs 75
70
would correspondingly have more males that
65
might potentially return. (In the given num- 60
bers, the number of breeding pairs varies only 55
from 28 to 38, but considering hypothetical 50
data with 10 and 100 breeding pairs makes 45
more apparent the reason for using percents 40
27 30 33 36 39
rather than counts.) (b) Scatterplot on the
Number of breeding pairs
right. Mean responses are shown as crosses;
the mean responses with 29 and 38 breeding pairs are (respectively) 71.3333% and 48.5%
males returning. (c) The scatterplot does show the negative association we would expect if
the theory were correct.
Solutions 95

2.32. There appears to be a positive association


between cycle length and day length, but it is 28

Cycle length (hours)


quite weak: The points of the scatterplot are
generally located along a positively sloped 26
line but with a lot of spread around that line.
(Ideally, both axes should have the same 24
scale.)
22
21
12 14 16 18 20 22 24
Hours of daylight on June 21

2.33. The scatterplot shows a fairly strong, 0.12


positive, linear association. There are no 0.1
particular outliers; each variable has low 0.08

Brain activity
and high values, but those points do not 0.06
0.04
deviate from the pattern of the rest. Social
0.02
exclusion does appear to trigger a pain re- 0
sponse. –0.02
–0.04
–0.06
1 1.5 2 2.5 3 3.5
Social distress score

2.34. (a) Value and revenue (scatterplot on the


600
Team value ($millions)

right) have a weak negative relationship.


550
The form of the relationship is unclear; one
500
would hesitate to call it linear. (b) Value and
450
debt (scatterplot following page, left) have
a very strong, positive, linear association. 400
(c) Value and income (scatterplot following 350
page, right) have a positive, roughly lin- 300
ear relationship. The relationship is quite 250
0 10 20 30 40 50 60 70
weak, partly because the two teams that lost Revenue ($millions)
the most money (Denver and Dallas) had
higher-than-expected values, and the team with the most income (Chicago) had a lower-than-
expected value. Without those three points, we might call the relationship moderately strong.
(d) Debt has the strongest association with value; other observations will vary.
96 Chapter 2 Looking at Data—Relationships

600 600
Team value ($millions)

Team value ($millions)


550 550
Chicago
500 500 Dallas
450 450
400 400
350 350 Denver
300 300
250 250
80 100 120 140 160 180 200 –30 –20 –10 0 10 20 30 40 50
Debt ($millions) Income ($millions)

2.35. (a) Women are marked with filled cir-


1800 Women

Metabolic rate (cal/day)


cles, men with open circles. (b) The asso-
ciation is linear and positive. The women’s 1600 Men
points show a stronger association. As a
1400
group, males typically have larger values
for both variables. 1200

1000
850
30 35 40 45 50 55 60
Lean body mass (kg)

2.36. (a) In the scatterplot on the right, the


open circles represent run 8905, the higher 25 Run 8903
Icicle length (cm)

flow rate. (b) Icicles seem to grow faster 20 Run 8905


when the water runs more slowly. (Note
15
that there is no guarantee that the pattern we
observe with these two flow rates applies to 10
rates a lot faster than 29.6 mg/s, or slower 5
than 11.9 mg/s.)
0
0 50 100 150 200
Time (minutes)

2.37. (a) Both men (filled circles) and women 2300


Record 10K time (seconds)

Men
(open circles) show fairly steady improve- 2200
ment. Women have made more rapid 2100 Women
progress, but their progress seems to have 2000
slowed (the record has not changed since 1900
1993), while men’s records may be drop- 1800
ping more rapidly in recent years. (b) The 1700
data support the first claim but do not seem 1600
to support the second. 1500
1900 1920 1940 1960 1980 2000
Year
Solutions 97

.
2.38. The correlation is r = 0.8839. (This can be computed by hand, but software makes it
much easier.)

.
2.39. (a) The correlation is r = 0.8839 (again). (b) They are equal. (c) Units do not affect
correlation.

2.40. The correlation is near 1, because the scatterplot shows a very strong positive linear
.
association. (This can be confirmed with the data; we find that r = 0.9971.)

2.41. In both these cases, the points in a scatterplot would fall exactly on a positively sloped
line, so both have correlation r = 1. (a) With x = the price of a brand-name product, and
y = the store-brand price, the prices satisfy y = 0.9x. (b) The prices satisfy y = x − 1.

2.42. (a) Scatterplot on the right. (b) The relationship is very


50
strong (assuming these five points are truly representative of
the overall pattern), but it is not linear. (c) The correlation 40
is r = 0. (d) Correlation only measures the strength of
linear relationships. 30

y
20

10

20 30 40 50 60
x

.
2.43. Software gives r = 0.2873. (This is consistent with the not-very-strong association visible
in the plot.)

.
2.44. (a) With the outlier (O’Doul’s) removed, the correlation is r = 0.4185. (b) Outliers that
are not consistent with the pattern of the other points tend to decrease the size of r (that is,
they make r closer to 0), because they weaken the association between the variables. (Points
that lie far from the other points, but roughly on the same line, are typically not referred to
as outliers because they actually strengthen the association.)

.
2.45. The correlation is r = 0.6701, but we note
80
Life expectancy (years)

that because this relationship is not linear, r 75


is not really appropriate for this situation. 70
65
60
55
50
45
40
0 10 20 30 40 50 60 70 80 90
Internet users (per 100 people)
98 Chapter 2 Looking at Data—Relationships

2.46. (a) Scatterplot on the right. (b) The cor-


.
relation is r = 0.6862. (c) The relationship

Life expectancy (years)


80
(as measured by r ) is stronger for European
countries than for the larger data set—and 75
more importantly, r is more appropriate for
70
the European data set, because the relation-
ship is at least roughly linear. 65

60
0 10 20 30 40 50 60 70 80 90
Internet users (per 100 people)

.
2.47. (a) r = 0.5194. (b) The first-test/final-exam correlation will be lower, because the
relationship is weaker. (See the next solution for confirmation.)

.
2.48. (a) r = −0.2013. (b) The small correlation (that is, close to 0) is consistent with
a weak association. (c) This correlation is much smaller (in absolute value) than the
second-test/final-exam correlation 0.5194.

2.49. Such a point should be at the lower left part of the scatterplot. Because it tends to
strengthen the relationship, the correlation increases.
Note: In this case, r was positive, so strengthening the relationship means r gets larger.
If r had been negative, strengthening the relationship would have decreased r (toward −1).

2.50. Any outlier should make r closer to 0, because it weakens the relationship. To be
considered an outlier, the point for the ninth student should be in either the upper left or
lower right portion of the scatterplot. The former would correspond to a student who had a
below-average second-test score but an above-average final-exam score. The latter would be
a student who did well on the second test but poorly on the final.
Note: In this case, because r > 0, this means r gets smaller. If r had been negative,
getting closer to 0 would mean that r gets larger (but gets smaller in absolute value).

.
2.51. The correlations are listed on the right; these sup- Value and revenue r1 = −0.3228
.
port the observation from the solution to Exercise 2.34 Value and debt r2 = 0.9858
.
that the value/debt relationship is by far the strongest. Value and income r3 = 0.7177

. .
2.52. For Exercise 2.32, r1 = 0.2797; for Exercise 2.36, r2 = 0.9958 (run 8903) and
.
r3 = 0.9982 (run 8905).
Solutions 99

2.53. (a) The scatterplot shows a moderate positive associa- 72


tion, so r should be positive, but not close to 1. (b) The 71

Man’s height (inches)


.
correlation is r = 0.5653. (c) r would not change if all the 70
men were six inches shorter. A positive correlation does not 69
68
tell us that the men were generally taller than the women; 67
instead it indicates that women who are taller (shorter) than 66
the average woman tend to date men who are also taller 65
(shorter) than the average man. (d) r would not change be- 64
cause it is unaffected by units. (e) r would be 1 because the 63
63 64 65 66 67 68 69 70 71 72
points of the scatterplot would fall exactly on a positively Woman’s height (inches)
sloped line (with no scatter).

.
2.54. The correlation is r = 0.481. The correlation is greatly 11
lowered by the one outlier. Outliers tend to have fairly 10
9
strong effects on correlation; it is even stronger here because 8
there are so few observations. 7
6

y
5
4
3
2
1
0
1 2 3 4 5 6 7 8 9 10
x

2.55. (a) As two points determine a line, the correlation is always either −1 or 1. (b) Sketches
will vary; an example is shown as the first graph below. Note that the scatterplot must be
positively sloped, but r is affected only by the scatter about a line drawn through the data
points, not by the steepness of the slope. (c) The first nine points cannot be spread from the
top to the bottom of the graph because in such a case the correlation cannot exceed about
0.66 (based on empirical evidence—that is, from a reasonable amount of playing around
.
with the applet). One possibility is shown as the second graph below. (d) To have r = 0.8,
the curve must be higher at the right than at the left. One possibility is shown as the third
graph below.
100 Chapter 2 Looking at Data—Relationships

2.56. (a) The correlation will be closer to −1. One possible answer is shown below, left.
(b) Answers will vary, but the correlation will increase and can be made positive by
dragging the point down far enough (below, right).

2.57. (Scatterplot not shown.) If the husband’s age is y and the wife’s x, the linear relationship
y = x + 2 would hold, and hence r = 1 (because the slope is positive).

2.58. Explanations and sketches will vary, but should note that correlation measures
the strength of the association, not the slope of the line (except for the sign of the
slope—positive or negative). The hypothetical Funds A and B mentioned in the report, for
example, might be related by a linear formula with slope 2 (or 1/2).

2.59. The person who wrote the article interpreted a correlation close to 0 as if it were a
correlation close to −1 (implying a negative association between teaching ability and
research productivity). Professor McDaniel’s findings mean there is little linear association
between research and teaching—for example, knowing that a professor is a good researcher
gives little information about whether she is a good or bad teacher.
Note: Students often think that “negative association” and “no association” mean the
same thing. This exercise provides a good illustration of the difference between these terms.

2.60. (a) Because occupation has a categorical (nominal) scale, we cannot compute the
correlation between occupation and anything. (There may be a strong association between
these variables; some writers and speakers use “correlation” as a synonym for “association.”
It is much better to retain the more specific meaning.) (b) A correlation r = 1.19 is
impossible because −1 ≤ r ≤ 1 always. (c) Neither variable (gender and color) is
quantitative.

2.61. Both relationships (scatterplots follow) are somewhat linear. The GPA/IQ scatterplot
. .
(r = 0.6337) shows a stronger association than GPA/self-concept (r = 0.5418). The two
students with the lowest GPAs stand out in both plots; a few others stand out in at least one
plot. Generally speaking, removing these points raises r (because the remaining points look
more linear). An exception: Removing the lower-left point in the self-concept plot decreases
r because the relative scatter of the remaining points is greater.
Solutions 101

10 10

8 8
GPA

GPA
6 6

4 4

2 2

0 0
70 80 90 100 110 120 130 20 30 40 50 60 70 80
IQ Self-concept score

2.62. The line lies almost entirely above


4
the points in the scatterplot. (The slope 3.5
−0.00344 of this line is the same as the 3

Fat gain (kg)


regression equation given in Example 2.19, 2.5
but the intercept 4.505 is one more than the 2
regression intercept.) 1.5
1
0.5
0
-100 0 100 200 300 400 500 600 700
NEA increase (cal)

.
2.63. The estimated fat gain is 3.505 − 0.00344 × 600 = 1.441 kg.

2.64. The data used to determine the regression line had NEA increase values ranging from
−94 to 690 calories, so estimates for values inside that range (like 200 and 500) should be
relatively safe. For values far outside this range (like −400 and 1000), the predictions would
not be trustworthy.

2.65. The table on the right shows the r −0.9 −0.5 −0.3 0 0.3 0.5 0.9
values of r 2 (expressed as percents). r 2 81% 25% 9% 0% 9% 25% 81%
We observe that (i) the fraction of
variation explained depends only on the magnitude (absolute value) of r , not its sign, and
(ii) the fraction of explained variation drops off drastically as |r | moves away from 1.

2.66. (a) Scatterplot on the right. (b) The 50000


association appears to be roughly linear
Open space (acres)

40000
(although note that the slope of the line
is almost completely determined by the 30000
largest cities). (c) The regression equation
is ŷ = 1248 + 6.1050x. (d) Regression 20000
.
on population explains r 2 = 95.2% of the 10000
variation in open space.
0
0 2500 5000 7500
Population (thousands)
102 Chapter 2 Looking at Data—Relationships

2.67. Residuals (found with software) are given in the table Los Angeles 5994.85
on the right. Los Angeles is the best; it has nearly 6000 Washington, D.C. 2763.75
acres more than the regression line predicts. Chicago, Minneapolis 2107.59
which falls almost 7300 acres short of the regression pre- Philadelphia 169.42
diction, is the worst of this group. Oakland 27.91
Boston 20.96
San Francisco −75.78
Baltimore −131.55
New York −282.99
Long Beach −1181.70
Miami −2129.21
Chicago −7283.26

2.68. Because New York’s data point is con- 50000


sistent with the pattern of the other cities,

Open space (acres)


40000
we don’t consider it an outlier. It does
have some impact on the regression line; 30000
with New York removed, the equation is
ŷ = 1105 + 6.2557x. However, in the 20000
plot on the right, we note that the original 10000
regression line (solid) and the new line
(dashed) are very similar, and the residuals 0
0 2500 5000 7500
are likewise very similar.
Population (thousands)

.
651 = 7.82. The complete table is shown below
2.69. For Baltimore, for example, this rate is 5091
on the left. Note that population is in thousands, so these are in units of acres per 1000
people. (a) Scatterplot below on the right. (b) The association is much less linear than in
the scatterplot for Exercise 2.66. (c) The regression equation is ŷ = 8.739 − 0.000424x.
.
(d) Regression on population explains only r 2 = 8.7% of the variation in open space per
person.
Baltimore 7.82
Boston 8.26 14
(acres/1000 people)

Chicago 4.02
12
Long Beach 6.25
Open space

Los Angeles 8.07 10


Miami 3.67 8
Minneapolis 14.87
6
New York 6.23
Oakland 9.30 4
Philadelphia 7.04 2
San Francisco 7.61 0 2500 5000 7500
Washington, D.C. 13.12 Population (thousands)
Solutions 103

2.70. As in Exercise 2.67, we compute residuals to assess Minneapolis 6.290


whether a city has more or less open space than we would Washington, D.C. 4.622
expect. These are given on the right, in descending order. Los Angeles 0.893
This time, Minneapolis is best, with about 6.3 acres per 1000 New York 0.883
people above what we predict. Miami is worst by this mea- Oakland 0.733
Boston −0.230
sure, falling short of the prediction by almost 5 acres per
Baltimore −0.643
1000 people. San Francisco −0.796
Preferences will vary. One reason to prefer the first Philadelphia −1.056
approach—apart from the stronger, more linear association— Long Beach −2.294
is the negative relationship in the second approach. Why Chicago −3.490
would an individual in a large city need less open space than Miami −4.914
an individual in a smaller city?

2.71. The regression equation for predicting carbohydrates from alcohol content is
ŷ = 3.379 + 1.6155x.
.
Note: As we would guess from the scatterplot, and from the correlation r = 0.2873 found
.
in Exercise 2.43, this is not a very reliable prediction; it only explains r2 = 8.3% of the
variation in carbohydrates.

2.72. (a) With the outlier (O’Doul’s) removed, the regression equation changes to
ŷ = −3.544 + 3.0319x. (b) An outlier tends to weaken the association between the variables,
and (as in this case) can drastically change the linear regression equation.

2.73. (a) To three decimal places, the correlations are all approximately 0.816 (for set D, r
actually rounds to 0.817), and the regression lines are all approximately ŷ = 3.000 + 0.500x.
.
For all four sets, we predict ŷ = 8 when x = 10. (b) Scatterplots below. (c) For Set A, the
use of the regression line seems to be reasonable—the data do seem to have a moderate
linear association (albeit with a fair amount of scatter). For Set B, there is an obvious
non-linear relationship; we should fit a parabola or other curve. For Set C, the point
(13, 12.74) deviates from the (highly linear) pattern of the other points; if we can exclude
it, the (new) regression formula would be very useful for prediction. For Set D, the data
point with x = 19 is a very influential point—the other points alone give no indication
of slope for the line. Seeing how widely scattered the y coordinates of the other points
are, we cannot place too much faith in the y coordinate of the influential point; thus, we
cannot depend on the slope of the line, so we cannot depend on the estimate when x = 10.
(We also have no evidence as to whether or not a line is an appropriate model for this
relationship.)

Set A 10 Set B 13 Set C 13 Set D


11 12 12
9
10 11 11
8
9 10 10
7
8 9 9
6
7 8 8
5 7 7
6 4 6 6
5 3 5 5
4 2 4 4
4 6 8 10 12 14 4 6 8 10 12 14 4 6 8 10 12 14 5 10 15 20
104 Chapter 2 Looking at Data—Relationships

2.74. (a) The scatterplot (below, left) suggests a moderate positive linear relationship. (b) The
regression equation is ŷ = 17.38 + 0.6233x. (c) The residual plot is below, right. (d) The
.
regression explains r 2 = 27.4% of the variation in y. (e) Student summaries will vary.

34 3
33
2
32
31 1

Residual
30 0
y

29 –1
28
–2
27
26 –3
25 –4
17 18 19 20 21 22 23 17 18 19 20 21 22 23
x x

2.75. (a) The scatterplot (below, left) suggests a fairly strong positive linear relationship.
(b) The regression equation is ŷ = 1.470 + 1.4431x. (c) The residual plot is below,
right. The new point’s residual is positive; the other residuals decrease as x increases.
.
(d) The regression explains r 2 = 71.1% of the variation in y. (e) The new point makes the
relationship stronger, but its location has a large impact on the regression equation—both the
slope and intercept changed substantially.

50
4
45 2
Residual

40 0
y

–2
35
–4
30 –6
25 –8
16 18 20 22 24 26 28 30 16 18 20 22 24 26 28 30
x x

2.76. (a) The scatterplot (following page, left) gives little indication of a relationship between
.
x and y. The regression equation is ŷ = 29.163 + 0.02278x; it explains only r 2 = 0.5% of
the variation in y. The residual plot (following page, right) tells a similar story to the first
scatterplot—little evidence of a relationship. This new point does not fall along the same
line as the other points, so it drastically weakens the relationship. (b) A point that does not
follow the same pattern as the others can drastically change an association, and in extreme
cases, can essentially make it disappear.
Solutions 105

34 5
33 4
32 3
31 2

Residual
30 1
y

29 0
28 –1
27 –2
26 –3
25 –4
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
x x

2.77. (a) When x = 5, y = 12 + 6 × 5 = 42. (b) y increases by 6. (The change in y


corresponding to a unit increase in x is the slope of this line.) (c) The intercept of this
equation is 12.

2.78. (a) Time plot shown on the right, along

National mean NAEP score


280
with the regression line. (b) The means
.
and standard deviations are x = 1999.14,
. . . 275
y = 273.43, sx = 6.7436, and s y = 6.4513.
.
With the correlation r = 0.9791, the slope 270
and intercept are
. 265
b1 = r s y /sx = 0.9366 and
.
b0 = y − b1 x = −1599 260
The equation is therefore ŷ = −1599 + 1988 1992 1996 2000 2004 2008
. Year
0.9366x; this line explains about r 2 = 95.9%
of the variation in score. (c) Obviously, the software regression line should be the same.
Note: In examining student time plots, make sure they have a consistent scale on the
horizontal axis; a common mistake is to ignore the variation in the gaps between years and
leave an equal amount of horizontal space between each data point. Also, students may need
to be reminded to carry extra digits in their computations of the slope and intercept.

2.79. See also the solution to Exercise 2.33. 0.12


(a) The regression equation is 0.1
0.08
ŷ = 0.06078x − 0.1261
Brain activity

0.06
(b) Based on the “up-and-over” method, 0.04
most students will probably estimate 0.02
. 0
that ŷ = 0; the regression formula gives
–0.02
ŷ = −0.0045. (c) The correlation is
. . –0.04
r = 0.8782, so the line explains r 2 = 77% –0.06
of the variation in brain activity. 1 1.5 2 2.5 3 3.5
Social distress score
106 Chapter 2 Looking at Data—Relationships

2.80. The regression equations are ŷ = −2.39 + 0.158x (Run 8903, 11.9 mg/s) and
ŷ = −1.45 + 0.0911x (Run 8905, 29.6 mg/s). Therefore, the growth rates are (respectively)
0.158 cm/minute and 0.0911 cm/minute; this suggests that the faster the water flows, the
more slowly the icicles grow.

. .
2.81. The means and standard deviations are x = 95 min, y = 12.6611 cm, sx = 53.3854 min,
. .
and s y = 8.4967 cm; the correlation is r = 0.9958.
.
For predicting length from time, the slope and intercept are b1 = r s y /sx = 0.158 cm/min
.
and a1 = y − b1 x = −2.39 cm, giving the equation ŷ = −2.39 + 0.158x (as in
Exercise 2.80).
.
For predicting time from length, the slope and intercept are b2 = r sx /s y = 6.26 min/cm
.
and a2 = x − b2 y = 15.79 min, giving the equation x̂ = 15.79 + 6.26y.

2.82. The means and standard deviations are: For lean body mass, m = 46.74 and
sm = 8.28 kg, and for metabolic rate, r = 1369.5 and sr = 257.5 cal/day. The
correlation is r = 0.8647. For predicting metabolic rate from body mass, the slope is
.
b1 = r · sr /sm = 26.9 cal/day per kg. For predicting body mass from metabolic rate, the
.
slope is b2 = r · sm /sr = 0.0278 kg per cal/day.

. .
2.83. The correlation of IQ with GPA is r1 = 0.634; for self-concept and GPA, r2 = 0.542.
.
IQ does a slightly better job; it explains about r12 = 40.2% of the variation in GPA, while
.
self-concept explains about r22 = 29.4% of the variation.

2.84. Women’s heights are the x-values; men’s are the y- 75


values. The slope is b1 = (0.5)(2.7)/2.5 = 0.54 and the
70
Husband’s height

intercept is b0 = 68.5 − (0.54)(64.5) = 33.67.


The regression equation is ŷ = 33.67 + 0.54x. Ideally, 65
the scales should be the same on both axes. For a 67-inch
tall wife, we predict the husband’s height will be about 60
69.85 inches. 55

50
50 55 60 65 70 75
Wife’s height

2.85. We have slope b1 = r s y /sx and intercept b0 = y − b1 x, and ŷ = b0 + b1 x, so when


x = x, ŷ = b0 + b1 x = (y − b1 x) + b1 x = y. (Note that the value of the slope does not
actually matter.)

. . .
2.86. (a) x = 95 min, sx = 53.3854 min, y = 12.6611 cm, and s y = 8.4967 cm. The
.
correlation r = 0.9958 has no units. (b) Multiply the old values of y and s y by 2.54:
. .
y = 32.1591 and s y = 21.5816 inches. The correlation r is unchanged. (c) The slope is
r s y /sx ; with s y from part (b), this gives b1 = 0.4025 in/min. (Or multiply by 2.54 the
appropriate slope from the solution to Exercise 2.80.)

2.87. r = 0.16 = 0.40 (high attendance goes with high grades, so r must be positive).
Solutions 107

2.88. (a) The scatterplot (below, left) includes the regression line ŷ = 21581 − 473.73x.
(b) The scatterplot does not suggest a linear association, so a regression line is not an
appropriate summary of the relationship. (c) The residual plot (below, right) reveals—in a
manner similar to the original scatterplot—that a line is not appropriate for this relationship.
More specifically, the wide range of GDP-per-cap values for low unemployment rates
suggest that there may be no useful relationship unless unemployment is sufficiently high.
80000 60000
70000 50000
60000 40000
GDP per cap

Residual
50000 30000
40000 20000
30000 10000
20000 0
10000 –10000
0 –20000
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Unemployment Unemployment

2.89. (a) The scatterplot (below, left) includes the regression line ŷ = 10.27 − 0.5382x. (b) The
scatterplot looks more linear than Figure 2.5, but a line may not be appropriate for all values
of log unemployment. (c) In the residual plot (below, right), we see that there are more
negative residuals on the left and right, with more positive residuals in the middle.
2
11
1
10
LGDP per cap

Residual

9 0

8 –1

7 –2

6 –3
–1 0 1 2 3 4 –1 0 1 2 3 4
L unemployment L unemployment

2.90. After removing countries with low unemployment rates, there are 70 countries left.
(a) The scatterplot (below, left) includes the regression line ŷ = 10.8614 − 0.7902x. (b) A
line seems appropriate for this set of countries, (c) The residual plot (below, right) does not
seem to show any patterns that might suggest any causes for concern.
10.5
1
10
0.5
LGDP per cap

9.5
Residual

9 0
8.5 –0.5
8
–1
7.5
7 –1.5
6.5 –2
1.5 2 2.5 3 3.5 4 4.5 1.5 2 2.5 3 3.5 4 4.5
L unemployment L unemployment
108 Chapter 2 Looking at Data—Relationships

2.91. See the solutions to Exercise 2.34 (scatter- Explanatory


plots) and Exercise 2.51 (correlations). Shown variable Equation r2
below are the three scatterplots, with regression Income 341.54 + 3.5760x 51.5%
lines; the equations for those lines are given on Debt 16.02 + 2.8960x 97.2%
the right. The best regression line is clearly the Revenue 425.17 − 1.6822x 10.4%
one based on debt.

600
Team value ($millions)

550
500
450
400
350
300
250
–30 –20 –10 0 10 20 30 40 50 80 100 120 140 160 180 200 0 10 20 30 40 50 60 70
Income ($millions) Debt ($millions) Revenue ($millions)

2.92. For an NEA increase of 143 calories, the predicted fat gain is ŷ = 3.505 − 0.00344 ×
. .
143 = 3.013 kg, so the residual is y − ŷ = 3.2 − 3.013 = 0.187 kg. This residual is positive
because the actual fat gain was greater than the prediction.

2.93. The sum of the residuals is 0.01.

2.94. (a) It is impossible for all the residuals to be positive; some must be negative, because
they will always sum to 0. (b) The direction of the relationship (positive or negative) has no
connection to whether or not it is due to causation. (c) Lurking variables can be any kind of
variables (and may be more likely to be explanatory rather than response).

2.95. (a) A high correlation means strong association, not causation. (b) Outliers in the y
direction (and some other data points) will have large residuals. (c) It is not extrapolation if
1 ≤ x ≤ 5.

2.96. Both variables are indicators of the level of technological and economic development in a
country; typically, they will both be low in a poorer, underdeveloped country, and they will
both be high in a more affluent country.

2.97. A reasonable explanation is that the cause-and-effect relationship goes in the other
direction: Doing well makes students or workers feel good about themselves, rather than
vice versa.

2.98. Patients suffering from more serious illnesses are more likely to go to larger hospitals
(which may have more or better facilities) for treatment. They are also likely to require
more time to recuperate afterwards.
Solutions 109

2.99. The explanatory and response variables were “consumption of herbal tea” and
“cheerfulness/health.” The most important lurking variable is social interaction; many of the
nursing home residents may have been lonely before the students started visiting.

2.100. See also the solutions to Exercises 2.3 and 2.9. (a) Size should be on the horizontal
axis because it is the explanatory variable. (b) The regression line is ŷ = 2.6071 + 0.08036x.
(c) See the plot (next page, left). (d) Rounded to four decimal places, the residuals (as
computed by software) are −0.0714, 0.1071, and −0.0357. It turns out that these three
residuals add up to 0, no matter how much they are rounded. However, if they are computed
by hand, and the slope and intercept in the regression equation have been rounded, there
might be some roundoff error. (e) The middle residual is positive and the other two are
negative, meaning that the 16-ounce drink costs more than the predicted value and the other
two sizes cost less than predicted. Note that the residuals show the same pattern (relative to
a horizontal line at 0) as the original points around the regression line.

4.60 0.10
4.40
0.05
4.20

Residual
Cost ($)

4.00 0.00
3.80
–0.05
3.60
3.40 –0.10
10 12 14 16 18 20 22 24 10 12 14 16 18 20 22 24
Size (ounces) Size (ounces)

2.101. (a) The plot (below, left) is curved (low at the beginning and end of the year, high in
the middle). (b) The regression line is ŷ = 39.392 + 1.4832x. It does not fit well because
a line is poor summary of this relationship. (c) Residuals are negative for January through
March and October through December (when actual temperature is less than predicted
temperature), and positive from April to September (when it is warmer than predicted).
(d) A similar pattern would be expected in any city that is subject to seasonal temperature
variation. (e) Seasons in the Southern Hemisphere are reversed, so temperature would be
cooler in the middle of the year.

70 20
Temperature (oF)

60 10
Residual

0
50
–10
40
–20
30 –30
20 –40
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Month Month
110 Chapter 2 Looking at Data—Relationships

2.102. (a) Below, left. (b) This line is not a good summary of the pattern; the scatterplot is
curved rather than linear. (c) The sum is 0.01. The first two and last four residuals are
negative, and those in the middle are positive. Plot below, right.
8 0.6
0.4
7 0.2
Weight (kg)

Residual
0
6 –0.2
–0.4
5 –0.6
–0.8
4 –1
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Age (months) Age (months)

2.103. With individual children, the correlation would be smaller (closer to 0) because the
additional variation of data from individuals would increase the “scatter” on the scatterplot,
thus decreasing the strength of the relationship.

2.104. Presumably, those applicants who were hired would generally have been those who
scored well on the test. As a result, we have little or no information on the job performance
of those who scored poorly (and were therefore not hired). Those with higher test scores
(who were hired) will likely have a range of performance ratings, so we will only see
the various ratings for those with high scores, which will almost certainly show a weaker
relationship than if we had performance ratings for all applicants.

2.105. For example, a student who in the past might have received a grade of B (and a
lower SAT score) now receives an A (but has a lower SAT score than an A student in the
past). While this is a bit of an oversimplification, this means that today’s A students are
yesterday’s A and B students, today’s B students are yesterday’s C students, and so on.
Because of the grade inflation, we are not comparing students with equal abilities in the past
and today.

2.106. A simple example illustrates this nicely: Suppose that everyone’s current salary is their
age (in thousands of dollars); for example, a 52-year-old worker makes $52,000 per year.
Everyone receives a $500 raise each year. That means that in two years, every worker’s
income has increased by $1000, but their age has increased by 2, so each worker’s salary is
now their age minus 1 (thousand dollars).

.
2.107. The correlation between BMR and fat gain is r = 0.08795; the slope of the regression
line is b = 0.000811 kg/cal. These both show that BMR is less useful for predicting fat
gain. The small correlation suggests a very weak linear relationship (explaining less than 1%
of the variation in fat gain). The small slope means that changes in BMR have very little
impact on fat gain; for example, increasing BMR by 100 calories changes fat gain by only
0.08 kg.
Solutions 111

2.108. (a) The scatterplot of the data is below on the left. (It is difficult to tell that there
are 20 data points, because many of the points overlap.) (b) The regression equation is
ŷ = −14.4 + 46.6x. (c) Residual plot below, right. The residuals for the extreme x-values
(x = 0.25 and x = 20.0) are almost all positive; all those for the middle two x-values are
negative.

900 16
800 12
700 8
Response

600

Residual
4
500
400 0
300 –4
200 –8
100 –12
0
–16
0 5 10 15 20 0 5 10 15 20
Mass (ng) Mass (ng)

2.109. (a) Scatterplot on the right. (b) The


plot shows a strong positive linear rela- 82
tionship. (c) The regression equation is 80
ŷ = 20.40 + 0.7194x. (d) Hernandez’s point Round 2
78
is in the lower left—a logical place for the
76
eventual champion.
74
72

74 76 78 80 82 84 86 88
Round 1

2.110. (a) Apart from the outlier—circled for 350


part (b)—the scatterplot shows a moderate
300
linear negative association. (b) With the out-
Silicon (mg/g)

lier, r = −0.3387; without it, r ∗ = −0.7866. 250


(c) The two regression formulas are 200
ŷ = −492.6 − 33.79x (the solid line, 150
with all points) and ŷ = −1371.6 − 75.52x
100
(the dashed line, with the outlier omitted).
The omitted point is also influential, as it 50
–22 –21.5 –21 –20.5 –20 –19.5
has a noticeable impact on the line. Isotope (%)

2.111. (a) Drawing the “best line” by eye is a very inaccurate process; few people choose
the best line (although you can get better at it with practice). (b) Most people tend to
.
overestimate the slope for a scatterplot with r = 0.7; that is, most students will find that the
112 Chapter 2 Looking at Data—Relationships

least-squares line is less steep than the one they draw.

2.112. (a) Any point that falls exactly on the regression line will not increase the sum of
squared vertical distances (which the regression line minimizes). Any other line—even if it
passes through this new point—will necessarily have a higher total sum of squares. Thus,
the regression line does not change. Possible output below, left. (b) Influential points are
those whose x coordinates are outliers; this point is on the right side, while all others are on
the left. Possible output below, right.

2.113. The plot shown is a very simplified (and


80
Annual income ($1000)

not very realistic) example. Filled circles


are economists in business; open circles are 70
teaching economists. The plot should show 60
positive association when either set of circles
is viewed separately and should show a large 50
number of bachelor’s degree economists in 40
business and graduate degree economists in
academia. 30
1 2 3 4 5 6 7 8
Years of post-HS education
Solutions 113

2.114. See also the solution to Exercise 2.73. (a) Fits and residuals are listed below. (Students
should find these using software.) (b) Plots below. (c) The residual plots confirm the
observations made in Exercise 2.73: Regression is only appropriate for Set A.

Set A Set B Set C Set D


x y Fit Resid. x y Fit Resid. x y Fit Resid. x y Fit Resid.
10 8.04 8.00 0.04 10 9.14 8.00 1.14 10 7.46 8.00 −0.54 8 6.58 7.00 −0.42
8 6.95 7.00 −0.05 8 8.14 7.00 1.14 8 6.77 7.00 −0.23 8 5.76 7.00 −1.24
13 7.58 9.50 −1.92 13 8.74 9.50 −0.76 13 12.74 9.50 3.24 8 7.71 7.00 0.71
9 8.81 7.50 1.31 9 8.77 7.50 1.27 9 7.11 7.50 −0.39 8 8.84 7.00 1.84
11 8.33 8.50 −0.17 11 9.26 8.50 0.76 11 7.81 8.50 −0.69 8 8.47 7.00 1.47
14 9.96 10.00 −0.04 14 8.10 10.00 −1.90 14 8.84 10.00 −1.16 8 7.04 7.00 0.04
6 7.24 6.00 1.24 6 6.13 6.00 0.13 6 6.08 6.00 0.08 8 5.25 7.00 −1.75
4 4.26 5.00 −0.74 4 3.10 5.00 −1.90 4 5.39 5.00 0.39 8 5.56 7.00 −1.44
12 10.84 9.00 1.84 12 9.13 9.00 0.13 12 8.15 9.00 −0.85 8 7.91 7.00 0.91
7 4.82 6.50 −1.68 7 7.26 6.50 0.76 7 6.42 6.50 −0.08 8 6.89 7.00 −0.11
5 5.68 5.50 0.18 5 4.74 5.50 −0.76 5 5.73 5.50 0.23 19 12.50 12.50 0.00

2 2 2
Set A Set B 3 Set C Set D
1 1 2 1

1
0 0 0
0
–1 –1 –1
–1

–2 –2 –2 –2
4 6 8 10 12 14 4 6 8 10 12 14 4 6 8 10 12 14 5 10 15 20

2.115. There are 1684 female binge drinkers in the table; 8232 female students are not binge
drinkers.

2.116. There are 1684 + 8232 = 9916 women in the study. The number of students who are
not binge drinkers is 5550 + 8232 = 13,782.

2.117. Divide the number of non-bingeing females by the total number of students:
8232 .
= 0.482
17,096

2.118. Use the numbers in the right-hand column of the table in Example 2.28. Divide the
counts of bingeing and non-bingeing students by the total number of students:
3314 . 13,782 .
= 0.194 and = 0.806
17,096 17,096

2.119. This is a conditional distribution; take the number of bingeing males divided by the
1630 .
total number of males: = 0.227.
7180

2.120. The first computation was performed in the previous solution; for the second, take the
5550 .
number of non-bingeing males divided by the total number of males: = 0.773.
7180
114 Chapter 2 Looking at Data—Relationships

.
2.121. (a) There are 151 + 148 = 299 “high exercisers,” of which 151 299 = 50.5% get enough
sleep and 49.5% (the rest) do not. (b) There are 115 + 242 = 357 “low exercisers,” of which
115 .
357 = 32.2% get enough sleep and 67.8% (the rest) do not. (c) Those who exercise more
than the median are more likely to get enough sleep.
Note: This question is asking for the conditional distribution of sleep within each
exercise group. The next question asks for the conditional distribution of exercise within
each sleep group.

.
2.122. (a) There are 151 + 115 = 266 students who get enough sleep, of which 151 266 = 56.8%
are high exercisers and 43.2% (the rest) are low exercisers. (b) There are 148 + 242 = 390
.
students who do not get enough sleep, of which 148 390 = 37.9% are high exercisers and 62.1%
(the rest) are low exercisers. (c) Students who get enough sleep are more likely to be high
exercisers. (d) Preferences will vary. In particular, note that one can make the case for a
cause-and-effect relationship in either direction between these variables.

2.123. 63
2100 = 3.0% of Hospital A’s patients died, compared with 16
800 = 2.0% at Hospital B.

2.124. (a) For patients in poor condition, Hospital A lost 1500 57


= 3.8%, while Hospital B lost
8
= 4.0%. Hospital A did slightly better with patients in poor condition. (b) For patients
8 .
200
in good condition, Hospital A lost 600 = 1.0% and Hospital B lost 600
6
= 1.33%. Again,
Hospital A did slightly better. (c) Hospital A appears to be the safer choice. (d) More than
70% of Hospital A’s patients arrive in poor condition, compared to 25% at Hospital B,
so A’s survival rate is lower overall because these patients are more likely to die in the
hospital.

2.125. Bar graphs are on the following page. (a) There FT PT


are about 3,388,000 full-time college students aged 15 3388 389 3777
to 19. (Note that numbers are in thousands.) (b) The 15–19 0.2067 0.0237 0.2305
joint distribution is found by dividing each number 5238 1164 6402
20–24 0.3196 0.0710 0.3907
in the table by 16,388 (the total of all the numbers).
These proportions are given in italics in the table on 1703 1699 3402
25–34 0.1039 0.1037 0.2076
3388 .
the right. For example, 16,388 = 0.2067, meaning that
762 2045 2807
about 20.7% of all college students are full-time and 35+ 0.0465 0.1248 0.1713
aged 15 to 19. (c) The marginal distribution of age is 11091 5297 16388
found by dividing the row totals by 16,388; they are in 0.6768 0.3232
the right margin of the table and the graph on the left
3777 .
following. For example, 16,388 = 0.2305, meaning that about 23% of all college students are
aged 15 to 19. (d) The marginal distribution of status is found by dividing the column totals
by 16,388; they are in the bottom margin of the table and the graph on the right following.
.
16,388 = 0.6768, meaning that about 67.7% of all college students are full-time.
For example, 11,091
Solutions 115

0.4 0.7

Proportion of students
Proportion of students
0.6
0.3 0.5
0.4
0.2
0.3

0.1 0.2
0.1
0 0
15–19 20–24 25–34 35 and over Full-time Part-time

2.126. Refer to the counts in the solution to Exercise 2.125. For each age category, the
conditional distribution of status is found by dividing the counts in that row by that row
. 389 .
3777 = 0.8970 and 3777 = 0.1030, meaning that of all college students
total. For example, 3388
in the 15–19 age range, about 89.7% are full-time, and the rest (10.3%) are part-time. Note
that each pair of numbers should add up to 1 (except for rounding error, but with only two
numbers, that rarely happens). The complete table is shown below, along with one possible
graphical presentation. We see that the older the students are, the more likely they are to be
part-time.
1
0.9 Part-time
Proportion of students

0.8
FT PT 0.7 Full-time
15–19 0.8970 0.1030 0.6
0.5
20–24 0.8182 0.1818 0.4
25–34 0.5006 0.4994 0.3
0.2
35+ 0.2715 0.7285 0.1
0
15–19 20–24 25–34 35 and over

2.127. Refer to the counts in the solution to Exercise 2.125. For each status category, the
conditional distribution of age is found by dividing the counts in that column by that column
3388 . 5238 .
total. For example, 11,091 = 0.3055, 11,091 = 0.4723, etc., meaning that of all full-time
college students, about 30.55% are aged 15 to 19, 47.23% are 20 to 24, and so on. Note
that each set of four numbers should add up to 1 (except for rounding error). Graphical
presentations may vary; one possibility is shown below. We see that full-time students are
dominated by younger ages, while part-time students are more likely to be older. (This
is essentially the same observation made in the previous exercise, seen from a different
viewpoint.)
1
0.9 15–19
Proportion of students

0.8
FT PT 0.7 20–24
15–19 0.3055 0.0734 0.6
0.5 25–34
20–24 0.4723 0.2197 0.4
35 and over
25–34 0.1535 0.3207 0.3
0.2
35+ 0.0687 0.3861 0.1
0
Full-time Part-time
116 Chapter 2 Looking at Data—Relationships

2.128. Two examples are shown on the right. In general, choose a 50 150 175 25
to be any number from 0 to 200, and then all the other entries 150 50 25 175
can be determined.
Note: This is why we say that such a table has “one degree of freedom”: We can make
one (nearly) arbitrary choice for the first number, and then have no more decisions to make.

2.129. To construct such a table, we can start by choosing values for the a b c r1
row and column sums r 1, r 2, r 3, c1, c2, c3, as well as the grand total d e f r2
N . Note that N = r 1 + r 2 + r 3 = c1 + c2 + c3, so we only have five g h i r3
choices to make. Then, find each count a, b, c, d, e, f, g, h, i by taking c1 c2 c3 N
the corresponding row total, times the corresponding column total, divided by the grand total.
For example, a = r 1 × c1/N and f = r 2 × c3/N . Of course, these counts should be whole
numbers, so it may be necessary to make adjustments in the row and column totals to meet
this requirement.
The simplest such table would have all nine counts (a, b, c, d, e, f, g, h, i) equal to one
another.

+ 180 .
2.130. (a) Overall, 125 + 155
900 = 51.1%
60
Nonresponse rate (%)

did not respond. (b) Generally, the larger


50
the business, the less likely it was to re-
. 40
300 = 41.7% of small businesses,
spond: 125
155 . 30
300 = 51.7% of medium-sized businesses,
20
300 = 60.0% of large businesses did not
and 180
respond. (c) At right. (d) Of the 440 total 10
.
440 = 39.8% came from small
responses, 175 0
. Small Medium Large
440 = 33.0% from medium-sized
businesses, 145 Business size
.
businesses, and 120440 = 27.3% from large busi-
nesses. (e) No: Almost 40% of respondents were small businesses, while just over a quarter
of all responses come from large businesses.

68 .
2.131. (a) Use column percents, e.g., 225 = 30.22% of females are in administration. See table
and graph below. The biggest difference between women and men is in Administration:
A higher percentage of women chose this major. Meanwhile, a greater proportion of men
.
722 = 46.5% did not
chose other fields, especially Finance. (b) There were 386 responses; 336
respond.
Solutions 117

40 Female

Percent of students
35
Male
30
Female Male Overall
25 Overall
Accting. 30.22% 34.78% 32.12%
20
Admin. 40.44% 24.84% 33.94%
15
Econ. 2.22% 3.70% 2.85%
10
Fin. 27.11% 36.65% 31.09%
5
0
Accounting Administration Economics Finance
Major

.
2.132. 1424 = 58.33% of desipramine users 60

Percent without relapses


did not have a relapse, while 246
= 25% of 50
4 .
lithium users and 24 = 16.67% of those 40
who received placebos succeeded in breaking
30
their addictions. Desipramine seems to be
effective. Note that use of percentages is not 20
as crucial here as in other cases because each 10
drug was given to 24 addicts. 0
Desipramine Lithium Placebo

2.133. This is a case of confounding: The association be-


tween dietary iron and anemia is difficult to detect be-
Inadequate Anemia
cause malaria and helminths also affect iron levels in the dietary iron ?
body.
?
?

Malaria Helminths

2.134. Opinions will vary; one can argue for causation both ways, and the truth is probably
that both conditions exacerbate one another.

2.135. Responses will vary. For example, students


who choose the online course might have more self-
motivation or better computer skills. A diagram is shown Course
Course
on the right; the generic “Student characteristics” might type ? grade
be replaced with something more specific.
?

Student
characteristics
118 Chapter 2 Looking at Data—Relationships

2.136. Age is one lurking variable: Married men would


generally be older than single men, so they would have
been in the work force longer and therefore had more
Marital Salary
time to advance in their careers. The diagram shown status ?
on the right shows this lurking variable; other variables
could also be shown in place of “age.” ?

Age

2.137. No; self-confidence and improving fitness could be a common response to some other
personality trait, or high self-confidence could make a person more likely to join the
exercise program.

2.138. Students with music experience may have other ad-


vantages (wealthier parents, better school systems, and so
forth). That is, experience with music may have been a Music Academic
experience performance
“symptom” (common response) of some other factor that
also tends to cause high grades.

Other
factors

2.139. Two possibilities are that they might perform better simply because this is their second
attempt or because they feel better prepared as a result of taking the course (whether or not
they really are better prepared).

2.140. The diagram below illustrates the confounding between exposure to chemicals and
standing up.

For 2.140 For 2.141

Chemical Hospital Length of


Miscarriages stay
exposure ? size

?
Time Seriousness
standing up of illness

2.141. Patients suffering from more serious illnesses are more likely to go to larger hospitals
(which may have more or better facilities) for treatment. They are also likely to require
more time to recuperate afterwards.
Solutions 119

2.142. Spending more time watching TV means that less time is spent on other activities; this
may suggest lurking variables. For example, perhaps the parents of heavy TV watchers do
not spend as much time at home as other parents. Also, heavy TV watchers would typically
not get as much exercise.

2.143. In this case, there may be a causative effect, but in the direction opposite to the one
suggested: People who are overweight are more likely to be on diets and so choose artificial
sweeteners over sugar. (Also, heavier people are at a higher risk to develop diabetes; if they
do, they are likely to switch to artificial sweeteners.)

2.144. (a) Statements such as this typically mean that the risk of dying at a given age is half
as great; that is, given two groups of the same age, where one group walks and the other
does not, the walkers are half as likely to die in (say) the next year. (b) Men who choose
to walk might also choose (or have chosen, earlier in life) other habits and behaviors that
reduce mortality.

2.145. This is an observational study—students choose their “treatment” (to take or not take
the refresher sessions).

2.146. (a) Time plot on the right. (b) The re- 950
gression equation is ŷ = 116779 − 57.83x. 900
Rank of “Atticus”

(c) The plot shows a clear negative associ- 850


ation, and the slope of the regression line 800
says that the rank is decreasing at an average 750
rate of about 58 per year. Because a lower 700
rank means higher popularity, this means 650
that “Atticus” is getting more popular. 600

2003 2004 2005 2006 2007 2008 2009


Year

2.148. (a) The scatterplot shows a positive, curved relationship. (b) The regression explains
.
about r 2 = 98.3% of the variation in salary. While this indicates that the relationship is
strong, and close to linear, we can see from that scatterplot that the actual relationship is
curved.

2.149. (a) The residuals are positive at the beginning and end, and negative in the middle.
(b) The behavior of the residuals agrees with the curved relationship seen in Figure 2.30.

2.150. (a) Both plots show a positive association, but the log-salary plot is linear rather than
curved. (b) While the residuals in Figure 2.31 were positive at the beginning and end, and
negative in the middle, the log-salary residuals in Figure 2.33 show no particular pattern.
120 Chapter 2 Looking at Data—Relationships

2.151. (a) The regression equation for predicting salary from year is ŷ = 41.253 + 3.9331x;
.
for x = 25, the predicted salary is ŷ = 139.58 thousand dollars, or about $139,600.
(b) The log-salary regression equation is ŷ = 3.8675 + 0.04832x. With x = 25, we have
. .
ŷ = 5.0754, so the predicted salary is e ŷ = 160.036, or about $160,040. (c) Although both
predictions involve extrapolation, the second is more reliable, because it is based on a linear
fit to a linear relationship. (d) Interpreting relationships without a plot is risky. (e) Student
summaries will vary, but should include comments about the importance of looking at plots,
and the risks of extrapolation.

2.152. (a) The scatterplot on the right includes 150

2008–09 salary ($1000)


the regression line given in the next exercise. 140
(b) The plot shows a very strong positive 130
linear relationship. (c) The correlation be- 120
.
tween the two variables is r = 0.9993, so 110
.
regression explains r 2 = 99.9% of the varia- 100
tion in 2008–09 salaries. 90
80
70
70 80 90 100 110 120 130 140
2007–08 salary ($1000)

2.153. (a) The regression equation is 1500


ŷ = 5403 + 0.9816x. (b) The residual 1000
plot on the right reveals no causes for 500
Residual ($)

concern; in particular, there are no clear


0
outliers or influential observations.
–500
–1000
–1500
–2000
70 80 90 100 110 120 130 140
2007–08 salary ($1000)

2.154. (a) A spreadsheet or other software is the best way to do these computations. As a
−141,800 . −109,800 .
check, the first two raises are 142,900
141,800 = 0.78% and 113,600109,800 = 3.46%. The
scatterplot (following page, left) shows a moderately strong negative linear relationship.
(b) The regression equation is ŷ = 8.8536 − 0.00005038x. (c) A plot of residuals
versus 2007–08 salaries reveals no outliers or other causes for concern. (d) The data
do show that, in general, those with lower salaries are given greater percentage raises.
Students might observe, for example, that the regression explains 71.1% of the variation
in raise. In addition, the regression slope tells us that on the average, an individual’s
raise decreases by about 0.5% for each additional $10,000 earned in 2007–08, because
.
0.00005038 × $10,000 = 0.5%. (Note that the slope is in units of percent per dollar.)
Solutions 121

6 1.5
5 1

Residual (%)
4 0.5
Raise (%)

3 0
2 –0.5
1 –1
0 –1.5
70 80 90 100 110 120 130 140 70 80 90 100 110 120 130 140
2007–08 salary ($1000) 2007–08 salary ($1000)

2.155. A school that accepts weaker students but graduates a higher-than-expected number of
them would have a positive residual, while a school with a stronger incoming class but a
lower-than-expected graduation rate would have a negative residual. It seems reasonable to
measure school quality by how much benefit students receive from attending the school.

2.156. (a) The association is negative and roughly linear. This seems reasonable because a
low number of smokers suggests that the state’s population is health-conscious, so we
might expect more people in that state to have healthy eating habits. (b) The correlation is
.
r = −0.5503. (c) Utah is the farthest point to the left (that is, it has the lowest smoking
rate) and lies well below the line (i.e., the proportion of adults who eat fruits and vegetables
is lower than we would expect). (d) California has the second-lowest smoking rate and
one of the highest fruit/vegetable rates. This point lies above the line, meaning that the
proportion of California adults who eat fruits and vegetables is higher than we would expect.

2.157. (a) The scatterplot shows a moderate


45
positive association. (b) The regression line
ŷ = 11.00 + 0.9344x fits the overall trend. 40
EdCollege

(However, there is some hint of a curve in


35
the plot, because most of the points in the
middle lie below the line.) (c) For example, 30
a state whose point falls above the line has a 25
higher percent of college graduates than we
would expect based on the percent who eat 20
16 18 20 22 24 26 28 30
5 servings of fruits and vegetables. (d) No; FruitVeg5
association is not evidence of causation.
122 Chapter 2 Looking at Data—Relationships

2.158. (a) The plot shows a fairly strong pos- 100


itive linear association. (b) The regression 90
equation is ŷ = −6.202 + 1.2081x. (c) If 80
.

Text pages
x = 62 pages, we predict ŷ = 68.7 pages. 70
60
50
40
30
20
10
20 30 40 50 60 70 80
A
L TEX pages

2.161. These results support the idea (the slope is negative, so variation decreases with
increasing diversity), but the relationship is only moderately strong (r 2 = 0.34, so diversity
only explains 34% of the variation in population variation).
Note: That last parenthetical comment is awkward and perhaps confusing, but is
consistent with similar statements interpreting r2 .

2.162. (a) One possible measure of the difference is the

Monkey call response (spikes/second)


500
mean response: 106.2 spikes/second for pure tones
and 176.6 spikes/second for monkey calls—an av- 400
erage of an additional 70.4 spikes/second. (b) The
regression equation is ŷ = 93.9 + 0.778x. The third 300
point (pure tone 241, call 485 spikes/second) has the
largest residual; it is circled. The first point (474 and 200
500 spikes/second) is an outlier in the x direction; it
100
is marked with a square. (c) The correlation drops
only slightly (from 0.6386 to 0.6101) when the third 0
point is removed; it drops more drastically (to 0.4793) 0 100 200 300 400 500
without the first point. (d) Without the first point, the Pure tone response (spikes/second)
line is ŷ = 101 + 0.693x; without the third point, it is ŷ = 98.4 + 0.679x.

2.163. On the right is a scatterplot of MOR 1.5


Modulus of rupture x 104 (psi)

against MOE, showing a moderate linear 1.4


positive association. The regression equation 1.3
is ŷ = 2653 + 0.004742x; this regression 1.2
. 1.1
explains r 2 = 0.6217 = 62% of the variation
1
in MOR. So, we can use MOE to get fairly 0.9
good (though not perfect) predictions of 0.8
MOR. 0.7
0.6
1 1.2 1.4 1.6 1.8 2 2.2 2.4
Modulus of elasticity x 106 (psi)
Solutions 123

2.164. The quantile plot (right) is reasonably


close to a straight line, so we have little rea- 1.5
son to doubt that they come from a Normal 1

Residual
distribution. 0.5
0
–0.5
–1

–3 –2 –1 0 1 2 3
Normal score

2.165. (a) The scatterplot is on the right. 18


(b) The regression equation is ŷ = 16

Drive for thinness


1.2027 + 0.3275x. As we see from the 14
12
scatterplot, the relationship is not too strong; 10
the correlation (r = 0.4916, r 2 = 0.2417) 8
confirms this. 6
4
2
0
0 5 10 15 20 25
Body dissatisfaction

2.166. (a) Yes: The two lines appear to fit the data well. There do not appear to
be any outliers or influential points. (b) Compare the slopes: before—0.189;
after—0.157. (The units for these slopes are 100 ft3 /day per degree-day/day; for
students who are comfortable with units, 18.9 ft3 vs. 15.7 ft3 would be a better
answer.) (c) Before: ŷ = 1.089 + 0.189(35) = 7.704 = 770.4 ft3 . After:
ŷ = 0.853 + 0.157(35) = 6.348 = 634.8 ft3 . (d) This amounts to an additional
($1.20)(7.704 − 6.348) = $1.63 per day, or $50.44 for the month.
124 Chapter 2 Looking at Data—Relationships

2.167. (a) Shown below are plots of count against time, and residuals against time for the
regression, which gives the formula ŷ = 259.58 − 19.464x. Both plots suggest a curved
relationship rather than a linear one. (b) With natural logarithms, the regression equation is
ŷ = 5.9732 − 0.2184x; with common logarithms, ŷ = 2.5941 − 0.09486x. The second pair
of plots below show the (natural) logarithm of the counts against time, suggesting a fairly
linear relationship, and the residuals against time, which shows no systematic pattern. (If
common logarithms are used instead of natural logs, the plots will look the same, except the
vertical scales will be different.) The correlations confirm the increased linearity of the log
plot: r 2 = 0.8234 for the original data, r 2 = 0.9884 for the log-data.

350 120
100
Bacteria count (hundreds)

300
80
250 60

Residual
200 40
150 20
0
100
–20
50 –40
0 –60
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Time Time

6 0.2
Log[Bacteria count (hundreds)]

5.5 0.15
5 0.1
Residual

0.05
4.5
0
4
–0.05
3.5 –0.1
3 –0.15
2.5 –0.2
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Time Time

2.168. Note that y = 46.6 + 0.41x. We predict that Octavio will score 4.1 points above the
mean on the final exam: ŷ = 46.6 + 0.41(x + 10) = 46.6 + 0.41x + 4.1 = y + 4.1.
(Alternatively, because the slope is 0.41, we can observe that an increase of 10 points on the
midterm yields an increase of 4.1 on the predicted final-exam score.)

2.169. Number of firefighters and amount of damage both increase with the seriousness of the
fire (i.e., they are common responses to the fire’s seriousness).
Solutions 125

2.170. Compute column percents; for ex- United Western


61,941 . Field States Europe Asia Overall
ample, 355,265 = 17.44% of those U.S.
Eng. 17.44% 38.26% 36.96% 32.78%
degrees considered in this table are in Nat. sci. 31.29% 33.73% 31.97% 32.29%
engineering. See table and graph at right. Soc. sci. 51.28% 28.01% 31.07% 34.93%
We observe that there are considerably
more social science degrees and fewer 60 Eng. Nat. sci. Soc. sci.
engineering degrees in the United States.

Percent of degrees
50
The Western Europe and Asia distribu- 40
tions are similar.
30
20
10
0
US Western Asia Overall
Europe

2.171. Different graphical presentations are possible; one is shown below. More women
perform volunteer work; the notably higher percentage of women who are “strictly
voluntary” participants accounts for the difference. (The “court-ordered” and “other”
percentages are similar for men and women.)
100
90 Strictly voluntary
80
70 Court-ordered
Percent

60
50 Other
40
Non-volunteers
30
20
10
0
Men Women

2.172. Table shown on the right; for example, Strictly Court-


31.9% .
40.3% = 79.16%. The percents in each row Gender voluntary ordered Other
sum to 100%, with no rounding error for up to Men 79.16% 5.21% 15.63%
Women 85.19% 2.14% 12.67%
four places after the decimal. Both this graph and
the graph in the previous exercise show that women are more likely to volunteer, but in this
view, we cannot see the difference in the rate of non-participation.
100
90 Strictly voluntary
80
70 Other
Percent

60
50 Court-ordered
40
30
20
10
0
Men Women
126 Chapter 2 Looking at Data—Relationships

800 = 61.25% of male applicants are


2.173. (a) At right. (b) 490 Admit Deny
.
admitted, while only 400
700 = 57.14% of females are admitted. Male 490 310
. Female 400 300
600 = 66.67% of male business school applicants are admit-
(c) 400
.
300 = 66.67%. In the law school, 200 = 45% of males
ted; for females, this rate is the same: 200 90

are admitted, compared to 200 400 = 50% of females. (d) A majority (6/7) of male applicants
+ 200 .
apply to the business school, which admits 400 600 + 300 = 66.67% of all applicants. Meanwhile,
90 + 200 .
a majority (3/5) of women apply to the law school, which admits only 200 + 400 = 48.33% of
its applicants.

2.174. Tables will vary, of course. The key idea is that one gender should be more likely to
apply to the schools that are easier to get into. For example, if the four schools admit 50%,
60%, 70%, and 80% of applicants, and men are more likely to apply to the first two, while
women apply to the latter two, women will be admitted more often.
A nice variation on this exercise is to describe two basketball teams practicing. You
observe that one team makes 50% of their shots, while the other makes only 40%. Does that
mean the first team is more accurate? Not necessarily; perhaps they attempted more lay-ups
while the other team spent more time shooting three-pointers. (Some students will latch onto
this kind of example much more quickly than discussions of male/female admission rates.)

2.175. If we ignore the “year” classification, 80


Department A
we see that Department A teaches 32 small 70
Small classes (%)

classes out of 52, or about 61.54%, while 60 Department B


Department B teaches 42 small classes out of 50
106, or about 39.62%. (These agree with the 40
dean’s numbers.) 30
For the report to the dean, students may 20
analyze the numbers in a variety of ways, 10
some valid and some not. The key observa- 0
Lower-level Upper-level
tions are: (i) When considering only first- and
. .
second-year classes, A has fewer small classes ( 121
= 8.33%) than B ( 12 70 = 17.14%). Like-
.
wise, when considering only upper-level classes, A has 31 40 = 77.5% and 36 = 83.33%
B has 30
small classes. The graph on the right illustrates this. These numbers are given in the back
.
of the text, so most students should include this in their analysis! (ii) 4052 = 77.78% of A’s
.
36
classes are upper-level courses, compared to 106 = 33.96% of B’s classes.
Solutions 127

2.176. (a) Subtract the “agreed” counts from the Minitab output
sample sizes to get the “disagreed” counts. The Students Non-st Total
table is in the Minitab output on the right. (The Agr 22 30 52
26.43 25.57
output has been slightly altered to have more
descriptive row and column headings.) We find Dis 39 29 68
.
X 2 = 2.67, df = 1, and P = 0.103, so we 34.57 33.43
cannot conclude that students and non-students Total 61 59 120
differ in the response to this question. (b) For
testing H0 : p1 = p2 versus Ha : p1 = p2 , we ChiSq = 0.744 + 0.769 +
. . .
have p̂1 = 0.3607, p̂2 = 0.5085, p̂ = 0.4333, 0.569 + 0.588 = 2.669
. df = 1, p = 0.103
SE Dp = 0.09048, and z = −1.63. Up to round-
ing, z 2 = X 2 , and the P-values are the same. (c) The statistical tests in (a) and (b) assume
that we have two SRSs, which we clearly do not have here. Furthermore, the two groups
differed in geography (northeast/West Coast) in addition to student/non-student classifica-
tion. These issues mean we should not place too much confidence in the conclusions of our
significance test—or, at least, we should not generalize our conclusions too far beyond the
populations “upper level northeastern college students taking a course in Internet marketing”
and “West Coast residents willing to participate in commercial focus groups.”

2.177. (a) First we must find the counts Minitab output


in each cell of the two-way ta- Div1 Div2 Div3 Total
ble. For example, there were about 1 966 621 998 2585
. 1146.87 603.54 834.59
(0.172)(5619) = 966 Division I
athletes who admitted to wager- 2 4653 2336 3091 10080
ing. These counts are shown in the 4472.13 2353.46 3254.41
Minitab output on the right, where
. Total 5619 2957 4089 12665
we see that X 2 = 76.7, df = 2, and
P < 0.0001. There is very strong ChiSq = 28.525 + 0.505 + 31.996 +
evidence that the percentage of ath- 7.315 + 0.130 + 8.205 = 76.675
df = 2, p = 0.000
letes who admit to wagering differs
by division. (b) Even with much smaller numbers of students (say, 1000 from each division),
P is still very small. Presumably, the estimated numbers are reliable enough that we would
not expect the true counts to be less than 1000, so we need not be concerned about the fact
that we had to estimate the sample sizes. (c) If the reported proportions are wrong, then our
conclusions may be suspect—especially if it is the case that athletes in some division were
more likely to say they had not wagered when they had. (d) It is difficult to predict exactly
how this might affect the results: Lack of independence could cause the estimated percents to
be too large, or too small, if our sample included several athletes from teams which have (or
do not have) a “gambling culture.”
Chapter 3 Solutions

3.1. Any group of friends is unlikely to include a representative cross section of all students.

3.2. Political speeches provide a good source of examples.

3.3. A hard-core runner (and her friends) are not representative of all young people.

3.4. The performance of one car is anecdotal evidence—a “haphazardly selected individual
case.” People tend to remember—and re-tell—stories about extraordinary performance, while
cases of average or below-average reliability are typically forgotten.

3.5. For example, who owns the Web site? Do they have data to back up this statement, and if
so, what was the source of that data?

3.7. This is an experiment: Each subject is assigned to a treatment group (presumably at


random, although the description does not tell us this is the case). The explanatory variable
is the drug received, the response variables are adverse events, as well as some reaction.
(The nature of that reaction is not specified in the exercise, but they apparently collected
some information to indicate which subjects had an “effective response” to the vaccine.)

3.8. (a) No treatment is imposed on the subjects (children); they (or their parents) choose how
much TV they watch. The explanatory variable is hours watching TV, and the response
variable is “later aggressive behavior.” (b) An adolescent who watches a lot of television
probably is more likely to spend less time doing homework, playing sports, or having social
interactions with peers. He or she may also have less contact with or guidance from his/her
parents.

3.9. This is an experiment: Each subject is (presumably randomly) assigned to a group, each
with its own treatment (computer animation or reading the textbook). The explanatory
variable is the teaching method, and the response variable is the change in each student’s
test score.

3.10. This is an experiment, assuming the order of treatment given to each subject was
randomly determined. The explanatory variable is the form of the apple (whole, or juice),
and the response variable is how full the subjects felt.
Note: This is a matched pairs experiment, described on page 181 of the text.

3.11. The experimental units are food samples, the treatment is exposure to different levels of
radiation, and the response variable is the amount of lipid oxidation. Note that in a study
with only one factor—like this one—the treatments and factor levels are essentially the same
thing: The factor is varying radiation exposure, with nine levels.
It is hard to say how much this will generalize; it seems likely that different lipids react
to radiation differently.

128
Solutions 129

3.12. This is an experiment because the experimental units (students) are randomly assigned to
a treatment group. Note that in a study with only one factor—like this one—the treatments
and factor levels are essentially the same thing: There are two treatments/levels of the factor
“instruction method.” The response variable is the change in score on the standardized test.
The results of this experiment should generalize to other classes (on the same topic)
taught by the same instructor, but might not apply to other subject matter, or to classes
taught by other instructors.

3.13. Those who volunteer to use the software may be better students (or worse). Even if
we cannot decide the direction of the bias (better or worse), the lack of random allocation
means that the conclusions we can draw from this study are limited at best.

3.14. Treatment 1
Group 1 - Paper-based
*
 HH

25 students
Random instruction j Compare change
H
H *


assignment in test scores
H
j
H Treatment 2
Group 2 - Web-based
25 students
instruction

3.15. Because there are nine levels, this diagram is rather large (and repetitive), so only the top
three branches are shown.
Group 1 - Treatment 1



3 samples Radiation level 1 J

J

JJ
^
Random - Group 2 - Treatment 2 - Compare lipid
assignment 3 samples Radiation level 2 oxidation
J


J

JJ
^ Group 3 - Treatment 3

3 samples Radiation level 3

3.16. The results will depend on the software used.

3.17. (a) Shopping patterns may differ on Friday and Saturday, which would make it hard to
determine the true effect of each promotion. (That is, the effect of the promotion would be
confounded with the effect of the day.) To correct this, we could offer one promotion on a
Friday, and the other on the following Friday. (Or, we could do as described in the exercise,
and then on the next weekend, swap the order of the offers.) (b) Responses may vary in
different states. To control for this, we could launch both campaigns in (separate) parts of
the same state or states. (c) A control is needed for comparison; if we simply compare
this year’s yield to last year’s yield, we will not know how much of the difference can be
attributed to changes in the economy. We should compare the new strategy’s yield with
another investment portfolio using the old strategy.
Note: For part (c), this comparison might be done without actually buying or selling
anything; we could simply compute how much money would have been made if we had
followed the new strategy; that is, we keep a “virtual portfolio.” This assumes that our
buying and selling is on a small enough scale that it does not affect market prices.
130 Chapter 3 Producing Data

3.18. (a) Assigning subjects by gender is not random. It would be better to treat gender
as a blocking variable, assigning five men and five women to each treatment. (b) This
randomization will not necessarily divide the subjects into two groups of five. (Note that it
would be a valid randomization to use this method until one group had four subjects, and
then assign any remaining subjects to the other group.) (c) The 10 rats in a batch might be
similar to one another in some way. For example, they might be siblings, or they might have
been exposed to unusual conditions during shipping. (The safest approach in this situation
would be to treat each batch as a block, and randomly assign two or three rats from each
batch to each treatment.)

3.19. The experiment can be single-blind (those evaluating the exams should not know which
teaching approach was used), but not double-blind, because the students will know which
treatment (teaching method) was assigned to them.

3.20. For example, we might block by gender, by year in school, or by housing type
(dorm/off-campus/Greek).

3.21. For example, new employees should be randomly assigned to either the current
program or the new one. There are many possible choices for outcome variables, such as
performance on a test of the information covered in the program, or a satisfaction survey or
other evaluation of the program by those who went through it.

3.22. Subjects—perhaps recruited from people suffering from chronic pain, or those recovering
from surgery or an injury—should be randomly assigned to be treated with magnets, or a
placebo (an object similar to the magnets, except that it is not magnetic. Students should
address some of the practical difficulties of such an experiment, such as: How does one
measure pain relief? How can we prevent subjects from determining whether they are
being treated with a magnet? (For the latter question, we might apply the treatments in a
controlled setting, making sure that there is nothing metal with which the subjects could test
their treatment object.)

3.23. (a) The factors are calcium dose, and vitamin D dose. There are 9 treatments (each
calcium/vitamin D combination). (b) Assign 20 students to each group, with 10 of each
gender. The complete diagram (including the blocking step) would have a total of 18
branches, below is a portion of that diagram, showing only three of the nine branches for
each gender. (c) Randomization results will vary.

-
Group 1 Treatment 1
 @
R
@
Men - Random - Group 2 - Treatment 2 - measure
1
 assignment @  TBBMC
R
@
Group 3 - Treatment 3
Subjects
-
Group 1 Treatment 1
PP
q  @
R
@
Women - Random - Group 2 - Treatment 2 - measure
assignment @
R
@  TBBMC
Group 3 - Treatment 3
Solutions 131

3.24. Students may need guidance as to how to illustrate these interactions. Figure 3.7 shows
one such illustration (as part of Exercise 3.40). Shown below are two possible illustrations,
based on made-up data (note there is no scale on the vertical axis). In the first, we see
that the effect of vitamin D on TBBMC depends on the calcium dose. In the second, we
see little variation in men’s TBBMC across all nine treatments, while women’s TBBMC
appears to depend on the treatment group. (In particular, women’s TBBMC is greatest for
treatment 9, with the highest doses of both calcium and vitamin D.)

100 IU vitamin D Men


TBBMC level

TBBMC level
50 IU vitamin D Women

0 IU vitamin D

0 200 400 1 2 3 4 5 6 7 8 9
Calcium dose (mg/day) Treatment

3.25. (a) For example, flip a coin for each customer to choose which variety (s)he will taste.
To evaluate preferences, we would need to design some scale for customers to rate the
coffee they tasted, and then compare the ratings. (b) For example, flip a coin for each
customer to choose which variety (s)he will taste first. Ask which of the two coffees (s)he
preferred. (c) The matched-pairs version described in part (b) is the stronger design; if each
customer tastes both varieties, we only need to ask which was preferred. In part (a), we
might ask customers to rate the coffee they tasted on a scale of (say) 1 to 10, but such
ratings can be wildly variable (one person’s “5” might be another person’s “8”), which
makes comparison of the two varieties more difficult.

3.26. An experiment would be nearly impossible in this case, unless the publishers of Sports
Illustrated would allow you to choose the cover photo for several issues. (The ideal design
would involve taking a collection of currently-successful teams or individuals, and randomly
assigning half to be on the cover, then observing their fortunes over the next few months.)

3.27. Experimental units: pine tree seedlings. Factor: amount of light. Treatments (two): full
light, or shaded to 5% of normal. Response variable: dry weight at end of study.

3.28. Experimental units: Middle schools. Factors: Physical activity program, and nutrition
program. Treatments (four): Activity intervention, nutrition intervention, both interventions,
and neither intervention. Response variables: Physical activity and lunchtime consumption
of fat.

3.29. Subjects: adults (or registered voters) from selected households. Factors: level of
identification, and offer of survey results. Treatments (six): interviewer’s name with results,
interviewer’s name without results, university name with results, university name without
results, both names with results, both names without results. Response variable: whether or
not the interview is completed.
132 Chapter 3 Producing Data

3.30. (a) The subjects are the physicians, the factor is medication (aspirin or placebo), and the
response variable is health, specifically whether the subjects have heart attacks. (b) Below.
(c) The difference in the number of heart attacks between the two groups was so great that
it would rarely occur by chance if aspirin had no effect.

Group 1 - Treatment 1
* 11,000 physicians

 Aspirin HH
Random  j
H Observe heart
assignment H *
 attacks
H
j
H 
Group 2 - Treatment 2
11,000 physicians Placebo

3.31. Assign nine subjects to treatment 1, then nine more to treatment 2, etc. A diagram is on
the next page; if we assign labels 01 through 36, then line 151 gives:

Group 1 Group 2 Group 3


03 Bezawada 12 Hatfield 32 Tyner 27 Rau 05 Cheng 13 Hua
22 Mi 11 Guha 30 Tang 20 Martin 16 Leaf 25 Park
29 Shu 31 Towers 09 Daye 06 Chronopoulou 28 Saygin 19 Lu
26 Paul 21 Mehta 23 Nolan 33 Vassilev 10 Engelbrecht 04 Cetin
01 Anderson 07 Codrington 18 Lipka
The other nine subjects are in Group 4. The names listed here assume that labels are
assigned alphabetically (across the rows). See note on page 50 about using Table B.

Group 1 - Antidepressant
9 subjects


Antidepressant J

Group 2 - J

plus stress J Observe number
*
 9 subjects HH

 management ^
J
Random  j
H
assignment HH *

 and severity of
J Hj Group 3 - 
 headaches
Placebo

J 9 subjects
J

^
J

Group 4 - Placebo plus


9 subjects stress management
Solutions 133

3.32. (a) A diagram is shown below. (b) Label the subjects from 01 through 20. From
line 101, we choose:
19, 05, 13, 17, 09, 07, 02, 01, 18, and 14
Assuming that labels are assigned alphabetically, that is Wayman, Cunningham, Mitchell,
Seele, Knapp, Fein, Brifcani, Becker, Truong, and Ponder for one group, and the rest for the
other. See note on page 50 about using Table B.

Group 1 - Treatment 1

* Strong marijuana H

10 subjects
Random H
j
H Observe work
output and
assignment H *


H
j
H  earnings
Group 2 - Treatment 2
10 subjects Weak marijuana

3.33. Students might envision different treatments; one possibility is to have some volunteers
go through a training session, while others are given a written set of instructions, or watch a
video. For the response variable(s), we need some measure of training effectiveness; perhaps
we could have the volunteers analyze a sample of lake water and compare their results to
some standard.

3.34. (a) Diagram below. (b) Using line 153 from Table B, the first four subjects are 07, 88,
65, and 68. See note on page 50 about using Table B.
Accomplice fired
Group 1 - because he/she did

 30 subjects
poorly J

J

JJ
^ Observe
Random - Group 2 - Accomplice - performance
assignment 30 subjects randomly fired
J

 after the break


J

JJ
^ Group 3 - Both continue

30 subjects to work
134 Chapter 3 Producing Data

3.35. Diagram below. Starting at line 160, we choose:


16, 21, 06, 12, 02, 04 for Group 1
14, 15, 23, 11, 09, 03 for Group 2
07, 24, 17, 22, 01, 13 for Group 3
The rest are assigned to Group 4. See note on page 50 about using Table B.

Group 1 - Physical activity


6 schools intervention


J

Group 2 - Nutrition J

H JJ
6 schools intervention Observe physical

*
 H^
Random  j
H activity and
H * lunchtime fat

JH
assignment j
H Group 3 - Both interventions 

J 6 schools
consumption
J

^
J

Group 4 - Neither
6 schools intervention

3.36. (a) The table below shows the 16 treatments—four levels for each of the two factors.
(b) A diagram is not shown here (with 16 treatments, it is quite large). Six subjects are
randomly assigned to each treatment; they read the ad for that treatment, and we record their
attractiveness ratings for the ad. Using line 111, the first six subjects are 81, 48, 66, 94, 87,
and 60.
Factor B
Fraction of shoes on sale
25% 50% 75% 100%
20% 1 2 3 4
Factor A 40% 5 6 7 8
Discount level 60% 9 10 11 12
80% 13 14 15 16

3.37. (a) Population = 1 to 150 , Select a sample of size 25 , click Reset and Sample .
(b) Without resetting, click Sample again. (c) Click Sample three more times.

3.38. Population = 1 to 40 , Select a sample of size 20 , then click Reset and Sample .

3.39. Design (a) is an experiment. Because the treatment is randomly assigned, the effect of
other habits would be “diluted” because they would be more-or-less equally split between
the two groups. Therefore, any difference in colon health between the two groups could be
attributed to the treatment (bee pollen or not).
Design (b) is an observational study. It is flawed because the women observed chose
whether or not to take bee pollen; one might reasonably expect that people who choose to
take bee pollen have other dietary or health habits that would differ from those who do not.
Solutions 135

3.40. For a range of discounts, the attractiveness of the sale decreases slightly as the
percentage of goods on sale increases. (The decrease is so small that it might not be
significant.) With precise discounts, on the other hand, mean attractiveness increases with
the percentage on sale. Range discounts are more attractive when only 25% of goods are
marked down, while the precise discount is more attractive if 75% or 100% of goods are
discounted.

3.41. As described, there are two factors: ZIP code (three levels: none, five-digit, nine-digit)
and the day on which the letter is mailed (three levels: Monday, Thursday, or Saturday) for
a total of nine treatments. To control lurking variables, aside from mailing all letters to the
same address, all letters should be the same size, and either printed in the same handwriting
or typed. The design should also specify how many letters will be in each treatment group.
Also, the letters should be sent randomly over many weeks.

3.42. Results will vary, but probability computations reveal that more than 97% of samples
will have 7 to 13 fast-reacting subjects (and 99.6% of samples have 8 to 14 fast-reacting
subjects). Additionally, if students average their 10 samples, nearly all students (more than
99%) should find that the average number of fast-reacting subjects is between 8.5 and 11.5.
Note: X, the number of fast-reacting subjects in the sample, has a hypergeometric
.
distribution with parameters N = 40, r = 20, n = 20, so that P(7 ≤ X ≤ 13) = 0.974. The
theoretical average number of fast-reacting subjects is 10.

3.43. Each player will be put through the sequence (100 yards, four times) twice—once with
oxygen and once without, and we will observe the difference in their times on the final run.
(If oxygen speeds recovery, we would expect that the oxygen-boosted time will be lower.)
Randomly assign half of the players to use oxygen on the first trial, while the rest use it on
the second trial. Trials should be on different days to allow ample time for full recovery.
If we label the players 01 through 20 and begin on line 140, we choose 12, 13, 04, 18,
19, 16, 02, 08, 17, 10 to be in the oxygen-first group. See note on page 50 about using
Table B.

3.44. The sketches requested in the problem are not shown here; random assignments will vary
among students. (a) Label the circles 1 to 6, then randomly select three (using Table B, or
simply by rolling a die) to receive the extra CO2 . Observe the growth in all six regions,
and compare the mean growth within the three treated circles with the mean growth in the
other three (control) circles. (b) Select pairs of circles in each of three different areas of the
forest. For each pair, randomly select one circle to receive the extra CO2 (using Table B or
by flipping a coin). For each pair, compute the difference in growth (treated minus control).

3.45. (a) Randomly assign half the girls to get high-calcium punch; the other half will get
low-calcium punch. The response variable is not clearly described in this exercise; the best
we can say is “observe how the calcium is processed.” (b) Randomly select half of the
girls to receive high-calcium punch first (and low-calcium punch later), while the other half
gets low-calcium punch first (followed by high-calcium punch). For each subject, compute
the difference in the response variable for each level. This is a better design because it
deals with person-to-person variation; the differences in responses for 60 individuals gives
more precise results than the difference in the average responses for two groups of 30
136 Chapter 3 Producing Data

subjects. (c) The first five subjects are 38, 44, 18, 33, and 46. In the CR design, the first
group receives high-calcium punch all summer; in the matched pairs design, they receive
high-calcium punch for the first part of the summer, and then low-calcium punch in the
second half.

3.46. (a) False. Such regularity holds only in the long run. If it were true, you could look at
the first 39 digits and know whether or not the 40th was a 0. (b) True. All pairs of digits
(there are 100, from 00 to 99) are equally likely. (c) False. Four random digits have chance
1/10000 to be 0000, so this sequence will occasionally occur. 0000 is no more or less
random than 1234 or 2718 or any other four-digit sequence.

3.47. (a) This is a block design. (b) The diagram might be similar to the one below (which
assumes equal numbers of subjects in each group). (c) The results observed in this study
would rarely have occurred by chance if vitamin C were ineffective.

Group 1 - Treatment 1
*

 n runners Vitamin C H
Random  H
j
H Watch for
H *


assignment infections
H
j
H Group 2 - Treatment 2
n runners Placebo

Group 1 - Treatment 1
*
 n nonrunners Vitamin C HH
Random  j
H Watch for
assignment HH *
 infections
j
H 
Group 2 - Treatment 2
n nonrunners Placebo

3.48. The population is faculty members at Mongolian public universities, and the sample is
the 300 faculty members to whom the survey was sent. Because we do not know how many
responses were received, we cannot determine the response rate.
Note: We might consider the population to be either the 2500 faculty members on the
list, or the larger group of “all current and future faculty members.” In the latter case,
those on the list constitute the sampling frame—the subset of the population from which our
sample will be selected.
The sample might be considered to be only those faculty who actually responded to the
survey (rather than the 300 selected), because that is the actual group from which we “draw
conclusions about the whole.”

3.49. The population is all forest owners in the region. The sample is the 772 forest owners
348 .
contacted. The response rate is = 45%. Aside from the given information, we would
772
like to know the sample design (and perhaps some other things).
Note: It would also be reasonable to consider the sample to be the 348 forest owners
who returned the survey, because that is the actual group from which we “draw conclusions
about the whole.”
Solutions 137

3.50. To use Table B, number the list from 0 to 9 and choose two single digits. (One can
also assign labels 01–10, but that would require two-digit numbers, and we would almost
certainly end up skipping over many pairs of digits before we found two in the desired
range.)
It is worth noting that choosing an SRS is often described as “pulling names out of a
hat.” For long lists, it is often impractical to do this literally, but with such a small list, one
really could write each ringtone on a slip of paper and choose two slips at random.

3.51. See the solution to the previous exercise; for this problem, we need to choose three items
instead of two, but it is otherwise the same.

3.52. (a) This is a multistage sample: We first sample three of the seven course sections, then
eight from each chosen section. (b) This is an SRS: Each student has the same chance
(5/55) of being selected. (c) This is a voluntary response sample: Only those who visit the
site can participate (if they choose to). (d) This is a stratified random sample: Males and
females (the strata) are sampled separately.

3.53. (a) This statement confuses the ideas of population and sample. (If the entire population
is found in our sample, we have a census rather than a sample.) (b) “Dihydrogen monoxide”
is H2 O. Any concern about the dangers posed by water most likely means that the
respondent did not know what dihydrogen monoxide was, and was too embarrassed to admit
it. (Conceivably, the respondent knew the question was about water and had concerns arising
from a bad experience of flood damage or near-drowning. But misunderstanding seems
to be more likely.) (c) Honest answers to such questions are difficult to obtain even in an
anonymous survey; in a public setting like this, it would be surprising if there were any
raised hands (even though there are likely to be at least a few cheaters in the room).

3.54. (a) The content of a single chapter is not random; choose random words from random
pages. (b) Students who are registered for an early-morning class might have different
characteristics from those who avoid such classes. (c) Alphabetic order is not random. One
problem is that the sample might include people with the same last name (siblings, spouses,
etc.). Additionally, some last names tend to be more common in some ethnic groups.

3.55. The population is (all) local businesses. The sample is the 73 businesses that return the
questionnaire, or the 150 businesses selected. The nonresponse rate is 51.3% = 150 77
.
Note: The definition of “sample” makes it somewhat unclear whether the sample
includes all the businesses selected or only those that responded. My inclination is toward
the latter (the smaller group), which is consistent with the idea that the sample is “a part of
the population that we actually examine.”
138 Chapter 3 Producing Data

3.56. (a) Shown is a possible display (as a bar


graph). A plot similar to Figure 3.7 would 70

Average spending ($)


tell the same story—namely, that spending is 60
highest on the weekends, and drops during 50
the middle of the week. Ideally, bars should 40
be in chronological order, although students 30
might choose to start with Sunday rather than 20
Monday. Some students might arrange bars 10
in increasing or decreasing order of height. 0

ay

y
ay

y
y

y
da

rda
da
da

da
(b) The exclusion of the Christmas shopping

esd
esd

urs

Fri
n

n
tu
Mo

Su
dn
Tu

Sa
season might impact the numbers. In addi-

Th
We
tion, much of this period fell during a economic recession. Day of week

3.57. Note that the numbers add to 100% down the columns; that is, 39% is the percent of Fox
viewers who are Republicans, not the percent of Republicans who watch Fox.
Students might display the data using a stacked bar graph like the one below, or
side-by-side bars. (They could also make four pie charts, but comparing slices across pies is
difficult.) The most obvious observation is that the party identification of Fox’s audience is
noticeably different from the other three sources.
100
90 Other/Don't know
Percent of viewers

80
70 Independent
60
50 Democrat
40
Republican
30
20
10
0
Fox CNN MSNBC Network

3.58. Exact descriptions of the populations may vary. (a) All current students—or perhaps all
current students who were enrolled during the year prior to the change. (The latter would be
appropriate if we want opinions based on a comparison between the new and old curricula.)
(b) All U.S. households. (c) Adult residents of the United States.

3.59. Numbering from 01 to 33 alphabetically (down the columns), we enter Table B at


line 137 and choose:
12 = Country View, 14 = Crestview, 11 = Country Squire, 16 = Fairington, 08 = Burberry
See note on page 50 about using Table B.

3.60. Assign labels 001 to 200. To use Table B, take three digits at a time beginning on
line 112; the first five pixels are 089, 064, 032, 117, and 003.

3.61. Population = 1 to 200 , Select a sample of size 25 , then click Reset and Sample .
Solutions 139

3.62. With the applet: Population = 1 to 371 , Select a sample of size 25 , then click
Reset and Sample . With Table B, line 120 gives the codes labeled 354, 239, 193, 099,
and 262.

3.63. One could use the labels already assigned to the blocks, but that would mean skipping
a lot of four-digit combinations that do not correspond to any block. An alternative would
be to drop the second digit and use labels 100–105, 200–211, and 300–325. But by far the
simplest approach is to assign labels 01–44 (in numerical order by the four-digit numbers
already assigned), enter the table at line 135, and select:
39 (block 3020), 10 (2003), 07 (2000), 11 (2004), and 20 (3001)
See note on page 50 about using Table B.

3.64. If one always begins at the same place, then the results could not really be called
random.

3.65. The sample will vary with the starting line in Table B. The simplest method is to use
the last digit of the numbers assigned to the blocks in Group 1 (that is, assign the labels
0–5), then choose one of those blocks; use the last two digits of the blocks in Group 2
(00–11) and choose two of those, and finally use the last two digits of the blocks in Group
3 (00–25) and choose three of them.

3.66. (a) If we choose one of the first 45 students and then every 45th name after that, we
45 = 200 names. (b) Label the first 45 names 01–45. Beginning at
will have a total of 9000
line 125, the first number we find is 21, so we choose names 21, 66, 111, . . . .

3.67. Considering the 9000 students of Exercise 3.66, each student is equally likely;
specifically, each name has chance 1/45 of being selected. To see this, note that each of the
first 45 has chance 1/45 because one is chosen at random. But each student in the second
45 is chosen exactly when the corresponding student in the first 45 is, so each of the second
45 also has chance 1/45. And so on.
This is not an SRS because the only possible samples have exactly one name from
the first 45, one name from the second 45, and so on; that is, there are only 45 possible
samples. An SRS could contain any 200 of the 9000 students in the population.

3.68. (a) This is a stratified random sample. (b) Label from 01 through 27; beginning at
line 122, we choose:
13 (805), 15 (760), 05 (916), 09 (510), 08 (925),
27 (619), 07 (415), 10 (650), 25 (909), and 23 (310)
Note: The area codes are in north-south order if we read across the rows; that is how they
were labeled for this solution. Students might label down rather than across; the sample
should include the same set of labels but a different list of area codes.
140 Chapter 3 Producing Data

3.69. Assign labels 01–36 for the Climax 1 group, 01–72 for the Climax 2 group, and so on.
Then beginning at line 140, choose:
12, 32, 13, 04 from the Climax 1 group and (continuing on in Table B)
51, 44, 72, 32, 18, 19, 40 from the Climax 2 group
24, 28, 23 from the Climax 3 group and
29, 12, 16, 25 from the mature secondary group
See note on page 50 about using Table B.

3.70. Label the students 01, . . . , 30 and use Table B. Then label the faculty 0, . . . , 9 and use
the table again. (You could also label the faculty from 01 to 10, but that would needlessly
require two-digit labels.)
Note: Students often try some fallacious method of choosing both samples
simultaneously. We simply want to choose two separate SRSs: one from the students and one
from the faculty. See note on page 50 about using Table B.

3.71. Each student has a 10% chance: 3 out of 30 over-21 students, and 2 of 20 under-21
students. This is not an SRS because not every group of 5 students can be chosen; the only
possible samples are those with 3 older and 2 younger students.

3.72. Label the 500 midsize accounts from 001 to 500, and the 4400 small accounts from
0001 to 4400. On line 115, we first encounter numbers 417, 494, 322, 247, and 097 for the
midsize group, then 3698, 1452, 2605, 2480, and 3716 for the small group. See note on
page 50 about using Table B.

3.73. (a) This design would omit households without telephones or with unlisted numbers.
Such households would likely be made up of poor individuals (who cannot afford a phone),
those who choose not to have phones, and those who do not wish to have their phone
numbers published. (b) Those with unlisted numbers would be included in the sampling
frame when a random-digit dialer is used.

3.74. (a) This will almost certainly produce a positive response because it draws the dubious
conclusion that cell phones cause brain cancer. Some people who drive cars, or eat carrots,
or vote Republican develop brain cancer, too. Do we conclude that these activities should
come with warning labels, also? (b) The phrasing of this question will tend to make
people respond in favor of national health insurance: It lists two benefits of such a system,
and no arguments from the other side of the issue. (c) This sentence is so convoluted
and complicated that it is almost unreadable; it is also vague (what sort of “economic
incentives”? How much would this cost?). A better phrasing might be, “Would you be
willing to pay more for the products you buy if the extra cost were used to conserve
resources by encouraging recycling?” That is still vague, but less so, and is written in plain
English.

3.75. The first wording brought the higher numbers in favor of a tax cut; “new government
programs” has considerably less appeal than the list of specific programs given in the second
wording.
Solutions 141

3.76. Children from larger families will be overrepresented in such a sample. Student
explanations of why will vary; a simple illustration can aid in understanding this effect.
Suppose that there are 100 families with children; 60 families have one child and the other
40 have three. Then there are a total of 180 children (an average of 1.8 children per family),
and two-thirds (120) of those children come from families with three children. Therefore, if
we had a class (a sample) chosen from these 180 children, only one-third of the class would
answer “one” to the teacher’s question, and the rest would say “three.” This would give an
average of about 2.3 children per family.

3.78. Responses to public opinion polls can be affected by things like the wording of the
question, as was the case here: Both statements address the question of how to distribute
wealth in a society, but subtle (and not-so-subtle) slants in the wording suggest that the
public holds conflicting opinions on the subjects.

3.79. The population is undergraduate college students. The sample is the 2036 students. (We
assume they were randomly selected.)

3.80. No; this is a voluntary response sample. The procedures described in the text apply to
data gathered from an SRS.

3.81. The larger sample would have less sampling variability. (That is, the results would have a
higher probability of being closer to the “truth.”)

3.82. (a) Parameters are associated with the population; statistics describe samples. (b) Bias
means that the center of sampling distribution is not equal to the true value of the
parameter; that is, bias is systematic under- or over-estimation. Variability refers to the
spread (not the center) of the sampling distribution. (c) Large samples generally have lower
variability, but if the samples are biased, that lower variability is of little use. (In addition,
larger samples generally come at a cost; the added cost might not justify the decrease in
variability.) (d) A sampling distribution might be visualized (or even simulated) with a
computer, but it arises from the process of sampling, not from computation.

3.83. (a) Population: college students. Sample: 17,096 students. (b) Population: restaurant
workers. Sample: 100 workers. (c) Population: longleaf pine trees. Sample: 40 trees.

3.84. (a) High bias, high variability (many are low, wide scatter). (b) Low bias, low variability
(close to parameter, little scatter). (c) Low bias, high variability (neither too low nor too
high, wide scatter). (d) High bias, low variability (too high, little scatter).
Note: Make sure that students understand that “high bias” means that the values are far
from the parameter, not that they are too high.
142 Chapter 3 Producing Data

3.85. (a) The scores will vary depending on the


14
starting row. Note that the smallest possible
12
mean is 5.25 (from the sample 3, 5, 6, 7)
10

Frequency
and the largest is 11.5 (from 9, 10, 12, 15).
8
(b) Answers will vary. On the right is the
(exact) sampling distribution, showing all 6
possible values of the experiment (so the first 4
rectangle is for 5.25, the next is for 5.5, etc.). 2
Note that it looks roughly Normal; if we had 0
5 6 7 8 9 10 11
taken a larger sample from a larger popula-
Sample mean
tion, it would appear even more Normal.  
Note: This histogram was found by considering all 10 4 = 210 of the possible samples. A
collection of only 10 random samples will, of course, be considerably less detailed.

3.86. No: With sufficiently large populations (“at least 100 times larger than the sample”), the
variability (and margin of error) depends on the sample size.

3.87. (a) This is a multistage sample. (b) Attitudes in smaller countries (many of which were
not surveyed) might be different. (c) An individual country’s reported percent will typically
differ from its true percent by no more than the stated margin of error. (The margins of
error differ among the countries because the sample sizes were not all the same.)
Note: The number of countries in the world is about 195 (the exact number depends on
the criteria of what constitutes a separate country). That means that about 60 countries are
not represented in this survey.

3.88. (a) The population is Ontario residents; the sample is the 61,239 people interviewed.
(b) The sample size is very large, so if there were large numbers of both sexes in the
sample—this is a safe assumption because we are told this is a “random sample”—these two
numbers should be fairly accurate reflections of the values for the whole population.

3.89. (a) The histogram should be centered at about 0.6 (with quite a bit of spread). For
reference, the theoretical histogram is shown below on the left; student results should have a
similar appearance. (b) The histogram should be centered at about 0.2 (with quite a bit of
spread). The theoretical histogram is shown below on the right.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Solutions 143

3.90. (a) The histogram of this theoretical sam-


pling distribution is shown (on the right) for
reference. (b) This theoretical sampling dis-
tribution is shown below on the left. Students
should observe that their two stemplots have
clearly different centers (near 0.6 and 0.3, 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
respectively) but similar spreads. (c) The the-
oretical sampling distribution is below on the right. Compared to the distribution of part (a),
this has the same center but is about half as wide; that is, the spread is about half as much
when the sample size is multiplied by 4. (The vertical scale of this graph is not the same as
the other two; it should be about twice as tall as it is since it is only about half as wide.)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

3.91. (a) The scores will vary depending on the starting row. Note that the smallest possible
mean is 61.75 (from the sample 58, 62, 62, 65) and the largest is 77.25 (from 73, 74, 80,
82). (b) Answers will vary; shown below are two views of the (exact) sampling distribution.
The first shows all possible values of the experiment (so the first rectangle is for 61.75,
the next is for 62.00, etc.); the other shows values grouped from 61 to 61.75, 62 to 62.75,
etc. (which makes the histogram less bumpy). The tallest rectangle in the first picture is 8
units; in the second, the tallest is 28 units.  
Note: These histograms were found by considering all 10 4 = 210 of the possible
samples. It happens that half (105) of those samples yield a mean smaller than 69.4 and
half yield a greater mean.

61.75 69.4 77.25 61 69.4 77


144 Chapter 3 Producing Data

3.92. Student results will vary greatly, and ten


values of x will give little indication of the
appearance of the sampling distribution. In
fact, the sampling distribution of x is approx-
imately Normal with a mean of 50.5 and a 23.74 32.66 41.58 50.5 59.42 68.34 77.26
standard deviation of about 8.92; this approx-
70
imating Normal distribution is shown on the
right (above). Therefore, nearly every sample 60
of size 10 would yield a mean between 23 50

Frequency
and 78. 40
The shape of the sampling distribution be- 30
comes more apparent if the results of many 20
students are pooled. Below on the right is an 10
example based on 300 sample means, which 0
might arise from pooling all the results in a 20 25 30 35 40 45 50 55 60 65 70 75
class of 30. Value of x-bar
Note: Because the values in these samples are not independent (there can be no repeats),
a stronger version of the central limit theorem is needed to determine that the sampling
distribution is approximately Normal. Confirming the standard deviation given above is a
reasonably difficult exercise even for a mathematics major.

3.93. (a) Below is the population stemplot (which gives the same information as a histogram).
. .
The (population) mean GPA is µ = 2.6352, and the standard deviation is σ = 0.7794.
.
[Technically, we should take σ = 0.7777, which comes from dividing by n rather than
n − 1, but few (if any) students would know this, and it has little effect on the results.]
(b) & (c) Results will vary; these histograms are not shown. Not every sample of size 20
could be viewed as “generally representative of the population,” but most should bear at
least some resemblance to the population distribution.
0 134
0 567889
1 0011233444
1 5566667888888888999999
2 000000000111111111222222222333333333444444444
2 5555555555555666666667777777777777788888888888888999999
3 0000000000000011111111112222222223333333333333333444444444
3 556666666677777788889
4 0000
Solutions 145

3.94. (a) Shown for reference is a histogram


of the approximate sampling distribution
of x. This distribution is difficult to find
exactly, but based on 1000 simulated sam-
ples, it is approximately Normal with mean µ
2.6352 (the same as µ) and standard devia-
. 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1
tion sx = 0.167. (Therefore, x will almost
always be between 2.13 and 3.14.) (b) Results may vary, but most students should see no
strong suggestion of bias. (c) Student means and standard deviations will vary, but for most
.
(if not all) students, their values should meet the expectations (close to µ = 2.6352 and less
.
than σ = 0.78).
Note: Observe that the distribution of x is slightly left-skewed, but less skewed than the
population distribution.
√ Also note that sx , the standard deviation of the sampling distribution,
.
is smaller than σ/ 20 = 0.174, since we are sampling without replacement.

3.95. (a) Answers will vary. If, for ex-


ample, eight heads are observed, then
p̂ = 208
= 0.4 = 40%. (b) Note that all
the leaves in the stemplot should be either 0
or 5 since all possible p̂-values end in 0 or
5. For comparison, here is a histogram of the
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
sampling distribution (assuming p really is
0.5). An individual student’s stemplot will probably only roughly approximate this distribu-
tion, but pooled efforts should be fairly close.

Many of the questions in Section 3.4 (Ethics), Exercises 3.96–3.117, are matters of opinion and
may be better used for class discussion rather than as assigned homework. A few comments
are included here.

3.96. These three proposals are clearly in increasing order of risk. Most students will likely
consider that (a) qualifies as minimal risk, and most will agree that (c) goes beyond minimal
risk.

3.97. (a) A nonscientist might raise different viewpoints and concerns from those considered
by scientists. (b) Answers will vary.

3.98. It is good to plainly state the purpose of the research (“To study how people’s religious
beliefs and their feelings about authority are related”). Stating the research thesis (that
orthodox religious belief are associated with authoritarian personalities) would cause bias.

3.102. (a) Ethical issues include informed consent and confidentiality; random assignment
generally is not an ethical consideration. (b) “Once research begins, the board monitors its
progress at least once a year.” (c) Harm need not be physical; psychological harm also needs
to be considered.
146 Chapter 3 Producing Data

3.103. To control for changes in the mass spectrometer over time, we should alternate between
control and cancer samples.

3.105. The articles are “Facebook and academic performance: Reconciling a media sensation
with data” (Josh Pasek, eian more, Eszter Hargittai), a critique of the first article called “A
response to reconciling a media sensation with data” (Aryn C. Karpinski), and a response to
the critique (“Some clarifications on the Facebook-GPA study and Karpinski’s response”) by
the original authors. In case these articles are not available at the address given in the text,
they might be found elsewhere with a Web search.

3.106. (a) The simplest approach is to label from 00001 through 14959 and then take five
digits at a time from the table. A few clever students might think of some ways to make
this process more efficient, such as taking the first random digit chosen as “0” if it is
even and “1” if odd. (This way, fewer numbers need to be ignored.) (b) Using labels
00001–14959, we choose 05995, 06788, and 14293. Students who try an alternate approach
may have a different sample.

3.108. (a) A sample survey: We want to gather information about a population (U.S. residents)
based on a sample. (b) An experiment: We want to establish a cause-and-effect relationship
between teaching method and amount learned. (c) An observational study: There is no
particular population from which we will sample; we simply observe “your teachers,” much
like an animal behavioral specialist might study animals in the wild.

3.111. They cannot be anonymous because the interviews are conducted in person in the
subject’s home. They are certainly kept confidential.
Note: For more information about this survey, see the GSS Web site:
www.norc.org/GSS+Website

3.112. This offers anonymity, since names are never revealed. (However, faces are seen, so
there may be some chance of someone’s identity becoming known.)

3.116. (a) Those being surveyed should be told the kind of questions they will be asked
and the approximate amount of time required. (b) Giving the name and address of the
organization may give the respondents a sense that they have an avenue to complain should
they feel offended or mistreated by the pollster. (c) At the time that the questions are being
asked, knowing who is paying for a poll may introduce bias, perhaps due to nonresponse
(not wanting to give what might be considered a “wrong” answer). When information about
a poll is made public, though, the poll’s sponsor should be announced.

3.120. At norc.org, search for “Consumer Finances” or “SCF,” and from the SCF page,
click on the link to “website for SCF respondents.” At the time this manual was written, the
pledge was found at www.norc.org/scf2010/Confidentiality.html.

3.121. (a) You need information about a random selection of his games, not just the ones he
chooses to talk about. (b) These students may have chosen to sit in the front; all students
should be randomly assigned to their seats.
Solutions 147

3.122. (a) A matched pairs design (two halves of the same board would have similar
properties). (b) A sample survey (with a stratified sample: smokers and nonsmokers). (c) A
block design (blocked by gender).

3.123. This is an experiment because each subject is (randomly, we assume) assigned to a


treatment. The explanatory variable is the price history seen by the subject (steady prices or
fluctuating prices), and the response variable is the price the subject expects to pay.

3.124. (a) For example, one could select (or recruit) a sample and assess each person’s calcium
intake (perhaps by having them record what they eat for a week), and measure his/her
TBBMD. (b) For example, measure each subject’s TBBMD, then randomly assign half the
subjects to take a calcium supplement, and the other half to take a placebo. After a suitable
period, measure TBBMD again. (c) The experiment, while more complicated, gives better
information about the relationship between these variables, because it controls for other
factors that may affect bone health.

3.126. Each subject should taste both kinds of fries in a randomly selected order and then
be asked about preference. One question to consider is whether they should have ketchup
available; many people typically eat fries with ketchup, and its presence or absence might
affect their preferences. If ketchup is used, should one use the same ketchup for both, or a
sample of the ketchup from each restaurant?

3.127. The two factors are gear (three levels) and steepness of the course (number of levels
not specified). Assuming there are at least three steepness levels—which seems like the
smallest reasonable choice—that means at least nine treatments. Randomization should be
used to determine the order in which the treatments are applied. Note that we must allow
ample recovery time between trials, and it would be best to have the rider try each treatment
several times.

3.129. (a) One possible population: all full-time undergraduate students in the fall term on a
list provided by the registrar. (b) A stratified sample with 125 students from each year is
one possibility. (c) Mailed (or emailed) questionnaires might have high nonresponse rates.
Telephone interviews exclude those without phones and may mean repeated calling for those
who are not home. Face-to-face interviews might be more costly than your funding will
allow. There might also be some response bias: Some students might be hesitant about
criticizing the faculty (while others might be far too eager to do so).
148 Chapter 3 Producing Data

3.130. (a) For the two factors (administration method, Injection Skin patch IV drip
with three levels, and dosage, with two levels), 5 mg 1 2 3
the treatment combinations are shown in the table 10 mg 4 5 6
on the right, and the design is diagrammed below.
(b) Larger samples give more information; in particular, with large samples, we reduce the
variability in the observed mean concentrations so that we can have more confidence that the
differences we might observe are due to the treatment applied rather than random fluctuation.

Group 1 - Treatment 1
n subjects 5 mg, injected
 B
 B
 Group 2 - Treatment 2 B
 n subjects 5 mg, patch B
 B

 J B

Group 3 - Treatment 3 J B
Measure

* n subjects 5 mg, IV drip H JJ BN concentration in
Random 
 H j
H ^
H  * the blood after

assignment BJ HHj Group 4 - Treatment 4 

BJ n subjects 10 mg, injected
 30 minutes
BJ

B J ^

B Group 5 - Treatment 5 
B n subjects 10 mg, patch 
B 
B 
BN Group 6 - Treatment 6 
n subjects 10 mg, IV drip

3.131. Use a block design: Separate men and women, and randomly allocate each gender
among the six treatments.

The remaining exercises relate to the material of Section 3.4 (Ethics). Answers are given
for the first two; the rest call for student opinions, or information specific to the student’s
institution.

3.132. Parents who fail to return the consent form may be more likely to place less priority
on education and therefore may give their children less help with homework, and so forth.
Including those children in the control group is likely to lower that group’s score.
Note: This is a generalization, to be sure: We are not saying that every such parent does
not value education, only that the percentage of this group that highly values education will
almost certainly be lower than that percentage of the parents who return the form.

3.133. The latter method (CASI) will show a higher percentage of drug use because
respondents will generally be more comfortable (and more assured of anonymity) about
revealing embarrassing or illegal behavior to a computer than to a person, so they will be
more likely to be honest.
Chapter 4 Solutions

4.1. Only 6 of the first 20 digits on line 119 correspond 95857 07118 87664 92099
to “heads,” so the proportion of heads is 20 = 0.3.
6 TTTTT HTHHT TTTTH THHTT
With such a small sample, random variation can produce results different from the expected
value (0.5).

4.2. The overall rate (56%) is an average. Graduation rates vary greatly among institutions;
some will have have higher rates, and others lower.

4.3. (a) Most answers (99.5% of them) will be between 82% and 98%. (b) Based on 100,000
simulated trials—more than students are expected to do—the longest string of misses will
be quite short (3 or fewer with probability 99%, 5 or fewer with probability 99.99%). The
average (“expected”) longest run of misses is about 1.7. For shots made, the average run is
about 27, but there is lots of variation; 77% of simulations will have a longest run of made
shots between 17 and 37, and about 95% of simulation will fall between 12 and 46.

4.5. If you hear music (or talking) one time, you will almost certainly hear the same thing for
several more checks after that. (For example, if you tune in at the beginning of a 5-minute
song and check back every 5 seconds, you’ll hear that same song over 30 times.)

4.6. To estimate the probability, count the number of times the dice show 7 or 11, then divide
by 25. For “perfectly made” (fair) dice, the number of winning rolls will nearly always
(99.4% of the time) be between 1 and 11 out of 25.

4.7. Out of a very large number of patients taking this medication, the fraction who experience
this bad side effect is about 0.00001.
Note: Student explanations will vary, but should make clear that 0.00001 is a long-run
average rate of occurrence. Because a probability of 0.00001 is often stated as “1 in in
100,000,” it is tempting to interpret this probability as meaning “exactly 1 out of every
100,000.” While we expect about 1 occurrence of side effects out of 100,000 patients, the
actual number of side effects patients is random; it might be 0, or 1, or 2, . . . .

4.8. (a) – (c) Results will vary, but after n tosses, the 99.7% Range 99.7% Range
distribution of the proportion p̂ is approximately Nor- n for p̂ for count
√ 50 0.5 ± 0.2121 25 ± 10.6
mal with mean 0.5 and standard deviation 0.5/ n,
while the distribution of the count of heads is ap- 150 0.5 ± 0.1225 75 ± 18.4
300 0.5 ± 0.0866 150 ± 26.0
proximately Normal with mean 0.5n and standard
√ 600 0.5 ± 0.0612 300 ± 36.7
deviation 0.5 n, so using the 68–95–99.7 rule, we
have the results shown in the table on the right. For example, after 300 tosses, nearly all
students should have p̂ between 0.4134 and 0.5866, and a count between 124 and 176. Note
that the range for p̂ gets narrower, while the range for the count gets wider.

149
150 Chapter 4 Probability: The Study of Randomness

 4
.
4.9. The true probability (assuming perfectly fair dice) is 1 − 5
6 = 0.5177, so students
should conclude that the probability is “quite close to 0.5.”

4.10. Sample spaces will likely include blonde, brunette (or brown), black, red, and gray.
Depending on student imagination (and use of hair dye), other colors may be listed; there
should at least be options to answer ”other” and “bald.”

4.11. One possibility: from 0 to hours (the largest number should be big enough to include
all possible responses). In addition, some students might respond with fractional answers
(e.g., 3.5 hours).

4.12. P(Black or White) = 0.07 + 0.02 = 0.09.

4.13. P(Blue, Green, Black, Brown, Grey, or White) = 1 − P(Purple, Red, Orange, or
Yellow) = 1 − (0.14 + 0.08 + 0.05 + 0.03) = 1 − 0.3 = 0.7. Using Rule 4 (the complement
rule) is slightly easier, because we only need to add the four probabilities of the colors we
do not want, rather than adding the six probabilities of the colors we want.

4.14. P(not 1) = 1 − 0.301 = 0.699.

4.15. In Example 4.13, P(B) = P(6 or greater) was found to be 0.222, so P(A or B) =
P(A) + P(B) = 0.301 + 0.222 = 0.523.

4.16. For each possible value (1, 2, . . . , 6), the probability is 1/6.

4.17. If Tk is the event “get tails on the kth flip,”then


  T1 and T2 are independent, and
P(two tails) = P(T1 and T2 ) = P(T1 )P(T2 ) = 2 12 = 14 .
1

4.18. If Ak is the event “the kth card drawn is an ace,” then A1 and A2 are not independent; in
3
particular, if we know that A1 occurred, then the probability of A2 is only 51 .

4.19. (a) The probability that both of two disjoint events occur is 0. (Multiplication is
appropriate for independent events.) (b) Probabilities must be no more than 1; P(A and B)
will be no more than 0.5. (We cannot determine this probability exactly from the given
information.) (c) P(Ac ) = 1 − 0.35 = 0.65.

4.20. (a) The two outcomes (say, A and B) in the sample space need not be equally likely.
The only requirements are that P(A) ≥ 0, P(B) ≥ 0, and P(A) + P(B) = 1. (b) In a
table of random digits, each digit has probability 0.1. (c) If A and B were independent,
then P(A and B) would equal P(A)P(B) = 0.06. (That is, probabilities are multiplied, not
added.) In fact, the given probabilities are impossible, because P(A and B) must be less than
the smaller of P(A) and P(B).

4.21. There are six possible outcomes: { link1, link2, link3, link4, link5, leave }.
Solutions 151

4.22. There are an infinite number of possible outcomes, and the description of the sample
space will depend on whether the time is measured to any degree of accuracy (S is the set
of all positive numbers) or rounded to (say) the nearest second (S = {0, 1, 2, 3, . . .}), or
nearest tenth of a second (S = {0, 0.1, 0.2, 0.3 . . .}).

4.23. (a) P(“Empire State of Mind” or “I Gotta Feeling”) = 0.180 + 0.068 = 0.248.
(b) P(neither “Empire State of Mind” nor “I Gotta Feeling”) = 1 − 0.248 = 0.752.

4.24. (a) If Rk is the event “ ‘Party in the USA’ is the kth chosen ringtone,” then
P(R1 and R2 ) = P(R1 )P(R2 ) = 0.1072 = 0.011449. (b) The complement would be “at most
one ringtone is ‘Party in the USA.’ ” P[(R1 and R2 )c ] = 1 − P(R1 and R2 ) = 0.988551.

4.25. (a) The given probabilities have sum 0.97, so P(type AB) = 0.03.
(b) P(type O or B) = 0.44 + 0.11 = 0.55.

4.26. P(both are type O) = (0.44)(0.52) = 0.2288; P(both are the same type) =
(0.42)(0.35) + (0.11)(0.10) + (0.03)(0.03) + (0.44)(0.52) = 0.3877.

4.27. (a) Not legitimate because the probabilities sum to 2. (b) Legitimate (for a nonstandard
deck). (c) Legitimate (for a nonstandard die).

4.28. (a) The given probabilities have sum 0.77, so P(French) = 0.23.
(b) P(not English) = 1 − 0.59 = 0.41, using Rule 4. (Or, add the other three probabilities.)

4.29. (a) The given probabilities have sum 0.72, so this probability must be 0.28.
(b) P(at least a high school education) = 1 − P(has not finished HS) = 1 − 0.12 = 0.88.
(Or add the other three probabilities.)

..
4.30. The probabilities of 2, 3, 4, and 5 are unchanged (1/6), so P( . or .. ..) must still be 1/3.
.. ..
If P( . .) = 0.21, then P( . ) = 3 − 0.21 = 0.123 (or 300 ). The complete table follows.
1 37

. . .. .. ... .. ..
Face . . .. .. ..
Probability 0.123 1/6 1/6 1/6 1/6 0.21

4.31. For example, the probability for A-positive blood is (0.42)(0.84) = 0.3528 and for
A-negative (0.42)(0.16) = 0.0672.
Blood type A+ A– B+ B– AB+ AB– O+ O–
Probability 0.3528 0.0672 0.0924 0.0176 0.0252 0.0048 0.3696 0.0704

4.32. (a) All are equally likely; the probability is 1/38. (b) Because 18 slots are red, the
.
probability of a red is P(red) = 1838 = 0.474. (c) There are 12 winning slots, so P(win a
.
column bet) = 1238 = 0.316.
152 Chapter 4 Probability: The Study of Randomness

4.33. (a) There are six arrangements of the digits 4, 9, and 1 (491, 419, 941, 914, 149,
194), so that P(win) = 1000
6
= 0.006. (b) The only winning arrangement is 222, so
P(win) = 1000
1
= 0.001.

4.34. (a) There are 104 = 10,000 possible PINs (0000 through 9999).* (b) The probability that
a PIN has no 0s is 0.94 (because there are 94 PINs that can be made from the nine nonzero
digits), so the probability of at least one 0 is 1 − 0.94 = 0.3439.
*If we assume that PINs cannot have leading 0s, then there are only 9000 possible codes
94
(1000–9999), and the probability of at least one 0 is 1 − 9000 = 0.271.

.
4.35. P(none are O-negative) = (1 − 0.07)10 = 0.4840, so P(at least one is
.
O-negative) = 1 − 0.4840 = 0.5160.

4.36. If we assume that each site is independent of the others (and that they can be considered
as a random sample from the collection of sites referenced in scientific journals), then P(all
.
seven are still good) = 0.877 = 0.3773.

4.37. This computation would only be correct if the events “a randomly selected person is at
least 75” and “a randomly selected person is a woman” were independent. This is likely
not true; in particular, as women have a greater life expectancy than men, this fraction is
probably greater than 3%.

4.38. As P(R) = 26 and P(G) = 46 , and successive rolls are independent, the respective
probabilities are:
 4    4  2  5  
2 . 4 . 2 .
2
6
4
6 = 243 = 0.00823, 2
6
4
6 = 729 = 0.00549, and 2
6 6 = 729 = 0.00274.
4

.
4.39. (a) (0.65)3 = 0.2746 (under the random walk theory). (b) 0.35 (because performance in
separate years is independent). (c) (0.65)2 + (0.35)2 = 0.545.

4.40. For any event A, along with its complement Ac , we have P(S) = P(A or Ac ) because
“A or Ac ” includes all possible outcomes (that is, it is the entire sample space S). By Rule 2,
P(S) = 1, and by Rule 3, P(A or Ac ) = P(A) + P(Ac ), because A and Ac are disjoint.
Therefore, P(A) + P(Ac ) = 1, from which Rule 4 follows.

4.41. Note that A = (A and B) or (A and B c ), and the events (A and B) and (A and B c ) are
disjoint, so Rule 3 says that
 
P(A) = P (A and B) or (A and B c ) = P(A and B) + P(A and B c ).
If P(A and B) = P(A)P(B), then we have P(A and B c ) = P(A) − P(A)P(B) =
P(A)(1 − P(B)), which equals P(A)P(B c ) by the complement rule.
Solutions 153

4.42. (a) Hannah and Jacob’s children can have alleles AA, BB, or AB, so A B
they can have blood type A, B, or AB. (The table on the right shows the A AA AB
possible combinations.) (b) Either note that the four combinations in the B AB BB
table are equally likely, or compute:
P(type A) = P(A from Hannah and A from Jacob) = P(A H )P(A J ) = 0.52 = 0.25
P(type B) = P(B from Hannah and B from Jacob) = P(B H )P(B J ) = 0.52 = 0.25
P(type AB) = P(A H )P(B J ) + P(B H )P(A J ) = 2 · 0.25 = 0.5

4.43. (a) Nancy and David’s children can have alleles BB, BO, or OO, B O
so they can have blood type B or O. (The table on the right shows B BB BO
the possible combinations.) (b) Either note that the four combinations O BO OO
in the table are equally likely or compute P(type O) = P(O from Nancy and O from
David) = 0.52 = 0.25 and P(type B) = 1 − P(type O) = 0.75.

4.44. Any child of Jennifer and José has a 50% chance of being type A A O
(alleles AA or AO), and each child inherits alleles independently of other A AA AO
children, so P(both are type A) = 0.52 = 0.25. For one child, we B AB BO
have P(type A) = 0.5 and P(type AB) = P(type B) = 0.25, so that P(both have same
type) = 0.52 + 0.252 + 0.252 = 0.375 = 38 .

4.45. (a) Any child of Jasmine and Joshua has an equal (1/4) chance of hav- A O
ing blood type AB, A, B, or O (see the allele combinations in the table). B AB BO
Therefore, P(type O) = 0.25. (b) P(all three have type O) = 0.253 = O AO OO
0.015625 = 64 . P(first has type O, next two do not) = 0.25 · 0.75 = 0.140625 = 64
1 2 9
.

4.46. P(grade of D or F) = P(X = 0 or X = 1) = 0.05 + 0.04 = 0.09.

4.47. If H is the number of heads, then the distribution of H Value of H 0 1 2


is as given on the right. P(H = 0), the probability of two Probabilities 1/4 1/2 1/4
tails was previously computed in Exercise 4.17.

4.48. P(0.1 < X < 0.4) = 0.3.

4.49. (a) The probabilities for a discrete random variable always add to one. (b) Continuous
random variables can take values from any interval, not just 0 to 1. (c) A Normal random
variable is continuous. (Also, a distribution is associated with a random variable, but
“distribution” and “random variable” are not the same things.)
154 Chapter 4 Probability: The Study of Randomness

4.50. (a) If T is the event that a person uses Twitter, we can write the sample space as
{T, T c }. (b) There are various ways to express this; one would be
{T T T, T T T c , T T c T, T c T T, T T c T c , T c T T c , T c T c T, T c T c T c }.
(c) For this random variable (call it X ), the sample space is {0, 1, 2, 3}. (d) The sample
space in part (b) reveals which of the three people use Twitter. This may or may not be
important information; it depends on what questions we wish to ask about our sample.

4.51. (a) Based on the information from Exercise 4.50, along with the complement rule,
P(T ) = 0.19 and P(T c ) = 0.81. (b) Use the multiplication rule for independent events;
. .
for example, P(T T T ) = 0.193 = 0.0069, P(T T T c ) = (0.192 )(0.81) = 0.0292,
. .
P(T T c T c ) = (0.19)(0.812 ) = 0.1247, and P(T c T c T c ) = 0.813 = 0.5314. (c) Add up the
probabilities from (b) that correspond to each value of X .

Outcome TTT TTTc T T cT T cT T T cT cT T cT T c T T cT c T cT cT c


Probability 0.0069 0.0292 0.0292 0.0292 0.1247 0.1247 0.1247 0.5314
Value of X 0 1 2 3
Probability 0.0069 0.0877 0.3740 0.5314

4.52. The two histograms are shown below. The most obvious difference is that a “family”
must have at least two people. Otherwise, the family-size distribution has slightly larger
probabilities for 2, 3, or 4, while for large family/household sizes, the differences between
the distributions are small.

0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
1 2 3 4 5 6 7 1 2 3 4 5 6 7

4.53. (a) See also the solution to Exercise 4.22. If we view this time as being measured to
any degree of accuracy, it is continuous; if it is rounded, it is discrete. (b) A count like this
must be a whole number, so it is discrete. (c) Incomes—whether given in dollars and cents,
or rounded to the nearest dollar—are discrete. (However, it is often useful to treat such
variables as continuous.)

4.54. (a) 0.8507 + 0.1448 + 0.0045 = 1. (b) Histogram on


the right. (The third bar is so short that it blends in with 0.8

the horizontal axis.) (c) P(at least one ace) = 0.1493, 0.6
which can be computed either as 0.1448 + 0.0045 or
1 − 0.8507. 0.4

0.2

0.0
0 1 2
Solutions 155

4.55. (a) Histogram on the right. (b) “At least one nonword
error” is the event “X ≥ 1” (or “X > 0”). P(X ≥ 1) = 0.3
1 − P(X = 0) = 0.9. (c) “X ≤ 2” is “no more than two
nonword errors,” or “fewer than three nonword errors.” 0.2

P(X ≤ 2) = 0.7 = P(X = 0) + P(X = 1) + P(X = 2)


0.1
= 0.1 + 0.3 + 0.3
P(X < 2) = 0.4 = P(X = 0) + P(X = 1) = 0.1 + 0.3 0.0
0 1 2 3 4

4.56. (a) Curve on the right. A good procedure is to draw


the curve first, locate the points where the curvature
changes, then mark the horizontal axis. Students may at
first make mistakes like drawing a half-circle instead of
the correct “bell-shaped” curve or being careless about 218 234 250 266 282 298 314
 
locating the standard deviation. (b) About 0.81: P(Y ≤ 280) = P Y −266
16 ≤ 280−266
16 =
P(Z ≤ 0.875). Software gives 0.8092; Table A gives 0.8078 for 0.87 and 0.8106 for 0.88 (so
the average is again 0.8092).

4.57. (a) The pairs are given below. We must assume that we can distinguish between,
for example, “(1,2)” and “(2,1)”; otherwise, the outcomes are not equally likely.
(b) Each pair has probability 1/36. (c) The value of X is given below each pair. For
the distribution (given below), we see (for example) that there are four pairs that add to
5, so P(X = 5) = 36 4
. Histogram below, right. (d) P(7 or 11) = 366
+ 36
2
= 368
= 29 .
(e) P(not 7) = 1 − 6
36 = 56 .

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


2 3 4 5 6 7
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 4 5 6 7 8
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 5 6 7 8 9
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 6 7 8 9 10
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 7 8 9 10 11
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6) 2 3 4 5 6 7 8 9 10 11 12
7 8 9 10 11 12

Value of X 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
Probability 36 36 36 36 36 36 36 36 36 36 36

4.58. The possible values of Y are 1, 2, 3, . . . , 12, each with probability 1/12. Aside from
drawing a diagram showing all the possible combinations, one can reason that the first
(regular) die is equally likely to show any number from 1 through 6. Half of the time, the
second roll shows 0, and the rest of the time it shows 6. Each possible outcome therefore
has probability 16 · 12 .
156 Chapter 4 Probability: The Study of Randomness

4.59. The table on the right shows the additional columns to add to the table (1,7) (1,8)
given in the solution to Exercise 4.57. There are 48 possible (equally-likely) 8 9
combinations. (2,7) (2,8)
9 10
Value of X 2 3 4 5 6 7 8 9 10 11 12 13 14 (3,7) (3,8)
Probability 1 2 3 4 5 6 6 6 5 4 3 2 1 10 11
48 48 48 48 48 48 48 48 48 48 48 48 48 (4,7) (4,8)
11 12
(5,7) (5,8)
12 13
(6,7) (6,8)
13 14

4.60. (a) W can be 0, 1, 2, or 3. (b) See the top two lines of the table below. (c) The
distribution is given in the bottom two lines of the table. For example, P(W = 0) =
. .
(0.73)(0.73)(0.73) = 0.3890, and in the same way, P(W = 3) = 0.273 = 0.1597. For
P(W = 1), note that each of the three arrangements that give W = 1 have probability
.
(0.73)(0.73)(0.27) = 0.143883, so P(W = 1) = 3(0.143883) = 0.4316. Similarly,
.
P(W = 2) = 3(0.73)(0.27)(0.27) = 0.1597.
Arrangement DDD DDF DFD FDD FFD FDF DFF FFF
Probability 0.3890 0.1439 0.1439 0.1439 0.0532 0.0532 0.0532 0.0197
Value of W 0 1 2 3
Probability 0.3890 0.4316 0.1597 0.0197

4.61. (a) P(X < 0.6) = 0.6. (b) P(X ≤ 0.6) = 0.6. (c) For continuous random variables,
“equal to” has no effect on the probability; that is, P(X = c) = 0 for any value of c.

4.62. (a) P(X ≥ 0.30) = 0.7. (b) P(X = 0.30) = 0. (c) P(0.30 < X < 1.30) =
P(0.30 < X < 1) = 0.7. (d) P(0.20 ≤ X ≤ 0.25 or 0.7 ≤ X ≤ 0.9) = 0.05 + 0.2 = 0.25.
(e) P(not [0.4 ≤ X ≤ 0.7]) = 1 − P(0.4 ≤ X ≤ 0.7) = 1 − 0.3 = 0.7.

4.63. (a) The height should be 12 since the area under


the curve must be 1. The density curve is at the right.
(b) P(Y ≤ 1.6) = 1.62 = 0.8. (c) P(0.5 < Y < 1.7) =
2 = 0.6. (d) P(Y ≥ 0.95) = 2 = 0.525.
1.2 1.05 0 0.5 1 1.5 2

4.64. (a) The area of a triangle is 12 bh = 12 (2)(1) = 1. (b) P(Y < 1) = 0.5.
(c) P(Y > 0.6) = 0.82; the easiest way to compute this is to note that the unshaded area is
a triangle with area 12 (0.6)(0.6) = 0.18.

0 0.5 1 1.5 2 0 0.5 1 1.5 2


Solutions 157

 
8−9 x −9 10 − 9
4.65. P(8 ≤ x ≤ 10) = P 0.0724 ≤ 0.0724 ≤ 0.0724 = P(−13.8 ≤ Z ≤ 13.8). This probability
is essentially 1; x will almost certainly estimate µ within ±1 (in fact, it will almost
certainly be much closer than this).
 
0.52 − 0.56 p̂ − 0.56 0.60 − 0.56
4.66. (a) P(0.52 ≤ p̂ ≤ 0.60) = P ≤ 0.019 ≤ = P(−2.11 ≤ Z ≤ 2.11) =
0.019 0.019 
− 0.56 − 0.56
0.9826 − 0.0174 = 0.9652. (b) P( p̂ ≥ 0.72) = P p̂0.019 ≥ 0.720.019 = P(Z ≥ 8.42); this
is basically 0.

4.67. The possible values of


 X are $0 and $1, each with probability 0.5 (because the coin is
fair). The mean is $0 2 + $1 2 = $0.50.
1 1

4.69. If Y = 15 + 8X , then µY = 15 + 8µ X = 15 + 8(10) = 95.

4.70. If W = 0.5U + 0.5V , then µW = 0.5µU + 0.5µV = 0.5(20) + 0.5(20) = 20.

4.71. First wenote that µ X = 0(0.5) + 2(0.5) = 1, so σ X2 = (0 − 1)2 (0.5) + (2 − 1)2 (0.5) = 1
and σ X = σ X2 = 1.

4.72. (a) Each toss of the coin is independent (that is, coins have no memory). (b) The
variance is multiplied by 102 = 100. (The mean and standard deviation are multiplied
by 10.) (c) The correlation does not affect the mean of a sum (although it does affect the
variance and standard deviation).

4.73. The mean is


µ X = (0)(0.3) + (1)(0.1) + (2)(0.1) + (3)(0.2) + (4)(0.1) + (5)(0.2) = 2.3 servings.
The variance is
σ X2 = (0 − 2.3)2 (0.3) + (1 − 2.3)2 (0.1) + (2 − 2.3)2 (0.1)
+ (3 − 2.3)2 (0.2) + (4 − 2.3)2 (0.1) + (5 − 2.3)2 (0.2) = 3.61,

so the standard deviation is σ X = 3.61 = 1.9 servings.

4.74. The mean number of aces is µ X = (0)(0.8507) + (1)(0.1448) + (2)(0.0045) = 0.1538.


Note: The exact value of the mean is 2/13, because 1/13 of the cards are aces, and two
cards have been dealt to us.

4.75. The average grade is µ = (0)(0.05) + (1)(0.04) + (2)(0.20) + (3)(0.40) + (4)(0.31) = 2.88.

4.76. The means are


(0)(0.1) + (1)(0.3) + (2)(0.3) + (3)(0.2) + (4)(0.1) = 1.9 nonword errors and
(0)(0.4) + (1)(0.3) + (2)(0.2) + (3)(0.1) = 1 word error
158 Chapter 4 Probability: The Study of Randomness

4.77. In the solution to Exercise 4.74, we found µ X = 0.1538 aces, so


.
σ X2 = (0 − 0.1538)2 (0.8507) + (1 − 0.1538)2 (0.1448) + (2 − 0.1538)2 (0.0045) = 0.1391,
. √ .
and the standard deviation is σ X = 0.1391 = 0.3730 aces.

4.78. In the solution to Exercise 4.75, we found the average grade was µ = 2.88, so
σ 2 = (0 − 2.88)2 (0.05) + (1 − 2.88)2 (0.04)
+ (2 − 2.88)2 (0.2) + (3 − 2.88)2 (0.4) + (4 − 2.88)2 (0.31) = 1.1056,
√ .
and the standard deviation is σ = 1.1056 = 1.0515.

4.79. (a) With ρ = 0, the variance is σ X2 +Y = σ X2 + σY2 = (75)2 + (41)2 = 7306, so the standard
√ .
deviation is σ X +Y = 7306 = $85.48. (b) This is larger; the negative correlation decreased
the variance.

4.80. (a) The mean of Y is µY = 1—the obvious balance point of the triangle. (b) Both X 1
and X 2 have mean µ X 1 = µ X 2 = 0.5 and µY = µ X 1 + µ X 2 .

4.81. The situation described in this exercise—“people who have high intakes of calcium
in their diets are more compliant than those who have low intakes”—implies a positive
correlation between calcium intake and compliance. Because of this, the variance of total
calcium intake is greater than the variance we would see if there were no correlation (as the
calculations in Example 4.38 demonstrate).

4.82. Let N and W be nonword and word error counts. In Exercise 4.76, we found
µ N = 1.9 errors and µW = 1 error. The variances of these distributions are σ N2 = 1.29
.
and σW2 = 1, so the standard deviations are σ N = 1.1358 errors and σW = 1 error. The
mean total error count is µ N + µW = 2.9 errors for both cases. (a) If error counts are
.
independent (so that ρ = 0), σ N2 +W = σ N2 + σW2 = 2.29 and σ N +W = 1.5133 errors.
(Note that we add the variances, not the standard deviations.) (b) With ρ = 0.5,
. .
σ N2 +W = σ N2 + σW2 + 2ρσ N σW = 2.29 + 1.1358 = 3.4258 and σ N +W = 1.8509 errors.
   
4.83. (a) The mean for one coin is µ1 = (0) 1
2 + (1) 1
2 = 0.5 and the variance is
   
σ12 = (0 − 0.5)2 1
2 + (1 − 0.5)2 1
2 = 0.25, so the standard deviation is σ1 = 0.5.
(b) Multiply µ1 and σ12
by 4: µ4 = 4µ1 = 2 and σ42 = 4σ14 = 1, so σ4 = 1. (c) Note that
because of the symmetry of the distribution, we do not need to compute the mean to see
that µ4 = 2; this is the obvious balance point of the probability histogram in Figure 4.7. The
details of the two computations are
µW = (0)(0.0625) + (1)(0.25) + (2)(0.375) + (3)(0.25) + (4)(0.0625) = 2
σW2 = (0 − 2)2 (0.0625) + (1 − 2)2 (0.25)
+ (2 − 2)2 (0.375) + (3 − 2)2 (0.25) + (4 − 2)2 (0.0625) = 1.

 
4.84. If D is the result of rolling a single four-sided die, then µ D = (1 + 2 + 3 + 4) 1
4 = 2.5,
and σ D2 = [(1 − 2.5)2 + (2 − 2.5)2 + (3 − 2.5)2 + (4 − 2.5)2 ] 14 = 1.25. Then for the sum
Solutions 159

I = D1 + D2 + 1, we have mean intelligence µ I = 2µ D + 1 = 6. The variance of I is


.
σ I2 = 2σ D2 = 2.5, so σ I = 1.5811.

and B2 the bearing lengths, we have µ B1 +R+B2 =


4.85. With R as the rod length and B1 √
.
12 + 2 · 2 = 16 cm and σ B1 +R+B2 = 0.0042 + 2 · 0.0012 = 0.004243 mm.

4.86. (a) d1 = 2σ R = 0.008 mm = 0.0008 cm and d2 = 2σ B = 0.002 mm = 0.0002 cm.


.
(b) The natural tolerance of the assembled parts is 2σ B1 +R+B2 = 0.008485 mm =
0.0008485 cm.

4.87. (a) Not independent: Knowing the total X of the first two cards tells us something
about the total Y for three cards. (b) Independent: Separate rolls of the dice should be
independent.

. .
4.88. Divide the given values by 2.54: µ = 69.6063 in and σ = 2.8346 in.

4.89. With ρ = 1, we have:


σ X2 +Y = σ X2 + σY2 + 2ρσ X σY = σ X2 + σY2 + 2σ X σY = (σ X + σY )2

And of course, σ X +Y = (σ X + σY )2 = σ X + σY .

 The mean of X is (µ − σ )(0.5) + (µ + σ√


4.90. )(0.5) = µ, and the standard deviation is
(µ − σ − µ)2 (0.5) + (µ + σ − µ)2 (0.5) = σ 2 = σ .

4.91. Although the probability of having to pay for a total loss for one or more of the 10
policies is very small, if this were to happen, it would be financially disastrous. On the other
hand, for thousands of policies, the law of large numbers says that the average claim on
many policies will be close to the mean, so the insurance company can be assured that the
premiums they collect will (almost certainly) cover the claims.

√ µ. T = 10 · $300 = $3000, and standard


4.92. The total loss√T for 10 fires has mean
deviation σT = 10 · $4002 = $400 10 = $1264.91. The average loss is T /10, so
.
µT /10 = 101
µT = $300 and σT /10 = 101
σT = $126.49.
The total loss √ √ µ.T = 12 · $300 = $3600, and standard
T for 12 fires has mean
deviation σT = 12 · $4002 = $400 12 = $1385.64. The average loss is T /12, so
.
µT /12 = 121
µT = $300 and σT /12 = 121
σT = $115.47.
Note: The mean of the average loss is the same regardless of the number of policies,
but the standard deviation decreases as the number of policies increases. With thousands of
policies, the standard deviation is very small, so the average claim will be close to $300, as
was stated in the solution to the previous problem.
160 Chapter 4 Probability: The Study of Randomness

4.93. (a) Add up the given probabilities and subtract from 1; this gives P(man does not die
in the next five years) = 0.99749. (b) The distribution of income (or loss) is given below.
.
Multiplying each possible value by its probability gives the mean intake µ = $623.22.

Age at death 21 22 23 24 25 Survives


Loss or income −$99,825 −$99,650 −$99,475 −$99,300 −$99,125 $875
Probability 0.00039 0.00044 0.00051 0.00057 0.00060 0.99749

4.94. The mean µ of the company’s “winnings” (premiums) and their “losses” (insurance
claims) is positive. Even though the company will lose a large amount of money on a small
number of policyholders who die, it will gain a small amount on the majority. The law of
large numbers says that the average “winnings” minus “losses” should be close to µ, and
overall the company will almost certainly show a profit.

4.95. The events “roll a 3” and “roll a 5” are disjoint, so P(3 or 5) = P(3) + P(5) = 1
6 + 1
6 =
6 = 3.
2 1

4.96. The events E (roll is even) and G (roll is greater than 4) are not disjoint—specifically,
E and G = {6}—so P(E or G) = P(E) + P(G) − P(E and G) = 36 + 26 − 16 = 46 = 23 .

4.97. Let A be the event “next card is an ace” and B be “two of Slim’s four cards are aces.”
Then, P(A | B) = 48
2
because (other than those in Slim’s hand) there are 48 cards, of which
2 are aces.

4.98. Let A1 = “the next card is a diamond” and A2 = “the second card is a diamond.”
We wish to find P(A1 and A2 ). There are 27 unseen cards, of which 10 are diamonds, so
5 .
P(A1 ) = 1027 , and P(A2 | A1 ) = 26 , so P(A1 and A2 ) = 27 × 26 = 39 = 0.1282.
9 10 9

Note: Technically, we wish to find P(A1 and A2 | B), where B is the given event (25 cards
visible, with 3 diamonds in Slim’s hand). We have P(A1 | B) = 27 10
and P(A2 | A1 and B) = 26
9
,
and compute P(A1 and A2 | B) = P(A1 | B) × P(A2 | A1 and B).

4.99. This computation uses the addition rule for disjoint events, which is appropriate for this
setting because B (full-time students) is made up of four disjoint groups (those in each of
the four age groups).

4.100. With A and B as defined in Example 4.44 (respectively, 15- to 19-year-old students, and
full-time students), we want to find
P(B and A) 0.21 .
P(B | A) = = = 0.9130
P(A) 0.21 + 0.02
For these two calculations, we restrict our attention to different subpopulations of students
(that is, different rows of the table given in Example 4.44). For P(A | B), we ask what
fraction of full-time students (the subpopulation) are aged 15 to 19 years. For P(B | A), we
ask what fraction of the subpopulation of 15- to 19-year-old students are full-time.
Solutions 161

4.101. The tree diagram shows the probability found First card Second card
in Exercise 4.98 on the top branch. The middle two 9/ 26 diamond 5/ 39
diamond
branches (added together) give the probability that 10/ 27 non-
17/ 26 diamond 85/ 351
Slim gets exactly one diamond from the next two
cards, and the bottom branch is the probability that 17/ 27 non-
10/ 26 diamond 85/ 351
neither card is a diamond. diamond non- 136/ 351
16/ 26 diamond

4.102. (a) The given statement is only true for disjoint events; in general, P(A or B) =
P(A) + P(B) − P(A and B). (b) P(A) plus P(Ac ) is always equal to one. (c) Two events
are independent if P(B | A) = P(B). They are disjoint if P(A and B) = 0.

4.103. For a randomly chosen adult, let S = “(s)he gets enough sleep” and let E = “(s)he
gets enough exercise,” so P(S) = 0.4, P(E) = 0.46, and P(S and E) = 0.24.
(a) P(S and E c ) = 0.4 − 0.24 = 0.16. (b) P(S c and E) = 0.46 − 0.24 = 0.22.
(c) P(S c and E c ) = 1 − (0.4 + 0.46 − 0.24) = 0.38. (d) The answers in (a) and
(b) are found by a variation of the addition rule for disjoint events: We note that
P(S) = P(S and E) + P(S and E c ) and P(E) = P(S and E) + P(S c and E). In each
case, we know the first two probabilities, and find the third by subtraction. The answer
for (c) is found by using the general addition rule to find P(S or E), and noting that
S c and E c = (S or E)c .

4.104. With S and E as defined in the previous solution, the S c and E c


Venn diagram on the right illustrates the probabilities computed 0.38
S c and E
above.
0.22
S and E
S and E c 0.24
0.16

4.105. For a randomly chosen high school student, let L = “student admits to lying” and
M = “student is male,” so P(L) = 0.48, P(M) = 0.5, and P(M and L) = 0.25. Then
P(M or L) = P(M) + P(L) − P(M and L) = 0.73.

4.106. Using the addition rule for disjoint events, note that P(M c and L) = P(L) −
P(M and L) = 0.23. Then by the definition of conditional probability, P(M c | L) =
P(M and L) 0.23 .
= = 0.4792.
P(L) 0.48

4.107. Let B = “student is a binge drinker” and M = “student is male.” (a) The four
probabilities sum to 0.11 + 0.12 + 0.32 + 0.45 = 1. (b) P(B c ) = 0.32 + 0.45 = 0.77.
c and M) .
(c) P(B c | M) = P(BP(M) = 0.110.32
+ 0.32 = 0.7442. (d) In the language of this chapter, the
events are not independent. An attempt to phrase this for someone who has not studied this
material might say something like, “Knowing a student’s gender gives some information
about whether or not that student is a binge drinker.”
Note: Specifically, male students are slightly more likely to be binge drinkers. This
statement might surprise students who look at the table and note that the proportion
of binge drinkers in the men’s columns is smaller than that proportion in the women’s
162 Chapter 4 Probability: The Study of Randomness

column. We cannot compare those proportions directly; we need to compare the conditional
probabilities of binge drinkers within each given gender (see the solution to the next
exercise.)

4.108. Let B = “student is a binge drinker” and M = “student is male.” (a) These two
probabilities are given as entries in the table: P(M and B) = 0.11 and P(M c and B) = 0.12.
.
(b) These are conditional probabilities: P(B | M) = P(BP(M) and M)
= 0.110.11
+ 0.32 = 0.2558 and
and M c ) .
P(B | M c ) = P(BP(M c) = 0.120.12
+ 0.45 = 0.2105. (c) The fact that P(B | M) > P(B | M )
c

indicates that male students are more likely to be binge drinkers (see the comment in the
solution to the previous exercise). The other comparison, P(M and B) < P(M c and B), is
more a reflection of the fact that the survey reported responses for more women (57%)
than men (43%) and does not by itself allow for comparison of binge drinking between
the genders. (To understand this better, imagine a more extreme case, where, say, 90% of
respondents were women . . . .)

4.109. Let M = “male” and C = “attends a 4-year institution.” Men Women


(C is not an obvious choice, but it is less confusing than F, 4-year 0.2684 0.3416
which we might mistake for “female.”) We have been given 2-year 0.1599 0.2301
P(C) = 0.61, P(C c ) = 0.39, P(M | C) = 0.44 and P(M | C c ) = 0.41. (a) To create the
table, observe that:
P(M and C) = P(M | C)P(C) = (0.44)(0.61) = 0.2684
And similarly, P(M and C c ) = P(M | C c )P(C c ) = (0.41)(0.39) = 0.1599. The other two
entries can be found in a similar fashion or by observing that, for example, the two numbers
on the first row must sum to P(C) = 0.61.
and M c ) .
(b) P(C | M c ) = P(CP(M c) = 0.3416 + 0.2301 = 0.5975.
0.3416

4.110. The branches of this tree diagram have the prob- Institution type Gender
abilities given in Exercise 4.109, and the branches 0.44 male 0.2684
4-year
end with the probabilities found in the solution to that 0.61 female 0.3416
0.56
exercise.
0.41 male 0.1599
0.39 2-year
0.59 female 0.2301

4.111. As before, let M = “male” and C = “attends Gender Institution type


a 4-year institution.” For this tree diagram, we need 0.6267 4-year 0.2684
male
to compute P(M) = 0.2684 + 0.1599 = 0.4283, 0.4283 2-year 0.1599
0.3733
P(M c ) = 0.3416 + 0.2301 = 0.5717, as well as
P(C | M), P(C | M c ), P(C c | M), and P(C c | M c ). 0.5975 4-year 0.3416
0.5717 female
For example, 2-year 0.2301
0.4025
P(C and M) 0.2684 .
P(C | M) = = = 0.6267.
P(M) 0.4283
All the computations for this diagram are “inconvenient” because they require that we work
backward from the ending probabilities, instead of working forward from the given probabili-
ties (as we did in the previous tree diagram).
Solutions 163

4.112. P(A or B) = P(A) + P(B) − P(A and B) = 0.138 + 0.261 − 0.082 = 0.317.

.
4.113. P(A | B) = P(AP(B)
and B)
= 0.082
0.261 = 0.3142. If A and B were independent, then P(A | B)
would equal P(A), and also P(A and B) would equal the product P(A)P(B).

4.114. (a) {A and B} means the selected household is both Ac and B c S


prosperous and educated. P(A and B) = 0.082 (given). 0.683
Ac and B
(b) {A and B c } means the household is prosperous but not
0.179
educated. P(A and B c ) = P(A) − P(A and B) = 0.056. A and B
(c) {Ac and B} means the household is not prosperous but is A and B c 0.082
0.056
educated. P(A and B) = P(B) − P(A and B) = 0.179.
c

(d) {Ac and B c } means the household is neither prosperous


nor educated. P(Ac and B c ) = 0.683 (so that the probabilities add to 1).

4.115. (a) “The vehicle is a light P(A) = 0.31 P(Ac ) = 0.69


truck” = Ac ; P(Ac ) = 0.69. P(B) = 0.22 P(A and B) = 0.08 P(Ac and B) = 0.14
(b) “The vehicle is an imported P(B c ) = 0.78 P(A and B c ) = 0.23 P(Ac and B c ) = 0.55
car” = A and B. To find this
probability, note that we have been given P(B c ) = 0.78 and P(Ac and B c ) = 0.55. From
this we can determine that 78% − 55% = 23% of vehicles sold were domestic cars—that is,
P(A and B c ) = 0.23—so P(A and B) = P(A) − P(A and B c ) = 0.31 − 0.23 = 0.08.
Note: The table shown here summarizes all that we can determine from the given infor-
mation (bold).

4.116. Let A be the event “income ≥ $1 million” and B be “income ≥ $100,000.” Then
“A and B” is the same as A, so:
392,220
P(A) 142,978,806 392,220 .
P(A | B) = = = = 0.02180
P(B) 17,993,498 17,993,498
142,978,806

4.117. See also the solution to Exercise 4.115, especially the table of probabilities given there.
c and B) .
(a) P(Ac | B) = P(AP(B) = 0.14
0.22 = 0.6364. (b) The events A and B are not independent;
c

if they were, P(Ac | B) would be the same as P(Ac ) = 0.69.

4.118. To find the probabilities in this Venn diagram, begin with A S


A only
P(A and B and C) = 0 in the center of the diagram. Then 0.3 A and C
each of the two-way intersections P(A and B), P(A and C), 0.1 C
and P(B and C) go in the remainder of the overlapping areas; A and B C only
All three
0.3 0.1
if P(A and B and C) had been something other than 0, we 0

would have subtracted this from each of the two-way inter- B and C
B only 0.1
section probabilities to find, for example, P(A and B and C c ). B 0.1 None
0
Next, determine P(A only) so that the total probability of
the regions that make up the event A is 0.7. Finally, P(none) = P(A and B and C ) = 0
c c c

because the total probability inside the three sets A, B, and C is 1.


164 Chapter 4 Probability: The Study of Randomness

4.119. We seek P(at least one offer) = P(A or B or C); we can find this as 1 − P(no
offers) = 1 − P(Ac and B c and C c ). We see in the Venn diagram of Exercise 4.118 that this
probability is 1.

4.120. This is P(A and B and C c ). As was noted in Exercise 4.118, because
P(A and B and C) = 0, this is the same as P(A and B) = 0.3.

P(B and C) P(B and C)


4.121. P(B | C) = = 0.1
0.3 = 13 . P(C | B) = = 0.1
0.5 = 0.2.
P(C) P(B)

4.122. Let W = “the degree was earned by a woman” and P = “the degree was a professional
degree.” (a) To construct the table (below), divide each entry by the grand total of
933 .
all entries (2403); for example, 2403 = 0.3883 is the fraction of all degrees that were
bachelor’s degrees awarded to women. Some students may also find the row totals (1412
and 991) and the column totals (1594, 662, 95, 52) and divide those by the grand total;
.
2403 = 0.6633 is the fraction of all degrees that were bachelor’s degrees.
for example, 1594
.
(b) P(W ) = 1412
2403 = 0.5876 (this is one of the optional marginal probabilities from the
51 .
table below). (c) P(W | P) = 51/2403
95/2403 = 95 = 0.5368. (This is the “Female” entry from the
“Professional” column, divided by that column’s total.) (d) W and P are not independent; if
they were, the two probabilities in (b) and (c) would be equal.

Bachelor’s Master’s Professional Doctorate Total


Female 0.3883 0.1673 0.0212 0.0108 0.5876
Male 0.2751 0.1082 0.0183 0.0108 0.4124
Total 0.6633 0.2755 0.0395 0.0216 1.0000

4.123. Let M be the event “the person is a man” and B be “the person earned a bachelor’s
991 .
degree.” (a) P(M) = 2403 = 0.4124. Or take the answer from part (b) of the
661 .
previous exercise and subtract from 1. (b) P(B | M) = 661/2403
991/2403 = 991 = 0.6670.
(This is the “Bachelor’s” entry from the “Male” row, divided by that row’s total.)
. .
(c) P(M and B) = P(M) P(B | M) = (0.4124)(0.6670) = 0.2751. This agrees with the
.
directly computed probability: P(M and B) = 2403
661
= 0.2751.

.
4.124. Each unemployment rate is computed as Did not finish HS 1− 11,552
12,623
= 0.0848
shown on the right. (Alternatively, subtract the .
HS/no college 1− 36,249
38,210
= 0.0513
number employed from the number in the labor .
Some college 1− 32,429
33,928
= 0.0442
force, then divide that difference by the number .
College graduate 1− 39,250
= 0.0288
in the labor force.) Because these rates (probabili- 40,414

ties) are different, education level and being employed are not independent.

4.125. (a) Add up the numbers in the first and second columns. We find that there are 186,210
thousand (i.e., over 186 million) people aged 25 or older, of which 125,175 thousand are in
. .
186,210 = 0.6722. (b) P(L | C) =
the labor force, so P(L) = 125,175 P(L and C)
P(C) = 40,414
51,568 = 0.7837.
(c) L and C are not independent; if they were, the two probabilities in (a) and (b) would be
equal.
Solutions 165

4.126. For the first probability, add up the numbers in the third column. We find that there
are 119,480 thousand (i.e., over 119 million) employed people aged 25 or older. Therefore,
39,250 .
P(C | E) = P(CP(E)
and E)
= 119,480 = 0.3285.
For the second probability, we use the total number of college graduates in the
.
population: P(E | C) = P(CP(C)
and E)
= 39,250
51,568 = 0.7611.

4.127. The population includes retired people who have left the labor force. Retired persons
are more likely than other adults to have not completed high school; consequently, a
relatively large number of retired persons fall in the “did not finish high school” category.
Note: Details of this lurking variable can be found in the Current Population Survey
annual report on “Educational Attainment in the United States.” For 2006, this report says
that among the 65-and-over population, about 24.8% have not completed high school,
compared to about 19.3% of the under-65 group.

4.128. Tree diagram at right. The numbers on the Type Caught?


right side of the tree are found by the multipli- 0.9 Yes 0.225
nonword
cation rule; for example, P(“nonword error” and 0.25 No 0.025
0.1
“caught”) = P(N and C) = P(N ) P(C | N ) = Error
(0.25)(0.9) = 0.225. A proofreader should catch 0.7 Yes 0.525
0.75 word
about 0.225 + 0.525 = 0.75 = 75% of all errors. No 0.225
0.3

4.129. (a) Her brother has type aa, and he got one allele from each parent. A a
But neither parent is albino, so neither could be type aa. (b) The table A AA Aa
on the right shows the possible combinations, each of which is equally a Aa aa
likely, so P(aa) = 0.25, P(Aa) = 0.5, and P(A A) = 0.25. (c) Beth is either A A or Aa, and
P(A A | not aa) = 0.25
0.75 = 3 , while P(Aa | not aa) = 0.75 = 3 .
1 0.50 2

4.130. (a) If Beth is Aa, then the first table on the right gives A a A A
the (equally likely) allele combinations for a child, so P(child a Aa aa Aa Aa
is non-albino | Beth is Aa) = 12 . If Beth is A A, then as the a Aa aa Aa Aa
second table shows, their child will definitely be type Aa (and non-albino), so P(child is
non-albino | Beth is A A) = 1. (b) We have:
P(child is non-albino) = P(child Aa and Beth Aa) + P(child Aa and Beth A A)
= P(Beth Aa) P(child Aa | Beth Aa) + P(Beth A A) P(child Aa | Beth A A)
= 2
3 · 1
2 + 1
3 ·1= 2
3
1/3
Therefore, P(Beth is Aa | child is Aa) = 2/3 = 12 .

4.131. Let C be the event “Toni is a carrier,” T be the event “Toni tests positive,” and D
be “her son has DMD.” We have P(C) = 23 , P(T | C) = 0.7, and P(T | C c ) = 0.1.
 ) = P(T and C) + P(T and C ) = P(C) P(T | C) + P(C ) P(T | C ) =
Therefore, P(T c c c
 
3 (0.7) + 3 (0.1) = 0.5, and:
2 1

P(T and C) (2/3)(0.7) 14 .


P(C | T ) = = = = 0.9333
P(C) 0.5 15
166 Chapter 4 Probability: The Study of Randomness

4.132. The value −1 should occur about 30% of the time (that is, the proportion should be
close to 0.3).

4.133. The mean of X is µ X = (−1)(0.3) + (2)(0.7) = 1.1, so the mean if many such values
will be close to 1.1.

4.134. (a) µ√ X = (1)(0.4) + (2)(0.6) = 1.6 and σ X = (1 − 1.6) (0.4) + (2 − 1.6) (0.6) = 0.24,
2 2 2
.
so σ X = 0.24 = 0.4899. (b) The mean is µY = 3µ X√− 2 = 2.8. The variance is
.
σY2 = 9σ X2 = 2.16, and the standard deviation is σY = 2.16 = 1.4697 (this can also be
found as 3σ X ). (c) The first computation used Rule 1 for means. The second computation
used Rule 1 for variances and standard deviations.

4.135. (a) Because the possible values of X are 1 and 2, the possible values of Y are
3 · 12 − 2 = 1 (with probability 0.4) and 3 · 22 − 2 = 10 (with probability 0.6).
(b) µY =√(1)(0.4) + (10)(0.6) = 6.4 and σY2 = (1 − 6.4)2 (0.4) + (10 − 6.4)2 (0.6) = 19.44,
.
so σ X = 19.44 = 4.4091. (c) Those rule are for transformations of the form a X + b, not
a X 2 + b.

4.136. (a) A and B are disjoint. (If A happens, B did not.) (b) A and B are independent. (A
concerns the first roll, B the second.) (c) A and B are independent. (A concerns the second
roll, B the first.) (d) A and B are neither disjoint nor independent. (If A happens, then so
does B.)

4.137. (a) P(A) = 5


36 and P(B) = 10
36 = 18 . (b) P(A) = 36 and P(B) = 36 = 12 .
5 5 21 7

(c) P(A) = 15
36 = 5
12 and P(B) = 10
36 = 18 . (d) P(A) = 36 = 12 and P(B) = 36 = 18 .
5 15 5 10 5

4.138. (a) The mean is µ X = (2)(0.3) + (3)(0.4) + Value of X 2 3 4


(4)(0.3) = 3. The variance is Probability 0.3 0.4 0.3
σ X2 = (2 − 3)2 (0.3) + (3 − 3)2 (0.4) + (4 − 3)2 (0.3) (b) & (c) p 1 − 2p p

= 0.6,

so the standard deviation is σ X = 0.6 = 0.7746. (b) & (c) To achieve a mean of 3 with
possible values 2, 3, and 4, the distribution must be symmetric; that is, the probability at 2
must equal the probability at 4 (so that 3 would be the balance point of the distribution). Let
p be the probability assigned to 2 (and also to 4) in the new distribution. A larger standard
deviation is achieved when p > 0.3, and a smaller standard deviation arises when p < 0.3.

In either case, the new standard deviation is 2 p.

4.139. For each bet, the mean is the winning prob- Point Expected Payoff
ability times the winning payout, plus the losing 4 or 10 1
(+$20) + 23 (−$10) = 0
3
probability times −$10. These are summarized 5 or 9 2
(+$15) + 35 (−$10) = 0
5
in the table on the right; all mean payoffs equal
6 or 8 5
(+$12) + 11
6
(−$10) = 0
$0. 11

Note: Alternatively, we can find the mean amount of money we have at the end of the
bet. For example, if the point is 4 or 10, we end with either $30 or $0, and our expected
ending amount is 13 ($30) + 23 ($0) = $10—equal to the amount of the bet.
Solutions 167

1 − 0.72
4.140. P(A) = P(B) = · · · = P(F) = 0.72
6 = 0.12 and P(1) = · · · = P(8) = 8 = 0.035.

4.141. (a) All probabilities are greater than or equal to 0, and their sum is 1. (b) Let R1
be Taster 1’s rating and R2 be Taster 2’s rating. Add the probabilities on the diagonal
(upper left to lower right): P(R1 = R2 ) = 0.03 + 0.07 + 0.25 + 0.20 + 0.06 = 0.61.
(c) P(R1 > 3) = 0.39 (the sum of the ten numbers in the bottom two rows), and
P(R2 > 3) = 0.39 (the sum of the ten numbers in the right two columns).

4.142. As σa+bX = bσ X and σc+dY = dσY , we need b = 100


106 and d = 109 . With these
100
. .
choices for b and d, we have µa+bX = a + bµ X = a + 419.8113, so a = 80.1887, and
. .
µc+d X = c + dµY = c + 519.2661, so c = −19.2661.

4.143. This is the probability of 19 (independent) losses, followed by a win; by the


.
multiplication rule, this is 0.99419 · 0.006 = 0.005352.
   
4.144. (a) P(win the jackpot) = 1
20
8
20
1
20 = 0.001. (b) The other symbol can show up
   
on the middle wheel, with probability 1
20
12
20
1
20 = 0.0015, or on either of the outside
   
wheels, with probability 19 20 = 0.019. Therefore, combining all three cases, we
8 1
20 20
have P(exactly two bells) = 0.0015 + 2 · 0.019 = 0.0395.

4.145. With B, M, and D representing the three kinds of degrees, and W meaning the degree
recipient was a woman, we have been given:
P(B) = 0.71, P(M) = 0.23, P(D) = 0.06,
P(W | B) = 0.44, P(W | M) = 0.41, P(W | D) = 0.30.
Therefore, we find
P(W ) = P(W and B) + P(W and M) + P(W and D)
= P(B) P(W | B) + P(M) P(W | M) + P(D) P(W | D) = 0.4247,
so:
P(B and W ) P(B) P(W | B) 0.3124 .
P(B | W ) = = = = 0.7356
P(W ) P(W ) 0.4247

4.146. The table shows conditional distribu- Given T Given Y


tions given T (public or private institution) Public Private Public Private
and given Y (two-year or four-year insti- Two-year 0.3759 0.7528 0.2523 0.7477
tution). The four numbers in the “Given 4-year 0.6241 0.2472 0.6304 0.3696
T ” group are found by dividing each number in the table by the column total (so each
column sums to 1). In the “Given Y ” group, we divide each number by the row total (so
.
each row sums to 1). For example: P(two-year | Public) = 639 639 + 1061 = 0.3759, P(two-
. .
year | Private) = 1894 + 622 = 0.7528, and P(Public | two-year) = 639 + 1894 = 0.2523.
1894 639
168 Chapter 4 Probability: The Study of Randomness

no point $0 12/ 36
4.147. P(no point is established) = 1236 = 3 .
1
4 $20 1/ 36
In Exercise 4.139, the probabilities of winning each 4 7 –$10 2/ 36
odds bet were given as 13 for 4 and 10, 25 for 5 and 9, 5
5 $15 2/ 45
5 7 –$10 3/ 45
and 11 for 6 and 8. This tree diagram can get a bit large 6 $12 25/ 396
(and crowded). In the diagram shown on the right, the 6
First 7 –$10 30/ 396
probabilities are omitted from the individual branches. roll
8
8 $12 25/ 396
The probability of winning 7 –$10 30/ 396
 an odds
 bet on 4 or 10 (with
9 $15 2/ 45
a net payout of $20) is 36 3 = 36
3 1 1
. Losing that odds 9
7 –$10 3/ 45
  
1/ 36
bet costs $10, and has probability 3
36
2
3 = 2
36 (or 10
10 $20
7 –$10 2/ 36
1
18 ).
Similarly, the probability of winning an odds bet
     
on 5 or 9 is 36 25 = 45
4 2 4
, and the probability of losing that bet is 36 5 =
3 3
(or 1
15 ).
   45
For an odds bet on 6 or 8, we win $12 with probability 5 5
= 25
396 , and lose $10 with
   36 11

11 = 396 (or 66 ).
5 6 30 5
probability 36
To confirm that this game is fair, one can multiply each payoff by its probability then
add up all of those products. More directly, because each individual odds bet is fair (as was
shown in the solution to Exercise 4.139), one can argue that taking the odds bet whenever it
is available must be fair.

4.148. Student findings will depend on how much they explore the Web site. Individual growth
charts include weight-for-age, height-for-age, weight-for-length, head circumference-for-age,
and body mass index-for-age.

4.149. Let R1 be Taster 1’s rating and R2 be Taster 2’s rating. P(R1 = 3) =
0.01 + 0.05 + 0.25 + 0.05 + 0.01 = 0.37, so:
P(R2 > 3 and R1 = 3) 0.05 + 0.01 .
P(R2 > 3 | R1 = 3) = = = 0.1622
P(R1 = 3) 0.37

4.150. Note first that P(A) = 1


2 and P(B) = 2
4 = 12 . Now P(B and A) = P(both coins are
P(B and A)
heads) = 0.25, so P(B | A) = = 0.25
0.5 = 0.5 = P(B).
P(A)

4.151. The event {Y < 1/2} is the bottom half of the square, while
{Y > X } is the upper left triangle of the square. They overlap in
a triangle with area 1/8, so: Y X
P(Y < 12 and Y > X ) 1/8 1
P(Y < 1
| Y > X) = = =
2 P(Y > X ) 1/2 4

Y 1 2
Solutions 169

4.152. The response will be “no” with probability tails “yes”


0.5
(0.5)(0.7) = 0.35. (That is, of the 70% who have 0.5
Flip
not plagiarized, half will say “no.”) yes “yes”
coin 0.3 0.15
If the probability of plagiarism were 0.2, then heads Did they
P(student answers “no”) = (0.5)(0.8) = 0.4. (Of 0.5 plagiarize?
no “no”
0.7 0.35
the 80% who have not plagiarized, half say “no.”)
If 39% of students surveyed answered “no,” then we estimate that 2 · 39% = 78% have not
plagiarized, so about 22% have plagiarized.
Chapter 5 Solutions

5.1. The population is iPhone users (or iPhone users who use the AppsFire service). The
statistic is an average of 65 apps per device. Likely values will vary, in part based on how
many apps are on student phones (which they might consider “typical”).

5.2. With µ = 240, σ = 18, and n = 36, we have mean µx = µ = 240 and standard deviation

σx = σ/ n = 3.

5.3. When n = 144, the mean is µx = µ = 240 (unchanged), and the standard deviation is

σx = σ/ n = 1.5. Increasing n does not change µx but decreases σx , the variability of
the sampling distribution. (In this case, because n was increased by a factor of 4, σx was
halved.)

5.4. When n = 144, σx = σ/ 144 = 18/12 = 1.5. The sampling distribution of x is
approximately N (240, 1.5), so about 95% of the time, x is between 237 and 243.

5.5. When n = 1296, σx = σ/ 1296 = 18/36 = 0.5. The sampling distribution of x is
approximately N (240, 0.5), so about 95% of the time, x is between 239 and 241.
√  
. − 25 28 − 25 . .
5.6. With σ/ 50 = 3.54, we have P(x < 28) = P x3.54 < 3.54 = P(Z < 0.85) = 0.8023.

5.7. (a) Either change “variance” to “standard deviation” (twice), or change the formula at the
end to 102 /30. (b) Standard deviation decreases with increasing sample size. (c) µx always
equals µ, regardless of the sample size.

5.8. (a) The distribution of x is approximately Normal. (The distribution of observed


values—that is, the population distribution—is unaffected by the sample size.) (b) x is

within µ ± 2σ/ n about 95% of the time. (c) The (distribution of the) sample mean x is
approximately Normal. (µ is not random; it is just a number, albeit typically an unknown
one.)

5.9. (a) µ = 3388/10 = 338.8. (b) The scores 180


will vary depending on the starting row. 160
The smallest and largest possible means 140
Frequency

120
are 290 and 370. (c) Answers will vary.
100
Shown on the right is a histogram of the 80
(exact) sampling distribution. With a sample 60
size of only 3, the distribution is noticeably 40
non-Normal. (d) The center of the exact 20
sampling distribution is µ, but with only 0
290 300 310 320 330 340 350 360 370
10 values of x, this might not be true for
Sample mean
student histograms.
Note: This histograms were found by considering all 1000 possible samples.

170
Solutions 171

√ .
5.10. (a) σx = σ/ 200 = 0.08132. (b) With n = 200,
 x will be within
 ±0.16 (about 10
6.9 − 7.02 . .
minutes) of µ = 7.02 hours. (c) P(x ≤ 6.9) = P Z ≤ 0.08132 = P(Z ≤ −1.48) =
0.0694.

5.11. (a) With n = 200, the 95% probability range was about ±10 minutes, so need a
larger sample size. (Specifically, to halve the range, we need to roughly quadruple the
.
sample size.) (b) We need 2σx = 60 5
, so σx = 0.04167. (c) With σ = 1.15, we have

n = 0.04167
1.15
= 27.6, so n = 761.76—use 762 students.
√ √ .
5.12. (a) The standard deviation is σ/ 10 = 280/ 10 = 88.5438 seconds. (b) In order to have
√ √ .
σ/ n = 15 seconds, we need n = 280 15 , so n = 348.4—use n = 349.

√ √ .
5.13. Mean µ = 250 ml and standard deviation σ/ 6 = 3/ 6 = 1.2247 ml.

5.14. (a) For this exercise, bear in mind that the ac-
tual distribution for a single song length is definitely
not Normal; in particular, a Normal distribution with
mean 350 seconds and standard deviation 280 sec-
onds extends well below 0 seconds. The Normal√
curve for x should be taller
√ by a factor of 10 and
skinnier by a factor of 1/ 10 (although that tech- –480 –210 70 350 630 910 1190
nical detail will likely be lost on most students). (b) Using a N (350, 280) distribution,
. .
1 − P(331 < X < 369) = 1 − P(−0.07 < Z < 0.07) = 0.9442. (c) Using a N (350, 88.5438)
. .
distribution, 1 − P(331 < X < 369) = 1 − P(−0.21 < Z < 0.21) = 0.8336.

.
5.15. In Exercise 5.13, we found that σx = 1.2247 ml, so x
has a N (250 ml, 1.2247 ml) distribution. (a) On the right.
The
√ Normal curve for x should be√taller by a factor of
6 and skinnier by a factor of 1/ 6. (b) The probability
that a single can’s volume differs from the target by
at least 1 ml—one-third of a standard deviation—is
1 − P(−0.33 < Z < 0.33) = 0.7414. (c) The probability
241 244 247 250 253 256 259
that x is at least 1 ml from the target is
1 − P(249 < x < 251) = 1 − P(−0.82 < Z < 0.82) = 0.4122.

5.16. For the population distribution (the number of friends of a randomly chosen individual),
µ = 130 and σ = 85 friends. (a) For the total number of friends for a sample of n = 30
√ .
users, the mean is nµ = 3900 and the standard deviation is σ n = 465.56 friends.
(b) For the mean number of friends, the mean is µ = 130 and  the standard deviation is
√ . 140 − 130 .
σ/ n = 15.519 friends. (c) P(x > 140) = P Z > 15.519 = P(Z > 0.64) = 0.2611
(software: 0.2597).
172 Chapter 5 Sampling Distributions

5.17. (a) x is not systematically higher than or lower than µ; that is, it has no particular
tendency to underestimate or overestimate µ. (In other words, it is “just right” on the
average.) (b) With large samples, x is more likely to be close to µ because with a larger
sample comes more information (and therefore less uncertainty).
 
.
5.18. (a) P(X ≥ 23) = P Z ≥ 23 −5.119.2 = P(Z ≥ 0.75) = 0.2266 (with software:
0.2281). Because ACT scores are reported as whole numbers, we might instead compute
.
P(X ≥ 22.5) = P(Z ≥ 0.65) = 0.2578(software: 0.2588).  (b) µx = 19.2 and
√ . 23 − 19.2
σx = σ/ 25 = 1.02. (c) P(x ≥ 23) = P Z ≥ 1.02 = P(Z ≥ 3.73) = 0.0001. (In
this case, it is not appropriate to find P(x ≥ 22.5), unless x is rounded to the nearest
whole number.) (d) Because individual scores are only roughly Normal, the answer to (a) is
approximate. The answer to (c) is also approximate but should be more accurate because x
should have a distribution that is closer to Normal.
√ √ .
5.19. (a) µx = 0.5 and σx = σ/ 50 = 0.7/ 50 = 0.09899. (b) Because this distribution
is only approximately Normal, it would be quite reasonable to use the 68–95–99.7 rule
to give a rough estimate: 0.6 is about one standard deviation above the mean, so the
probability should be about 0.16 (half of the32% that falls outside ±1 standard deviation).
.  − 0.5
Alternatively, P(x > 0.6) = P Z > 0.6 0.09899 = P(Z > 1.01) = 0.1562.

5.20. (a) µ = (4)(0.33) + (3)(0.24) + (2)(0.18) + (1)(0.16) + (0)(0.09) = 2.56 and


σ 2 = (4 − 2.56)2 (0.33) + (3 − 2.56)2 (0.24)
+ (2 − 2.56)2 (0.18) + (1 − 2.56)2 (0.16) + (0 − 2.56)2 (0.09) = 1.7664,
√ . √ .
so σ = 1.7664 = 1.3291. (b) µx = µ = 2.56 and σx = σ/ 50 = 0.1880. (c) P(X ≥ 3) =
. − 2.56
0.33 + 0.24 = 0.57, and P(x ≥ 3) = P Z ≥ 30.1880 = P(Z ≥ 2.34) = 0.0096.

5.21. Let X be Sheila’s measured glucose level. (a) P(X > 140) = P(Z > 1.5) = 0.0668.
(b) If x is the
√ mean of three measurements (assumed to be independent), then x has a
N (125, 10/ 3 ) or N (125 mg/dl, 5.7735 mg/dl) distribution, and P(x > 140) = P(Z >
2.60) = 0.0047.
√ .
5.22. (a) µ X = ($500)(0.001) = $0.50 and σ X = 249.75 = $15.8035. (b) In the long run,
Joe makes about 50 cents for each $1 √ ticket.. (c) If x is Joe’s average payoff over a year,
then µx = µ = $0.50 and σx = σ X / 104 = $1.5497. The central limit theorem says that
x is approximately Normally distributed (with this mean and standard deviation). (d) Using
.
this Normal approximation, P(x > $1) = P(Z > 0.32) = 0.3745 (software: 0.3735).
Note: Joe comes out ahead if he wins at least once during the year. This probability is
.
easily computed as 1 − (0.999)104 = 0.0988. The distribution of x is different enough from a
Normal distribution so that answers given by the approximation are not as accurate in this
case as they are in many others.

5.23. The mean of three measurements has a N (125 mg/dl, 5.7735 mg/dl) distribution, and
.
P(Z > 1.645) = 0.05 if Z is N (0, 1), so L = 125 + 1.645 · 5.7735 = 134.5 mg/dl.
Solutions 173

√ .
5.24. x is approximately Normal with mean 1.3 and standard deviation 1.5/ 200 =
.
0.1061 flaws/yd2 , so P(x > 2) = P(Z > 6.6) = 0 (essentially).

5.25. If W is total weight, and x = W/25, then:


.  
P(W > 5200) = P(x > 208) = P Z > 208−190
√ = P(Z > 2.57) = 0.0051
5/ 25

5.26. (a) Although the probability of having to pay for a total loss for one or more of the 12
policies is very small, if this were to happen, it would be financially disastrous. On the other
hand, for thousands of policies, the law of large numbers says that the average claim on
many policies will be close to the mean, so the insurance company can be assured that the
premiums they collect will (almost certainly) cover the claims. (b) The central limit theorem
says that, in spite of the skewness of the population distribution, the average loss among
10,000 policies
√ will be approximately Normally distributed with mean $250 and standard
deviation σ/ 10,000 = $1000/100 = $10. Since $275 is 2.5 standard deviations above the
mean, the probability of seeing an average loss over $275 is about 0.0062.
√ .
5.27. (a) The mean of six untreatedspecimens has a standard deviation of 2.2/ 6 =
− 57
0.8981 lbs, so P(x u > 50) = P Z > 50 0.8981 = P(Z > −7.79), which is basically 1.
 .
(b) x u − x t has mean 57 − 30 = 27 lbs and standard
 deviation 2.22 /6 + 1.62 /6 =
− 27 .
1.1106 lbs, so P(x u − x t > 25) = P Z > 25 1.1106 = P(Z > −1.80) = 0.9641.

5.28. (a) The central limit theorem says that the sample means will be roughly
Normal. Note that the distribution of individual scores cannot have extreme outliers
because all scores are between
√ 1 and 7. (b) For Journal scores, y has mean 4.8 and
.
standard deviation 1.5/√28 = 0.2835. For Enquirer scores, x has mean 2.4 and
.
standard deviation 1.6/ 28 = 0.3024. (c) y − x has  (approximately) a .Normal
distribution with mean 2.4 and standard
 deviation 1.52 /28 + 1.62 /28 = 0.4145.
1 − 2.4 .
(d) P(y − x ≥ 1) = P Z ≥ 0.4145 = P(Z ≥ −3.38) = 0.9996.

√ √
5.29. (a) y has a N (µY , σY / m ) distribution and x has a N (µ X , σ X / n ) distribution.
 y − x has a Normal distribution with mean µY − µ X and standard deviation
(b)
σY2 /m + σ X2 /n.

5.30. We have been given µ X = 9%, σ X = 19%, µY = 11%, σY = 17%, and ρ = 0.6.
 R = 0.7X + 0.3Y has mean µ R = 0.7µ X .+ 0.3µY = 9.6% and standard
(a) Linda’s return
deviation σ R = (0.7σ X )2 + (0.3σY )2 + 2ρ(0.7σ X )(0.3σY ) = 16.8611%. (b) R, the average
return over 20√years, has approximately a Normal distribution with mean 9.6% and standard
. . .
deviation σ R / 20 = 3.7703%, so P(R < 5%) = P(Z < −1.22) = 0.1112. (c) After a
12% gain in the first year, Linda would have $1120; with a 6% gain in the second year, her
portfolio would be worth $1187.20. By contrast, two years with a 9% return would make
her portfolio worth $1188.10.
Note: As the text suggests, the appropriate
√ average for this situation is (a variation
.
on) the geometric mean, computed as (1.12)(1.06) − 1 = 8.9587%. Generally, if the
sequence of annual returns is r1 , r2 , . . . , rk (expressed as decimals), the mean return is
174 Chapter 5 Sampling Distributions


k
(1 + r1 )(1 + r2 ) · · · (1 + rk ) − 1. It can be shown that the geometric mean is always
smaller than the arithmetic mean, unless all the returns are the same.

5.31. The total height H of the four rows has a √


Normal distribution with mean
4 × 8 = 32 inches and standard deviation 0.1 4 = 0.2 inch. P(H < 31.5 or H > 32.5) =
1 − P(31.5 < H < 32.5) = 1 − P(−2.50 < Z < 2.50) = 1 − 0.9876 = 0.0124.

5.32. n = 250 (the sample size), p̂ = 45% = 0.45, and X = n p̂ = 112.5. (Because X must be
a whole number, it was either 112 or 113, and the reported value of p̂ was rounded.)

5.33. (a) n = 1500 (the sample size). (b) The “Yes” count seems like the most reasonable
choice, but either count is defensible. (c) X = 825 (or X = 675). (d) p̂ = 1500
825
= 0.55 (or
p̂ = 1500 = 0.45).
675

5.34. Assuming no multiple births (twins, triplets, quadruplets), we have four independent
trials, each with probability of success (type O blood) equal to 0.25, so the number of
children with type O blood has the B(4, 0.25) distribution.

5.35. We have 15 independent trials, each with probability of success (heads) equal to 0.5, so
X has the B(15, 0.5) distribution.

5.36. Assuming each free-throw attempt is an independent trial, X has the B(10, 0.8)
distribution, and P(X ≤ 4) = 0.0064.

5.37. (a) For the B(5, 0.4) distribution, P(X = 0) = 0.0778 and P(X ≥ 3) = 0.3174. (b) For
the B(5, 0.6) distribution, P(X = 5) = 0.0778 and P(X ≤ 2) = 0.3174. (c) The number
of “failures” in the B(5, 0.4) distribution has the B(5, 0.6) distribution. With 5 trials,
0 successes is equivalent to 5 failures, and 3 or more successes is equivalent to 2 or fewer
failures.

5.38. (a) For the B(100, 0.5) distribution, µp̂ = p = 0.5 and σp̂ = p(1−
n
p)
= 20
1
= 0.05.
(b) No; the mean and standard deviation of the sample count are both 100 times bigger.
(That is, p̂ = X/100, so µp̂ = µ X /100 and σp̂ = σ X /100.)

5.39. (a) p̂ has approximately a Normal distribution with mean 0.5 and standard deviation
.
0.05, so P(0.3 < p̂ < 0.7) = P(−4 < Z < 4) = 1. (b) P(0.35 < p̂ < 0.65) = P(−3 < Z <
.
3) = 0.9974.
Note: For the second, the 68–95–99.7 rule would give 0.997—an acceptable answer,
especially since this is an approximation anyway. For comparison, the exact answers (to
. .
four decimal places) are P(0.3 < p̂ < 0.7) = 0.9999 or P(0.3 ≤ p̂ ≤ 0.7) = 1.0000, and
. .
P(0.35 < p̂ < 0.65) = 0.9965 or P(0.35 ≤ p̂ ≤ 0.65) = 0.9982. (Notice that the “correct”
answer depends on our understanding of “between.”)
    .
5.40. (a) P(X ≥ 3) = 43 0.533 0.47 + 44 0.534 = 0.3588. (b) If the coin were fair,
4 3 4 4
P(X ≥ 3) = 3 0.5 0.5 + 4 0.5 = 0.3125.
Solutions 175

5.41. (a) Separate flips are independent (coins have no “memory,” so they do not try to
compensate for a lack of tails). (b) Separate flips are independent (coins have no “memory,”
so they do not get on a “streak” of heads). (c) p̂ can vary from one set of observed data to
another; it is not a parameter.

5.42. (a) X is a count; p̂ is a proportion. (b) The given formula is the standard deviation for a
binomial proportion. The variance for a binomial count is np(1 − p). (c) The rule of thumb
in the text is that np and n(1 − p) should both be at least 10. If p is close to 0 (or close to
1), n = 1000 might not satisfy this rule of thumb. (See also the solution to Exercise 5.22.)

5.43. (a) A B(200, p) distribution seems reasonable for this setting (even though we do not
know what p is). (b) This setting is not binomial; there is no fixed value of n. (c) A
B(500, 1/12) distribution seems appropriate for this setting. (d) This is not binomial,
because separate cards are not independent.

5.44. (a) This is not binomial; X is not a count of successes. (b) A B(20, p) distribution
seems reasonable, where p (unknown) is the probability of a defective pair. (c) This should
be (at least approximately) the B(n, p) distribution, where n is the number of students
in our sample, and p is the probability that a randomly-chosen student eats at least five
servings of fruits and vegetables.

5.45. (a) C, the number caught, is B(10, 0.7). M, the number missed, is B(10, 0.3).
(b) Referring to Table C, we find P(M ≥ 4) = 0.2001 + 0.1029 + 0.0368 + 0.0090 +
0.0014 + 0.0001 = 0.3503 (software: 0.3504).

5.46. (a) The B(20, 0.3) distribution (at least approximately). (b) P(X ≥ 8) = 0.2277.

5.47. (a) The mean of C is (10)(0.7) = 7 errors caught; for M the


√ mean is (10)(0.3) =3
.
errors missed. (b) The standard deviation
√ of C (or M) is σ = (10)(0.7)(0.3) =
.
1.4491 errors. (c) With p = 0.9, σ = (10)(0.9)(0.1) = 0.9487 errors; with p = 0.99,
.
σ = 0.3146 errors. σ decreases toward 0 as p approaches 1.

5.48. X , the number who listen to streamed music daily, has the B(20, 0.25) distribution.
(a) µ X = np = 5, and µp̂ = 0.25. (b) With n = 200, µ X = 50 and µp̂ = 0.25. With
n = 2000, µ X = 500 and µp̂ = 0.25. µ X increases with n, while µp̂ does not depend on n.

5.49. m = 6: P(X ≥ 6) = 0.0473 and P(X ≥ 5) = 0.1503.

5.50. (a) The population (the 75 members of the fraternity) is only 2.5 times the size of
the sample. Our rule of thumb says that this ratio should be at least 20. (b) Our rule of
thumb for the Normal approximation calls for np and n(1 − p) to be at least 10; we have
np = (1000)(0.002) = 2.

5.51. The count of 5s among n random digits has a binomial distribution with p = 0.1.
.
(a) P(at least one 5) = 1 − P(no 5) = 1 − (0.9)5 = 0.4095. (Or take 0.5905 from Table C
and subtract from 1.) (b) µ = (40)(0.1) = 4.
176 Chapter 5 Sampling Distributions

5.52. One sample of 15 flips is shown on the


right. Results will vary quite a bit; Table C
shows that 99.5% of the time, there will be 4
or fewer bad records in a sample of 15.
Out of 25 samples, most students should
see 2 to 12 samples with no bad records.
That is, N , the number of samples with no
bad records, has the B(25, 0.2683) distribu-
tion, and P(2 ≤ N ≤ 12) = 0.9894.

5.53. (a) n = 4 and p = 1/4 = 0.25. (b) The distribution is


below; the histogram is on the right. (c) µ = np = 1.

x 0 1 2 3 4
P(X = x) .3164 .4219 .2109 .0469 .0039

µ
0 1 2 3 4

√ .
5.54. For p̂, µ = 0.52 and σ = p(1 − p)/n = 0.01574. As p̂ is approximately Normally
distributed with this mean and standard deviation, we find:
. .
P(0.49 < p̂ < 0.55) = P(−1.91 < Z < 1.91) = 0.9438
(Software computation of the Normal probability gives 0.9433. Using a binomial
.
distribution, we can also find P(493 ≤ X ≤ 554) = 0.9495.)

5.55. Recall that p̂ is approximately


√ Normally distributed with mean µ = p
.
and standard deviation p(1 − p)/n. (a) With p = 0.24, σ = 0.01333, so
.
P(0.22 < p̂ < 0.26) = P(−1.50 < Z < 1.50) = 0.8664. (Software computation
of the Normal probability gives 0.8666. Using a binomial distribution, we can also
. .
find P(226 ≤ X ≤ 267) = 0.8752.) (b) With p = 0.04, σ = 0.00611, so
P(0.02 < p̂ < 0.06) = P(−3.27 < Z < 3.27) = 0.9990. (Using a binomial distribution, we
.
can also find P(21 ≤ X ≤ 62) = 0.9992.) (c) P(−0.02 < p̂ − p < 0.02) increases to 1 as p
gets closer to 0. (This is because σ also gets close to 0, so that 0.02/σ grows.)

5.56. When n = 300, the distribution of p̂ is approximately Normal with mean 0.52 and
standard deviation 0.02884 (nearly twice that in Exercise 5.54). When n = 5000, the
standard deviation drops to 0.00707 (less than half as big as in Exercise 5.54). Therefore:
. .
n = 300 : P(0.49 < p̂ < 0.55) = P(−1.04 < Z < 1.04) = 0.7016
. .
n = 5000 : P(0.49 < p̂ < 0.55) = P(−4.25 < Z < 4.25) = 1
Larger samples give a better probability that p̂ will be close to the true proportion p.
(Software computation of the first Normal probability gives 0.7017; using a binomial
.
distribution, we can also find P(147 ≤ X ≤ 165) = 0.7278. These more accurate answers do
not change our conclusion.)
Solutions 177

√ .
5.57. (a) The mean is µ = p = 0.69, and the standard deviation is σ = p(1 − p)/n =
0.0008444. (b) µ ± 2σ gives the range 68.83% to 69.17%. (c) This range is considerably
narrower than the historical range. In fact, 67% and 70% correspond to z = −23.7 and
z = 11.8—suggesting that the observed percents do not come from a N (0.69, 0.0008444)
distribution; that is, the population proportion has changed over time.

5.58. (a) p̂ = 294
400 = 0.735. (b) With p = 0.8, σp̂ = (0.8)(0.2)/400 = 0.02. (c) Still
assuming that p = 0.8, we would expect that about 95% of the time, p̂ should fall between
0.76 and 0.84. (d) It appears that these students prefer this type of course less than the
national average. (The observed value of p̂ is quite a bit lower than we would expect from a
N (0.8, 0.2) distribution, which suggests that it came from a distribution with a lower mean.)

5.59. (a) p̂ = 140


200 = 0.7. (b) We want Continuity correction
P(X ≥ 140) or P( p̂ ≥ 0.7). The first Exact Table Software Table Software
prob. Normal Normal Normal Normal
can be found exactly (using a binomial
0.2049 0.1841 0.1835 0.2033 0.2041
distribution), or we can compute either
using a Normal approximation (with or without the continuity correction). All possible an-
swers are shown on the right. (c) The sample results are higher than the national percentage,
but the sample was so small that such a difference could arise by chance even if the true
campus proportion is the same.


5.60. As σp̂ = p(1 − p)/n, we have 0.0042 = (0.52)(0.48)/n, so n = 15,600.

5.61. (a)√ p = 1/4 = √ 0.25. (b) P(X ≥ 10) = 0.0139. (c) µ = np = 5 and
.
σ = np(1 − p) = 3.75 = 1.9365 successes. (d) No: The trials would not be independent
because the subject may alter his/her guessing strategy based on this information.

5.62. (a) õ = (1200)(0.75) = 900 and Continuity correction


σ = 225 = 15 students. (b) P(X ≥ Table Software Table Software
.
800) = P(Z ≥ −6.67) = 1 (essentially). Normal Normal Normal Normal
. 0.9382 0.9379 0.9418 0.9417
(c) P(X ≥ 951) = P(Z ≥ 3.4) = 0.0003.
.
(d) With n = 1300, P(X ≥ 951) = P(Z ≥ −1.54) = 0.9382. Other answers are shown in
the table on the right.

5.63. (a) X , the count of successes, has the B(900,


√ 1/5) distribution, with mean
µ X = np = (900)(1/5) = 180 and σ X = (900)(0.2)(0.8)
√ = 12 successes.
.
(b) For p̂, the mean is µp̂ = p = 0.2 and σp̂ = (0.2)(0.8)/900 = 0.01333.
.
(c) P( p̂ > 0.24) = P(Z > 3) = 0.0013. (d) From a standard Normal distribution,
P(Z > 2.326) = 0.01, so the subject must score 2.326 standard deviations above the mean:
µp̂ + 2.326σp̂ = 0.2310. This corresponds to 208 or more successes.
30 .
5.64. (a) M has the B(30, 0.65) distribution, so P(M = 20) = 20 (0.65)20 (0.35)10 = 0.1502.
(b) P(1st woman is the 4th call) = (0.65)3 (0.35) = 0.0961.
178 Chapter 5 Sampling Distributions

23,772,494 .
5.65. (a) p = 209,128,094 = 0.1137. (b) If B is the number of blacks, then B has
.
(approximately) the B(1200, 0.1137) distribution, so the mean is np = 136.4 blacks.
.
(c) P(B ≤ 100) = P(Z < −3.31) = 0.0005.
Note: In (b), the population is at least 20 times as large as the sample, so our “rule of
thumb” for using a binomial distribution is satisfied. In fact, the mean would be the same
even if we could not use a binomial distribution, but we need to have a binomial distribution
for part (c), so that we can approximate it with a Normal distribution—which we can safely
do, because both np and n(1 − p) are much greater than 10.
 
n n!
5.66. (a) n = n! 0! = 1. The only way  to distribute
 n successes among n observations is for
n n! n · (n − 1)!
all observations to be successes. (b) n − 1 = (n − 1)! 1! = (n − 1)! = n. To distribute
n − 1 successes among  n observations, the one failure must be either
 observation
 1, 2, 3,
n n! n! n
. . . , n − 1, or n. (c) k = k! (n − k)! = (n − k)! [n − (n − k)]! = n − k . Distributing k
successes is equivalent to distributing n − k failures.

5.67. Jodi’s number of correct answers Continuity correction


will have the B(n, 0.88) distribution. Exact Table Software Table Software
(a) P( p̂ ≤ 0.85) = P(X ≤ 85) is on prob. Normal Normal Normal Normal
line 1. (b) P( p̂ ≤ 0.85) = P(X ≤ 212) 0.2160 0.1788 0.1780 0.2206 0.2209
0.0755 0.0594 0.0597 0.0721 0.0722
is on line 2. (c) For a test with 400
questions, the standard deviation of p̂ would be half
√as big as the standard deviation of p̂
.
√ questions: With n. = 100, σ = (0.88)(0.12)/100 = 0.03250; and with
for a test with 100
n = 400, σ = (0.88)(0.12)/400 = 0.01625. (d) Yes: Regardless of p, n must be quadru-
pled to cut the standard deviation in half.

  
5.68. (a) P(first . appears on toss 2) = 5 1
= . 5
 6   6   36
(b) P(first . appears on toss 3) = 5
6
5
6
1
6 = 25
216 .
 3  
(c) P(first . appears on toss 4) = 5
6
1
6 .
 4  
P(first . appears on toss 5) = 5
6
1
6 .

 k − 1  
5.69. Y has possible values 1, 2, 3, . . .. P(first . appears on toss k) = 5
6
1
6 .

5.70. With µ = $430, σ = $140, and


√ n= 500, the distribution of x is approximately Normal
. .
with mean $430 and σx = 140/ 500 = $6.2610, so P(x > 440) = P(Z > 1.60) = 0.0548
(software: 0.0551).
√ .
5.71. (a) With σx = 0.08/ 3 = 0.04619, x has (approximately) a N (123 mg, 0.04619 mg)
distribution. (b) P(x ≥ 124) = P(Z ≥ 21.65), which is essentially 0.

5.72. (a) The table of standard deviations is given below. (b) The graph is below on the right;
it is shown as a scatterplot, but in this situation it would be reasonable to “connect the dots”
because the relationship between standard deviation and sample size holds for all n. (c) As
Solutions 179

n increases, the standard deviation decreases—at first quite rapidly, then more slowly (a
demonstration of the law of diminishing returns).
√ 100
n σ/ n

Standard deviation
1 100 75
4 50
25 20 50
100 10
250 6.32 25
500 4.47
0
1000 3.16
5000 1.41 0 1000 2000 3000 4000 5000
Sample size

5.73. (a) Out of 12 independent vehicles, the number X with one person has the B(12, 0.755)
distribution, so P(X ≥ 7) = 0.9503 (using software or a calculator). (b) Y (the number
of one-person cars in a sample of 80) has the B(80, 0.755) distribution. Regardless of
the approach used—Normal approximation, or exact computation using software or a
.
calculator—P(Y ≥ 41) = 1.

5.74. This would not be surprising: Assuming that all the authors are independent (for
example, none were written by siblings or married couples), we can view the 12
names as being a random sample so that the number N of occurrences of the ten most
common names would have a binomial distribution with n = 12 and p = 0.056. Then
.
P(N = 0) = (1 − 0.056)12 = 0.5008.

5.75. The probability that the first digit is 1, 2, or Continuity correction


3 is 0.301 + 0.176 + 0.125 = 0.602, so the Table Software Table Software
number of invoices for amounts beginning with Normal Normal Normal Normal
these digits should have a binomial distribution 0.0034 0.0033 0.0037 0.0037
with n = 1000 and p = 0.602. More usefully, the proportion p̂ of such invoices should
have approximately a Normal distribution with mean p = 0.602 and standard deviation
√ . .
p(1 − p)/1000 = 0.01548, so P( p̂ ≤ 1000 560
) = P(Z ≤ −2.71) = 0.0034. Alternate answers
shown on the right.

5.76. (a) If R is the number of red- Continuity correction


blossomed plants out of a sample of Exact Table Software Table Software
12, then P(R = 9) = 0.2581, using a prob. Normal Normal Normal Normal
binomial distribution with n = 12 and 0.9845 0.9826 0.9825 0.9864 0.9866
p = 0.75. (For Table C, use p = 0.25 and find P(X = 3), where X = 12 − R is the number
of flowers with nonred blossoms.) (b) With n = 120, the mean number of red-blossomed
plants is np = 90. (c) If R2 is the number of red-blossomed plants out of a sample of 120,
.
then P(R2 ≥ 80) = P(Z ≥ −2.11) = 0.9826. (Other possible answers are given in the table
on the right.)
180 Chapter 5 Sampling Distributions


5.77. If x is the average weight of 12 eggs, then x has a N (65 g, 5/ 12 g) =
830 .
12 < x < 12 ) = P(−1.44 < Z < 2.89) = 0.9231
N (65 g, 1.4434 g) distribution, and P( 755
(software: 0.9236).

5.78. (a) The machine that makes the caps and the machine that applies the torque are not
the same. (b)√T (torque) is N (7.0, 0.9) and S (cap strength) is N (10.1, 1.2), so T − S is
N (7 − 10.1, 0.92 + 1.22 ) = N (−3.1 inch · lb, 1.5 inch · lb). The probability that the cap
breaks is P(T > S) = P(T − S > 0) = P(Z > 2.07) = 0.0192 (software: 0.0194).

5.79. The center line is µx = µ = 4.25 and the control limits are µ ± 3σ/ 5 = 4.0689 to
4.4311.
√ √
5.80. (a) x has a N (32, 6/ 25 ) = N (32, 1.2) distribution,
 and y has a N (29, 5/ 25 ) =
.
N (29, 1) distribution. (b) y − x has a N (29 − 32, 52 /25 + 62 /25 ) = N (−3, 1.5620)
distribution. (c) P(y ≥ x) = P(y − x ≥ 0) = P(Z ≥ 1.92) = 0.0274.

5.81. (a) p̂ F is approximately N (0.82, 0.01921) and p̂ M is approximately N (0.88, 0.01625).


(b) When we subtract two independent Normal random variables, the difference is Normal.
The new mean is the difference of the two means (0.88−0.82 = 0.06), and the new variance
is the sum of the variances (0.019212 + 0.016252 = 0.000633), so p̂ M − p̂ F is approximately
.
N (0.06, 0.02516). (c) P( p̂ F > p̂ M ) = P( p̂ M − p̂ F < 0) = P(Z < −2.38) = 0.0087
(software: 0.0085).

5.82. (a) Yes; this rule works for any random variables X and Y . (b) No; this rule requires that
X and Y be independent. The incomes of two married people are certainly not independent,
as they are likely to be similar in many characteristics that affect income (for example,
educational background).

5.83. For each step of the random walk, the mean is µ = (1)(0.6) + (−1)(0.4) = 0.2, the
√ is σ. = (1 − 0.2) (0.6) + (−1 − 0.2) (0.4) = 0.96, and the standard deviation is
variance 2 2 2

σ = 0.96 = 0.9798. Therefore, Y /500 has approximately a N (0.2, 0.04382) distribution,


. .
and P(Y ≥ 200) = P( 500 Y
≥ 0.4) = P(Z ≥ 4.56) = 0.
Note: The number R of right-steps has a binomial distribution with n = 500 and
p = 0.6. Y ≥ 200 is equivalent to taking at least 350 right-steps, so we can also compute
this probability as P(R ≥ 350), for which software gives the exact value 0.00000215. . . .
Chapter 6 Solutions

6.1. σx = σ/ 25 = $2.50
5 = $0.50.

6.2. Take two standard deviations: 2 · $2.50


5 = 2 · $0.50 = $1.00.

6.3. As in the previous solution, take two standard deviations: $1.00.


Note: This is the whole idea behind a confidence interval: Probability tells us that x is
usually close to µ. That is equivalent to saying that µ is usually close to x.

6.4. Shown below are sample output screens for (a) 10 and (b) 1000 SRSs. In 99.4% of all
repetitions of part (a), students should see between 5 and 10 hits (that is, at least 5 of the 10
SRSs capture the true mean µ). Out of 1000 80% confidence intervals, nearly all students
will observe between 76% and 84% capturing the mean.

6.5. The standard error is sx = √σ = 0.3, and the 95% confidence interval for µ is
100

3
87.3 ± 1.96 √ = 87.3 ± 0.588 = 86.712 to 87.888
100

6.6. A 99% confidence interval would have a larger margin of error; a wider interval is needed
in order to be more confident that the interval includes the true mean. The 99% confidence
interval for µ is

3
87.3 ± 2.576 √ = 87.3 ± 0.7728 = 86.527 to 88.073
100
 (1.96)(12,000) 2
.
6.7. n = 1000
= 553.19—take n = 554.

6.8. In the previous exercise, we found that n = 554 would give a margin of error of $1000.
The margin of error (1.96 × $12,000

n
) would be smaller than $1000 with a larger sample
(n > 554), and larger with a smaller sample (n < 554).
.
If all 1000 graduates respond, the margin of error would be (1.96)($379.47) = $743.77; if
.
n = 500, the margin of error would be (1.96)($536.66) = $1051.85.

181
182 Chapter 6 Introduction to Inference

249 .
6.9. The (useful) response rate is 5800 = 0.0429, or about 4.3%. The reported margin of error
is probably unreliable because we know nothing about the 95.7% of students that did not
provide (useful) responses; they may be more (or less) likely to charge education-related
expenses.

6.10. (a) The 95% confidence interval is 87 ± 10 = 77 to 97. (The sample size is not needed.)
(b) Greater than 10: A wider margin of error is needed in order to be more confident that
the interval includes the true mean.
Note: If this result is based on a Normal distribution, the margin of error for 99%
confidence would be roughly 13.1, because we multiply by 2.576 rather than 1.96.

6.11. The margins of error are 1.96 × 7/ n, which yields n = 10
4.3386, 3.0679, 2.1693, and 1.5339. (And, of course, all n = 20
intervals are centered at 50.) Interval width decreases n = 40
with increasing sample size. n = 80

46 47 48 49 50 51 52 53 54


6.12. The margins of error are z ∗ × 14/ 49 = 2z ∗ . With z ∗ 99%
equal to 1.282, 1.645, 1.960, and 2.576, this yields 2.564, 95%
3.290, 3.920, and 5.152. (And, of course, all intervals are 90%
centered at 70.) Increasing confidence makes the interval 80%
wider.
66 68 70 72 74


6.13. (a) She did not divide the standard deviation by n = 20. (b) Confidence intervals
concern the population mean, not the sample mean. (The value of the sample mean is
known to be 8.6; it is the population mean that we do not know.) (c) 95% is a confidence
level, not a probability. Furthermore, it does not make sense to make probability statements
about the population mean µ, which is an unknown constant (rather than a random
quantity). (d) The large sample size does not affect the distribution of individual alumni
ratings (the population distribution). The use of a Normal distribution is justified because the
distribution of the sample mean is approximately Normal when the sample is large.
Note: For part (c), a Bayesian statistician might view the population mean µ as a
random quantity, but the viewpoint taken in the text is non-Bayesian.

6.14. (a) The standard deviation should be divided by 100 = 10, not by 100. (b) The correct
interpretation is that (with 95% confidence) the average time spent at the site is between
3.71 and 4.69 hours. That is, the confidence interval is a statement about the population
mean, not about the individual members. (c) To halve the margin of error, the sample size
needs to be quadrupled, to about 400. (In fact, n = 385 would be enough.)
Solutions 183

6.15. (a) To estimate the mean importance of recreation to college satisfaction, the 95%
confidence interval for µ is

3.9
7.5 ± 1.96 √ = 7.5 ± 0.1478 = 7.3522 to 7.6478
2673
(b) The 99% confidence interval for µ is

3.9
7.5 ± 2.576 √ = 7.5 ± 0.1943 = 7.3057 to 7.6943
2673

6.16. We must assume that the 2673 students were chosen as an SRS (or something like it).
The non-Normality of the population distribution is not a problem; we have a very large
sample, so the central limit theorem applies.

6.17. For mean TRAP level, the margin of error is 2.29 U/l and the 95% confidence interval
for µ is 
6.5
13.2 ± 1.96 √ = 13.2 ± 2.29 = 10.91 to 15.49 U/l
31

6.18. For mean OC level, the 95% confidence interval for µ is



19.6
33.4 ± 1.96 √ = 33.4 ± 6.90 = 26.50 to 40.30 ng/ml
31

6.19. Scenario B has a smaller margin of error. Both samples would have the same value of z ∗
(1.96), but the value of σ would be smaller for (B) because we would have less variability
in textbook cost for students in a single major.
Note: Of course, at some schools, taking a sample of 100 sophomores in a given major
is not possible. However, even if we sampled students from a number of institutions, we still
might expect less variability within a given major than from a broader cross-section.

6.20. We assume that the confidence interval was √ constructed using the methods of
this chapter; that is, we assume that x ± 1.96σ/ 2500. Then the center of the given
confidence interval is x = 12 ($45,330 + $46,156) = $45,743, the margin of error
for is $46,156 − x = $413, and the 99% confidence margin of error is therefore
1.96 = $542.8. Then the desired confidence interval is
$413 · 2.576
($48,633 − x) ± $542.8 = $2347.2 to $3432.8;
that is, the average starting salary at this institution is about $2300 to $3400 less than that
NACE mean.

6.21. (a) “The probability is about 0.95 that x is within 14 kcal/day of . . . µ” (because 14
is two standard deviations). (b) This is simply another way of understanding the statement
from part (a): If |x − µ| is less than 14 kcal/day 95% of the time, then “about 95% of all
samples will capture the true mean . . . in the interval x plus or minus 14 kcal/day.”
184 Chapter 6 Introduction to Inference

6.22. For the mean monthly rent for unfurnished one-bedroom apartments in Dallas, the 95%
confidence interval for µ is

$290
$980 ± 1.96 √ = $980 ± $179.74 = $800.26 to $1159.74
10

6.23. No; This is a range of values for the mean rent, not for individual rents.
Note: To find a range to include 95% of all rents, we should take µ ± 2σ (or more
precisely, µ ± 1.96σ ), where µ is the (unknown) mean rent for all apartments, and σ is
the standard deviation for all apartments (assumed to be $290 in Exercise 6.22). If µ were
equal to $1050, for example, this range would be about $470 to $1630. However, because
we do not actually know µ, we estimate it using x,and to account for the variability
 in x,
we must widen the margin of error by a factor of 1 + n . The formula x ± 2σ 1 + 10
1 1
is
called a prediction interval for future observations. (Usually, such intervals are constructed
with the t distribution, discussed in the Chapter 7, but the idea is the same.)

6.24. If the distribution were roughly Normal, the 68–95–99.7 rule says that 68% of all
measurements should be in the range 13.8 to 53.0 ng/ml, 95% should be between −5.8
and 72.6 ng/ml, and 99.7% should be between −25.4 and 92.2 ng/ml. Because the
measurements cannot be negative, this suggests that the distribution must be skewed to
the right. The Normal confidence interval should be fairly accurate nonetheless because
the central limit theorem says that the distribution of the sample mean x will be roughly
Normal.

6.25. (a) For the mean number of hours spent on the Internet, the 95% confidence interval for
µ is 
5.5
19 ± 1.96 √ = 19 ± 0.3112 = 18.6888 to 19.3112 hours
1200
(b) No; this is a range of values for the mean time spent, not for individual times. (See also
the comment in the solution to Exercise 6.23.)

6.26. (a) To change from hours to minutes, multiply by 60: x m = 60x h = 1140 and
σm = 60σh = 330 minutes. (b) For mean time in minutes, the 95% confidence interval for µ
is 
330
1140 ± 1.96 √ = 1140 ± 18.67 = 1121.33 to 1158.67 minutes
1200
(c) This interval can be found by multiplying the previous interval (18.6888 to
19.3112 hours) by 60.

6.27. (a) We can be 95% confident, but not certain. (b) We obtained the interval 85% to
95% by a method that gives a correct result (that is, includes the true mean) 95% of the
time. (c) For 95% confidence, the margin of error is about two standard deviations (that
.
is, z ∗ = 1.96), so σestimate = 2.5%. (d) No; confidence intervals only account for random
sampling error.
Solutions 185

.
6.28. (a) The standard deviation of the mean is σx = √3.5 = 0.7826 mpg. (b) A 3 4
20
stemplot (right) does not suggest any severe deviations from Normality. The 3 677
3 9
mean of the 20 numbers in the sample is x = 43.17 mpg. (c) If µ is the 4 1
population mean fuel efficiency, the 95% confidence interval for µ is 4 23333
 4 445
3.5
43.17 ± 1.96 √ = 43.17 ± 1.5339 = 41.6361 to 44.7039 mpg 4 667
20 4 88
5 0

1.609 km 1 gallon . kpl .


6.29. Multiply by · = 0.4251 . This gives x kpl = 0.4251x mpg =
1 mile 3.785 liters √ mpg
.
18.3515 and margin of error 1.96(0.4251σmpg )/ 20 = 0.6521 kpl, so the 95% confidence
interval is 17.6994 to 19.0036 kpl.

6.30. One sample screen is shown below, along with a sample stemplot of results. The number
of hits will vary, but the distribution should follow a binomial distribution with n = 50 and
p = 0.95, so we expect the average number of hits to be about 47.5. We also find that about
99.7% of individual counts should be 43 or more, and the mean hit count for 30 samples
should be approximately Normal with mean 47.5 and standard deviation 0.2814—so almost
all sample means should be between 46.66 and 48.34.

44 00
45 0000
46 00
47 00000000
48 000000
49 000
50 00000

 (1.96)(6.5) 2
.
6.31. n = 1.5
= 72.14—take n = 73.

6.32. If we start with a sample of size k and lose 20% of the sample, we will end with 0.8k.
Therefore, we need to increase the sample size by 25%—that is, start with a sample of size
k = 1.25n—so that we end with (0.8)(1.25n) = n. With n = 73, that means we should
initially sample k = 91.25 (use 92) subjects.

6.33. No: Because the numbers are based on voluntary response rather than an SRS, the
confidence interval methods of this chapter cannot be used; the interval does not apply to
the whole population.

6.34. (a) For the mean of all repeated measurements, the 98% confidence interval for µ is

0.0002
10.0023 ± 2.326 √ = 10.0023 ± 0.0002 = 10.0021 to 10.0025 g
5
 (2.326)(0.0002) 2
.
(b) n = 0.0001
= 21.64—take n = 22.
186 Chapter 6 Introduction to Inference

6.35. The number of hits has a binomial distribution with parameters n = 5 and p = 0.95, so
the number of misses is binomial with n = 5 and p = 0.05. We can therefore use Table C
.
to answer these questions. (a) The probability that all cover their means is 0.955 = 0.7738.
(Or use Table C to find the probability of 0 misses.) (b) The probability that at least four
.
intervals cover their means is 0.955 + 5(0.05)(0.954 ) = 0.9774. (Or use Table C to find the
probability of 0 or 1 misses.)

6.36. The new design can be considered an improvement if the mean response µ to the survey
is greater than 4. The null hypothesis should be H0 : µ = 4; the alternative hypothesis
could be either µ > 4 or µ < 4. The first choice would be appropriate if we want the
default assumption (H0 ) to be that the new design is not an improvement; that is, we will
only conclude that the new design is better if we see compelling evidence to that effect.
Choosing Ha : µ < 4 would mean that the default assumption is that the new design is at
least as good as the old one, and we will stick with that belief unless we see compelling
evidence that it is worse.
Students who are just learning about stating hypotheses might have difficulty choosing the
alternative for this problem. In fact, either one is defensible, although the typical choice in
such cases would be µ > 4—that is, we give the benefit of the doubt to the old design and
need convincing evidence that the new design is better.

6.37. If µ is the mean DXA reading for the phantom, we test H0 : µ = 1.4 g/cm2 versus
Ha : µ = 1.4 g/cm2 .

6.38. P(Z > 2.42) = 0.0078, so the two-sided P-value is


2(0.0078) = 0.0156.
–2.42 2.42

–3 –2 –1 0 1 2 3

6.39. P(Z < −1.63) = 0.0516, so the two-sided P-value


is 2(0.0516) = 0.1032. –1.63 1.63

–3 –2 –1 0 1 2 3

6.40. (a) For P = 0.05, the value of z is ±1.96. (b) For


a two-sided alternative, z is statistically significant at
–1.96 1.96
α = 0.05 if |z| > 1.96.

–3 –2 –1 0 1 2 3

6.41. (a) For P = 0.05, the value of z is 1.645. (b) For a


one-sided alternative (on the positive side), z is statisti- 1.645

cally significant at α = 0.05 if z > 1.645.

–3 –2 –1 0 1 2 3
Solutions 187

6.42. For z ∗ = 2 the P-value would be 2P(Z > 2) = 0.0456, and for z ∗ = 3 the P-value
would be 2P(Z > 3) = 0.0026.
Note: In other words, the Supreme Court uses α no bigger than about 0.05.

− 25
6.43. (a) z = 27√ = 2.4. (b) For a one-sided alternative, P = P(Z > 2.4) = 0.0082. (c) For
5/ 36
a two-sided alternative, double the one-sided P-value: P = 0.0164.

.
6.44. The test statistic is z = = 0.45. This gives very little reason to doubt the null
0.514−0.5

0.314/ 100
.
hypothesis (µ = 0.5); in fact, the two-sided P-value is P = 0.6527.

6.45. Recall the statement from the text: “A level α two-sided significance test rejects . . .
H0 : µ = µ0 exactly when the value µ0 falls outside a level 1 − α confidence interval for µ.”
(a) No; 30 is not in the 95% confidence interval because P = 0.033 means that we would
reject H0 at α = 0.05. (b) Yes; 30 is in the 99% confidence interval because we would not
reject H0 at α = 0.01.

6.46. See the quote from the text in the previous solution. (a) No, we would not reject µ = 58
because 58 falls well inside the confidence interval, so the P-value is (much) greater than
0.05. (b) Yes, we would reject µ = 63; the fact that 63 falls outside the 95% confidence
interval means that P < 0.05.
Note: The given confidence interval suggests that x = 57.5, and if the interval
was constructed using the Normal distribution, the standard error of the mean is about
2.25—half the margin of error. (The standard error might be less if it was constructed with
a t distribution rather than the Normal distribution.) Then x is about 0.22 standard errors
.
below µ = 58—yielding P = 0.82—and x is about 2.44 standard errors below µ = 63, so
.
that P = 0.015.

6.47. (a) Yes, we reject H0 at α = 0.05. (b) No, we do not reject H0 at α = 0.01.
(c) We have P = 0.039; we reject H0 at significance level α if P < α.

6.48. (a) No, we do not reject H0 at α = 0.05. (b) No, we do not reject H0 at α = 0.01.
(c) We have P = 0.062; we reject H0 at significance level α if P < α.

6.49. (a) One of the one-sided P-values is half as big as the two-sided P-value (0.022); the
other is 1 − 0.022 = 0.978. (b) Suppose the null hypothesis is H0 : µ = µ0 . The smaller
P-value (0.022) goes with the one-sided alternative that is consistent with the observed data;
for example, if x > µ0 , then P = 0.022 for the alternative µ > µ0 .

6.50. (a) The null hypothesis√ should be a statement about µ, not x. (b) The standard deviation
of the sample mean is 5/ 30. (c) x = 45 would not make us inclined to believe that µ > 50
over the (presumed) null hypothesis µ = 50. (d) Even if we fail to reject H0 , we are not
sure that it is true.
Note: That is, “not rejecting H0 ” is different from “knowing that H0 is true.” This is
the same distinction we make about a jury’s verdict in a criminal trial: If the jury finds the
defendant “not guilty,” that does not necessarily mean that they are sure he/she is innocent.
It simply means that they were not sufficiently convinced of his/her guilt.
188 Chapter 6 Introduction to Inference

6.51. (a) Hypotheses should be stated in terms of the population mean, not the sample mean.
(b) The null hypothesis H0 should be that there is no change (µ = 21.2). (c) A small
P-value is needed for significance; P = 0.98 gives no reason to reject H0 . (d) We compare
the P-value, not the z-statistic, to α. (In this case, such a small value of z would have a
very large P-value—close to 0.5 for a one-sided alternative, or close to 1 for a two-sided
alternative.)

6.52. (a) We are checking to see if the proportion p increased, so we test H0 : p = 0.88 versus
Ha : p > 0.88. (b) The professor believes that the mean µ for the morning class will be
higher, so we test H0 : µ = 75 versus Ha : µ > 75. (c) Let µ be the mean response (for the
population of all students who read the newspaper). We are trying to determine if students
are neutral about the change, or if they have an opinion about it, with no preconceived idea
about the direction of that opinion, so we test H0 : µ = 0 versus Ha : µ = 0.

6.53. (a) If µ is the mean score for the population of placement-test students, then we
test H0 : µ = 77 versus Ha : µ = 77 because we have no prior belief about whether
placement-test students will do better or worse. (b) If µ is the mean time to complete the
maze with rap music playing, then we test H0 : µ = 20 seconds versus Ha : µ > 20 seconds
because we believe rap music will make the mice finish more slowly. (c) If µ is the mean
area of the apartments, we test H0 : µ = 880 ft2 versus Ha : µ < 880 ft2 , because we suspect
the apartments are smaller than advertised.

6.54. (a) If pm and p f are the proportions of (respectively) males and females who like MTV
best, we test H0 : pm = p f versus Ha : pm > p f . (b) If µ A and µ B are the mean test scores
for each group, we test H0 : µ A = µ B versus Ha : µ A > µ B . (c) If ρ is the (population)
correlation between time spent at social network sites and self-esteem, we test H0 : ρ = 0
versus Ha : ρ < 0.
Note: In each case, the parameters identified refer to the respective populations, not the
samples.

6.55. (a) H0 : µ = $42,800 versus Ha : µ > $42,800, where µ is the mean household income
of mall shoppers. (b) H0 : µ = 0.4 hr versus Ha : µ = 0.4 hr, where µ is this year’s mean
response time.

6.56. (a) For Ha : µ > µ0 , the P-value is P(Z > 1.63) = 0.0516.
(b) For Ha : µ < µ0 , the P-value is P(Z < 1.63) = 0.9484.
(c) For Ha : µ = µ0 , the P-value is 2P(Z > 1.63) = 2(0.0516) = 0.1032.

6.57. (a) For Ha : µ > µ0 , the P-value is P(Z > −1.82) = 0.9656.
(b) For Ha : µ < µ0 , the P-value is P(Z < −1.82) = 0.0344.
(c) For Ha : µ = µ0 , the P-value is 2P(Z < −1.82) = 2(0.0344) = 0.0688.

6.58. Recall the statement from the text: “A level α two-sided significance test rejects . . .
H0 : µ = µ0 exactly when the value µ0 falls outside a level 1 − α confidence interval for µ.”
(a) No, 30 is not in the 95% confidence interval because P = 0.032 means that we would
reject H0 at α = 0.05. (b) No, 30 is not in the 90% confidence interval because we would
also reject H0 at α = 0.10.
Solutions 189

6.59. See the quote from the text in the previous solution. (b) Yes, we would reject
H0 : µ = 24; the fact that 24 falls outside the 90% confidence interval means that P < 0.10.
(a) No, we would not reject H0 : µ = 30 because 30 falls inside the confidence interval, so
P > 0.10.
Note: The given confidence interval suggests that x = 28.5, and if the interval was
constructed using the Normal distribution, the standard error of the mean is about
1.75—half the margin of error. (The standard error might be less if it was constructed with
a t distribution rather than the Normal distribution.) Then x is about 2.57 standard errors
.
above µ = 24—yielding P = 0.01—and x is about 0.86 standard errors below µ = 30, so
.
that P = 0.39.

6.60. The study presumably examined malarial infection rates in two groups of subjects—one
with bed nets and one without. The observed differences between the two groups were so
large that they would be unlikely to occur by chance if bed nets had no effect. Specifically,
if the groups were the same, and we took many samples, the difference in malarial
infections would be so large less than 0.1% of the time.

6.61. P = 0.09 means there is some evidence for the wage decrease, but it is not significant
at the α = 0.05 level. Specifically, the researchers observed that average wages for
peer-driven students were 13% lower than average wages for ability-driven students, but
(when considering overall variation in wages) such a difference might arise by chance 9% of
the time, even if student motivation had no effect on wages.

6.62. If the presence of pig skulls were not an indication of wealth, then differences similar to
those observed in this study would occur less than 1% of the time by chance.

6.63. Even if the two groups (the health and safety class, and the statistics class) had the same
level of alcohol awareness, there might be some difference in our sample due to chance. The
difference observed was large enough that it would rarely arise by chance. The reason for
this difference might be that health issues related to alcohol use are probably discussed in
the health and safety class.

6.64. Even if scores had not changed over time, random fluctuation might cause the mean in
2009 to be different from the 2007 mean. However, in this case the difference was so great
that it is unlikely to have occurred by chance; specifically, such a difference would arise less
than 5% of the time if the actual mean had not changed. We therefore conclude that the
mean did change from 2007 to 2009.

6.65. If µ is the mean difference between the two groups of children, we test H0 : µ = 0
−0 .
versus Ha : µ = 0. The test statistic is z = 41.2 = 3.33, for which software reports
.
P = 0.0009—very strong evidence against the null hypothesis.
Note: The exercise reports the standard deviation of the mean, rather than
√ the sample
standard deviation; that is, the reported value has already been divided by 238.
190 Chapter 6 Introduction to Inference

6.66. If µ is the mean north-south location, we wish to test H0 : µ = 100 versus Ha : µ = 100.
.
We find z = 99.74√− 100 = −0.11; this is not significant—P = 2(0.4562) = 0.9124—so we
58/ 584
have no reason to doubt a uniform distribution based on this test.

6.67. If µ is the mean east-west location, the hypotheses are H0 : µ = 100 versus Ha : µ = 100
.
(as in the previous exercise). For testing these hypotheses, we find z = 113.8√− 100 = 5.75.
58/ 584
This is highly significant (P < 0.0001), so we conclude that the trees are not uniformly
spread from east to west.

.
6.68. For testing these hypotheses, we find z = 10.2 −√8.9 = 1.27. This is not significant
2.5/ 6
(P = 0.1020); there is not enough evidence to conclude that these sonnets were not written
by our poet. (That is, we cannot reject H0 .)

.
6.69. (a) z = 127.8√− 115 = 2.13, so the P-value is P = P(Z > 2.13) = 0.0166. This is strong
30/ 25
evidence that the older students have a higher SSHA mean. (b) The important assumption
is that this is an SRS from the population of older students. We also assume a Normal
distribution, but this is not crucial provided there are no outliers and little skewness.

6.70. (a) Because we suspect that athletes might be deficient, we use a one-sided alternative:
H0 : µ = 2811.5 kcal/day versus Ha : µ < 2811.5 kcal/day. (b) The test statistic
.
is z = 2403.7−2811.5
√ = −6.57, for which P < 0.0001. There is strong evidence of
880/ 201
below-recommended caloric consumption among female Canadian high-performance athletes.

6.71. (a) H0 : µ = 0 mpg versus Ha : µ = 0 mpg, where µ is the mean difference. (b) The
.
mean of the 20 differences is x = 2.73, so z = 2.73√− 0 = 4.07, for which P < 0.0001. We
3/ 20
conclude that µ = 0 mpg; that is, we have strong evidence that the computer’s reported fuel
efficiency differs from the driver’s computed values.

.
6.72. A debt of $3817 in the West would be equivalent to a debt of ($3817)(1.06) = $4046
in the Midwest, for a difference of $4046 − $3260 = $786. With the hypotheses given
in Example 6.10, and the standard deviation ($374) from Example 6.11, the test statistic
− $0 . .
is z = $786
$374 = 2.10. The P-value is P = 2P(Z ≥ 2.10) = 0.0358; recall that, for
the unadjusted data, the P-value was 0.1362. Adjusting for the differing value of a dollar
strengthens the evidence against H0 , enough that it is now significant at the 5% level.

6.73. For (b) and (c), either compare with the critical values in Table D or determine the
P-value (0.0336). (a) H0 : µ = 0.9 mg versus Ha : µ > 0.9 mg. (b) Yes, because z > 1.645
(or because P < 0.05). (c) No, because z < 2.326 (or because P > 0.01).
Solutions 191

6.74. A sample screen (for x = 1) is shown below on the left. As one can judge from the
shading under the Normal curve, x = 0.5 is
√ not significant, but 0.6 is. (In fact, the cutoff is
about 0.52, which is approximately 1.645/ 10.)

6.75. See the sample screen (for x = 1) above on the right. As one can judge from the shading
under the Normal curve, x = 0.7 is not√significant, but 0.8 is. (In fact, the cutoff is about
0.7354, which is approximately 2.326/ 10.) Smaller α means that x must be farther away
from µ0 in order to reject H0 .

6.76. A sample screen (for x = 0.4) is shown on


the right. The P-values given by the applet
are listed in the table below; as x moves
farther away from µ0 , P decreases.
x P x P
0.1 0.3745 0.6 0.0287
0.2 0.2643 0.7 0.0136
0.3 0.1711 0.8 0.0057
0.4 0.1038 0.9 0.0022
0.5 0.0571 1 0.0008

6.77. When a test is significant at the 5% level, it means that if the null hypothesis were true,
outcomes similar to those seen are expected to occur fewer than 5 times in 100 repetitions
of the experiment or sampling. “Significant at the 10% level” means we have observed
something that occurs in fewer than 10 out of 100 repetitions (when H0 is true). Something
that occurs “fewer than 5 times in 100 repetitions” also occurs “fewer than 10 times in 100
repetitions,” so significance at the 5% level implies significance at the 10% level (or any
higher level).
192 Chapter 6 Introduction to Inference

6.78. Something that occurs “fewer than 5 times in 100 repetitions” is not necessarily as rare
as something that occurs “less than once in 100 repetitions,” so a test that is significant at
5% is not necessarily significant at 1%.

6.79. Using Table D or software, we find that the 0.005 critical value is 2.576, and the 0.0025
critical value is 2.807. Therefore, if 2.576 < |z| < 2.807—that is, either 2.576 < z < 2.807
or −2.807 < z < −2.576—then z would be significant at the 1% level, but not at the 0.5%
level.

6.80. As 2.326 < 2.52 < 2.576, the two-sided P-value is between 2(0.005) = 0.01 and
.
2(0.01) = 0.02. (Software tells us that P = 0.012, consistent with the observation that z is
close to 2.576.)

6.81. As 0.63 < 0.674, the one-sided P-value is P > 0.25. (Software gives P = 0.2643.)

6.82. Because 1.645 < 1.92 < 1.960, the P-value is between 2(0.025) = 0.05 and
2(0.05) = 0.10. From Table A, P = 2(0.0274) = 0.0548.

6.83. Because the alternative is two-sided, the answer for z = −1.92 is the same as for
z = 1.92: −1.645 > −1.92 > −1.960, so Table D says that 0.05 < P < 0.10, and Table A
gives P = 2(0.0274) = 0.0548.

− 525
6.84. (a) z = 541.4√
= 1.64. This is not significant at α = 0.05 because z < 1.645
100/ 100
(or P = 0.0505). (b) z = 541.5√− 525 = 1.65. This is significant at α = 0.05 because
100/ 100
z > 1.645 (or P = 0.0495). (c) Fixed-level significance tests require that we draw a line
between “significant” and “not significant”; in this example, we see evidence on each side
of that line. The 5% significance level is a guideline, not a sacred edict. P-values are more
informative ways to convey the strength of the evidence.

6.85. In order to determine the effectiveness of alarm systems, we need to know the percent of
all homes with alarm systems, and the percent of burglarized homes with alarm systems.
For example, if only 10% of all home have alarm systems, then we should compare the
proportion of burglarized homes with alarm systems to 10%, not 50%.
An alternate (but rather impractical) method would be to sample homes and classify
them according to whether or not they had an alarm system, and also by whether or not
they had experienced a break-in at some point in the recent past. This would likely require
a very large sample in order to get a sufficiently large count of homes that had experienced
break-ins.

6.86. Finding something to be “statistically significant” is not really useful unless the
significance level is sufficiently small. While there is some freedom to decide what
“sufficiently small” means, α = 0.20 would lead the student to incorrectly reject H0 one-fifth
of the time, so it is clearly a bad choice.

6.87. The first test was barely significant at α = 0.05, while the second was significant at any
reasonable α.
Solutions 193

6.88. One can learn something from negative results; for example, a study that finds no benefit
from a particular treatment is at least useful in terms of what will not work. Furthermore,
reviewing such results might point researchers to possible future areas of study.

6.89. A significance test answers only Question b. The P-value states how likely the observed
effect (or a stronger one) is if H0 is true, and chance alone accounts for deviations from
what we expect. The observed effect may be significant (very unlikely to be due to chance)
and yet not be of practical importance. And the calculation leading to significance assumes a
properly designed study.

6.90. Based on the description, this seems to have been an experiment (not just an
observational study), so a statistically significant outcome suggests that vitamin C is
effective in preventing colds.

6.91. (a) If SES had no effect on LSAT results, there would still be some difference in scores
due to chance variation. “Statistically insignificant” means that the observed difference was
no more than we might expect from that chance variation. (b) If the results are based on a
small sample, then even if the null hypothesis were not true, the test might not be sensitive
enough to detect the effect. Knowing the effects were small tells us that the statistically
insignificant test result did not occur merely because of a small sample size.

6.92. These questions are addressed in the summary for Section 6.3. (a) Failing to reject
H0 does not mean that H0 is true. (b) This is correct; a difference that is statistically
significant might not be practically important. (This does not mean that these are opposites;
a difference could be both statistically and practically significant.) (c) This might be
technically true, but in order for the analysis to be meaningful, the data must satisfy the
assumptions of the analysis. (d) Searching for patterns and then testing their significance can
lead to false positives (that is, we might reject the null hypothesis incorrectly). If a pattern is
observed, we should collect new data to test if it is present.

6.93. In each case, we find the test statistic z by dividing the observed difference
√ .
(2453.7 − 2403.7 = 50 kcal/day) by 880/ n. (a) For n = 100, z = 0.57, so
.
P = P(Z > 0.57) = 0.2843. (b) For n = 500, z = 1.27, so P = P(Z > 1.27) = 0.1020.
.
(c) For n = 2500, z = 2.84, so P = P(Z > 2.84) = 0.0023.

6.94. The study may have rejected µ = µ0 (or some other null hypothesis), but with such a
large sample size, such a rejection might occur even if the actual mean (or other parameter)
differs only slightly from µ0 . For example, there might be no practical importance to the
difference between µ = 10 and µ = 10.5.

6.95. We expect more variation with small sample sizes, so even a large difference between
x and µ0 (or whatever measures are appropriate in our hypothesis test) might not turn out
to be significant. If we were to repeat the test with a larger sample, the decrease in the
standard error might give us a small enough P-value to reject H0 .
194 Chapter 6 Introduction to Inference

6.98. When many variables are examined, “significant” results will show up by chance, so
we should not take it for granted that the variables identified are really indicative of future
success. In order to decide if they are appropriate, we should track this year’s trainees and
compare the success of those from urban/suburban backgrounds with the rest, and likewise
compare those with a degree in a technical field with the rest.

6.100. We expect 50 tests to be statistically significant: Each of the 1000 tests has a 5%
chance of being significant, so the number of significant tests has a binomial distribution
with n = 1000 and p = 0.05, for which the mean is np = 50.

6.101. P = 0.00001 = 100,0001


, so we would need n = 100,000 tests in order to expect one
P-value of this size (assuming that all null hypotheses are true). That is why we reject H0
when we see P-values such as this: It indicates that our results would rarely happen if H0
were true.

.
6.102. Using α/6 = 0.008333 as the cutoff, the fourth (P = 0.003) and sixth (P < 0.001) tests
are significant.

.
6.103. Using α/12 = 0.004167 as the cutoff, we reject the fifth (P = 0.002) and eleventh
(P < 0.002) tests.

6.104. The power of this study is far lower than what is generally desired—for example, it is
well below the “80% standard” mentioned in the text. For the specified effect, 35% power
means that, if the effect is present, we will only detect it 35% of the time. With such a
small chance of detecting an important difference, the study should probably not be run
(unless the sample size is increased to give sufficiently high power).

6.105. A larger sample gives more information and therefore gives a better chance of detecting
a given alternative; that is, larger samples give more power.

6.106. The power for µ = −3 is 0.82—the same as the power for µ = 3—because both
alternatives are an equal distance from the null value of µ. (The symmetry of two-sided
tests with the Normal distribution means that we only need to consider the size of the
difference, not the direction.)

6.107. The power for µ = 40 will be higher Reject H0


than 0.6, because larger differences are
easier to detect. The picture on the right
shows one way to illustrate this (assuming
Normal distributions): The solid curve
(centered at 20) is the distribution under 20 30 40
the null hypothesis, and the two dashed curves represent the alternatives µ = 30 and µ =
40. The shaded region under the middle curve is the power against µ = 30; that is, that
shaded region is 60% of the area under that curve. The power against µ = 40 would be the
corresponding area under the rightmost curve, which would clearly be greater than 0.6.
Solutions 195

6.108. The applet finds that the power is approximately 0.069.

6.109. The applet reports the power as 0.986.

6.110. (a) For the alternative Ha : µ > 168, we reject H0 at the 5% significance level if
.
z > 1.645. (b) x −√168 > 1.645 when x > 168 + 1.645 · √27 = 173.31. (c) When µ = 173,
27/ 70 70
the probability of rejecting H0 is

x − 173 173.31 − 173 .
P(x > 173.31) = P √ > √ = P(Z > 0.10) = 0.4602.
27/ 70 27/ 70
(d) The power of this test is not up to the 80% standard suggested in the text; he should
collect a larger sample.
Note: Software gives a slightly different answer for the power in part (c), but the
conclusion in part (d) is the same. To achieve 80% power against µ = 173, we need
n = 180.

.
6.111. We reject H0 when z > 2.326, which is equivalent to x > 450 + 2.326 · √100 = 460.4,
500
so the power against µ = 460 is
P(reject H0 when µ = 460) = P(x > 460.4 when µ = 460)

460.4 − 460 .
=P Z> √ = P(Z > 0.09) = 0.4641.
100/ 500
This is quite a bit less than the “80% power” standard.

6.112. (a) P(Type I error) = P(X ≤ 2 when the distribution is p0 ) = 0.4.


(b) P(Type II error) = P(X > 2 when the distribution is p1 ) = 0.4.

6.113. (a) The hypotheses are “subject should go to college” and “subject should join work
force.” The two types of errors are recommending someone go to college when (s)he is
better suited for the work force, and recommending the work force for someone who should
go to college. (b) In significance testing, we typically wish to decrease the probability of
wrongly rejecting H0 (that is, we want α to be small); the answer to this question depends
on which hypothesis is viewed as H0 .
196 Chapter 6 Introduction to Inference

Note: For part (a), there is no clear choice for which should be the null hypothesis.
In the past, when fewer people went to college, one might have chosen “work force” as
H0 —that is, one might have said, “we’ll assume this student will join the work force unless
we are convinced otherwise.” Presently, roughly two-thirds of graduates attend college,
which might suggest H0 should be “college.”

6.114. This is probably not a confidence interval; it is not intended to give an estimate of the
mean income, but rather it gives the range of incomes earned by all (or most) telemarketers
working for this company.

6.115. From the description, we might surmise that we had two (or more) groups of
students—say, an exercise group and a control (or no-exercise) group. (a) For example, if µ
is the mean difference in scores between the two groups, we might test H0 : µ = 0 versus
Ha : µ = 0. (Assuming we had no prior suspicion about the effect of exercise, the alternative
should be two-sided.) (b) With P = 0.38, we would not reject H0 . In plain language: The
results observed do not differ greatly from what we would expect if exercise had no effect
on exam scores. (c) For example: Was this an experiment? What was the design? How big
were the samples?

6.116. (a) For each proportion, Occupation p̂ SD m.e. Conf. interval


the estimated standard deviation Professional 0.23 0.00851 0.01667 0.2133 to 0.2467
(inthe column labeled “SD”) Managerial 0.22 0.00820 0.01607 0.2039 to 0.2361
is p̂(1 − p̂)/n; the margin Administrative 0.17 0.00782 0.01532 0.1547 to 0.1853
of error (“m.e.”) is 1.96 × SD. Sales 0.15 0.00839 0.01645 0.1336 to 0.1664
(b) The first two groups (profes- Mechanical 0.12 0.00730 0.01432 0.1057 to 0.1343
Service 0.13 0.00661 0.01295 0.1171 to 0.1429
sional and managerial) had the
Operator 0.12 0.00616 0.01208 0.1079 to 0.1321
highest stress levels, followed by Farm 0.08 0.01135 0.02225 0.0577 to 0.1023
the next two (administrative and
sales), then the next three (mechanical, service, and operator). Stress levels were lowest for
farm workers. (c) Because data was compiled over two years, some individuals might have
been included more than once, which violates the independence assumption of the binomial
distribution.

6.117. (a) Because all standard √ deviations and sample sizes are the same, the margin of error
.
for all intervals is 1.96 × 19/ 180 = 2.7757. The confidence intervals are listed in the table
on the following page. (b) The plot on the following page shows the error bars for the
confidence intervals of part (a), and also for part (c). The limits for (a) are the thicker lines
which do not extend as far above and √ below the mean. (c) With z ∗ = 2.40, the margin of
.
error for all intervals is 2.40 × 19/ 180 = 3.3988. The confidence intervals are listed in the
table below and are shown in the plot (the thinner lines with the wider dashes). (d) When
we use z ∗ = 2.40 to adjust for the fact that we are making three “simultaneous” confidence
intervals, the margin of error is larger, so the intervals overlap more.
Solutions 197

77
Workplace Mean 75
size SCI
73

Mean SCI
< 50 64.45 to 70.01
71
50–200 67.59 to 73.15
69
> 200 72.05 to 77.61
67
< 50 63.83 to 70.63
65
50–200 66.97 to 73.77
63
> 200 71.43 to 78.23 <50 50–200 >200
Workplace size

6.118. Shown below is a sample screenshot from the applet and an example of what the
resulting plot might look like. Most students (99.7% of them) should find that their final
proportion is between 0.90 and 1; 90% will have a proportion between 0.925 and 0.975.
Note: For each n (number of intervals), the number of “hits” would have a binomial
distribution with p = 0.95, but these counts would not be independent; for example, if we
knew there were 28 hits after 30 tries, we would know that there could be no more than 38
after 40 tries.

98

96
Percent hit

94

92

90

88
0 50 100 150 200
Number of intervals

6.119. A sample screenshot and example plot are not shown but would be similar to those
shown above for the previous exercise. Most students (99.4% of them) should find that their
final proportion is between 0.84 and 0.96; 85% will have a proportion between 0.87 and
0.93.

.
6.120. For n = 10, z = 1 −√0 = 0.90, for which P = 0.1841. For the other sample sizes, the
14/ 10
computations are similar; the resulting table and graphs are shown on the following page.
We see that sample size increases the value of the test statistic (assuming the mean is the
same), which in turn decreases the size of the P-value.
198 Chapter 6 Introduction to Inference

2 0.18
0.16
n z P 1.5 0.14

Test statistic
0.12

P -value
10 0.90 0.1841 0.1
20 1.28 0.1003 1
0.08
30 1.56 0.0594 0.06
0.5 0.04
40 1.81 0.0351
0.02
50 2.02 0.0217 0 0
10 20 30 40 50 10 20 30 40 50
Sample size Sample size


6.121. (a) x = 5.3 mg/dl, so x ± 1.960σ/ 6 is 4.6132 to 6.0534 mg/dl. (b) To test
4.8 .
H0 : µ = 4.8 mg/dl versus Ha : µ > 4.8 mg/dl, we compute z = x − √ = 1.45 and
0.9/ 6
.
P = 0.0735. This is not strong enough to reject H0 .
Note: The confidence interval in (a) would allow us to say without further computation
that, against a two-sided alternative, we would have P > 0.05. Because we have a one-sided
alternative, we could conclude from the confidence interval that P > 0.025, but that is not
enough information to draw a conclusion.

6.122. (a) The 90% confidence interval for µ is



8
145 ± 1.645 √ = 145 ± 3.3979 = 141.6021 to 148.3979 mg/g
15
(b) Our hypotheses are H0 : µ = 140 mg/g versus Ha : µ > 140 mg/g. The test statistic is
− 140
z = 145√ = 2.42, so the P-value is P = P(Z > 2.42) = 0.0078. This is strong evidence
8/ 15
against H0 ; we conclude that the mean cellulose content is higher than 140 mg/g. (c) We
must assume that the 15 cuttings in our sample are an SRS. Because our sample is not too
large, the population should be Normally distributed, or at least not extremely nonnormal.

6.123. (a) The stemplot is reasonably symmetric


√ for such a small sample. 2 034
(b) x = 30.4 µg/l; 30.4 ± (1.96)(7/ 10 ) gives 26.0614 to 34.7386 µg/l. 2
. 3 01124
(c) We test H0 : µ = 25 µg/l versus Ha : µ > 25 µg/l. z = 30.4√− 25 = 2.44, so 3 6
7/ 10
P = 0.0073. (We knew from (b) that it had to be smaller than 0.025). This is 4 3
fairly strong evidence against H0 ; the beginners’ mean threshold is higher than 25 µg/l.

6.124. (a) The intended population is probably “the Food stores 15.22 to 22.12
American public”; the population that was actually Mass merchandisers 27.77 to 36.99
sampled was “citizens of Indianapolis √ (with listed Pharmacies 43.68 to 53.52
phone numbers).” (b) Take x ± 1.96s/ 201; these intervals are listed on the right. (c) The
confidence intervals do not overlap at all; in particular, the lower confidence limit of the
rating for pharmacies is higher than the upper confidence limit for the other stores. This indi-
cates that the pharmacies are really rated higher.
Solutions 199

√ .
6.125. (a) Under H0 , x has a N (0%, 55%/ 104 ) =
.
N (0%, 5.3932%) distribution. (b) z = 6.9√− 0 = 1.28, 6.9%
55/ 104
so P = P(Z > 1.28) = 0.1003. (c) This is not sig-
nificant at α = 0.05. The study gives some evidence
–15.6% –5.2% 0 5.2% 10.4% 15.6%
of increased compensation, but it is not very strong;
similar results would happen about 10% of the time just by chance.

6.126. No: “Significant at α = 0.01” does mean that the null hypothesis is unlikely, but only in
the sense that the evidence (from the sample) would not occur very often if H0 were true.
There is no probability associated with H0 ; it is either true or it is not.
Note: Bayesian statistics views the parameter we wish to estimate as having a
probability distribution; with that viewpoint, it would make sense to speak of “the
probability that H0 is true.” This textbook does not take the Bayesian approach.

6.127. Yes. That’s the heart of why we care about statistical significance. Significance tests
allow us to discriminate between random differences (“chance variation”) that might occur
when the null hypothesis is true, and differences that are unlikely to occur when H0 is true.

6.128. If p is the probability that red occurs, we test H0 : p = 18


38 versus Ha : p = 18
38 .


6.129. For each sample, find x, then take x ± 1.96(4/ 12 ) = x ± 2.2632.
We “expect” to see that 95 of the 100 intervals will include 25 (the true value of µ);
binomial computations show that (about 99% of the time) 90 or more of the 100 intervals
will include 20.

x−
6.130. For each sample, find x, then compute z = √25 . Choose a significance level α and
4/ 12
the appropriate cutoff point—for example, with α = 0.10, reject H0 if |z| > 1.645; with
α = 0.05, reject H0 if |z| > 1.96.
If, for example, α = 0.05, then we “expect” to reject H0 (that is, make the wrong
decision) only 5 of the 100 times.

x−
6.131. For each sample, find x, then compute z = √23 . Choose a significance level α and the
4/ 12
appropriate cutoff point (z ∗ )—forexample, with α = 0.10, reject H0 if |z| > 1.645; with
α = 0.05, reject H0 if |z| > 1.96.
Because the true mean is 25, Z = x − √25 has a N (0, 1) distribution, so the probability
 4/ 12 
x−
that we will accept H0 is P −z ∗
< √23 < z ∗ = P(−z ∗ < Z + 1.7321 < z ∗ ) =
4/ 12
P(−1.7321 − z ∗ < Z < −1.7321 + z ∗ ). If α = 0.10 (z ∗ = 1.645), this probability
is P(−3.38 < Z < −0.09) = 0.4637; if α = 0.05 (z ∗ = 1.96), this probability is
P(−3.69 < Z < 0.23) = 0.5909. For smaller α, the probability will be larger. Thus we
“expect” to (wrongly) accept H0 about half the time (or more), and correctly reject H0 about
half the time or less. (The probability of rejecting H0 is essentially the power of the test
against the alternative µ = 25.)

6.132. The test statistics and P-values are in the table below, computed as described in
the text; for example, for conscientiousness, z = 3.80−3.88
0.10 = −0.8. Because we are
200 Chapter 6 Introduction to Inference

performing 13 tests, we should use the Bonferroni procedure (see Exercise 6.102) or some
other multiple-test method. For an overall significance level of α, Bonferroni requires
individual-test significance at α/13; for α = 0.05, this means we need P < 0.0038.
Therefore, the only significant differences (for α = 0.05 or α = 0.01) are for handshake
strength and grip.
Characteristic z P
Conscientiousness −0.8 0.4238
Extraversion 0.9 0.3682
Agreeableness −2.2 0.0278
Emotional stability 1.5 0.1336
Openness to experience 0.5 0.6170
Overall handshake 2.3 0.0214
Handshake strength 5.3 < 0.0001*
Handshake vigor 1.7 0.0892
Handshake grip 3.8 0.0002*
Handshake duration 1.5 0.1336
Eye contact −0.6 0.5486
Professional dress −2.0 0.0456
Interviewer assessment −0.58 0.5626
Chapter 7 Solutions

7.1. (a) The standard error of the mean is √s


n
= $96
√ = $24. (b) The degrees of freedom are
16
df = n − 1 = 15.

7.2. In each case, use df = n − 1; if that number is not in Table D, drop to the lower degrees
of freedom. (a) For 95% confidence and df = 10, use t ∗ = 2.228. (b) For 99% confidence
and df = 32, we drop to df = 30 and use t ∗ = 2.750. (Software gives t ∗ = 2.7385 for
df = 32.) (c) For 90% confidence and df = 249, we drop to df = 100 and use t ∗ = 1.660.
(Software gives t ∗ = 1.6510 for df = 249.)

7.3. For the mean monthly rent, the 95% confidence interval for µ is

$96
$613 ± 2.131 √ = $613 ± $51.14 = $561.86 to $664.14
16

7.4. The margin of error for 90% confidence would be smaller (so the interval would be
narrower) because we are taking a greater risk—specifically, a 10% risk—that the interval
does not include the true mean µ.

7.5. (a) Yes, t = 2.18 is significant when n = 18. This


can be determined either by comparing to the df = 17 t (17)
line in Table D (where we see that t > 2.110, the 2.5% –2.18 2.18
critical value) or by computing the two-sided P-value
(which is P = 0.0436). (b) No, t = 2.18 is not signif- –3 –2 –1 0 1 2 3
icant when n = 10, as can be seen by comparing to the
t (9)
df = 9 line in Table D (where we see that t < 2.262,
–2.18 2.18
the 2.5% critical value) or by computing the two-sided
P-value (which is P = 0.0572). (c) Student sketches
–3 –2 –1 0 1 2 3
will likely be indistinguishable from Normal distribu-
tions; careful students may try to show that the t (9) distribution is shorter in the center and
heavier to the left and right (“in the tails”) than the t (17) distribution (as is the case here),
but in reality, the difference is nearly imperceptible.

7.6. For the hypotheses H0 : µ = $550 versus Ha : µ > $550, we find t = 613 −√550 = 2.625
96/ 16
.
with df = 15, for which P = 0.0096. We have strong evidence against H0 , and conclude
that the mean rent is greater than $550.

7.7. Software will typically give a more accurate value for t ∗ than that given in Table D, and
will not round off intermediate values such as the standard deviation. Otherwise, the details
of this computation√ are the same as what is shown in the textbook: df = 7, t ∗ = 2.3646,
6.75 ± t ∗ (3.8822/ 8 ) = 6.75 ± 3.2456 = 3.5044 to 9.9956, or about 3.5 to 10.0 hours per
month.

201
202 Chapter 7 Inference for Distributions

7.8. We wish to test H0 : µ = 0 versus Ha : µ = 0, where µ is the mean of (drink A rating)


minus (drink B rating). (We could also subtract in the other direction.) Compute the
differences for each subject (−2, 5, 2, 10, and 7), then find the mean and standard deviation
. .
of these differences: x = 4.4 and s = 4.6152. Therefore, t = 4.4 − √ 0
= 2.1318 with
4.6152/ 5
.
df = 4, for which P = 0.1000. We do not have enough evidence to conclude that there is a
difference in preference.

7.9. About −1.33 to 10.13: Using the mean and standard deviation from the previous exercise,
the 95% confidence interval for µ is

4.6152
4.4 ± 2.7765 √ = 4.4 ± 5.7305 = −1.3305 to 10.1305
5
(This is the interval produced by software; using the critical value t ∗ = 2.776 from Table D
gives −1.3296 to 10.1296.)

7.10. See also the solutions to Exercises 1.36, 1.74, and 1.150. The CO2 data are sharply
right-skewed (clearly non-Normal). However, the robustness of the t procedures should make
them safe for this situation because the sample size is large (n = 48). The bigger question
is whether we can treat the data as an SRS; we have recorded CO2 emissions for every
country with a population over 20 million, rather than a random sample.

7.11. The distribution is clearly non-Normal, but the sample size (n = 63) should be sufficient
to overcome this, especially in the absence of strong skewness. One might question the
independence of the observations; it seems likely that after 40 or so tickets had been posted
for sale, that someone listing a ticket would look at those already posted for an idea of what
price to charge.
If we were to use t procedures, we would presumably take the viewpoint that these 63
observations come from a larger population of hypothetical tickets for this game, and we are
trying to estimate the mean µ of that population. However, because (based on the histogram
in Figure 1.33) the population distribution is likely bimodal, the mean µ might not be the
most useful summary of a bimodal distribution.

7.12. The power would be greater because larger differences (like µ > 1) are easier to detect.

7.13. As was found in Example 7.9, we reject H0 if t = x√ ≥ 1.729, which is equivalent to


1.5/ 20
x ≥ 0.580. The power we seek is P(x ≥ 0.580 when µ = 1.1), which is:


x − 1.1 0.580 − 1.1
P √ ≥ √ = P(Z ≥ −1.55) = 0.9394
1.5/ 20 1.5/ 20
Solutions 203

7.14. We test H0 : median = 0 versus Ha : median = 0—or equivalently, H0 : p = 1/2


versus Ha : p = 1/2, where p is the probability that the rating for drink A is higher.
We note that four of the five differences are positive (and none are 0). The P-value is
2P(X ≥ 4) = 0.375 from a B(5, 0.5) distribution; there is not enough evidence to conclude
that the median ratings are different.

Minitab output: Sign test of median = 0 versus median = 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
Diff 5 1 0 4 0.3750 5.000

7.15. (a) df = 10, t ∗ = 2.228. (b) df = 21, t ∗ = 2.080. (c) df = 21, t ∗ = 1.721. (d) For
a given confidence level, t ∗ (and therefore the margin of error) decreases with increasing
sample size. For a given sample size, t ∗ increases with increasing confidence.

7.16. This t distribution has df = 17. The 2.5% critical


value is 2.110, so we reject H0 when t < −2.110 or
t > 2.110. 2.110

–3 –2 –1 0 1 2 3

7.17. The 5% critical value for a t distribution with


df = 17 is 1.740. Only one of the one-sided options
1.740
(reject H0 when t > 1.740) is shown; the other is
simply the mirror image of this sketch (shade the area
to the left of −1.740, and reject when t < −1.740). –3 –2 –1 0 1 2 3

7.18. Because the value of x is positive, which supports the direction of the alternative
(µ > 0), the P-value for the one-sided test is half as big as that for the two-sided test:
P = 0.037.

7.19. x = −15.3 would support the alternative µ < 0, and


for that alternative, the P-value would still be 0.037.
For the alternative µ > 0 given in Exercise 7.18,
the P-value is 0.963. Note that in the sketch shown,
no scale has been given, because in the absence of t
a sample size, we do not know the degrees of freedom. Nevertheless, the P-value for the
alternative µ > 0 is the area above the computed value of the test statistic t, which will be
the opposite of that found when x = 15.3. As the area below t is 0.037, the area above this
point must be 0.963.

7.20. (a) df = 15. (b) 1.753 < t < 2.131. (c) 0.025 < P < 0.05. (d) t = 2.10 is significant at
.
5%, but not at 1%. (e) From software, P = 0.0265.

7.21. (a) df = 27. (b) 1.703 < t < 2.052. (c) Because the alternative is two-sided, we double
the upper-tail probabilities to find the P-value: 0.05 < P < 0.10. (d) t = 2.01 is not
.
significant at either level (5% or 1%). (e) From software, P = 0.0546.
204 Chapter 7 Inference for Distributions

7.22. (a) df = 13. (b) Because 2.282 < |t| < 2.650, the P-value is between 0.01 < P < 0.02.
.
(c) From software, P = 0.0121.

7.23. Let P be the given (two-sided) P-value, and suppose that the alternative is µ > µ0 . If x
is greater than µ0 , this supports the alternative over H0 . However, if x < µ0 , we would not
take this as evidence against H0 because x is on the “wrong” side of µ0 . So, if the value of
x is on the “correct” side of µ0 , the one-sided P-value is simply P/2. However, if the value
of x is on the “wrong” side of µ0 , the one-sided P-value is 1 − P/2 (which will always be
at least 0.5, so it will never indicate significant evidence against H0 ).

7.24. (a) The distribution is slightly right-skewed, and the largest observation 7 24
stands out from the rest (although it does not quite qualify as an outlier us- 8 355
9 679
ing the 1.5 × IQR rule). (b) The reasonably large sample should be sufficient 10 3456
to overcome the mild non-Normality of the data, and because it was based 11 012899
on a random sample from a large population, t procedures should be appro- 12 0678
. . 13 7
priate.
√ (c) x = 119.0667 and s = 29.5669 friends, so the standard error is
. 14 8
s/ 30 = 5.3982. The critical value for 95% confidence is t ∗ = 2.045, so 15 248
the margin of error is 11.04. (d) With 95% confidence, the mean number of 16 0
Facebook friends at this university is between 108.03 and 130.11. 17 1
18
19 3

7.25. (a) If µ is the mean number of uses a person can produce in 5 minutes after witnessing
.
rudeness, we wish to test H0 : µ = 10 versus Ha : µ < 10. (b) t = 7.88 −√10 = −5.2603,
2.35/ 34
with df = 33, for which P < 0.0001. This is very strong evidence that witnessing rudeness
decreases performance.

7.26. (a) A stemplot (shown) or a histogram shows no outliers and no particular 3 4


skewness. (In fact, for such a small sample, it suggests no striking deviations 3 677
3 9
from Normality.) The use of t methods seems to be safe. (b) The mean is
. . 4 1
x = 43.17 mpg, √ the .standard deviation is s = 4.4149 mpg, and the stan- 4 23333
dard error is s/ 20 = 0.9872 mpg. For df √ = 19, the 2.5% critical value is 4 445
∗ . ∗ . 4 667
t = 2.093, so the margin of error is t s/ 20 = 2.0662 mpg. (c) The 95%
4 88
confidence interval is 41.1038 to 45.2362 mpg. 5 0
Solutions 205

7.27. (a) A stemplot (right) 0 222244 70


reveals that the distribution 0 579 60
1 0113

Diameter (cm)
has two peaks and a high 1 678 50
value (not quite an outlier). 2 2 40
Both the stemplot and 2 6679
30
quantile plot show that the 3 112
3 5789 20
distribution is not Normal. 4 0033444 10
The five-number summary 4 7
5 112 0
is 2.2, 10.95, 28.5, 41.9, –3 –2 –1 0 1 2 3
5
69.3 (all in cm); a boxplot 6 Normal score
is not shown, but the long 6 9
“whisker” between Q 3 and
the maximum is an indication of the skewness. (b) Maybe: We have a large enough sample
to overcome the non-Normal distribution, but we are sampling from a small population.

.
(c) The mean is x = 27.29 cm, s = 17.7058 cm, and the margin of error is t ∗ · s/ 40:
df t∗ Interval
Table D 30 2.042 27.29 ± 5.7167 = 21.57 to 33.01 cm
Software 39 2.0227 27.29 ± 5.6626 = 21.63 to 32.95 cm

(d) One could argue for either answer. We chose a random sample from this tract, so the
main question is, can we view trees in this tract as being representative of trees elsewhere?

7.28. (a) We wish to test H0 : µ = 3421.7 kcal/day versus Ha : µ < 3421.7 kcal/day. (b) The
. .
test statistic is t = 3077 −√3421.7 = −3.73, with df = 113, for which P = 0.0002. (c) Starting
987/ 114
with the average shortfall 3421.7 − 3077.0 = 344.7 kcal/day, the mean deficiency is (with
95% confidence) between about 160 and 530 kcal/day.
df t∗ Interval
Table D 100 1.984 344.7 ± 183.4030 = 161.2970 to 528.1030
Software 113 1.9812 344.7 ± 183.1423 = 161.5577 to 527.8423

7.29. (a) The distribution is not Normal—there were lots of 1s 1 0000000000000000


and 10s—but the nature of the scale means that there can 2 0000
3 0
be no extreme outliers, so with a sample of size 60, the t
. 4 0
methods should be acceptable. (b) The mean√is x = 5.9, 5 00000
.
s = 3.7719, and the margin of error is t ∗ · s/ 60: 6 000
7 0
df t∗ Interval 8 000000
9 00000
Table D 50 2.009 5.9 ± 0.9783 = 4.9217 to 6.8783 10 000000000000000000
Software 59 2.0010 5.9 ± 0.9744 = 4.9256 to 6.8744

(c) Because this is not a random sample, it may not represent other children well.
206 Chapter 7 Inference for Distributions

7.30. (a) The distribution cannot be Normal because all values must be (presumably) integers
between 0 and 4. (b) The sample size (282) should make the t methods appropriate √ because
the distribution of ratings can have no outliers. (c) The margin of error is t ∗ · s/ 282, which
is either 0.1611 (Table D) or 0.1591 (software):
df t∗ Interval
Table D 100 2.626 2.22 ± 0.1611 = 2.0589 to 2.3811
Software 281 2.5934 2.22 ± 0.1591 = 2.0609 to 2.3791

(d) The sample might not represent children from other locations well (or perhaps more
accurately, it might not represent well the opinions of parents of children from other
locations).

7.31. These intervals are constructed as in the previous exer- 99%


cise, except for the choice of t ∗ . We see that the width of 95%
the interval increases with confidence level. 90%

2.0 2.1 2.2 2.3 2.4

df t∗ Interval
90% confidence Table D 100 1.660 2.22 ± 0.1018 = 2.1182 to 2.3218
Software 281 1.6503 2.22 ± 0.1012 = 2.1188 to 2.3212
95% confidence Table D 100 1.984 2.22 ± 0.1217 = 2.0983 to 2.3417
Software 281 1.9684 2.22 ± 0.1207 = 2.0993 to 2.3407

7.32. (a) For example, Subject 1’s weight change is 61.7 − 55.7 = 6 kg. (b) The
.
mean change√is x = 4.73125 kg and the standard deviation is s = 1.7457 kg.
. ∗
(c) SEx = s/ 16 = 0.4364 kg; for df = 15, t = 2.131, so the margin of error for 95%
confidence is ±0.9300 (software: ±0.9302). Based on a method that gives correct results
95% of the time, the mean weight change is 3.8012 to 5.6613 kg (software: 3.8010 to
.
5.6615 kg). (d) x = 10.40875 lb, s = 3.8406 lb, and the 95% confidence interval is 8.3626
to 12.4549 lb (software: 8.3622 to 12.4553 lb). (e) H0 is µ = 16 lb. The test statistic is
.
t = −5.823 with df = 15, which is highly significant evidence (P < 0.0001) against H0
(unless Ha is µ > 16 lb). (f) The data suggest that the excess calories were not converted
into weight; the subjects must have used this energy some other way. (See the next exercise
for more information.)

−0 . .
7.33. (a) t = 328 √
= 5.1250 with df = 15, for which P = 0.0012. There is strong evidence
256/ 16
of a change in NEAT. (b) With t ∗ = 2.131, the 95% confidence interval is 191.6 to
464.4 kcal/day. This tells us how much of the additional calories might have been burned
by the increase in NEAT: It consumed 19% to 46% of the extra 1000 kcal/day.

.
7.34. (a) For the differences, x = $112.5 and s = $123.7437. (b) We wish to test
H0 : µ = 0 versus Ha : µ > 0, where µ is the mean difference between Jocko’s estimates
and those of the other garage. (The alternative hypothesis is one-sided because the
insurance adjusters suspect that Jocko’s estimates may be too high.) For this test, we
.
find t = 112.5 −√0 = 2.87 with df = 9, for which P = 0.0092 (Minitab output
123.7437/ 10
Solutions 207

below). This is significant evidence against H0 —that is, we have good reason to believe
estimates are higher. (c) The 95% confidence interval with df = 9 is
that Jocko’s √
x ± 2.262 s/ 10 = $112.5 ± $88.5148 = $23.99 to $201.01. (The software interval is $23.98
to $201.02.) (d) Student answers may vary; based on the confidence interval, one could
justify any answer in the range $25,000 to $200,000.

Minitab output: Test of µ = 0 vs µ > 0


Variable N Mean StDev SE Mean T P-Value
Diff 10 112.5 123.7 39.1 2.87 0.0092

7.35. (a) We wish to test H0 : µc = µd versus Ha : µc = µd , where µc is the mean


computer-calculated mpg and µd is the mean mpg computed by the driver. Equivalently,
we can state the hypotheses in terms of µ, the mean difference between computer- and
driver-calculated mpgs, testing H0 : µ = 0 versus Ha : µ = 0. (b) With mean difference
. . .
x = 2.73 and standard deviation s = 2.8015, the test statistic is t = 2.73 −√0 = 4.3580
2.8015/ 20
.
with df = 19, for which P = 0.0003. We have strong evidence that the results of the two
computations are different.

7.36. (a) The plots are on the right; the five- 88 6 980
number summary (in units of “picks”) is 89 5
90 9
Min Q1 M Q3 Max 960
91 236679

Pick count
886 919.5 936.5 958 986 92 0133445
93 567 940
There are no outliers or particular skewness, 94 05
but the stemplot reveals two peaks. (The box- 95 0226779 920
plot gives no evidence of the two peaks; they 96 12579
97 07 900
are visible in the quantile plot, but it takes 98 6
a fair amount of thought—or practice—to 880
observe this in a quantile plot.) (b) While
the distribution is non-Normal, there are no 980
outliers or strong skewness, so the sample 960
size n = 36 should make the t procedures
Pick count

. 940
reasonably safe. (c) The mean is x = 938.2,
.
the standard deviation is s = 24.2971,
√ and the 920
.
standard error of the mean is s/ 36 = 4.0495.
900
(All are in units of picks.) (d) The 90% con-
fidence interval for the mean number of picks 880
in a 1-pound bag is: -3 -2 -1 0 1 2 3
Normal score
df t∗ Interval
Table D 30 1.697 938.2 ± 6.8720 = 931.3502 to 945.0943
Software 35 1.6896 938.2 ± 6.8420 = 931.3803 to 945.0642

925 .
7.37. (a) To test H0 : µ = 925 picks versus Ha : µ > 925 picks, we have t = 938.2 − √ = 3.27
24.2971/ 36
.
with df = 35, for which P = 0.0012. (b) For H0 : µ = 935 picks versus Ha : µ > 935 picks,
935 . .
we have t = 938.2 − √ = 0.80, again with df = 35, for which P = 0.2158. (c) The 90%
24.2971/ 36
208 Chapter 7 Inference for Distributions

confidence interval from the previous exercise was 931.4 to 945.1 picks, which includes 935,
but not 925. For a test of H0 : µ = µ0 versus Ha : µ = µ0 , we know that P < 0.10 for
values of µ0 outside the interval, and P > 0.10 if µ0 is inside the interval. The one-sided
P-value would be half of the two-sided P-value.

7.38. The 90% confidence interval is 3.8 ± t ∗ (1.02/ 1783 ). With Table D, take df = 1000
and t ∗ = 1.646; with software, take df = 1782 and t ∗ = 1.6457. Either way, the confidence
interval is 3.7602 to 3.8398.

7.39. (a) The differences are spread from −0.018 to 0.020 g, with mean x = −1 85
. −1
−0.0015 and standard deviation s = 0.0122 g. A stemplot is shown on the right;
−0 65
the sample is too small to make judgments about skewness or symmetry. (b) For −0
.
H0 : µ = 0 versus Ha : µ = 0, we find t = −0.0015
√ − 0 = −0.347 with df = 7, 0 2
s/ 8
for which P = 0.7388. We cannot reject H0 based on this sample. (c) The 95% 0 55
1
confidence interval for µ is
 1
0.0122 2 0
−0.0015 ± 2.365 √ = −0.0015 ± 0.0102 = −0.0117 to 0.0087 g
8
(d) The subjects from this sample may be representative of future subjects, but the test re-
sults and confidence interval are suspect because this is not a random sample.

7.40. (a) The differences are spread from −31 to 45 g/cm2 , with mean x = 4.625 −3 1
. −2
and standard deviation s = 26.8485 g/cm2 . A stemplot is shown on the right; 8
−1
the sample is too small to make judgments about skewness or symmetry. (b) For −0
. 4
H0 : µ = 0 versus Ha : µ = 0, we find t = 4.625√− 0 = 0.487 with df = 7, for 0 16
s/ 8
which P = 0.6410. We cannot reject H0 based on this sample. (c) The 95% 1 3
2
confidence interval for µ is
 3 5
26.8485 4 5
4.625 ± 2.365 √ = 4.625 ± 22.4494 = −17.8244 to 27.0744 g/cm2
8
(d) See the answer to part (d) of the previous exercise.

7.41. (a) We test H0 : µ = 0 versus Ha : µ > 0, where µ is the mean −0 6


change in score (that is, the mean improvement). (b) The distribu- −0
. −0
tion is slightly left-skewed, with mean x = 2.5 and s = 2.8928. −0 0
.
(c) t = 2.5√− 0 = 3.8649, df = 19, and P = 0.0005; there is strong 0 0011
s/ 20
0 222333333
evidence of improvement in listening test scores. (d) With df = 19, we 0
have t ∗ = 2.093, so the 95% confidence interval is 1.1461 to 3.8539. 0 66666
Solutions 209

7.42. Graphical summaries may vary; shown 0 00000000111233444555555566677777778889


on the right is a stemplot, after omitting 1 0000111222334444457789
2 00127789
the clear outlier (2631 seconds). The 3 2678
distribution is sharply right-skewed; in 4 367
fact, based on the 1.5 × IQR criterion, 5
the top eight call lengths qualify as out- 6
7 00
liers. Given the sharp skewness of the 8
data and the large number of outliers, the 9 5
five-number summary is preferred, and 10
11 4
we should be cautious about relying on
the confidence interval. All the numerical summaries below are in units of seconds:
x s Min Q 1 M Q 3 Max
196.575 342.0215 1 54.5 103.5 200 2631
The 95% confidence interval for the mean is roughly 120 to 273 seconds:
df t∗ Interval
Table D 70 1.994 196.575 ± 76.2489 = 120.3261 to 272.8239 sec
Software 79 1.9905 196.575 ± 76.1132 = 120.4618 to 272.6882 sec

7.43. The distribution is fairly symmetrical with no outliers. The mean IQ 8 12


.
is x = 114.983 and the standard √ deviation is s = 14.8009. The 95% 8 9
9 04
confidence interval is x ± t ∗ (s/ 60 ), which is about 111.1 to 118.8: 9 67
10 01112223
df t∗ Interval 10 568999
Table D 50 2.009 111.1446 to 118.8221 11 0002233444
Software 59 2.0010 111.1598 to 118.8068 11 5677788
12 223444
Because all students in the sample came from the same school, this 12 56778
13 01344
might adequately describe the mean IQ at this school, but the sample 13 6799
could not be considered representative of all fifth graders. 14 2
14 5

7.44. We have data for all countries with population at least 20 million, so this cannot be
considered a random sample of (say) all countries.

7.45. We test H0 : median = 0 versus Ha : median > 0—or equivalently, H0 : p = 1/2 versus
Ha : p > 1/2, where p is the probability that Jocko’s estimate is higher. One difference is
0; of the nine non-zero differences, seven are positive. The P-value is P(X ≥ 7) = 0.0898
from a B(9, 0.5) distribution; there is not quite enough evidence to conclude that Jocko’s
estimates are higher. In Exercise 7.34 we were able to reject H0 ; here we cannot.
Note: The failure to reject H0 in this case is because with the sign test, we pay attention
only to the sign of each difference, not the size. In particular, the negative differences are
each given the same “weight” as each positive difference, in spite of the fact that the
negative differences are only −$50 and −$75, while most of the positive differences are
larger. See the “Caution” about the sign test on page 425 of the text.
210 Chapter 7 Inference for Distributions

Minitab output: Sign test of median = 0 versus median > 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
Diff 10 2 1 7 0.0898 125.0

7.46. We test H0 : median = 0 versus Ha : median = 0. The Minitab output below gives P = 1
because there were four positive and four negative differences, giving us no reason to doubt
H0 . (This is the same conclusion we reached with the t test, for which P = 0.7388.)

Minitab output: Sign test of median = 0 versus median = 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
opdiff 8 4 0 4 1.0000 -0.00150

7.47. We test H0 : median = 0 versus Ha : median = 0. There were three negative and
five positive differences, so the P-value is 2P(X ≥ 5) for a binomial distribution with
parameters n = 8 and p = 0.5. From Table C or software (Minitab output below), we have
P = 0.7266, which gives no reason to doubt H0 . The t test P-value was 0.6410.

Minitab output: Sign test of median = 0 versus median = 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
opdiff 8 3 0 5 0.7266 3.500

7.48. We test H0 : median = 0 versus Ha : median > 0, or H0 : p = 1/2 versus Ha : p > 1/2.
Three of the 20 differences are zero; of the other 17, 16 are positive. The P-value
is P(X ≥ 16) for a B(17, 0.5) distribution. While Table C cannot give us the exact
value of this probability, if we weaken the evidence by pretending that the three zero
differences were negative and look at the B(20, 0.5) distribution, we can estimate that
P < 0.0059—enough information to reject the null hypothesis. In fact, software reports the
P-value as 0.0001. (For the t test, we found P = 0.0005.)

Minitab output: Sign test of median = 0 versus median > 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
gain 20 1 3 16 0.0001 3.000

7.49. We test H0 : median = 0 versus Ha : median > 0, or H0 : p = 1/2 versus Ha : p > 1/2.
Out of the 20 differences, 17 are positive (and none equal 0). The P-value is P(X ≥ 17)
for a B(20, 0.5) distribution. From Table C or software (Minitab output below), we have
P = 0.0013, so we reject H0 and conclude that the results of the two computations are
.
different. (Using a t test, we found P = 0.0003, which led to the same conclusion.)

Minitab output: Sign test of median = 0 versus median > 0


N BELOW EQUAL ABOVE P-VALUE MEDIAN
diff 20 3 0 17 0.0013 3.000

7.50. After taking logarithms, the √


90% Log x s Confidence interval

confidence interval is x ± t (s/ 5 ). Common 2.5552 0.0653 2.4930 to 2.6175
For df = 4, t ∗ = 2.132, and the con- Natural 5.8836 0.1504 5.7403 to 6.0270
fidence intervals are as shown in the table. (As we would expect, after exponentiating to
undo the logarithms, both intervals are equivalent except for rounding differences: 311.2 to
414.5 hours.)
Solutions 211

.
7.51. The standard deviation for the given data was s = 0.012224. With α = 0.05, t = x
√ ,
√ s/ 15
and df = 14, we reject H0 if |t| ≥ 2.145, which means |x| ≥ (2.145)(s/ 15 ), or
|x| ≥ 0.00677. Assuming µ = 0.002:
P(|x| ≥ 0.00677) = 1 − P (−0.00677 ≤ x ≤ 0.00677)

−0.00677 − 0.002 x − 0.002 0.00677 − 0.002
=1− P √ ≤ √ ≤ √
s/ 15 s/ 15 s/ 15
= 1 − P (−2.78 ≤ Z ≤ 1.51)
.
= 1 − (0.9345 − 0.0027) = 0.07
The power is about 7% against this alternative—not surprising, given the small sample size,
and the fact that the difference (0.002) is small relative to the standard deviation.
Note: Power calculations are often done with software. This may give answers that
differ slightly from those found by the method described in the text. Most software does
these computations with a “noncentral t distribution” (used in the text for two-sample
power problems) rather than a Normal distribution, resulting in more accurate answers. In
most situations, the practical conclusions drawn from the power computations are the same
regardless of the method used.

7.52. We will reject H0 when t = s/√ ≥ t ∗ , where t ∗ 2 √


x
n n t∗ t ∗ − 15 n Power
is the appropriate critical value for the chosen sample
√ 100 1.6604 0.3271 0.3718
size. This corresponds to x ≥ 15t ∗ / n, so the power 200 1.6525 −0.2331 0.5921
against µ = 2 is: 300 1.6500 −0.6594 0.7452
 √ −0.8092
∗ √ 15t ∗ / n − 2
x −2 340 1.6494 0.7908
P(x ≥ 15t / n ) = P √ ≥ √ 345 1.6493 −0.8273 0.7960
15/ n 15/ n
 346 1.6493 −0.8309 0.7970
2√
= P Z ≥ t∗ − n 347 1.6493 −0.8345 0.7980
15 348 1.6493 −0.8380 0.7990
For α = 0.05, the table on the right shows the power 349 1.6492 −0.8416 0.8000
for a variety of sample sizes, and we see that n ≥ 349
achieves the desired 80% power.

7.53. Taking s = 1.5 as in Example 7.9, the power for the alternative µ = 0.75 is:


t ∗s x − 0.75 t ∗ s/ n − 0.75  √ 
P x ≥ √ when µ = 0.75 = P √ ≥ √ = P Z ≥ t ∗ − 0.5 n
n s/ n s/ n
.
Using trial-and-error, we find that with n = 26, power = 0.7999, and with n = 27,
.
power = 0.8139. Therefore, we need n > 26.

7.54. (a) Use a two-sided alternative (Ha : µ A = µ B ) because we (presumably) have no prior
suspicion that one design will be better than the other. (b) Both sample sizes are the same
(n 1 = n 2 = 15), so the appropriate degrees of freedom would be df = 15 − 1 = 14. (c) For a
two-sided test at α = 0.05, we need |t| > t ∗ , where t ∗ = 2.145 is the 0.025 critical value for
a t distribution with df = 14.

7.55. Because 2.264 < t < 2.624 and the alternative is two-sided, Table D tells us that the
P-value is 0.02 < P < 0.04. (Software gives P = 0.0280.) That is sufficient to reject H0 at
α = 0.05.
212 Chapter 7 Inference for Distributions

.
7.56. We find SE D = 2.0396. The options for the df t∗ Confidence interval
95% confidence interval for µ1 − µ2 are shown on 85.4 1.9881 −14.0550 to −5.9450
the right. This interval includes fewer values than a 49 2.0096 −14.0987 to −5.9013
99% confidence interval would (that is, a 99% con- 40 2.021 −14.1220 to -5.8780
fidence interval would be wider) because increasing
our confidence level means that we need a larger margin of error.

.
7.57. We find SE D = 4.5607. The options for the df t∗ Confidence interval
95% confidence interval for µ1 − µ2 are shown on 15.7 2.1236 −19.6851 to −0.3149
the right. The instructions for this exercise say to 9 2.262 −20.3163 to 0.3163
use the second approximation (df = 9), in which
case we do not reject H0 , because 0 falls in the the 95% confidence interval. Using the first
approximation (df = 15.7, typically given by software), the interval is narrower, and we
.
would reject H0 at α = 0.05 against a two-sided alternative. (In fact, t = −2.193, for which
. .
0.05 < P < 0.1 [Table D, df = 9], or P = 0.0438 [software, df = 15.7].)

(s12 /n 1 + s22 /n 2 )2 (122 /30 + 92 /25)2 .


7.58. df = = = 52.4738.
(s12 /n 1 )2 (s 2 /n 2 )2 (12 /30)
2 2
(9 /25)
2 2
+ 2 +
n1 − 1 n2 − 1 29 24

7.59. SPSS and SAS give both results (the SAS output refers to the unpooled result as the
Satterthwaite method), while JMP and Excel show only the unpooled procedures. The
pooled t statistic is 1.998, for which P = 0.0808.
Note: When the sample sizes are equal—as in this case—the pooled and unpooled t
statistics are equal. (See the next exercise.)
Both Excel and JMP refer to the unpooled test with the slightly-misleading phrase
“assuming unequal variances.” The SAS output also implies that the variances are unequal
for this method. In fact, unpooled procedures make no assumptions about the variances.
Finally, note that both Excel and JMP can do pooled procedures as well as the unpooled
procedures that are shown.

7.60. If n 1 = n 2 = n, the pooled estimate of the variance is:


(n 1 − 1)s12 + (n 2 − 1)s22 (n − 1)s12 + (n − 1)s22 s 2 + s22
sp2 = = = 1
n1 + n2 − 2 2(n − 1) 2

1 1 2 s1 + s 2
2 2
s12 s2
The pooled standard error is therefore sp + = sp · = 2 = + 2 , which
n n n n n n
is the same as the unpooled standard error.
Note: The text refers to this as “simple algebra.” Bear in mind that some students might
consider that phrase to be an oxymoron.

7.61. (a) Hypotheses should involve µ1 and µ2 (population means) rather than x 1 and x 2
(sample means). (b) The samples are not independent; we would need to compare the 56
males to the 44 females. (c) We need P to be small (for example, less than 0.10) to reject
H0 . A large P-value like this gives no reason to doubt H0 . (d) Assuming the researcher
computed the t statistic using x 1 − x 2 , a positive value of t does not support Ha . (The
one-sided P-value would be 0.982, not 0.018.)
Solutions 213

7.62. (a) Because 0 is not in the confidence interval, we would reject H0 at the 5% level.
(b) Larger samples generally give smaller margins of error (at the same confidence level,
and assuming that the standard deviations for the large and small samples are about the
same). One conceptual explanation for this is that larger samples give more information and
therefore offer more precise results. A more mathematical explanation:
 In looking at the
formula for a two-sample confidence interval, we see that SE D = s1 /n 1 + s22 /n 2 , so that if
2

n 1 and n 2 are increased, the standard error decreases.


Note: For (a), we can even make some specific statements about t and its P-value: The
confidence interval tells us that x1 − x2 = 1.55 (halfway between 0.8 and 2.3) and the
margin of error is 0.75 (half the width of the interval). As t∗ for a 95% confidence interval
is at least 1.96, SED = 0.75
t∗ is less than about 0.383 and the t-statistic t = 1.55/SE D
is at least 4.05. (The largest possible value of t, for df = 1, is about 26.3.) A little
experimentation with different df reveals that for all df, P < 0.024; if df ≥ 10, then
P < 0.001.

7.63. (a) We cannot reject H0 : µ1 = µ2 in favor of the two-sided alternative at the 5% level
.
because 0.05 < P < 0.10 (Table D) or P = 0.0542 (software). (b) We could reject H0
in favor of Ha : µ1 < µ2 . A negative t-statistic means that x 1 < x 2 , which supports the
claim that µ1 < µ2 , and the one-sided P-value would be half of its value from part (a):
.
0.025 < P < 0.05 (Table D) or P = 0.0271 (software).

.
7.64. We find SE D = 3.4792. The options for the df t∗ Confidence interval
95% confidence interval for µ1 − µ2 are shown on 87.8 1.9873 −16.9144 to −3.0856
the right. A 99% confidence interval would include 39 2.0227 −17.0374 to −2.9626
more values (it would be wider) because increasing 30 2.042 −17.1046 to −2.8954
our confidence level means that we need a larger margin of error.

7.65. (a) Stemplots (right) do not look particularly Normal, but Neutral Sad
they have no extreme outliers or skewness, so t procedures 0 0000000 0 0
should be reasonably safe. (b) The table of summary statistics 0 55 0 5
1 000 1 000
is below on the left. (c) We wish to test H0 : µ N = µ S versus
. 1 1 555
Ha : µ N < µ S . (d) We find SE D = 0.3593 and t = −4.303, 2 00 2 0
. .
so P = 0.0001 (df = 26.5) or P < 0.0005 (df = 13). Ei- 2 55
ther way, we reject H0 . (e) The 95% confidence interval for the 3 00
3 55
difference is one of the two options in the table below on the 4 00
right.
Group n x s df t∗ Confidence interval
Neutral 14 $0.5714 $0.7300 26.5 2.0538 −2.2842 to −0.8082
Sad 17 $2.1176 $1.2441 13 2.160 −2.3224 to −0.7701
214 Chapter 7 Inference for Distributions

7.66. (a) The scores can be examined with either his- Primed Non-primed
tograms or stemplots. Neither distribution reveals 1 1 00
any extreme skewness or outliers, so t procedures 2 00 2 00
3 000 3 000000000000
should be safe. (b) Summary statistics for the two 4 0000000000 4 000
distributions are given below on the left. We find 5 0000000 5 0
. .
SE D = 0.2891 and t = 3.632, so P = 0.0008
.
(df = 39.5) or 0.001 < P < 0.002 (df = 19). Either way, we reject H0 . (c) The 95% con-
fidence interval for the difference is one of the two options in the table below on the right.
(d) The hypothesis test and confidence interval suggest that primed individuals had a more
positive response to the shampoo label, with an average rating between 0.4 and 1.6 points
higher than the unprimed group. (However, priming can only do so much, as the average
score in the primed group was only 4 on a scale of 1 to 7.)
Group n x s df t∗ Confidence interval
Primed 22 4.00 0.9258 39.5 2.0220 0.4655 to 1.6345
Non-primed 20 2.95 0.9445 19 2.093 0.4450 to 1.6550

7.67. (a) The female means and standard deviations df t∗ Confidence interval
. .
are x F = 4.0791 and s F = 0.9861; for males, 402.2 1.9659 0.0793 to 0.4137
. .
they are x M = 3.8326 and s M = 1.0677. (b) Both 220 1.9708 0.0788 to 0.4141
distributions are somewhat skewed to the left. This 100 1.984 0.0777 to 0.4153
can be seen by constructing a histogram, but is also evident in the data table in the text by
noting the large numbers of “4” and “5” ratings for both genders. However, because the rat-
ings range from 1 to 5, there are no outliers, so the t procedures should be safe. (c) We find
. . . .
SE D = 0.0851 and t = 2.898, for which P = 0.0040 (df = 402.2) or 0.002 < P < 0.005
(df = 220). Either way, there is strong evidence of a difference in satisfaction. (d) The 95%
confidence interval for the difference is one of the three options in the table on the right—
roughly 0.08 to 0.41. (e) While we have evidence of a difference in mean ratings, it might
not be as large as 0.25.

.
7.68. (a) For testing H0 : µLC = µLF versus Ha : µLC = µLF , we have SE D = 6.7230 and
. .
t = 4.165, so P < 0.0001 (df = 62.1) or P < 0.001 (df = 31). Either way, we clearly reject
H0 . (b) It might be that the moods of subjects who dropped out differed from the moods of
those who stayed; in particular, it seems reasonable to suspect that those who dropped out
had higher TMDS scores.
Solutions 215

7.69. (a) Assuming we have SRSs from each pop- df t∗ Confidence interval
ulation, use of two-sample t procedures seems 76.1 1.9916 −11.7508 to 15.5308
reasonable. (We cannot assess Normality, but the 36 2.0281 −12.0005 to 15.7805
large sample sizes would overcome most problems.) 30 2.042 −12.0957 to 15.8757
.
(b) We wish to test H0 : µ f = µm versus Ha : µ f = µm . (c) We find SE D = 6.8490 mg/dl.
. .
The test statistic is t = 0.276, with df = 76.1 (or 36—use 30 for Table D), for which
.
P = 0.78. We have no reason to believe that male and female cholesterol levels are different.
(d) The options for the 95% confidence interval for µ f − µm are shown on the right. (e) It
might not be appropriate to treat these students as SRSs from larger populations.
Note: Because t distributions are more spread out than Normal distributions, a t-value
that would not be significant for a Normal distribution (such as 0.276) cannot possibly be
significant when compared to a t distribution.

7.70. Considering the LDL cholesterol levels, we test H0 : µ f = µm versus Ha : µ f < µm . We


. . .
find SE D = 6.2087 mg/dl, so the test statistic is t = −2.104, with df = 70.5 (or 36—use 30
.
for Table D), for which P = 0.0195. We have enough evidence to conclude LDL cholesterol
levels are higher for males than for females.
Note: This t statistic was computed by subtracting the male mean from the female mean.
Reversing the order of subtraction would make t positive, but would not change the P-value
or the conclusion.

7.71. (a) The distribution cannot be Normal because all df t∗ Confidence interval
numbers are integers. (b) The t procedures should 354.0 1.9667 0.5143 to 0.9857
be appropriate because we have two large samples 164 1.9745 0.5134 to 0.9866
with no outliers. (c) We will test H0 : µ I = µC 100 1.984 0.5122 to 0.9878
versus Ha : µ I > µC (or µ I = µC ). The one-sided alternative reflects the researchers’
(presumed) belief that the intervention would increase scores on the test. The two-sided
alternative allows
 for the possibility that the intervention might have had a negative effect.
. .
(d) SE D = s I /n I + sC2 /n C = 0.1198 and t = (x I − x C )/SE D = 6.258. Regardless of how
2
.
we compute degrees of freedom (df = 354 or 164), the P-value is very small: P < 0.0001.
We reject H0 and conclude that the intervention increased test scores. (e) The interval is
x I − x C ± t ∗ SE D ; the value of t ∗ depends on the df (see the table), but note that in every case
the interval rounds to 0.51 to 0.99. (f) The results for this sample may not generalize well to
other areas of the country.
216 Chapter 7 Inference for Distributions

7.72. (a) The distribution cannot be Normal, because df t∗ Confidence interval


all numbers are integers. (b) The t procedures should 341.8 1.9669 0.1932 to 0.6668
be appropriate because we have two large samples 164 1.9745 0.1922 to 0.6678
with no outliers. (c) Again, we test H0 : µ I = µC 100 1.984 0.1911 to 0.6689
versus Ha : µ I > µC (or µ I = µC ). The one-sided alternative reflects the researchers’
(presumed) belief that the intervention would increase self-efficacy scores. The two-sided
alternative allows
 for the possibility that the intervention might have had a negative effect.
. .
(d) SE D = s I /n I + sC2 /n C = 0.1204 and t = (x I − x C )/SE D = 3.571. Regardless
2
.
of how we compute degrees of freedom (df = 341.8 or 164), the (one-sided) P-value is
about 0.0002. We reject H0 and conclude that the intervention increased self-efficacy scores.
(e) The interval is x I − x C ± t ∗ SE D ; the value of t ∗ depends on the df (see the table), but in
every case the interval rounds to 0.19 to 0.67. (f) As in the previous exercise, the results for
this sample may not generalize well to other areas of the country.

7.73. (a) This may be near enough to an SRS, if this df t∗ Confidence interval
company’s working conditions were similar to that 137.1 1.9774 9.9920 to 13.0080
.
of other workers. (b) SE D = 0.7626; regardless of 114 1.9810 9.9893 to 13.0107
how we choose df, the interval rounds to 9.99 to 100 1.984 9.9870 to 13.0130
3
13.01 mg.y/m . (c) A one-sided alternative would seem to be reasonable here; specifically,
we would likely expect that the mean exposure for outdoor workers would be lower. For
testing H0 , we find t = 15.08, for which P < 0.0001 with either df = 137 or 114 (and
for either a one- or a two-sided alternative). We have strong evidence that outdoor concrete
workers have lower dust exposure than the indoor workers. (d) The sample sizes are large
enough that skewness should not matter.

.
7.74. With the given standard deviations, SE D = df t∗ Confidence interval
0.2653; regardless of how we choose df, a 95% con- 121.5 1.9797 4.3747 to 5.4253
fidence interval for the difference in means rounds 114 1.9810 4.3744 to 5.4256
to 4.37 to 5.43 mg.y/m3 . With the null hypothesis 100 1.984 4.3736 to 5.4264
H0 : µi = µo (and either a one- or two-sided alternative, as in the previous exercise), we find
t = 18.47, for which P < 0.0001 regardless of df and the chosen alternative. We have strong
evidence that outdoor concrete workers have lower respirable dust exposure than the indoor
workers.

7.75. To find a confidence interval (x 1 − x 2 ) ± t ∗ SE D , we need one of the following:


• Sample sizes and standard deviations—in which case we could find the interval
in the usual way
• t and df—because t = (x 1 − x 2 )/SE D , so we could compute SE D = (x 1 − x 2 )/t
and use df to find t ∗
• df and a more accurate P-value—from which we could determine t, and then
proceed as above
The confidence interval could give us useful information about the magnitude of the
difference (although with such a small P-value, we do know that a 95% confidence interval
would not include 0).
Solutions 217

7.76. (a) The 68–95–99.7 rule suggests that the distri- df t∗ Confidence interval
butions are not Normal: If they were Normal, then 7.8 2.3159 −16.4404 to 3.8404
(for example) 95% of 7-to-10-year-olds drink be- 4 2.776 −18.4551 to 5.8551
tween −13.2 and 29.6 oz of sweetened drinks per day.
As negative numbers do not make sense (unless some children are regurgitating sweetened
. .
drinks), the distributions must be right-skewed. (b) We find SE D = 4.3786 and t = −1.439,
.
with either df = 7.8 (P = 0.1890) or df = 4 (P = 0.2236). We do not have enough evidence
to reject H0 . (c) The possible 95% confidence intervals are given in the table. (d) Because
the distributions are not Normal and the samples are small, the t procedures are questionable
for these data. (e) Because this group is not an SRS—and indeed might not be random in
any way—we would have to be very cautious about extending these results to other children.

7.77. This is a matched pairs design; for example, Monday hits are (at least potentially) not
independent of one another. The correct approach would be to use one-sample t methods on
the seven differences (Monday hits for design 1 minus Monday hits for design 2, Tuesday/1
minus Tuesday/2, and so on).

7.78. (a) Results for this randomization will depend on df t∗ Confidence interval
.
the technique used. (b) SE D = 0.5235, and the op- 12.7 2.1651 0.6667 to 2.9333
tions for the 95% confidence interval are given on the 9 2.262 0.6160 to 2.9840
right. (c) Because 0 falls outside the 95% confidence
.
interval, the P-value is less than 0.05, so we would reject H0 . (For reference, t = 3.439 and
the actual P-value is either 0.0045 or 0.0074, depending on which df we use.)

7.79. The next 10 employees who need screens might not be an independent group—perhaps
they all come from the same department, for example. Randomization reduces the chance
that we end up with such unwanted groupings.

7.80. (a) The null hypothesis is µ1 = µ2 ; the alter- df t∗ Confidence interval


native can be either two- or one-sided. (It might be 121.5 1.9797 1.7865 to 2.8935
a reasonable expectation that µ1 > µ2 .) We find 60 2.000 1.7808 to 2.8992
.
SE D = 0.2796 and t = 8.369. Regardless of df and
Ha , the conclusion is the same: P is very small, and we conclude that WSJ ads are more
trustworthy. (b) Possible 95% confidence intervals are given in the table; both place the dif-
ference in trustworthiness at between about 1.8 and 2.9 points. (c) Advertising in WSJ is
seen as more reliable than advertising in the National Enquirer—a conclusion that probably
comes as a surprise to no one.
218 Chapter 7 Inference for Distributions

7.81. (a) Stemplots, boxplots, and five-number North South 60


summaries (in cm) are shown on the right. 43322 0 2
65 0 57 50

Tree diameter (cm)


The north distribution is right-skewed, 443310 1 2
while the south distribution is left-skewed. 955 1 8 40
2 13
(b) The methods of this section seem to 8755 2 689
0 3 2 30
be appropriate in spite of the skewness
996 3 566789
because the sample sizes are relatively 43 4 003444 20
large, and there are no outliers in either 6 4 578
4 5 0112 10
distribution. (c) We test H0 : µn = µs 85 5
versus Ha : µn = µs ; we should use a 0
North South
two-sided alternative because we have no
reason (before looking at the data) to ex- Min Q1 M Q3 Max
pect a difference in a particular direction. North 2.2 10.2 17.05 39.1 58.8
(d) The means and standard deviations are South 2.6 26.1 37.70 44.6 52.9
.
x n = 23.7, sn = 17.5001, x s = 34.53, and
. . t∗
ss = 14.2583 cm. Then SE D = 4.1213, so df Confidence interval
t = −2.629 with df = 55.7 (P = 0.011) or 55.7 2.0035 −19.0902 to −2.5765
df = 29 (P = 0.014). We conclude that the 29 2.045 −19.2614 to −2.4053
means are different (specifically, the south
mean is greater than the north mean). (e) See the table for possible 95% confidence intervals.

7.82. (a) Stemplots, boxplots, and five-number East West 60


summaries (in cm) are shown on the right. 222 0 233
9566655 0 50
The east distribution is right-skewed, while
Tree diameter (cm)
3100 1 11
the west distribution is left-skewed. (b) The 7 1 78
33222 2 0011 40
methods of this section seem to be appro- 2 55
priate in spite of the skewness because the 11 3 00 30
98 3 555669
sample sizes are relatively large, and there 333 4 023444 20
are no outliers in either distribution. (c) We 86 4 1
1 5 78 10
test H0 : µe = µw versus Ha : µe = µw ; we
should use a two-sided alternative because 0
East West
we have no reason (before looking at the
data) to expect a difference in a particular Min Q1 M Q3 Max
direction. (d) The means and standard de- East 2.3 6.7 19.65 38.7 51.1
.
viations are x e = 21.716, se = 16.0743, West 2.9 20.4 33.20 42.1 58.0
.
x w = 30.283, and sw = 15.3314 cm.
. t∗
Then SE D = 4.0556, so t = −2.112 with df Confidence interval
df = 57.8 (P = 0.0390) or df = 29 57.8 2.0018 −16.6852 to −0.4481
(P = 0.0434). We conclude that the means 29 2.045 −16.8604 to −0.2730
are different at α = 0.05 (specifically,
the west mean is greater than the east mean). (e) See the table for possible 95% confidence
intervals.
Solutions 219

.
7.83. (a) SE D = 1.9686. Answers will vary with the df df t∗ Confidence interval
used (see the table), but the interval is roughly −1 to 122.5 1.9795 −0.8968 to 6.8968
7 units. (b) Because of random fluctuations between 54 2.0049 −0.9468 to 6.9468
stores, we might (just by chance) have seen a rise in 50 2.009 −0.9549 to 6.9549
the average number of units sold even if actual mean sales had remained unchanged. (Based
on the confidence interval, mean sales might have even dropped slightly.)

7.84. (a) Good statistical practice dictates that the alternative hypothesis should be chosen
without looking at the data; we should only choose a one-sided alternative if we have some
reason to expect it before looking at the sample results. (b) The correct P-value is twice
that reported for the one-tailed test: P = 0.12.

7.85. (a) We test H0 : µb = µ f ; Ha : µb > µ f . df t∗ Confidence interval


.
SE D = 0.5442 and t = 1.654, for which P = 0.0532 37.6 2.0251 −0.2021 to 2.0021
(df = 37.6) or 0.0577 (df = 18); there is not quite 18 2.101 −0.2434 to 2.0434
enough evidence to reject H0 at α = 0.05. (b) The confidence interval depends on the
degrees of freedom used; see the table. (c) We need two independent SRSs from Normal
populations.

7.86. See the solution to Exercise 7.65 for a table of means and standard deviations.
.
The pooled standard deviation is sp = 1.0454, so the pooled standard error is
√ . .
sp 1/14 + 1/17 = 0.3773. The test statistic is t = −4.098 with df = 29, for which
.
P = 0.0002, and the 95% confidence interval (with t ∗ = 2.045) is −2.3178 to −0.7747.
In the solution to Exercise 7.65, we reached the same conclusion on the significance test
. .
(t = −4.303 and P = 0.0001) and the confidence interval was quite similar (roughly −2.3
to −0.8).

7.87. See the solution to Exercise 7.66 for a table of means and standard deviations.
.
The pooled standard deviation is sp = 0.9347, so the pooled standard error is
√ . .
sp 1/22 + 1/20 = 0.2888. The test statistic is t = 3.636 with df = 40, for which
.
P = 0.0008, and the 95% confidence interval (with t ∗ = 2.021) is 0.4663 to 1.6337.
In the solution to Exercise 7.66, we reached the same conclusion on the significance test
. . .
(t = 3.632 and P = 0.0008) and the confidence interval (using the more-accurate df = 39.5)
was quite similar: 0.4655 to 1.6345.

7.88. See the solution to Exercise 7.67 for means and df t∗ Confidence interval
standard deviations. The pooled standard deviation 687 1.9634 0.0842 to 0.4088
.
is sp = 1.0129, so the pooled standard error is 100 1.984 0.0825 to 0.4105
√ . .
sp 1/468 + 1/221 = 0.0827. The test statistic is t = 2.981 with df = 687, for which
P = 0.0030 (or, using Table D, 0.002 < P < 0.005). The 95% confidence interval is one of
the two entries in the table on the right.
In the solution to Exercise 7.67, we reached the same conclusion on the significance test
. .
(t = 2.898 and P = 0.0040). The confidence intervals were slightly wider, but similar; using
.
the more-accurate df = 402.2, the interval was 0.0793 to 0.4137. (The other intervals were
wider than this.)
220 Chapter 7 Inference for Distributions

7.89. See the solution to Exercise 7.81 for means and df t∗ Confidence interval
standard deviations. The pooled standard devia- 58 2.0017 −19.0830 to −2.5837
.
tion is sp = 15.9617, and the standard error is 50 2.009 −19.1130 to −2.5536
.
SE D = 4.1213. For the significance test, t = −2.629, df = 58, and P = 0.0110, so we
have fairly strong evidence (though not quite significant at α = 0.01) that the south mean is
greater than the north mean. Possible answers for the confidence interval (with software, and
with Table D) are given in the table. All results are similar to those found in Exercise 7.81.
Note: If n1 = n2 (as in this case), the standard error and t statistic are the same for the
usual and pooled procedures. The degrees of freedom will usually be different (specifically, df
is larger for the pooled procedure, unless s1 = s2 and n1 = n2 ).

7.90. Testing the same hypotheses as in that example (H0 : µ1 = µ2 versus Ha : µ1 = µ2 ), we


. . .
have pooled standard deviation sp = 8.1772 so that SE D = 1.2141 and t = 4.777. With
either df = 236 or 100, we find that P < 0.001, so we have very strong evidence that mean
systolic blood pressure is higher in low sleep efficiency children. This is nearly identical to
the results of the unpooled analysis (t = 4.18, P < 0.001).

. . .
7.91. With sn = 17.5001, ss = 14.2583, and n n = n s = 30, we have sn2 /n n = 10.2085 and
.
ss2 /n s = 6.7767, so:  2 
sn ss2 2
nn + ns . (10.2085 + 6.7767)
2
.
df =  2 2  2 2 = = 55.7251
29 (10.2085 + 6.7767 )
sn s 1 2 2
1
nn − 1 nn + n s 1− 1 nss

. . .
7.92. With se = 16.0743, sw = 15.3314, and n e = n w = 30, we have se2 /n e = 8.6128 and
.
sw /n w = 7.8351, so:
2
 2 
se sw2 2
ne + nw . (8.6128 + 7.8351)
2
.
df =  2 2  2 2 = 1 = 57.8706
29 (8.6128 + 7.8351 )
se s 2 2
1
ne − 1 ne + n w1− 1 nww
Solutions 221

. .
7.93. (a) With si = 7.8, n i = 115, so = 3.4, df t∗ Confidence interval
.
and n o = 220, we have si2 /n i = 0.5290 and Part (d) 333 1.9671 10.2931 to 12.7069
.
so2 /n o = 0.05455, so: 100 1.984 10.2827 to 12.7173
Part (e) 333 1.9671 4.5075 to 5.2925
. (0.5290 + 0.05455) .
2
df = = 137.0661 100 1.984 4.5042 to 5.2958
0.52902 0.054552
+
114 219

.
(n i − 1)si2 + (n o − 1)so2
(b) sp = = 5.3320, which is slightly closer to so (the standard deviation
ni + no − 2

.
from the larger sample). (c) With no assumption of equality, SE1 = si2 /n i + so2 /n o =
√ .
0.7626. With the pooled method, SE2 = sp 1/n i + 1/n o = 0.6136. (d) With the pooled
.
standard deviation, t = 18.74 and df = 333, for which P < 0.0001, and the 95% confidence
interval is as shown in the table. With the smaller standard error, the t value is larger (it had
been 15.08), and the confidence interval is narrower. The P-value is also smaller (although
. .
both are less than 0.0001). (e) With si = 2.8, n i = 115, so = 0.7, and n o = 220, we have
. .
si2 /n i = 0.06817 and so2 /n o = 0.002227, so:
. (0.06817 + 0.002227) .
2
df = = 121.5030
0.068172 0.0022272
+
114 219
.
The pooled standard deviation is sp = 1.7338; the standard errors are SE1 = 0.2653 (with
no assumptions) and SE2 = 0.1995 (assuming equal standard deviations). The pooled t is
24.56 (df = 333, P < 0.0001), and the 95% confidence intervals are shown in the table. The
pooled and usual t procedures compare similarly to the results for part (d): With the pooled
procedure, t is larger, and the interval is narrower.

7.94. We have n 1 = n 2 = 5. (a) For a two-sided test with df = 4, the critical value is
t ∗ = 2.776. (b) With the pooled procedures, df = 8 and the critical value is t ∗ = 2.306.
(c) The smaller critical value with the pooled approach means that a smaller t-value (that is,
weaker evidence) is needed to reject H0 .
Note: When software is available, we use the more accurate degrees of freedom for the
standard approach. In this case, pooling typically is less beneficial; for this example, the
.
software output shown in Figure 7.14 shows that df = 7.98 for the unpooled approach.

7.95. (a) From an F(15, 22) distribution with α = 0.05, F ∗ = 2.15. (b) Because F = 2.45
is greater than the 5% critical value, but less than the 2.5% critical value (F ∗ = 2.50),
we know that P is between 2(0.025) = 0.05 and 2(0.05) = 0.10. (Software tells us that
P = 0.055.) F = 2.45 is significant at the 10% level but not at the 5% level.

7.96. The power would be higher. Larger differences are easier to detect; µ 1 − µ2 Power
that is, when µ1 − µ2 is more than 5, there is a greater chance that the 5 0.7965
test statistic will be significant. 6 0.9279
Note: In fact, as the table on the right shows, if we repeat the com- 7 0.9817
putations of Example 7.23 with larger values of µ1 − µ2 , the power 8 0.9968
increases rapidly. 9 0.9996
222 Chapter 7 Inference for Distributions

7.97. The power would be smaller. A larger value of σ means that large dif- σ Power
ferences between the sample means would arise more often by chance so 7.4 0.7965
that, if we observe such a difference, it gives less evidence of a difference 7.5 0.7844
in the population means. 7.6 0.7722
Note: The table on the right shows the decrease in the power as σ in- 7.7 0.7601
creases. 7.8 0.7477

7.98. (a) F = 9.1


3.5 = 2.6. (b) For 17 and 8 degrees of freedom (using 15 and 8 in Table E),
we need F > 4.10 to reject H0 at the 5% level with a two-sided alternative. (c) We cannot
conclude that the standard deviations are different. (Software gives P = 0.1716.)
 2
.
7.99. The test statistic is F = 9.9
7.5 = 1.742, with df 60 and 176. The two-sided P-value is
0.0057, so we can reject H0 and conclude that the standard deviations are different. We do
not know if the distributions are Normal, so this test may not be reliable.
 2
.
7.100. The test statistic is F = 13.75
7.94 = 2.9989, with df 70 and 36. The two-sided P-value
is 0.0005, so we have strong evidence that the standard deviations are different. However,
this test assumes that the underlying distributions are Normal; if this is not true, then the
conclusion may not be reliable.

2 .
7.101. The test statistic is F = 1.16
1.152
= 1.0175 with df 211 and 164. Table E tells us that
P > 0.20, while software gives P = 0.9114. The distributions are not Normal (“total score
was an integer between 0 and 6”), so the test may not be reliable (although with s1 and
s2 so close, the conclusion is probably correct). To reject at the 5% level, we would need
F > F ∗ , where F ∗ = 1.46 (using df 120 and 100 from Table E) or√F ∗ = 1.3392 (using
software). As F = s22 /s12 , we would need s22 > s12 F ∗ , or s2 > 1.15 F ∗ , which is about
1.3896 (Table E) or 1.3308 (software).
2 .
7.102. The test statistic is F = 1.19
1.122
= 1.1289 with df 164 and 211. Table E tells us that
P > 0.2, while software gives P = 0.4063. We cannot conclude that the standard deviations
are different. The distributions are not Normal (because all responses are integers from 1 to
5), so the test may not be reliable.

2 .
7.103. The test statistic is F = 7.8
3.72
= 5.2630 with df 114 and 219. Table E tells us that
P < 0.002, while software gives P < 0.0001; we have strong evidence that the standard
deviations differ. The authors described the distributions as somewhat skewed, so the
Normality assumption may be violated.
Solutions 223

2
7.104. The test statistic is F = 2.8
0.72
= 16 with df 114 and 219. Table E tells us that P < 0.002,
while software gives P < 0.0001; we have strong evidence that the standard deviations
differ. We have no information about the Normality of the distributions, so it is difficult
to determine how reliable these conclusions are. (We can observe that for Exercise 7.73,
x 1 − 3s1 and x 2 − 3s2 were both negative, hinting at the skewness of those distributions.
For Exercise 7.74, this is not the case, suggesting that these distributions might not be as
skewed.)

. 2 .
7.105. The test statistic is F = 17.5001
14.25832
= 1.5064 with df 29 and 29. Table E tells us that
P > 0.2, while software gives P = 0.2757; we cannot conclude that the standard deviations
differ. The stemplots and boxplots of the north/south distributions in Exercise 7.81 do not
appear to be Normal (both distributions were skewed), so the results may not be reliable.

. 2 .
7.106. The test statistic is F = 16.0743
15.33142
= 1.0993 with df 29 and 29. Table E tells us that
P > 0.2, while software gives P = 0.8006; we cannot conclude that the standard deviations
differ. The stemplots and boxplots of the east/west distributions in Exercise 7.82 do not
appear to be Normal (both distributions were skewed), so the results may not be reliable.

. 2 .
7.107. (a) To test H0 : σ1 = σ2 versus Ha : σ1 = σ2 , we find F = 7.1554
6.76762
= n F∗
1.1179. We do not reject H0 . (b) With an F(4, 4) distribution with a two- 5 9.60
sided alternative, we need the critical value for p = 0.025: F ∗ = 9.60. The 4 15.44
table on the right gives the critical values for other sample sizes. With such 3 39.00
2 647.79
small samples, this is a very low-power test; large differences between σ1
and σ2 would rarely be detected.

7.108. (a) For two samples of size 20, we have Power Power
noncentrality parameter n δ t∗ (Normal) (software)
10 . 20 1.5811 2.0244 0.3288 0.3379
δ= √ = 1.5811
20 2/20 60 2.7386 1.9803 0.7759 0.7753
The power is about 0.33 (using the Normal
approximation) or 0.34 (software); see the table on the right. (b) With n 1 = n 2 = 60, we
.
have δ = 2.7386, df = 118, and t ∗ = 1.9803 (or 1.984 for df = 100 from Table D). The
approximate power is about 0.78 (details in the table on the right). (c) Samples of size 60
would give a reasonably good chance of detecting a difference of 20 cm.

7.109. The four standard deviations from Exercises 7.81 and 7.82 are Power with n =
. . . .
sn = 17.5001, ss = 14.2583, se = 16.0743, and sw = 15.3314 cm. σ 20 60
Using a larger σ for planning the study is advisable because it pro- 15 0.5334 0.9527
vides a conservative (safe) estimate of the power. For example, if 16 0.4809 0.9255
we choose a sample size to provide 80% power and the true σ is 17 0.4348 0.8928
smaller than that used for planning, the actual power of the test is 18 0.3945 0.8560
greater than the desired 80%.
Results of additional power computations depend on what students consider to be “other
reasonable values of σ .” Shown in the table are some possible answers using the Normal
approximation. (Powers computed using the noncentral t distribution are slightly greater.)
224 Chapter 7 Inference for Distributions

√ .
7.110. (a) The noncentrality parameter is δ = 1.5
= 5.3446. With such a large value of δ,
1.6 2/65
the value of t ∗ (1.9787 for df = 128, or 1.984 for df = 100 from Table D) does not matter
.
very much. The Normal approximation for the power is P(Z > t ∗ − δ) = 0.9996 for either
.
choice of t ∗ . Software gives the same result. (b) For samples of size 100, δ = 6.6291, and
once again the value of t ∗ makes little difference; the power is very close to 1 (using the
Normal approximation or software). (c) Because the effect is large relative to the standard
deviation, small samples are sufficient. (Even samples of size 20 will detect this difference
with probability 0.8236.)

.
7.111. The mean is x = 140.5, the standard deviation is s = 13.58, and the standard error of
.
the mean is sx = 6.79. It would not be appropriate to construct a confidence interval because
we cannot consider these four scores to be an SRS.

7.112. To support the alternative µ1 > µ2 , we need to see x 1 > x 2 , so that t = (x 1 − x 2 )/SE D
must be positive. (a) If t = 2.08, the one-sided P-value is half of the reported two-sided
value (P = 0.04), so we reject H0 at α = 0.05. (b) t = −2.08 does not support Ha ; the
one-sided P-value is 0.96. We do not reject H0 at α = 0.05 (or any reasonable choice of α).

7.113. The plot (below, left) shows that t ∗ approaches 1.96 as df increases.

7.114. The plot (below, right) shows that t ∗ approaches 1.645 as df increases.

For 7.113 For 7.114

4 2.75
t* for 95% confidence

t* for 90% confidence

3.5 2.5
3 2.25
2.5 2
2 1.75

1.5 1.5
0 20 40 60 80 100 0 20 40 60 80 100
Degrees of freedom Degrees of freedom


7.115. The margin of error is t ∗ / n, using t ∗ for df = n − 1 and 95% confidence. For
example, when n = 5, the margin of error is 1.2417, and when n = 10, it is 0.7154, and
for n = 100, it is 0.1984. As we see in the plot (next page, left), as sample size increases,
margin of error decreases (toward 0, although it gets there very slowly).

7.116. The margin of error is t ∗ / n, using t ∗ for df = n − 1 and 99% confidence. For
example, when n = 5, the margin of error is 2.0590, and when n = 10, it is 1.0277, and for
n = 100, it is 0.2626. As we see in the plot (next page, right), as sample size increases,
margin of error decreases (toward 0, although it gets there very slowly).
Solutions 225

For 7.115 For 7.116


2

Margin of error (95% conf.)

Margin of error (99% conf.)


1.2
1
1.5
0.8
0.6 1

0.4
0.5
0.2
0 0
0 20 40 60 80 100 0 20 40 60 80 100
Sample size Sample size

7.117. (a) Use two independent samples (students that live in the dorms, and those that live
elsewhere). (b) Use a matched pairs design: Take a sample of college students, and have
each subject rate the appeal of each label design. (c) Take a single sample of college
students, and ask them to rate the appeal of the product.

7.118. (a) Take a single sample of customers, and record the age of each subject. (b) Use two
independent samples (this year’s sample, and last year’s sample). (c) Use a matched pairs
design: Ask each customer in the sample to rate each floor plan.

.
7.119. (a) To test H0 : µ = 1.5 versus Ha : µ < 1.5, we have t = 1.20 −
√ 1.5 = −2.344 with
1.81/ 200
.
df = 199, for which P = 0.0100. We can reject H0 at the 5% significance level. (b) From
Table D, use df = 100 and t ∗ = 1.984, so the 95% confidence interval for µ is

1.81
1.20 ± 1.984 √ = 1.20 ± 0.2539 = 0.9461 to 1.4539 violations
200
(With software, the interval is 0.9476 to 1.4524.) (c) While the significance test lets us
conclude that there were fewer than 1.5 violations (on the average), the confidence interval
gives us a range of values for the mean number of violations. (d) We have a large sample
(n = 200), and the limited range means that there are no extreme outliers, so t procedures
should be safe.

7.120. (a) The prior studies give us reason to expect greater improvement from those who used
.
the computer, so we test H0 : µC = µ D versus Ha : µC > µ D . (b) With SE D = 0.7527, the
. . .
test statistic is t = 2.79, so P = 0.0027 (df = 484.98) or 0.0025 < P < 0.005 (df = 241;
use df = 100 in Table D). Either way, we reject H0 at the α = 0.05 level. (c) While we
have strong evidence that the mean improvement is greater using computerized training, we
cannot draw this conclusion about an individual’s response.

7.121. (a) The mean difference in body weight change (with wine minus
√ without wine)
.
was x 1 = 0.4 − 1.1 = −0.7 kg, with standard error SE1 = 8.6/ 14 = 2.2984 kg.
The mean difference
√ in caloric intake was x 2 = 2589 − 2575 = 14 cal, with
.
SE2 = 210/ 14 = 56.1249 cal. (b) The t statistics ti = x i /SEi , both with df = 13, are
t1 = −0.3046 (P1 = 0.7655) and t2 = 0.2494 (P2 = 0.8069). (c) For df = 13, t ∗ = 2.160,
so the 95% confidence intervals x i ± t ∗ SEi are −5.6646 to 4.2646 kg (−5.6655 to 4.2655
with software) and −107.2297 to 135.2297 cal (−107.2504 to 135.2504 with software).
226 Chapter 7 Inference for Distributions

(d) Students might note a number of factors in their discussions; for example, all subjects
were males, weighing 68 to 91 kg (about 150 to 200 lb), which may limit how widely we
can extend these conclusions.

7.122. For both entrée and wine, we test H0 : µC = µ N df t∗ Confidence interval


versus Ha : µC > µ N , because the question asked 29.3 2.0442 1.2975 to 120.3025
suggests an expectation that consumption would be 14 2.145 −1.6364 to 123.2364
greater when the wine was identified as coming from California.
. . . .
For the entrée, SE D = 29.1079 and t = 2.089, for which P = 0.0227 (df = 29.3) or
0.025 < P < 0.05 (df = 14). For wine consumption, we will not reject H0 , because the
. .
sample means are not consistent with Ha : SE D = 5.2934, t = −1.814, and P > 0.5.
Having rejected H0 for the entrée, a good second step is to give a confidence interval for
the size of the difference; the two options for this interval are in the table on the right. Note
that with the conservative degrees of freedom (df = 14), the interval includes 0, because the
one-sided P-value was greater than 0.025.

7.123. How much a person eats or drinks may depend on how many people he or she is sitting
with. This means that the individual customers within each wine-label group probably
cannot be considered to be independent of one another, which is a fundamental assumption
of the t procedures.

. . √
7.124. The mean is x = 26.8437 cm, s = 18.3311 cm, and the margin of error is t ∗ · s/ 584:
df t∗ Interval
Table D 100 1.984 26.8437 ± 1.5050 = 25.3387 to 28.3486 cm
Software 583 1.9640 26.8437 ± 1.4898 = 25.3538 to 28.3335 cm
The confidence interval is much narrower with the whole data set, largely because the
standard error is about one-fourth what it was with a sample of size 40. The distribution of
the 584 measurements is right-skewed (although not as much as the smaller sample). If we
can view these trees as an SRS of similar stands—a fairly questionable assumption—the t
procedures should be fairly reliable because of the large sample size. See the solution to
Exercise 7.126 for an examination of the distribution.

7.125. The tables on the following page contain summary statistics and 95% confidence
intervals for the differences. For north/south differences, the test of H0 : µn = µs gives
t = −7.15 with df = 575.4 or 283; either way, P < 0.0001, so we reject H0 . For east/west
.
differences, t = −3.69 with df = 472.7 or 230; either way, P = 0.0003, so we reject H0 .
The larger data set results in smaller standard errors (both are near 1.5, compared to about 4
in Exercises 7.81 and 7.82), meaning that t is larger and the margin of error is smaller.
Solutions 227

x s n df t∗ Confidence interval
North 21.7990 18.9230 300 N–S 575.4 1.9641 −13.2222 to −7.5248
South 32.1725 16.0763 284 283 1.9684 −13.2285 to −7.5186
East 24.5785 17.7315 353 100 1.984 −13.2511 to −7.4960
West 30.3052 18.7264 231 E–W 472.7 1.9650 −8.7764 to −2.6770
230 1.9703 −8.7847 to −2.6687
100 1.984 −8.8059 to −2.6475

7.126. The histograms and quantile plots are shown be- x M


low, and the means and medians are given in the table Original 26.8437 26.15
on the right and are marked on the histograms. (The Natural log 2.9138 3.2638
plots were created using natural logarithms; for com- Common log 1.2654 1.4175
mon logs, the appearance would be roughly the same except for scale.) The transformed data
does not look notably more Normal; it is left-skewed instead of right-skewed. The t proce-
dures should be fairly dependable anyway because of the large sample size, but only if we
can view the data as an SRS from some population.

90 100
80 M x x M
70 80
Frequency

Frequency
60
50 60
40
40
30
20 20
10
0 0
0 10 20 30 40 50 60 70 80 0.5 1 1.5 2 2.5 3 3.5 4 4.5
DBH (cm) Log(DBH)
80 4.5
• ••
70
•••••••
• 4
••
••
••
•••
••
••••••••••••••

••• •••••••••
•••••
60 3.5
••••• ••
DBH (cm)


Log(DBH)

50 ••
• 3 •••
••• ••••
40 •
•••••••• 2.5 •
•• ••
30
•••• 2

•••
20
•••
• 1.5
••••

• •
10
••••• 1 •••••
0 ••••••••••••••••••
• •••••••••• 0.5 • ••••••••••••••
–4 –3 –2 –1 0 1 2 3 –4 –3 –2 –1 0 1 2 3
Normal score Normal score

7.127. (a) This is a matched pairs design because at each of the 24 nests, the same
mockingbird responded on each day. (b) The variance of the difference is approximately
s12 + s42 − 2ρs1 s4 = 48.684, so the standard deviation is 6.9774 m. (c) To test H0 : µ1 = µ4
.
versus Ha : µ1 = µ4 , we have t = 15.1 −√6.1 = 6.319 with df = 23, for which P is very
6.9774/ 24
small. (d) Assuming the correlation is the same (ρ = 0.4), the variance of the difference
is approximately s12 + s52 − 2ρs1 s5 = 31.324, so the standard deviation is 5.5968 m. To
.
test H0 : µ1 = µ5 versus Ha : µ1 = µ5 , we have t = 4.9 − 6.1√ = −1.050 with df = 23,
5.5968/ 24
228 Chapter 7 Inference for Distributions

.
for which P = 0.3045. (e) The significant difference between day 1 and day 4 suggests
that the mockingbirds altered their behavior when approached by the same person for four
consecutive days; seemingly, the birds perceived an escalating threat. When approached by a
new person on day 5, the response was not significantly different from day 1; this suggests
that the birds saw the new person as less threatening than a return visit from the first person.

7.128. (a) In each case, we test H0 : µ


a = µs versus Ha : µa = µs . The table below
a −x s
summarizes the values of SE D = sa2 /n a + ss2 /n s and t = xSE D
, and the two options
for df and P. For the second option, P-values from Table D are obtained using df = 70.
(b) We conclude that sedentary female high school students had significantly higher body
fat, BMI, and calcium deficit than female athletes. The difference in milk consumption was
not significant.
Option 1 Option 2
Group SE D t df P df P
Body fat 1.0926 −6.315 140.1 P < 0.0001 79 P < 0.001
Body mass index 0.4109 −11.707 156.3 P < 0.0001 79 P < 0.001
.
Calcium deficit 71.2271 −3.979 143.7 P = 0.0001 79 P < 0.001
.
Glasses of milk/day 0.2142 1.821 154.0 P = 0.0705 79 0.05 < P < 0.1

7.129. The mean and standard deviation of the 25 numbers are x = 78.32% and
. .
s = 33.3563%, so the standard error is SEx = 6.6713%. For df = 24, Table D gives

t = 2.064, so the 95% confidence interval is x ± 13.7695% = 64.5505% to 92.0895% (with
software, t ∗ = 2.0639 and the interval is x ± 13.7688% = 64.5512% to 92.0888%). This
seems to support the retailer’s claim: The original supplier’s price was higher between 65%
to 93% of the time.

7.130. (a) We are interested in weight change; the pairs are the “before” and “after”
measurements. (b) The mean weight change was a loss. The exact amount lost is not
specified, but it was large enough so that it would rarely happen by chance for an ineffective
weight-loss program. (c) Comparing to a t (40) distribution in Table D, we find P < 0.0005
for a one-sided alternative (P < 0.0010 for a two-sided alternative). Software reveals that it
is even smaller than that: about 0.000013 (or 0.000026 for a two-sided alternative).
Solutions 229

7.131. Back-to-back stemplots below. The dis- GPA IQ


tributions appear similar; the most striking n x s x s
difference is the relatively large number of Boys 47 7.2816 2.3190 110.96 12.121
boys with low GPAs. Testing the difference Girls 31 7.6966 1.7208 105.84 14.271
in GPAs (H0 : µb = µg ; Ha : µb < µg ), we
. df t∗ Confidence interval
obtain SE D = 0.4582 and t = −0.91, which GPA 74.9 1.9922 −1.3277 to 0.4979
is not significant, regardless of whether 30 2.042 −1.3505 to 0.5207
we use df = 74.9 (P = 0.1839) or 30 IQ 56.9 2.0025 −1.1167 to 11.3542
(0.15 < P < 0.20). The 95% confidence 30 2.042 −1.2397 to 11.4772
interval for the difference µb − µg in GPAs
is shown in the second table on the right.
.
For the difference in IQs, we find SE D = 3.1138. With the same hypotheses as before, we
find t = 1.64—fairly strong evidence, but not quite significant at the 5% level: P = 0.0528
(df = 56.9) or 0.05 < P < 0.10 (df = 30). The 95% confidence interval for the difference
µb − µg in IQs is shown in the second table on the right.

GPA: Girls Boys IQ: Girls Boys


0 5 42 7
1 7 7 79
2 4 8
4 3 689 96 8
7 4 068 31 9 03
952 5 0 86 9 77
4200 6 019 433320 10 0234
988855432 7 1124556666899 875 10 556667779
998731 8 001112238 44422211 11 00001123334
95530 9 1113445567 98 11 556899
17 10 57 0 12 03344
8 12 67788
20 13
13 6
230 Chapter 7 Inference for Distributions

7.132. The median self-concept score is GPA IQ


59.5. A back-to-back stemplot (be- n x s x s
low) suggests that high self-concept High SC 39 8.4723 1.3576 114.23 10.381
students have a higher mean GPA and Low SC 39 6.4208 2.2203 103.62 13.636
IQ. Testing H0 : µlow = µhigh versus
df t∗ Confidence interval
Ha : µlow = µhigh for GPAs leads to
. GPA 62.9 1.9984 1.2188 to 2.8843
SE D = 0.4167 and t = 4.92, which is 38 2.0244 1.2079 to 2.8951
quite significant (P < 0.0005 regardless 30 2.042 1.2006 to 2.9025
of df). The confidence interval for the IQ 70.9 1.9940 5.1436 to 16.0871
difference µhigh − µlow in GPAs is shown 38 2.0244 5.0601 to 16.1707
in the second table on the right. 30 2.042 5.0118 to 16.2190
For the difference in IQs, we find
.
SE D = 2.7442. With the same hypotheses as before, we find t = 3.87, which is quite signifi-
cant (P < 0.0006 regardless of df). The confidence interval for the difference µhigh − µlow in
IQs is shown in the second table on the right.
In summary, both differences are significant; with 95% confidence, high self-concept stu-
dents have a mean GPA that is 1.2 to 2.9 points higher, and their mean IQ is 5 to 16 points
higher.

GPA: Low SC High SC IQ: Low SC High SC


5 0 42 7
7 1 97 7
4 2 8
864 3 9 96 8
8760 4 310 9 3
9520 5 776 9 8
94000 6 12 44300 10 22333
986655421 7 1234556688899 9877766 10 55567
887321100 8 112399 44432100 11 00111222334
97 9 0111334455556 99985 11 568
10 1577 4 12 00334
7 12 67888
13 02
13 6

7.133. It is reasonable to have a prior belief that peo- df t∗ Confidence interval


ple who evacuated their pets would score higher, 237.0 1.9700 0.7779 to 2.6021
so we test H0 : µ1 = µ2 versus Ha : µ1 > µ2 . We 115 1.9808 0.7729 to 2.6071
.
find SE D = 0.4630 and t = 3.65, which gives 100 1.984 0.7714 to 2.6086
P < 0.0005 no matter how we choose degrees of freedom (115 or 237.0). As one might
suspect, people who evacuated their pets have a higher mean score.
One might also compute a 95% confidence interval for the difference; these are given in
the table.
Solutions 231

7.134. (a) “se” is standard error (of the Calories Alcohol


mean). To find s, multiply the stan- n x s x s
√ Drivers 98 2821 435.58 0.24 0.5940
dard error by n. (b) No: We test
H0 : µd = µc versus Ha : µd < µc , for Conductors 83 2844 437.30 0.39 1.0021
. .
which SE D = 65.1153 and t = −0.3532,
df t∗ Confidence interval
so P = 0.3623 (df = 173.9) or 0.3625 128.4 2.6146 −0.4776 to 0.1776
(df = 82)—in either case, there is lit- 82 2.6371 −0.4804 to 0.1804
tle evidence against H0 . (c) The evi- 80 2.639 −0.4807 to 0.1807
dence is not very significant: To test
. .
H0 : µd = µc versus Ha : µd = µc , SE D = 0.1253, t = −1.1971, for which P = 0.2335
(df = 128.4) or 0.2348 (df = 82). (d) The 95% confidence interval is 0.39 ± t ∗ (0.11).
With Table D, t ∗ = 1.990 (df = 80) and the interval is 0.1711 to 0.6089 g; with software,
t ∗ = 1.9893 (df = 82) and√ the interval is 0.1712 to 0.6088 g. (e) The 99% confidence inter-
val is (0.24 − 0.39) ± t ∗ 0.062 + 0.112 ; see the table.

7.135. The similarity of the sample standard deviations suggests that the population standard
.
deviations are likely to be similar. The pooled standard deviation is sp = 436.368 and
.
t = −0.3533, so P = 0.3621 (df = 179)—still not significant.

7.136. (a) The sample sizes (98 and 83) are quite large, so the t test should be reasonably safe
(provided there are no extreme outliers). (b) Large samples do not make the F test more
reliable when the underlying distributions are skewed, so it should not be used.

7.137. No: What we have is nothing like an SRS of the population of school corporations.
232 Chapter 7 Inference for Distributions

7.138. (a) Back-to-back stemplot be- SATM SATV


low; summary statistics on the right. n x s x s
With a pooled standard deviation, Men 145 611.772 84.0206 508.841 94.3485
. . Women 79 565.025 82.9294 496.671 89.3849
sp = 83.6388, t = 4.00 with df = 222,
. .
so P < 0.0001. Without pooling, SE D = 11.6508, t = 4.01 with df = 162.2, and again
P < 0.0001 (or, with df = 78, we conclude that P < 0.0005). The test for equality
.
of standard deviations gives F = 1.03 with df 144 and 78; the P-value is 0.9114, so the
pooled procedure should be appropriate. In either case, we conclude that male mean SATM
scores are higher than female mean SATM scores. Options for a 95% confidence interval
for the male − female difference are given in the table below. (b) With a pooled standard
. .
deviation, sp = 92.6348, t = 0.94 with df = 222, so P = 0.3485. Without pooling,
. .
SE D = 12.7485, t = 0.95 with df = 162.2, so P = 0.3410 (or, with df = 78, P = 0.3426).
.
The test for equality of standard deviations gives F = 1.11 with df 144 and 78; the P-value
is 0.6033, so the pooled procedure should be appropriate. In either case, we cannot see a dif-
ference between male and female mean SATV scores. Options for a 95% confidence interval
for the male − female difference are given in the table below. (c) The results may generalize
fairly well to students in different years, less well to students at other schools, and probably
not very well to college students in general.

Confidence interval Confidence interval


df t∗ for SATM µ M − µ F df t∗ for SATV µ M − µ F
162.3 1.9747 23.7402 to 69.7538 167.9 1.9742 −12.9981 to 37.3381
78 1.9908 23.5521 to 69.9419 78 1.9908 −13.2104 to 37.5504
70 1.994 23.5154 to 69.9786 70 1.994 −13.2506 to 37.5906
222 1.9707 23.6978 to 69.7962 222 1.9707 −13.3584 to 37.6984
100 1.984 23.5423 to 69.9517 100 1.984 −13.5306 to 37.8706

Men’s SATM Women’s SATM


3 0
3 5
400 4 1334
99999888776 4 56777888999
44444333322222111000 5 0111123334
99999988888877766555555555 5 55555556667777788899999
444444444433333332222222211100000000 6 00011222233334444
999999987777776666655555 6 55555789
3222211100000 7 1124
77766655555 7
0 8

Men’s SATV Women’s SATV


98 2 9
4322 3 33
9999988766 3 55669
444444444332111100000 4 0122223333444
999998888888888877777666666555 4 5666666777777888888899999
44443333322221110000000000000 5 01111122334
998888777777666666555 5 566777777889
43333332111100000 6 0000
9987775 6 668
420 7 00
6 7 5
Solutions 233

7.139. (a) We test H0 : µ B = µ D versus Ha : µ B < µ D . n x s


Pooling might be appropriate for this problem, in which Basal 22 41.0455 5.6356
. .
case sp = 6.5707. Whether or not we pool, SE D = 1.9811 DRTA 22 46.7273 7.3884
. Strat 22 44.2727 5.7668
and t = 2.87 with df = 42 (pooled), 39.3, or 21, so
P = 0.0032, or 0.0033, or 0.0046. We conclude that the mean score using DRTA is higher
than the mean score with the Basal method. The difference in the average scores is 5.68;
options for a 95% confidence interval for the difference µ D − µ B are given in the table
.
below. (b) We test H0 : µ B = µ S versus Ha : µ B < µ S . If we pool, sp = 5.7015. Whether
. .
or not we pool, SE D = 1.7191 and t = 1.88 with df = 42, 42.0, or 21, so P = 0.0337,
or 0.0337, or 0.0372. We conclude that the mean score using Strat is higher than the Basal
mean score. The difference in the average scores is 3.23; options for a 95% confidence
interval for the difference µ S − µ B are given in the table below.

Confidence interval Confidence interval


df t∗ for µ D − µ B df t∗ for µ S − µ B
39.3 2.0223 1.6754 to 9.6882 42.0 2.0181 −0.2420 to 6.6966
21 2.0796 1.5618 to 9.8018 21 2.0796 −0.3477 to 6.8023
21 2.080 1.5610 to 9.8026 21 2.080 −0.3484 to 6.8030
42 2.0181 1.6837 to 9.6799 42 2.0181 −0.2420 to 6.6965
40 2.021 1.6779 to 9.6857 40 2.021 −0.2470 to 6.7015

7.140. For testing µ1 = µ2 against a two-sided alternative, we would reject H0 if t = x 1SE−Dx 2 is



greater (in absolute value) than t ∗ , where SE D = 2.5 2/n. (Rather than determining t ∗ for
.
each considered sample size, we might use t ∗ = 2.) We therefore want to choose n so that



x − x x − x
1 2
> t ∗ = 1 − P −t ∗ < < t ∗ = 0.90
1 2
P
SE D SE D
. √
when µ1 − µ2 = 0.4 inch. With δ = 0.4/SE D = 0.1131 n, this means that
 
P −t ∗ − δ < Z < t ∗ − δ = 0.10
where Z has a N (0, 1) distribution. By trial and error, we find that two samples of size 822
will do this. (This answer will vary slightly depending on what students do with t ∗ in the
formula above.)
Note: Determining the necessary sample size can be made a bit easier with some
software. The output of the G•Power program below gives the total sample size as 1644
(i.e., two samples of size 822).

G•Power output
A priori analysis for "t-Test (means)", two-tailed:
Alpha: 0.0500
Power (1-beta): 0.9000
Effect size "d": 0.1600
Total sample size: 1644
Actual power: 0.9001
Critical value: t(1642) = 1.9614
Delta: 3.2437
234 Chapter 7 Inference for Distributions

7.141. (a) The distributions can be compared using a back-to-back 3BR 4BR
stemplot (shown), or two histograms, or side-by-side boxplots. 99987 0
4432211100 1 4
Three-bedroom homes are right-skewed; four-bedroom homes 8655 1 678
are generally more expensive. The top two prices from the three- 1 2 024
976 2 68
bedroom distribution qualify as outliers using the 1.5 × IQR 3 223
criterion. Boxplots are probably a poor choice for displaying 3 9
4 2
the distributions because they leave out so much detail, but five-
number summaries do illustrate that four-bedroom prices are higher at every level. Summary
statistics (in units of $1000) are given in the table below. (b) For testing H0 : µ3 = µ4 versus
. .
Ha : µ3 = µ4 , we have t = −4.475 with either df = 20.98 (P = 0.0002) or df = 13
(P < 0.001). We reject H0 and conclude that the mean prices are different (specifically,
that 4BR houses are more expensive). (c) The one-sided alternative µ3 < µ4 could have
been justified because it would be reasonable to expect that four-bedroom homes would
be more expensive. (d) The 95% confidence interval for the difference µ4 − µ3 is about
$63,823 to $174,642 (df = 20.97) or $61,685 to $176,779 (df = 13). (e) While the data
were not gathered from an SRS, it seems that they should be a fair representation of three-
and four-bedroom houses in West Lafayette. (Even so, the small sample sizes, together with
the skewness and the outliers in the three-bedroom data, should make us cautious about the
t procedures. Additionally, we might question independence in these data: When setting
the asking price for a home, sellers are almost certainly influenced by the asking prices for
similar homes on the market in the area.)
n x s Min Q1 M Q3 Max
3BR 23 147.561 61.741 79.5 100.0 129.9 164.9 295.0
4BR 14 266.793 87.275 149.9 189.0 259.9 320.0 429.9
Chapter 8 Solutions

8.1. (a) n = 760 banks. (b) X = 283 banks expected to acquire another bank.
.
(c) p̂ = 283
760 = 0.3724.

8.2. (a) n = 1063 adults played video games. (b) X = 223 of those adults play daily or almost
223 .
daily. (c) p̂ = 1063 = 0.2098.

.  .
8.3. (a) With p̂ = 0.3724, SE p̂ = p̂(1 − p̂)/760 = 0.01754. (b) The 95% confidence interval
is p̂ ± 1.96 SE p̂ = 0.3724 ± 0.0344. (c) The interval is 33.8% to 40.7%.

.  .
8.4. (a) With p̂ = 0.2098, SE p̂ = p̂(1 − p̂)/1063 = 0.01249. (b) The 95% confidence
interval is p̂ ± 1.96 SE p̂ = 0.2098 ± 0.0245. (c) The interval is 18.5% to 23.4%.

8.5. For z = 1.34, the two-sided P-value is the area under a


standard Normal curve above 1.34 and below −1.34. –1.34 1.34

–3 –2 –1 0 1 2 3

8.6. The 95% confidence interval is 44% to 86%. (Student opinions of what qualifies as
“appropriate” rounding might vary.) This is given directly in the Minitab output; the SAS
output gives a confidence interval for the complementary proportion.
The confidence interval in consistent with the result of the significance test, but is more
informative in that it gives a range of values for the true proportion.

8.7. The sample proportion is p̂ = 15 20 = √0.75. To test H0 : p = 0.5 versus Ha : p = 0.5,


.
the appropriate standard error is σp̂ = p0 (1 − p0 )/20 = 0.1118, and the test statistic is
. 0.25 .
z = ( p̂ − p0 )/σp̂ = 0.1118 = 2.24. The two-sided P-value is 0.0250 (Table A) or 0.0253
(software), so this result is significant at the 5% level.

8.8. With√n = 40 and p̂ = 0.65, the standard error for the significance test is
. . 0.15 .
σp̂ = p0 (1 − p0 )/40 = 0.0791, and the test statistic is z = ( p̂ − p0 )/σp̂ = 0.0791 = 1.90.
The two-sided P-value is 0.0574 (Table A) or 0.0578 (software)—not quite significant at the
5% level, but stronger evidence than the result with n = 20.

8.9. (a) To test H0 : p = 0.5 versus Ha : p = 0.5 with p̂ = 0.35, the test statistic is
p̂ − p0 . −0.15 .
z= = = −1.34
p0 (1 − p0 ) 0.1118
n
This is the opposite of the value of z given in Example 8.4, and the two-sided P-value is
the same: 0.1802 (or 0.1797 with software). (b) The standard error for a confidence interval
.
is SE p̂ = p̂(1 − p̂)/20 = 0.1067, so the 95% confidence interval is 0.35 ± 0.2090 = 0.1410
to 0.5590. This is the complement of the interval shown in the Minitab output in Figure 8.2.

235
236 Chapter 8 Inference for Proportions

8.10. We can achieve that margin of error with 90% confidence with a smaller sample. With
 2
.
p ∗ = 0.5 (as in Example 8.5), we compute n = (2)(0.03)
1.645
= 751.67, so we need a sample
of 752 students.

8.11. The plot is symmetric about 0.5, where 0.07


it has its maximum. (The maximum margin 0.06
of error always occurs at p̂ = 0.5, but the

Margin of error
0.05
size of the maximum error depends on the
0.04
sample size.)
0.03
Note: The first printing of the text asked
0.02
students to plot sample proportion (p̂) ver-
sus the margin of error (m), rather than 0.01
m versus p̂. Because p̂ is the explanatory 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
variable, the latter is more natural. Sample proportion

8.12. (a) H0 should refer to p (the population proportion), not p̂ (the sample proportion).
(b) Use Normal distributions (and a z test statistic) for significance tests involving
proportions. (c) The margin of error equals z ∗ times standard error; for 95% confidence, we
would have z ∗ = 1.96.

8.13. (a) Margin of error only accounts for random sampling error. (b) P-values measure the
strength of the evidence against H0 , not the probability of it being true. (c) The confidence
level cannot exceed 100%. (In practical terms, the confidence level must be less than 100%.)

8.14. (a) The mean √is µ = p = 0.4√ and the standard


.
deviation is σ = p(1 − p)/n = 0.0048 = 0.06928.
(b) Normal curve on the right. The tick marks are
σ (about 0.07) units apart. (c) p ∗ should be either
. .
1.96σ = 0.1358 or 2σ = 0.1386, so the points marked 0.2642 0.4 0.5358
on the curve should be either 0.2642 and 0.5358 or 0.2614 and 0.5386.

. .
8.15. The sample proportion is p̂ = 3274
5000 = 0.6548, the standard error is SE p̂ = 0.00672, and
the 95% confidence interval is 0.6548 ± 0.0132 = 0.6416 to 0.6680.

.
8.16. (a) With p̂ = 0.16, we have SE p̂ = 0.00518, so the 95% confidence interval is
.
p̂ ± 1.96 SE p̂ = 0.16 ± 0.01016 = 0.1498 to 0.1702. (b) With 95% confidence, the percent
of this group who would prefer a health care professional as a marriage partner is about
16 ± 1%, or 15% and 17%.
Solutions 237

8.17. (a) SE p̂ depends on n, which is some number less than 7061. Without that number,
we do not know the margin of error z ∗ SE p̂ . (b) The number who expect to begin playing
.
an instrument is 67% of half of 7061, or (0.67)(0.5)(7061) = 2365 players. (c) Taking
.
n = (0.5)(7061) = 3530.5, the 99% confidence interval is p̂ ± 2.576 SE p̂ = 0.67 ± 0.02038 =
0.6496 to 0.6904. (d) We do not know the sampling methods used, which might make these
methods unreliable.
Note: Even though n must be an integer in reality, it is not necessary to round n in
part (c); the confidence interval formula works fine when n is not a whole number.

8.18. (a) With n = (0.25)(7061) = 1765.25, the 99% confidence interval is


.
p̂ ± 2.576 SE p̂ = 0.67 ± 0.02883 = 0.6412 to 0.6988. (b) With n = (0.75)(7061) = 5295.75,
.
the 99% confidence interval is 0.67 ± 0.01664 = 0.6534 to 0.6866. (c) The margin of error
depends on the (unknown) sample size, and varies quite a bit among the three scenarios.

8.19. If p̂ has (approximately) a N ( p0 , σ ) distribution


under H0 , we reject H0 (at the α = 0.05 level) if p̂
falls outside the range p√0 ± 1.96σ . (a) If H0 is true
.
and n = 60, then σ = (0.4)(0.6)/60 = 0.06325,
so we reject H0 when p̂ is outside the range 0.2760 to
0.5240. Because p̂ = 60 x
, this corresponds to: Reject 0.22 0.28 0.34 0.4 0.46 0.52 0.58
H0 if p̂ ≤√0.2667 (x ≤ 15) or p̂ ≥ 0.5333 (x ≥ 32). (b) If H0 is true and n = 100,
.
then σ = (0.4)(0.6)/100 = 0.04899, so we reject H0 when p̂ is outside the range 0.3040
to 0.4960. This corresponds to: Reject H0 if p̂ ≤ 0.30 (x ≤ 30) or p̂ ≥ 0.50 (x ≥ 50).
(c) Shown on the right is one possible sketch, with the two Normal curves drawn on the
same scale; the dashed curve and rejection cutoffs is for n = 100. (Most students will likely
not realize that when σ is smaller, the curve must be taller to compensate for the decreased
width.) With a larger sample size, smaller values of | p̂ − 0.4| lead to the rejection of H0 .

.
8.20. (a) For 99% confidence, the margin of error is (2.576)(0.00122) = 0.00314. (b) All
of these facts suggest possible sources of error; for example, students at two-year colleges
are not represented, nor are students at institutions that do not wish to pay the fee. (Even
though the fee is scaled for institution size, larger institutions can more easily absorb it.)
None of these potential errors are covered by the margin of error found in part (a), which
only accounts for random sampling error.

. .
8.21. (a) About (0.42)(159,949) = 67,179 students plan to study abroad. (b) SE p̂ = 0.00123,
.
the margin of error is 2.576 SE p̂ = 0.00318, and the 99% confidence interval is 0.4168 to
0.4232.

. .
8.22. With p̂ = 1087
1430 = 0.7601, we have SE p̂ = 0.0113, and the 95% confidence interval is
.
p̂ ± 1.96 SE p̂ = 0.7601 ± 0.0221 = 0.7380 to 0.7823.

.
8.23. With p̂ = 0.43, we have SE p̂ = 0.0131, and the 95% confidence interval is
.
p̂ ± 1.96 SE p̂ = 0.43 ± 0.0257 = 0.4043 to 0.4557.
238 Chapter 8 Inference for Proportions

8.24. (a) A 99% confidence interval would be wider: We need a larger margin of error (by
a factor of 2.576/1.96) in order to be more confident that we have included p. The 99%
confidence interval is 0.3963 to 0.4637. (b) A 90% confidence interval would be narrower
(by a factor of 1.645/1.96). The 90% confidence interval is 0.4085 to 0.4515.
√ .
8.25. (a) SE p̂ = (0.87)(0.13)/430, 000 = 0.0005129. For 99% confidence, the margin of
.
error is 2.576 SE p̂ = 0.001321. (b) One source of error is indicated by the wide variation in
response rates: We cannot assume that the statements of respondents represent the opinions
of nonrespondents. The effect of the participation fee is harder to predict, but one possible
impact is on the types of institutions that participate in the survey: Even though the fee is
scaled for institution size, larger institutions can more easily absorb it. These other sources
of error are much more significant than sampling error, which is the only error accounted
for in the margin of error from part (a).
√ .
8.26. (a) The standard error is SE p̂ = (0.69)(0.31)/1048 = 0.01429, so the margin
.
of error for 95% confidence is 1.96 SE p̂ = 0.02800 and the interval is 0.6620
√ (b) To test H0 : p. = 0.79 versus Ha : p < 0.79, the standard error
to 0.7180.
.
is
σp̂ = (0.79)(0.21)/1048 = 0.01258 and the test statistic is z = 0.69−0.79
0.01258 = −7.95. This is
very strong evidence against H0 (P < 0.00005).
√ .
8.27. (a) The standard error is SE p̂ = (0.38)(0.62)/1048 = 0.01499, so the margin of error
.
for 95% confidence is 1.96 SE p̂ = 0.02939 and the interval is 0.3506 to 0.4094. (b) Yes;
some respondents might not admit to such behavior. The true frequency of such actions
might be higher than this survey suggests.

9054 .
 .
8.28. (a) p̂ = 24,142 = 0.3750. (b) The standard error is SE p̂ = p̂(1 − p̂)/24,142 = 0.003116,
.
so the margin of error for 95% confidence is 1.96 SE p̂ = 0.00611 and the interval is 0.3689
− 24,142 .
to 0.3811. (c) The nonresponse rate was 37,328
37,328 = 0.3532—about 35%. We have no
way of knowing if cheating is more or less prevalent among nonrespondents; this weakens
the conclusions we can draw from this survey.

390 .
 .
8.29. (a) p̂ = 1191 = 0.3275. The standard error is SE p̂ = p̂(1 − p̂)/1191 = 0.01360, so
.
the margin of error for 95% confidence is 1.96 SE p̂ = 0.02665 and the interval is 0.3008 to
0.3541. (b) Speakers and listeners probably perceive sermon length differently (just as, say,
students and lecturers have different perceptions of the length of a class period).

8.30. A 90% confidence interval would be narrower: The margin of error will be smaller (by a
factor of 1.645/2.576) if we are willing to be less confident that we have included p. The
90% confidence interval is 0.6570 to 0.6830—narrower than the 99% confidence interval
(0.6496 to 0.6904) from Exercise 8.17.

8.31. Recall the rule of thumb from Chapter 5: Use the Normal approximation if np ≥ 10 and
n(1 − p) ≥ 10. We use p0 (the value specified in H0 ) to make our decision.
(a) No: np0 = 6. (b) Yes: np0 = 18 and n(1 − p0 ) = 12. (c) Yes: np0 = n(1 − p0 ) = 50.
(d) No: np0 = 2.
Solutions 239

8.32. (a) Because we have defined p as the proportion


who prefer fresh-brewed coffee, we should compute
40 = 0.7. To test H0 : p = 0.5 versus Ha : p > 0.5,
p̂ = 28 2.53
√ .
the standard error is σp̂ = (0.5)(0.5)/40 = 0.07906,
− 0.5 .
–3 –2 –1 0 1 2 3
and the test statistic is z = 0.7
0.07906 = 2.53. The P-
value is 0.0057. (b) Curve on the right. (c) The result is significant at the 5% level, so we
reject H0 and conclude that a majority of people prefer fresh-brewed coffee.

.
8.33. With p̂ = 0.69, SE p̂ = 0.02830 and the 95% confidence interval is 0.6345 to 0.7455.

.
8.34. With p̂ = 0.583, SE p̂ = 0.03023 and the 95% confidence interval is 0.5237 to 0.6423.

. .
8.35. We estimate p̂ = 594
2533 = 0.2345, SE p̂ = 0.00842, and the 95% confidence interval is
0.2180 to 0.2510.

. .
8.36. (a) We estimate p̂ = 1434
2533 = 0.5661, SE p̂ = 0.00985, and the 95% confidence interval
is 0.5468 to 0.5854. (b) Pride or embarrassment might lead respondents to claim that their
income was above $25,000 even if it were not. Consequently, it would not be surprising
if the true proportion p were lower than the estimate p̂. (There may also be some who
would understate their income, out of humility or mistrust of the interviewer. While this
would seem to have less of an impact, it makes it difficult to anticipate the overall effect of
untruthful responses.) (c) Respondents would have little reason to lie about pet ownership;
the few that might lie about it would have little impact on our conclusions. The number of
untruthful responses about income is likely to be much larger and have a greater impact.

.
8.37. We estimate p̂ = 110
125 = 0.88, SE p̂ = 0.02907, and the 95% confidence interval is 0.8230
to 0.9370.

542 .
8.38. (a) p̂ = 1711 = 0.3168; about 31.7% of bicyclists aged 15 or older killed between
1987 and 1991 had alcohol in their systems at the time of the accident. (b) SE p̂ =
 .
p̂(1 − p̂)/1711 = 0.01125; the 99% confidence interval is p̂ ± 2.576 SE p̂ = 0.2878
to 0.3457. (c) No: We do not know, for example, what percent of cyclists who were
386 .
not involved in fatal accidents had alcohol in their systems. (d) p̂ = 1711 = 0.2256,
.
SE p̂ = 0.01010, and the 99% confidence interval is 0.1996 to 0.2516.

8.39. (a) For testing H0 : p = 0.5 versus Ha : p = 0.5, –1.34 1.34


we have p̂ = 10000
5067
= 0.5067 and σp̂ =

(0.5)(0.5)/10000 = 0.005, so z = 0.0067
0.005 = 1.34, for
which P = 0.1802. This is not significant
 at α = 0.05
. –3 –2 –1 0 1 2 3
(or even α = 0.10). (b) SE p̂ = p̂(1 − p̂)/10000 =
0.005, so the 95% confidence interval is 0.5067 ± (1.96)(0.005), or 0.4969 to 0.5165.
240 Chapter 8 Inference for Proportions

8.40. With no prior knowledge of the value of p (the proportion of “Yes” responses), take
 2
.
p ∗ = 0.5: n = 2(0.15)
1.96
= 42.7—use n = 43.

8.41. As a quick estimate, we can observe that to cut the margin of error in half, we must
quadruple the sample size, from 43 to 172. Using the sample-size formula, we find
 2
.
n = 2(0.075)
1.96
= 170.7—use n = 171. (The difference in the two answers is due to
rounding.)
 2
.
8.42. Using p ∗ = 0.25 (based on previous surveys), we compute n = 1.96
0.1 (0.25)(0.75) =
72.03, so we need a sample of 73 students.
 2
8.43. The required sample sizes are found by computing 1.960.1 p ∗ (1− p ∗ ) = 384.16 p ∗ (1− p ∗ ):
To be sure that we meet our target margin of error, we should take the largest sample
indicated: n = 97 or larger.

p∗ n Rounded up Required sample size 90


0.1 34.57 35 80
0.2 61.47 62 70
60
0.3 80.67 81
50
0.4 92.20 93 40
0.5 96.04 97 30
0.6 92.20 93 20
10
0.7 80.67 81
0
0.8 61.47 62 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.9 34.57 35 Actual population proportion p*

 2
8.44. n = 1.96
0.02 (0.15)(0.85) = 1224.51—use n = 1225.

8.45. With p1 = 0.4, n 1 = 25, p2 = 0.5, and n 2 = 30, the mean and standard deviation of the
sampling distribution of D = p̂1 − p̂2 are

p1 (1 − p1 ) p (1 − p2 ) .
µ D = p1 − p2 = −0.1 and σD = + 2 = 0.1339
n1 n2

8.46. (a) With p1 = 0.4, n 1 = 100, p2 = 0.5, and n 2 = 120, the mean and standard deviation
of the sampling distribution of D = p̂1 − p̂2 are

p1 (1 − p1 ) p (1 − p2 ) .
µ D = p1 − p2 = −0.1 and σD = + 2 = 0.0670
n1 n2
(b) µ D is unchanged, with σD is halved.

8.47. (a) The means are µp̂1 = p1 and µp̂2 = p2 . The standard deviations are

p1 (1 − p1 ) p2 (1 − p2 )
σp̂1 = and σp̂2 =
n1 n2
p (1 − p 1) p (1 − p2 )
(b) µ D = µp̂1 − µp̂2 = p1 − p2 . (c) σD2 = σp̂21 + σp̂22 = 1 + 2 .
n1 n2
Solutions 241

79 .
8.48. With p̂w = 100
44
= 0.44 and p̂m = 140 = 0.5643, we estimate the difference to be
.
p̂m − p̂w = 0.1243. The standard error of the difference is

p̂w (1 − p̂w ) p̂ (1 − p̂m ) .
SE D = + m = 0.06496
100 140
so the 95% confidence interval for pm − pw is 0.1243 ± (1.96)(0.06496) = −0.0030 to
0.2516.
Note: We followed the text’s practice of subtracting the smaller proportion from the
larger one, as described at the top of page 494.

8.49. Let us call the proportions favoring Commercial B qw and qm . Our estimates of
these proportions are the complements of those found in Exercise 8.48; for example,
q̂w = 100
56
= 0.56 = 1 − p̂w . Consequently, the standard error of the difference q̂w − q̂m

q̂w (1 − q̂w ) q̂ (1 − q̂ ) .
is the same as that for p̂m − p̂w : SE D = 100
+ m 140 m = 0.06496. The
margin of error is therefore also the same, and the 95% confidence interval for qw − qm is
(q̂w − q̂m ) ± (1.96)(0.06496) = −0.0030 to 0.2516.
Note: As in the previous exercise, we followed the text’s practice of subtracting the
smaller proportion from the larger one.

44 + 79
8.50. The pooled estimate of the proportion is p̂ = 100 + 140 = 0.5125. For testing
 
.
H0 : pm = pw versus Ha : pm = pw , we have SE Dp = p̂(1 − p̂) 100
1
+ 140
1
= 0.06544 and
.
the test statistic is z = ( p̂m − p̂w )/SE Dp = 1.90, for which the two-sided P-value is 0.0576.
This is not quite enough evidence to reject H0 at the 5% level.

8.51. Because the sample proportions would tend to support the alternative hypothesis
( pm > pw ), the P-value is half as large (P = 0.0288), which would be enough to reject H0
at the 5% level.

8.52. (a) The filled-in table is on the Population Sample Count of Sample
right. (b) The estimated difference is Population proportion size successes proportion
.
p̂2 − p̂1 = 0.1198. (c) Large-sample 1 p1 2822 198 0.0702
methods should be appropriate, be- 2 p2 1553 295 0.1900
cause we have large, independent
.
samples from two populations. (d) With SE D = 0.01105, the 95% confidence interval is
.
0.1198 ± 0.02167 = 0.0981 to 0.1415. (e) The estimated difference is about 12.0%, and the
interval is about 9.8% to 14.1%. (f) It is hard to imagine why the months for each survey
would affect the interpretation. (Of course, just because we cannot guess what the impact
would be does not mean there is no impact.)

198 + 295 .
8.53. For H0 : p1 = p2 , the pooled estimate of the proportion is p̂ = 2822 + 1553 = 0.1127.
. 0.1198 .
The standard error is SE Dp = 0.00999, and the test statistic is z = 0.00999 = 11.99. The
alternative hypothesis was not specified in this exercise; for either p1 = p2 or p1 < p2 , the
P-value associated with z = 11.99 would be tiny. (For the alternative p1 > p2 , P would be
242 Chapter 8 Inference for Proportions

nearly 1, and we would not reject H0 ; however, it is hard to imagine why we would suspect
that podcast use had decreased from 2006 to 2008.)

8.54. (a) No; this is a ratio of proportions, not people. In addition, these are sample
proportions, so they are only estimates of the population proportions. If the size of the
population (Internet users) remained roughly constant from 2006 to 2008, we can say that
about about 2.7 times as many people are downloading podcasts. (b) We are quite confident
that the 2008 proportion exceeds the 2006 proportion by at least 0.098, so making the
(extremely reasonable) assumption that the number of Internet users did not decrease, we are
nearly certain that there are more people downloading podcasts.

8.55. (a) The filled-in table is on the Population Sample Count of Sample
right. The values of X 1 and X 2 Population proportion size successes proportion
are estimated as (0.54)(1063) and 1 p1 1063 574 0.54
(0.89)(1064). (b) The estimated 2 p2 1064 947 0.89
.
difference is p̂2 − p̂1 = 0.35.
(c) Large-sample methods should be appropriate, because we have large, independent
.
samples from two populations. (d) With SE D = 0.01805, the 95% confidence interval is
.
0.35 ± 0.03537 = 0.3146 to 0.3854. (e) The estimated difference is about 35%, and the
interval is about 31.5% to 38.5%. (f) A possible concern is that adults were surveyed before
Christmas, while teens were surveyed before and after Christmas. It might be that some of
those teens may have received game consoles as gifts, but eventually grew tired of them.

(0.54)(1063) + (0.89)(1064).
8.56. The pooled estimate of the proportion is p̂ = = 0.7151. For testing
1063 + 1064  
.
H0 : p1 = p1 versus Ha : p1 = p2 , we have SE Dp = p̂(1 − p̂) 1063 1
+ 1064
1
= 0.01957 and
.
the test statistic is z = ( p̂2 − p̂1 )/SE Dp = 17.88. The P-value is essentially 0; we have no
doubt that the two proportions are different (specifically, that the teen proportion is higher).

8.57. (a) The filled-in table is on the Population Sample Count of Sample
right. The values of X 1 and X 2 Population proportion size successes proportion
are estimated as (0.73)(1063) and 1 p1 1063 776 0.73
(0.76)(1064). (b) The estimated 2 p 2 1064 809 0.76
.
difference is p̂2 − p̂1 = 0.03.
(c) Large-sample methods should be appropriate, because we have large, independent
.
samples from two populations. (d) With SE D = 0.01889, the 95% confidence interval is
.
0.03 ± 0.03702 = −0.0070 to 0.0670. (e) The estimated difference is about 3%, and the
interval is about −0.7% to 6.7%. (f) As in the solution to Exercise 8.55, a possible concern
is that adults were surveyed before Christmas.

(0.73)(1063) + (0.76)(1064) .
8.58. The pooled estimate of the proportion is p̂ = = 0.7450. For testing
1063 + 1064  
.
H0 : p1 = p1 versus Ha : p1 = p2 , we have SE Dp = p̂(1 − p̂) 1063 1
+ 1064
1
= 0.01890
.
and the test statistic is z = ( p̂2 − p̂1 )/SE Dp = 1.59. The P-value is 0.1118 (Table A) or
0.1125 (software); either way, there is not enough evidence to conclude that the proportions
are different.
Solutions 243

8.59. No; this procedure requires independent samples from different populations. We have one
sample (of teens).

8.60. (a) The mean is


µ D = p1 − p2 = 0.4 − 0.5 = −0.1
and the standard
deviation is
p1 (1 − p1 ) p (1 − p2 ) .
σD = + 2 = 0.09469 –0.2856
or –0.2894
–0.1 0.0856
or 0.0894
50 60
. .
(b) Normal curve on the right. (c) d should be either 1.96σ D = 0.1856 or 2σ D = 0.1894,
so the points marked on the curve should be either −0.2856 and 0.0856 or −0.2894 and
0.0894.
Note: Because this problem told us which population was “first” and which was “sec-
ond,” we did not follow the suggestion in the text to arrange them so that the population 1
had the larger proportion. Where necessary, we have done so in the other exercises.

8.61. (a) H0 should refer to p1 and p2 (population proportions) rather than p̂1 and p̂2 (sample
proportions). (b) Knowing p̂1 = p̂2 does not tell us that the success counts are equal
(X 1 = X 2 ) unless the sample sizes are equal (n 1 = n 2 ). (c) Confidence intervals only
account for random sampling error.

8.62. (a) The mean of D = p̂1 − p̂2 is µ D = p1 − p2 = 0.4 − 0.5 = −0.1 (as before). The
standard deviation is

p1 (1 − p1 ) p (1 − p2 ) .
σD = + 2 = 0.07106
50 1000
(b) The mean of p̂1 − 0.5 is also −0.1; the standard deviation is the same as that of p̂1 :

p1 (1 − p1 ) .
= 0.06928. (c) The standard deviation of p̂2 is only 0.01581, so it will
50
typically (95% of the time) differ from its mean (0.5) by no more than about 0.032. (d) If
one sample is very large, that estimated proportion will be more accurate, and most of the
variation in the difference will come from the variation in the other proportion.

8.63. Pet owners had the lower proportion of women, so we call them “population 2”:
. 1024 . .
p̂2 = 285
595 = 0.4790. For non-pet owners, p̂1 = 1939 = 0.5281. SE D = 0.02341, so the 95%
confidence interval is 0.0032 to 0.0950.
244 Chapter 8 Inference for Proportions

8.64. (a) Arranging the proportion so that population 1


35 .
has the larger proportion, we have p̂1 = 165 = 0.2121
17 . .
and p̂2 = 283 = 0.0601. (b) p̂1 − p̂2 = 0.1521 4.85
and the standard error (for constructing a confidence
.
interval) is SE D = 0.03482. (c) The hypotheses are –4 –3 –2 –1 0 1 2 3 4 5

H0 : p1 = p2 versus Ha : p1 > p2 . The alternative reflects the reasonable expectation that


reducing pollution might decrease wheezing. (d) The pooled estimate of the proportion is
17 + 35 . . .
p̂ = 283 + 165 = 0.1161 and SE Dp = 0.03137, so z = ( p̂1 − p̂2 )/SE Dp = 4.85. The P-value
is very small (P < 0.0001). (e) The 95% confidence interval, using the standard error from
.
part (b), has margin of error 1.96 SE D = 0.06824: 0.0838 to 0.2203. The percent report-
ing improvement was between 8% and 22% higher for bypass residents. (f) There may be
geographic factors (e.g., weather) or cultural factors (e.g., diet) that limit how much we can
generalize the conclusions.

8.65. With equal sample sizes, the pooled estimate of the proportion is p̂ = 0.255,
the average of p̂1 = 0.29 and p̂2 = 0.22. This can also be computed by
taking X 1 = (0.29)(1421) = 412.09 and X 2 = (0.22)(1421) = 312.62, so
.
p̂ = (X 1 + X 2 )/(1421 + 1421). The standard error for a significance test is SE Dp = 0.01635,
.
and the test statistic is z = 4.28 (P < 0.0001); we conclude that the proportions are
.
different. The standard error for a confidence interval is SE D = 0.01630, and the 95%
confidence interval is 0.0381 to 0.1019. The interval gives us an idea of how large the
difference is: Music downloads dropped 4% to 10%.

8.66. The table below shows the results from the previous exercise, and those with different
sample sizes. For part (iii), two answers are given, corresponding to the two ways one could
interpret which is the “first sample size.”

n1 n2 p̂ SE Dp z SE D Confidence interval
8.65 1421 1421 0.255 0.01635 4.28 0.01630 0.0381 to 0.1019
(i) 1000 1000 0.255 0.01949 3.59 0.01943 0.0319 to 0.1081
(ii) 1600 1600 0.255 0.01541 4.54 0.01536 0.0399 to 0.1001
(iii) 1000 1600 0.2469 0.01738 4.03 0.01770 0.0353 to 0.1047
1600 1000 0.2631 0.01775 3.94 0.01733 0.0360 to 0.1040
As one would expect, we see in (i) and (ii) that smaller samples result in smaller z (weaker
evidence) and wider intervals, while larger samples have the reverse effect. The results of
(iii) show that the effect of varying unequal sample sizes is more complicated.

. 75 .
8.67. (a) We find p̂1 = 73 91 = 0.8022 and p̂2 = 109 = 0.6881. For a confidence interval,
.
SE D = 0.06093, so the 95% confidence interval for p1 − p2 is (0.8022 − 0.6881) ±
(1.96)(0.06093) = −0.0053 to 0.2335. (b) The question posed was, “Do high-tech
companies tend to offer stock options more often than other companies?” Therefore, we test
. . 73 + 75
H0 : p1 = p2 versus Ha : p1 > p2 . With p̂1 = 0.8022, p̂2 = 0.6881, and p̂ = 91 + 109 = 0.74,
. .
we find SE Dp = 0.06229, so z = ( p̂1 − p̂2 )/SE Dp = 1.83. This gives P = 0.0336. (c) We
have fairly strong evidence that high-tech companies are more likely to offer stock options.
However, the confidence interval tells us that the difference in proportions could be very
small, or as large as 23%.
Solutions 245

. .
8.68. With p̂2002 = 0.4780 and p̂2004 = 0.3750, the standard error for a confidence interval
.
is SE D = 0.00550. The 90% confidence interval for the difference p2002 − p2004 is
(0.4780 − 0.3750) ± 1.645SE D = 0.0939 to 0.1120.

.
8.69. (a) p̂f = 48 60 = 0.8, so SE p̂ = 0.05164 for females. p̂m = 132 = 0.39, so
52

. √ .
SE p̂ = 0.04253 for males. (b) SE D = 0.051642 + 0.042532 = 0.06690, so the interval is
( p̂f − p̂m ) ± 1.645 SE D , or 0.2960 to 0.5161. There is (with high confidence) a considerably
higher percent of juvenile references to females than to males.

515 . 27 . .
8.70. (a) p̂1 = 1520 = 0.3388 for men, and p̂2 = 191 = 0.1414 for women. SE D = 0.02798,
so the 95% confidence interval for p1 − p2 is 0.1426 to 0.2523. (b) The female
contribution is larger because the sample size for women is much smaller. (Specifically,
. .
p̂1 (1 − p̂1 )/n 1 = 0.0001474, while p̂2 (1 − p̂2 )/n 2 = 0.0006355.) Note that if the
sample sizes had been similar, the male contribution would have been larger (assuming the
proportions remained the same) because the numerator term pi (1 − pi ) is greater for men
than women.

. .
8.71. We test H0 : p1 = p2 versus Ha : p1 = p2 . With p̂1 = 0.5281, p̂2 = 0.4790, and
+ 285 . . .
p̂ = 1024
1939 + 595 = 0.5166, we find SE Dp = 0.02342, so z = ( p̂1 − p̂2 )/SE Dp = 2.10. This
gives P = 0.0360—significant evidence (at the 5% level) that a higher proportion of non-pet
owners are women.

8.72. For each confidence interval, Genre p̂ SE m.e. Interval


the standard error
is Racing 0.74 0.01321 0.02590 0.7141 to 0.7659
p̂(1 − p̂) Puzzle 0.72 0.01353 0.02651 0.6935 to 0.7465
SE p̂ = Sports 0.68 0.01405 0.02754 0.6525 to 0.7075
1102
and the margin of error is Action 0.67 0.01416 0.02776 0.6422 to 0.6978
Adventure 0.66 0.01427 0.02797 0.6320 to 0.6880
1.96 SE p̂ .
Rhythm 0.61 0.01469 0.02880 0.5812 to 0.6388

8.73. (a) While there is only a 5% chance of any interval being Genre Interval
wrong, we have six (roughly independent) chances to make Racing 0.705 to 0.775
that mistake. (b) For 99.2% confidence, use z ∗ = 2.65. (Us- Puzzle 0.684 to 0.756
∗ .
ing software, z = 2.6521, or 2.6383 using the exact value of Sports 0.643 to 0.717
0.05/6 = 0.0083). (c) The margin of error for each interval is Action 0.632 to 0.708

z SE p̂ , so each interval is about 1.35 times wider than in the Adventure 0.622 to 0.698
Rhythm 0.571 to 0.649
previous exercise. (If intervals are rounded to three decimal
places, as on the right, the results are the same regardless of the value of z ∗ used.)

8.74. (a) The proportion is p̂ = 0.042, so X = (0.042)(15, 000) = 630 of the households
.
in the sample are wireless only. (b) SE p̂ = 0.00164, so the 95% confidence interval is
0.042 ± 0.00321 = 0.0388 to 0.0452.

8.75. (a) The proportion is p̂ = 0.164, so X = (0.164)(15, 000) = 2460 of the households
.
in the sample are wireless only. (b) SE p̂ = 0.00302, so the 95% confidence interval is
0.164 ± 0.00593 = 0.1581 to 0.1699. (c) The estimate is 16.4%, and the interval is 15.8%
246 Chapter 8 Inference for Proportions

to 17.0%. (d) The difference in the sample proportions is D = 0.164 − 0.042 = 0.122.
. .
(e) SE D = 0.00344, so the margin of error is 1.96 SE D = 0.00674. (The confidence interval
is therefore 0.1153 to 0.1287.)

.
0.042 = 3.90. A better term might be “relative rate” (of
8.76. (a) The “relative risk” is 0.164
being wireless only). (b) A possible summary: From 2003 to 2007, the proportion of
wireless-only households increased by nearly four times. (With software, and/or the
methods of Chapter 14, one could also determine a confidence interval for this ratio,
as in Example 8.11.) (c) Both illustrate (in different ways) a change in the proportion.
(d) Preferences will vary. Students might note that ratios are often used in news reports; for
example, “the risk of complications is twice as high for people taking drug A compared to
those taking drug B.”

.
8.77. With p̂1 = 0.43, p̂2 = 0.32, and n 1 = n 2 = 1430, we have SE D = 0.01799, so the 95%
confidence interval is (0.43 − 0.32) ± 0.03526 = 0.0747 to 0.1453.

8.78. The pooled estimate of the proportion is p̂ = 0.375 (the average of p̂1 and p̂2 , because
. .
the sample sizes were equal). Then SE Dp = 0.01811, so z = (0.43 − 0.32)/SE Dp = 6.08, for
which P < 0.0001.

8.79. (a) and (b) The revised confidence intervals and z statistics are in the table below.
(c) While the interval and z statistic change slightly, the conclusions are roughly the same.
Note: Even if the second sample size were as low as 100, the two proportions would be
significantly different, albeit less so (z = 2.15, P = 0.0313).
n2 SE D m.e. Interval p̂ SE Dp z P
1000 0.01972 0.03866 0.0713 to 0.1487 0.3847 0.02006 5.48 < 0.0001
2000 0.01674 0.03281 0.0772 to 0.1428 0.3659 0.01668 6.59 < 0.0001

994 . .
8.80. With p̂ = 1430 = 0.6951, we have SE p̂ = 0.01217, so the 95% confidence interval is
.
p̂ ± 1.96 SE p̂ = 0.6951 ± 0.0239 = 0.6712 to 0.7190.

8.81. Student answers will vary. Shown on the right is the margin of error n m.e.
arising for sample sizes ranging from 500 to 2300; a graphical summary 500 0.04035
is not shown, but a good choice would be a plot of margin of error ver- 800 0.03190
sus sample size. 1100 0.02721
1430 0.02386
1700 0.02188
2000 0.02018
2300 0.01881

√ .
8.82. With p̂ = 0.58, the standard error is SE p̂ = (0.58)(0.42)/1048 = 0.01525, so the
.
margin of error for 90% confidence is 1.645 SE p̂ = 0.02508, and the interval is 0.5549 to
0.6051.
Solutions 247

.
8.83. With p̂m = 0.59 and p̂w = 0.56, the standard error is SE D = 0.03053, the margin of
.
error for 95% confidence is 1.96SE D = 0.05983, and the confidence interval for pm − pw is
−0.0298 to 0.0898.

8.84. (a) The table below summarizes the margins of error m.e.= 1.96 p̂(1 − p̂)/n:
p̂ m.e. 95% confidence interval
Current Downloading less 38% 6.05% 31.95% to 44.05%
downloaders Use P2P networks 33.33% 5.88% 27.45% to 39.21%
(n = 247) Use e-mail/IM 24% 5.33% 18.67% to 29.33%
Use music-related sites 20% 4.99% 15.01% to 24.99%
Use paid services 17% 4.68% 12.32% to 21.68%
All users Have used paid services 7% 1.35% 5.65% to 8.35%
(n = 1371) Currently use paid services 3% 0.90% 2.10% to 3.90%

(b) Obviously, students’ renditions of the above information in a paragraph will vary.
(c) Student opinions may vary on this. Personally, I lean toward B, although I would be
inclined to report two margins of error: “no more than 6%” for the current downloaders and
“no more than 1.4%” for all users.

8.85. (a) People have different symptoms; for example, not all who wheeze consult a doctor.
45 . 12 .
(b) In the table (below), we find for “sleep” that p̂1 = 282 = 0.1596 and p̂2 = 164 = 0.0732,
. .
so the difference is p̂1 − p̂2 = 0.0864. Therefore, SE D = 0.02982 and the margin of
error for 95% confidence is 0.05844. Other computations are performed in like manner.
(c) It is reasonable to expect that the bypass proportions would be higher—that is, we
expect more improvement where the pollution decreased—so we could use the alternative
45 + 12 . .
p1 > p2 . (d) For “sleep,” we find p̂ = 282 + 164 = 0.1278 and SE Dp = 0.03279. Therefore,
. .
z = (0.1596 − 0.0732)/SE Dp = 2.64. Other computations are similar. Only the “sleep”
difference is significant. (e) 95% confidence intervals are shown below. (f) Part (b) showed
improvement relative to control group, which is a better measure of the effect of the bypass,
because it allows us to account for the improvement reported over time even when no
change was made.
Bypass minus congested Bypass
Complaint p̂1 − p̂2 95% CI z P p̂ 95% CI
Sleep 0.0864 0.0280 to 0.1448 2.64 0.0042 0.1596 0.1168 to 0.2023
Number 0.0307 −0.0361 to 0.0976 0.88 0.1897 0.1596 0.1168 to 0.2023
Speech 0.0182 −0.0152 to 0.0515 0.99 0.1600 0.0426 0.0190 to 0.0661
Activities 0.0137 −0.0395 to 0.0670 0.50 0.3100 0.0925 0.0586 to 0.1264
Doctor −0.0112 −0.0796 to 0.0573 −0.32 0.6267 0.1174 0.0773 to 0.1576
Phlegm −0.0220 −0.0711 to 0.0271 −0.92 0.8217 0.0474 0.0212 to 0.0736
Cough −0.0323 −0.0853 to 0.0207 −1.25 0.8950 0.0575 0.0292 to 0.0857

8.86. (a) The number of orders completed in five days or less before the changes was
.
X 1 = (0.16)(200) = 32. With p̂1 = 0.16, SE p̂ = 0.02592, and the 95% confidence
interval for p1 is 0.1092 to 0.2108. (b) After the changes, X 2 = (0.9)(200) = 180. With
.
p̂2 = 0.9, SE p̂ = 0.02121, and the 95% confidence interval for p2 is 0.8584 to 0.9416.
.
(c) SE D = 0.03350 and the 95% confidence interval for p2 − p1 is 0.6743 to 0.8057, or
about 67.4% to 80.6%.
248 Chapter 8 Inference for Proportions

.
8.87. With p̂ = 0.56, SE p̂ = 0.01433, so the margin of error for 95% confidence is
.
1.96SE p̂ = 0.02809.

. .
8.88. (a) X 1 = 121 = (0.903)(134) die-hard fans and X 2 = 161 = (0.679)(237) less loyal fans
+ 161 . .
watched or listened as children. (b) p̂ = 121
134 + 237 = 0.7601 and SE Dp = 0.04615, so we find
.
z = 4.85 (P < 0.0001)—strong evidence of a difference in childhood experience. (c) For a
.
95% confidence interval, SE D = 0.03966 and the interval is 0.1459 to 0.3014. If students
work with the rounded proportions (0.903 and 0.679), the 95% confidence interval is 0.1463
to 0.3017.

p̂1 + 237 p̂2 . .


8.89. With p̂1 = 23 and p̂2 = 0.2, we have p̂ = 134134 + 237 = 0.3686, SE Dp = 0.05214,
and z = 8.95—very strong evidence of a difference. (If we assume that “two-thirds of the
die-hard fans” and “20% of the less loyal fans” mean 89 and 47 fans, respectively, then
. .
p̂ = 0.3666 and z = 8.94; the conclusion is the same.) For a 95% confidence interval,
.
SE D = 0.04831 and the interval is 0.3720 to 0.5613. (With X 1 = 89 and X 2 = 47, the
interval is 0.3712 to 0.5606.)

8.90. We test H0 : pf = pm versus Ha : pf = pm Text p̂f p̂m p̂ z P


for each text, where, for example, pf is the 1 .4000 .2059 .2308 0.96 .3361
proportion of juvenile female references. We 2 .7143 .2857 .3286 2.29 .0220
can reject H0 for texts 2, 3, 6, and 10. The 3 .4464 .2154 .3223 2.71 .0067
4 .1447 .1210 .1288 0.51 .6123
last three texts do not stand out as differ-
5 .6667 .2791 .3043 1.41 .1584
ent from the first seven. Texts 7 and 9 are 6 .8000 .3939 .5208 5.22 .0000
notable as the only two with a majority of 7 .9500 .9722 .9643 −0.61 .5437
juvenile male references, while six of the 8 .2778 .1818 .2157 0.80 .4259
ten texts had juvenile female references a 9 .6667 .7273 .7097 −0.95 .3399
majority of the time. 10 .7222 .2520 .3103 4.04 .0001

8.91. The proportions, z-values, and P-values are:


Text 1 2 3 4 5 6 7 8 9 10
p̂ .8718 .9000 .5372 .6738 .9348 .6875 .6429 .6471 .7097 .8759
z 4.64 6.69 0.82 5.31 5.90 5.20 3.02 2.10 6.60 9.05
P ≈0 ≈ 0 .4133 ≈ 0 ≈0 ≈0 .0025 .0357 ≈0 ≈0
We reject H0 : p = 0.5 for all texts except Text 3 and (perhaps) Text 8. If we are using a
“multiple comparisons” procedure such as Bonferroni (see Chapter 6), we also might fail to
reject H0 for Text 7.
The last three texts do not seem to be any different from the first seven; the gender of the
author does not seem to affect the proportion.

. .
8.92. (a) p̂ = 463
975 = 0.4749, SE D = 0.01599, and the 95% confidence interval is 0.4435
to 0.5062. (b) Expressed as percents, the confidence interval is 44.35% to 50.62%.
(c) Multiply the upper and lower limits of the confidence interval by 37,500: about 16,632
to 18,983 students.
Solutions 249

8.94. With sample sizes of n w = (0.52)(1200) = 624 women and n m = 576 men, we test
H0 : pm = pw versus Ha : pm = pw . Assuming there were X m = 0.62n m = 357.12 men and
X w = 0.51n w = 318.24 women who thought that parents put too little pressure on students,
. . .
the pooled estimate is p̂ = 0.5628, SE Dp = 0.02866, and the test statistic is z = 3.84. This
is strong evidence (P < 0.0001) that a higher proportion of men have this opinion.
.
To construct a 95% confidence interval for pm − pw , we have SE D = 0.02845, yielding
the interval 0.0542 to 0.1658.

8.95. The difference becomes more significant (i.e., the P-value de- n z P
creases) as the sample size increases. For small sample sizes, the 40 0.90 0.3681
difference between p̂1 = 0.5 and p̂2 = 0.4 is not significant, but 50 1.01 0.3125
with larger sample sizes, we expect that the sample proportions 80 1.27 0.2041
should be better estimates of their respective population propor- 100 1.42 0.1556
tions, so p̂1 − p̂2 = 0.1 suggests that p1 = p2 . 400 2.84 0.0045
500 3.18 0.0015
1000 4.49 0.0000

8.96. The table and graph below show the large-sample margins of error. The margin of error
decreases as sample size increases, but the rate of decrease is noticeably less for large n.
Margin of error (95% conf.) 0.2
n m.e.
40 0.2079 0.15
50 0.1859
80 0.1470 0.1
100 0.1315
0.05
400 0.0657
500 0.0588
0
1000 0.0416 0 100 200 300 400 500 600 700 800 900 1000
Sample size

8.97. (a) Using either trial and error, or the formula derived in part (b), we find that at least
p̂ (1 − p̂ ) p̂ (1 − p̂2 )
n = 342 is needed. (b) Generally, the margin of error is m = z ∗ 1 1
+ 2 ;
√ n n
with p̂1 = p̂2 = 0.5, this is m = z ∗ 0.5/n. Solving for n, we find n = (z ∗ /m)2 /2.

8.98. We must assume that we can treat the births recorded during these two times as
independent SRSs. Note that the rules of thumb for the Normal approximation are not
satisfied here; specifically, three birth defects are less than ten. Additionally, one might call
into question the assumption of independence, because there may have been multiple births
to the same set of parents included in these counts (either twins/triplets/etc., or “ordinary”
siblings).
16 .
If we carry out the analysis in spite of these issues, we find p̂1 = 414 = 0.03865 and
3 . .
p̂2 = 228 = 0.01316. We might then find a 95% confidence interval: SE D = 0.01211, so the
interval is p̂1 − p̂2 ± (1.96)(0.01211) = 0.00175 to 0.04923. Note that this does not take into
account the presumed direction of the difference. (This setting does meet our requirements
for the plus four method, for which p̃1 = 0.04086 and p̃2 = 0.01739, SE D = 0.01298, and
the 95% confidence interval is −0.0020 to 0.0489.)
250 Chapter 8 Inference for Proportions

We could also perform a significance test of H0 : p1 = p2 versus Ha : p1 > p2 :


19 . . .
p̂ = 642 = 0.02960, SE Dp = 0.01398, z = 1.82, P = 0.0344.
Both the large-sample interval and the significance test suggest that the two proportions
are different (but not much); the plus four interval does not establish that p1 = p2 . Also, we
must recognize that the issues noted above make this conclusion questionable.

. 339 . .
8.99. (a) p0 = 143,611
181,535 = 0.7911. (b) p̂ = 870 = 0.3897, σp̂ = 0.0138, and
. .
z = ( p̂ − p0 )/σp̂ = −29.1, so P = 0 (regardless of whether Ha is p > p0 or p = p0 ).
This is very strong evidence against H0 ; we conclude that Mexican Americans are
. 143,611 − 339 .
870 = 0.3897, while p̂2 = 181,535 − 870 = 0.7930. Then
underrepresented on juries. (c) p̂1 = 339
. . .
p̂ = 0.7911 (the value of p0 from part (a)), SE Dp = 0.01382, and z = −29.2—and again, we
have a tiny P-value and reject H0 .

8.101. In each case, p̂ SE p̂ 95% confidence interval


the
 standard error is Burdened by debt 0.555 0.01389 0.5278 to 0.5822
p̂(1 − p̂)/1280. One Would borrow less 0.544 0.01392 0.5167 to 0.5713
observation is that, while More hardship 0.343 0.01327 0.3170 to 0.3690
many feel that loans are a Loans worth it 0.589 0.01375 0.5620 to 0.6160
burden and wish they had Career opportunities 0.589 0.01375 0.5620 to 0.6160
Personal growth 0.715 0.01262 0.6903 to 0.7397
borrowed less, a majority
are satisfied with the benefits they receive from their education.
Chapter 9 Solutions

9.1. (a) The conditional distributions are given in the table below. For example, given that
Explanatory = 1, the distribution of the response variable is 200
70
= 35% “Yes” and
200 = 65% “No.” (b) The graphical display might take the form of a bar graph like the
130

one shown below, but other presentations are possible. (c) One notable feature is that when
Explanatory = 1, “No” is more common, but “Yes” and “No” are closer to being evenly
split when Explanatory = 2.
70 Exp1
60
Exp2
Explanatory 50

Percent
Response variable 40
variable 1 2 30
Yes 35% 45% 20
No 65% 55% 10
0
Yes No
Response variable

9.2. (a) The expected count for the first cell is (160)(200)
400 = 80. (b) This X 2 statistic has
df = (2 − 1)(2 − 1) = 1. (c) Because 3.84 < X < 5.02, the P-value is between 0.025 and
2

0.05.

0.00899 = 0.838. Example 9.7 gave the 95% confidence interval


9.3. The relative risk is 0.00753
(1.02, 1.32), so with the ratio reversed, the interval would be approximately (0.758, 0.980).
For this relative risk, the statement made in Example 9.7 would be (changes underlined):
“Since this interval does not include the value 1, corresponding to equal proportions in the
two groups, we conclude that the lower CVD rate is statistically significant with P < 0.05.
The low salt diet is associated with a 16% lower rate of CVD than the high salt diet.”

9.4. The nine terms are shown in the table on Fruit Physical activity
the right. For example, the first term is consumption Low Moderate Vigorous
(69 − 51.90)2 . Low 5.6341 0.2230 0.3420
= 5.6341 Medium 0.6256 0.2898 0.0153
51.90
These terms add up to about 14.1558; the High 6.1280 0.0091 0.8889
slight difference is due to the rounding of the expected values reported in Example 9.10.

9.5. The table below summarizes the bounds for the P-values, and also gives the exact
P-values (given by software). In each case, df = (r − 1)(c − 1).
Size of Crit. values Bounds Actual
X2 table df (Table F) for P P
(a) 5.32 2 by 2 1 5.02 < X 2 < 5.41 0.02 < P < 0.025 0.0211
(b) 2.7 2 by 2 1 2.07 < X 2 < 2.71 0.10 < P < 0.15 0.1003
(c) 25.2 4 by 5 12 24.05 < X 2 < 26.22 0.01 < P < 0.02 0.0139
(d) 25.2 5 by 4 12 24.05 < X 2 < 26.22 0.01 < P < 0.02 0.0139

251
252 Chapter 9 Analysis of Two-Way Tables

9.6. The Minitab output shown on the right gives Minitab output
.
X 2 = 54.307, df = 1, and P < 0.0005, indicating Men Women Total
significant evidence of an association. Yes 1392 1748 3140
1215.19 1924.81
No 3956 6723 10679
4132.81 6546.19
Total 5348 8471 13819
ChiSq = 25.726 + 16.241 +
7.564 + 4.776 = 54.307
df = 1, p = 0.000

9.7. The expected counts were rounded to the nearest hundredth.

9.8. The table below lists the observed counts, the population proportions, the expected counts,
and the chi-square contributions (for the next exercise). Each expected count is the product
of the proportion and the sample size 1567; for example, (0.172)(1567) = 269.524 for
California.
State AZ CA HI IN NV OH
Observed count 167 257 257 297 107 482
Proportion 0.105 0.172 0.164 0.188 0.070 0.301
Expected count 164.535 269.524 256.988 294.596 109.690 471.667
Chi-square contribution 0.0369 0.5820 0.0000 0.0196 0.0660 0.2264

9.9. The expected counts are in the table above, rounded to four decimal places as in
Example 9.15; for example, for California, we have
(257 − 269.524)2 .
= 0.5820
269.524
The six values add up to 0.93 (rounded to two decimal places).

.
9.10. The chi-square goodness of fit statistic is X 2 = 15.2 with df = 5, for which
0.005 < P < 0.01 (software gives 0.0096). The details of the computation are given in the
table below; note that there were 475 M&M’s in the bag.
Expected Expected Observed (O − E)2
frequency count count O−E E
Brown 0.13 61.75 61 −0.75 0.0091
Yellow 0.14 66.5 59 −7.5 0.8459
Red 0.13 61.75 49 −12.75 2.6326
Orange 0.20 95 77 −18 3.4105
Blue 0.24 114 141 27 6.3947
Green 0.16 76 88 12 1.8947
475 15.1876
Solutions 253

9.11. (a) The two-way table is on the right; for Date of Survey
example, for April 2001, (0.05)(2250) = 112.5 April April March April
and (0.95)(2250) = 2137.5. (b) Under Broadband? 2001 2004 2007 2008
the null hypothesis that the proportions Yes 112.5 540 1080 1237.5
have not changed, the expected counts are No 2137.5 1710 1170 1012.5
(0.33)(2250) = 742.5 (across the top row) and (0.67)(2250) = 1507.5 (across the bottom
row), because the average of the four broadband percents is 5%+24%+48%+55%
4 = 33%. (We
take the unweighted average because we have assumed that the sample sizes were equal.)
.
The test statistic is X 2 = 1601.8 with df = 3, for which P < 0.0001. Not surprisingly, we
reject H0 . (c) The average of the last two broadband percents is 48%+55%
2 = 51.5%, so if
the proportions are equal, the expected counts are (0.515)(2250) = 1158.75 (top row) and
.
(0.485)(2250) = 1091.25 (bottom row). The test statistic is X 2 = 22.07 with df = 1, for
which P < 0.0001.
Note: This test is equivalent to testing H0 : p1 = p2 versus Ha : p1 = p2 using
.
the methods of Chapter 8. We find pooled estimate p̂ = 0.515, SEDp = 0.01490, and
.
z = (0.48 − 0.55)/SEDp = −4.70. (Note that z2 = X 2 .)

9.12. (a) The two-way table is on the right; for Date of Survey
example, for April 2001, (0.41)(2250) = 922.5 April April March April
and (0.59)(2250) = 1327.5. (b) Under Dialup? 2001 2004 2007 2008
the null hypothesis that the proportions Yes 922.5 675 360 270
have not changed, the expected counts are No 1327.5 1575 1890 1980
(0.2475)(2250) = 556.875 (across the top
row) and (0.7525)(2250) = 1693.125 (across 50
Percent of households

the bottom row), because the average of the Broadband


40
four dialup percents is 41%+30%+16%+12%
4 . =
24.75%. The test statistic is X 2 = 641.2 30
with df = 3, for which P < 0.0001. Again, 20
we reject H0 . (c) The average of the last
10
two dialup percents is 16%+12%
2 = 14%, so Dialup
if the proportions are equal, the expected 0
counts are (0.14)(2250) = 315 (top row) and 2001 2002 2003 2004 2005 2006 2007 2008
Year
(0.86)(2250) = 1935 (bottom row). The test
.
statistic is X 2 = 14.95 with df = 1, for which P < 0.0001. (d) The data shows that the rise
of broadband access has been accompanied by a decline in dialup access.
Note: As in the previous exercise, the test in part (c) is equivalent to testing H0 : p1 = p2
.
versus Ha : p1 = p2 , for which the pooled estimate is p̂ = 0.14, SEDp = 0.01035, and
.
z = (0.16 − 0.12)/SEDp = 3.87. Again, note that z2 = X 2 .)

9.13. Students may experiment with a variety of scenarios, but they should find that regardless
of the what they try, the conclusion is the same.
254 Chapter 9 Analysis of Two-Way Tables

9.14. (a) Student approaches to estimat- Date of Survey


ing the dialup counts will vary. The October February December May
bottom row of the table on the right Switch? 2002 2004 2005 2008
shows a reasonable set of estimates, Yes 301.10 266.29 191.71 94.32
found by fitting a regression line to the No 491.28 399.43 299.86 167.68
counts in the solution to Exercise 9.13. Total 792.38 665.72 491.57 262.00
(Even students who use a similar approach might get slightly different answers depending
on how they represent the survey dates as x values.) (b) For example, for October 2002,
(0.38)(792.38) = 301.10 and (0.62)(792.38) = 491.28. (c) For the data shown, the test
.
statistic is X 2 = 1.45 (df = 3, P = 0.6934). Student results will vary, but unless their
dialup count estimates are drastically different, they should not reject H0 ; that is, there is not
enough evidence to conclude that the proportion of dialup users intending to switch to broad-
band has changed. (d) Answers will vary depending on the approach used, but should be
close to 45%. One explanation is that the number of (surveyed) dialup users who were not
interested in switching dropped from about 300 to 168 from December 2005 to May 2008—a
44% reduction. Alternatively, in that time period, the number of dialup users dropped by
47%, from about 492 to 262. In order for the percent not planning to switch to remain at
60%, that group must decrease by a similar amount.

9.15. (a) The 3 × 2 table is on the right. (b) The percents of Allowed?
disallowed small, medium, and large claims are (respectively) Stratum Yes No Total
6 . 5 .
57 = 10.5%, 17 = 29.4%, and 5 = 20%. (c) In the 3 × 2
1
Small 51 6 57
table, the expected count for large/not allowed is too small Medium 12 5 17
.
79 = 0.76). (d) The null hypothesis is “There is no rela-
( 5·12 Large 4 1 5
tionship between claim size and whether a claim is allowed.” Total 67 12 79
(e) As a 2 × 2 table (with the second row 16 “yes” and 6 “no”), we find X = 3.456, df = 1,
2

P = 0.063. The evidence is not quite strong enough to reject H0 .

9.16. (a) In the table below, the estimated numbers of disallowed claims in the populations
are found by multiplying the sample proportion by the population size; for example,
.
57 · 3342 = 351.8 claims. (b) For each stratum, let p̂ be the sample proportion, n be
6

the sample
 size, and N be the population size. The standard error for the sample is
SE p̂ = p̂(1 − p̂)/n, and the standard error for the population estimate is N SE p̂ . The
margins of error depends on the desired confidence level; for 95% confidence, we should
double the population standard errors.

Sample Population Standard error


Stratum Not allowed Total Not allowed Total Sample Population
Small 6 57 351.8 3342 0.0406 135.8485
Medium 5 17 72.4 246 0.1105 27.1855
Large 1 5 11.6 58 0.1789 10.3754
Solutions 255

9.17. The table on the right shows the given information trans- Year DFW Pass
lated into a 3 × 2 table. For example, in Year 1, about 1 1018.584 1389.416
(0.423)(2408) = 1018.584 students received DFW grades, 2 578.925 1746.075
and the rest—(0.577)(2408) = 1389.416 students—passed. 3 423.074 1702.926
.
To test H0 : the DFW rate has not changed, we have X = 307.8, df = 2, P < 0.0001—very
2

strong evidence of a change.

9.18. (a) The table of approximate counts is on Attendance ABC DFW Total
the right. Because the reported percents were Less than 50% 10.78 9 19.78
rounded to the nearest whole percent, the total 51% to 74% 43.12 25.2 68.32
sample size is not 719. (b) With the counts as in 75% to 94% 134.75 54 188.75
. .
the table, X = 15.75, df = 3, and P = 0.0013.
2 95% or more 355.74 91.8 447.54
If students round the counts, or attempt to adjust Total 544.39 180 724.39
2
the numbers in the first column so the numbers add up to 719, the value of X will change
slightly, but the P-value remains small, and the conclusion is the same. (c) We have strong
enough evidence to conclude that there is an association between class attendance and DFW
rates. (d) Association is not proof of causation. However, by comparing the observed counts
with the expected counts, we can see that the data are consistent with that scenario; for ex-
ample, among students with the highest attendance rates, more passed than expected (355.74
observed, 336.33 expected), and fewer failed (91.8 observed, 111.2 expected).

9.19. (a) The approximate counts are shown on Time of entry


the right; for example, among those students Field of Right after
in trades, (0.34)(942) = 320.28 enrolled right study high school Later Total
after high school, and (0.66)(942) = 621.72 Trades 320.28 621.72 942
enrolled later. (b) In addition to a chi-square Design 274.48 309.52 584
test in part (c), students might note other things, Health 2034 3051 5085
Media/IT 975.88 2172.12 3148
such as: Overall, 39.4% of these students en-
Service 486 864 1350
rolled right after high school. Health is the most Other 1172.60 1082.40 2255
popular field, with about 38% of these students. Total 5263.24 8100.76 13,364
(c) We have strong enough evidence to conclude
that there is an association between field of study and when students enter college; the test
statistic is X 2 = 275.9 (with unrounded counts) or 276.1 (with rounded counts), with df = 5,
for which P is very small. A graphical summary is not shown; a bar chart would be appro-
priate.
256 Chapter 9 Analysis of Two-Way Tables

9.20. (a) The approximate counts are shown on the Government loans
right; for example, among those students in trades, Field Yes No Total
(0.45)(942) = 423.9 took government loans Trades 423.9 518.1 942
and (0.55)(942) = 518.1 did not. (b) We have Design 317.47 281.53 599
strong enough evidence to conclude that there is Health 2878.7 2355.3 5234
an association between field of study and taking Media/IT 1780.9 1457.1 3238
Service 826.8 551.2 1378
government loans; the test statistic is X 2 = 97.44
Other 1081 1219 2300
(with unrounded counts) or 97.55 (with rounded Total 7308.77 6382.23 13,691
counts), with df = 5, for which P is very small.
(c) Overall, 53.3% of these students took government loans; students in trades and “other”
fields of study were slightly less likely, and those in the service field were slightly more
likely. A bar graph would be a good choice for a graphical summary.

9.21. (a) The approximate counts are shown Parents, family, spouse
on the right; for example, among those Field Yes No Total
students in trades, (0.2)(942) = 188.4 Trades 188.4 753.6 942
relied on parents, family, or spouse, and Design 221.63 377.37 599
(0.8)(942) = 753.6 did not. (b) We have Health 1360.84 3873.16 5234
strong enough evidence to conclude that there Media/IT 518.08 2719.92 3238
Service 248.04 1129.96 1378
is an association between field of study and
Other 943 1357 2300
getting money from parents, family, or spouse; Total 3479.99 10211.01 13,691
the test statistic is X 2 = 544.0 (with un-
rounded counts) or 544.8 (with rounded counts), with df = 5, for which P is very small.
(c) Overall, 25.4% of these students relied on family support; students in media/IT and ser-
vice fields were slightly less likely, and those in the design and“other” fields were slightly
more likely. A bar graph would be a good choice for a graphical summary.

.
9.22. (a) For example, 63 +63309 = 16.94% of the small- 60 56.85%
est banks over RDC. The bar graph on the right is
Percent offering RDC

50
one possible graphical summary. (b) To test H0 : no
association between bank size and offering RDC, 40
. 30.89%
we have X 2 = 96.3 with df = 2, for which P is 30
tiny. We have very strong evidence of an associa- 20 16.94%
tion.
10
0
Under $100 $101–200 $201 or more
Bank assets ($millions)
Solutions 257

.
+ 148 = 50.5% get
9.23. (a) Of the high exercisers, 151151

Percent getting enough sleep


50.5%
enough sleep, and the rest (49.5%) do not. (b) Of 50
.
+ 242 = 32.2% get enough
the low exercisers, 115115 40
sleep, and the rest (67.8%) do not. (c) Those who 32.2%
30
exercise more than the median are more likely
to get enough sleep. (d) To test H0 : exercise and 20
.
sleep are not associated, we have X 2 = 22.58 with 10
df = 1, for which P is very small. We have very
strong evidence of an association. 0
High Low
Exercise group

9.24. (a) The marginal totals are given in the table Lied? Male Female Total
on the right. (b) The most appropriate description Yes 3,228 10,295 13,523
is the conditional distribution by gender (the ex- No 9,659 4,620 14,279
planatory variable): 25.05% of males, and 69.02% Total 12,887 14,915 27,802
of females, admitted to lying. (c) Females are
70 69.02%
much more likely to have lied (or at least, to ad-
60
mit to lying). (d) Not surprisingly, this is highly

Percent who have


lied to a teacher
. 50
significant: X 2 = 5352, df = 1, P is tiny. This
test statistic is too extreme to bother creating a 40
P-value sketch. 30 25.05%
Note: To get an idea of how extreme this 20
test statistic value is: Observing X 2 = 5352 10
from √a χ 2 (1) distribution is equivalent to 0
. Male Female
z = 5352 = 73 from the standard Normal
Gender
distribution.

9.25. (a) The marginal totals are given in the table Lied? Male Female Total
on the right. (b) The most appropriate descrip- Yes 11,724 14,169 25,893
tion is the conditional distribution by gender (the No 1,163 746 1,909
explanatory variable): 91% of males, and 95% Total 12,887 14,915 27,802
of females, agreed that trust and honesty are 100
91% 95%
essential. (c) Females are slightly more likely to
that trust is essential

view trust and honesty as essential. (d) While 80


Percent who say

the percents in the conditional distribution are 60


similar, the large sample sizes make this highly
.
significant: X 2 = 175.0, df = 1, P is tiny. Once 40
again, a P-value sketch is not shown. 20
Note: X 2 = 175 coming√from a χ 2 (1) distri-
.
bution is equivalent to z = 175 = 13 coming 0
Male Female
from the standard Normal distribution.
Gender
258 Chapter 9 Analysis of Two-Way Tables

9.26. The main problem is that this is not a two-way table. Specifically, each of the 119
students might fall into several categories: They could appear on more than one row if
they saw more than one of the movies and might even appear more than once on a given
row (for example, if they have both bedtime and waking symptoms arising from the same
movie).
Another potential problem is that this is a table of percents rather than counts. However,
because we were given the value of n for each movie title, we could use that information to
determine the counts for each category; for example, it appears that 20 of the 29 students
.
who watched Poltergeist had short-term bedtime problems because 20 29 = 68.96% (perhaps
the reported value of 68% was rounded incorrectly). If we determine all of these counts in
this way (and note several more apparent rounding errors in the process), those counts add
up to 200, so we see that students really were counted more than once.
If the values of n had not been given for each movie, then we could not do a chi-squared
analysis even if this were a two-way table.

9.27. (a) The joint distribution is found by dividing each FT PT


number in the table by 17,380 (the total of all the 3553 329 3882
numbers). These proportions are given in italics on the 15–19 0.2044 0.0189 0.2234
3553 .
right. For example, 17380 = 0.2044, meaning that about 5710 1215 6925
20–24 0.3285 0.0699 0.3984
20.4% of all college students are full-time and aged 15
1825 1864 3689
to 19. (b) The marginal distribution of age is found by 25–34 0.1050 0.1072 0.2123
dividing the row totals by 17,380; they are in the right 901 1983 2884
margin of the table (above, right) and the graph on the 35+ 0.0518 0.1141 0.1659
3882 .
left below. For example, 17380 = 0.2234, meaning 11989 5391 17380
that about 22.3% of all college students are aged 15 to 0.6898 0.3102
19. (c) The marginal distribution of status is found by dividing the column totals by 17,380;
they are in the bottom margin of the table (above, right) and the graph on the right below.
.
17380 = 0.6898, meaning that about 69% of all college students are full-time.
For example, 11989
(d) The conditional distributions are given in the table on the following page. For each status
category, the conditional distribution of age is found by dividing the counts in that column
3553 . 5710 .
by that column total. For example, 11989 = 0.2964, 11989 = 0.4763, etc., meaning that of all
full-time college students, about 29.64% are aged 15 to 19, 47.63% are 20 to 24, and so on.
Note that each set of four numbers should add to 1 (except for rounding error). Graphical
presentations may vary; one possibility is shown on the following page. (e) We see that full-
time students are dominated by younger ages, while part-time students are more likely to be
older.

0.4 0.7
Proportion of students
Proportion of students

0.6
0.3 0.5
0.4
0.2
0.3
0.2
0.1
0.1
0 0
15–19 20–24 25–34 35 and over Full-time Part-time
Solutions 259

1
0.9 15–19
Full- Part-

Proportion of students
0.8
time time 20–24
0.7
15–19 0.2964 0.0610 0.6
0.5 25–34
20–24 0.4763 0.2254
0.4
35 and over
25–34 0.1522 0.3458 0.3
0.2
35+ 0.0752 0.3678 0.1
0
Full-time Part-time

.
6925 = 46.99% are men and the rest
9.28. (a) Of all students aged 20 to 24 years, 3254
.
6925 = 53.01%) are women. Shown below are two possible graphical displays. In the bar
( 3671
graph on the left, the bars represent the proportion of all students (in this age range) in each
gender. Alternatively, because the two percents represent parts of a single whole, we can
display the distribution as a pie chart like that in the middle. (b) Among male students,
2719 . 535 .
3254 = 83.56% are full-time and the rest ( 3254 = 16.44%) are part-time. Among female
. 680 .
3671 = 81.48% and 3671 = 18.52%. Men in this age range
students, those numbers are 2991
are (very slightly) more likely to be full-time students. The bar graph below on the right
shows the proportions of full-time students side by side; note that a pie graph would not
be appropriate for this display because the two proportions represent parts of two different
.
wholes. (c) For the full-time row, the expected counts are (5710)(3254)
6925 = 2683.08 and
(5710)(3671) .
6925 = 3026.92. (d) Using df = 1, we see that X 2 = 5.17 falls between 5.02 and 5.41,
so 0.02 < P < 0.025 (software gives 0.023). This is significant evidence (at the 5% level)
that there is a difference in the conditional distributions.

0.9
Proportion of FT students

0.5
Proportion of all students

0.8
0.4 0.7
0.6
0.3 Female Male 0.5
0.4
0.2
0.3
0.1 0.2
0.1
0 0
Male Female Male Female
260 Chapter 9 Analysis of Two-Way Tables

9.29. (a) The percent who have lasting waking Minitab output
symptoms is the total of the first column divided WakeYes WakeNo Total
69 .
by the grand total: 119 = 57.98%. (b) The BedYes 36 33 69
40.01 28.99
percent who have both waking and bedtime
symptoms is the count in the upper left divided BedNo 33 17 50
36 . 28.99 21.01
by the grand total: 119 = 30.25%. (c) To test
H0 : There is no relationship between waking Total 69 50 119
and bedtime symptoms versus Ha : There is a ChiSq = 0.402 + 0.554 +
. 0.554 + 0.765 = 2.275
relationship, we find X 2 = 2.275 (df = 1) and
. df = 1, p = 0.132
P = 0.132. We do not have enough evidence to
conclude that there is a relationship.

9.30. The table below gives df = (r − 1)(c − 1), bounds for P, and software P-values.
Size of Crit. values Bounds Software
X2 table df (Table F) for P P
(a) 1.25 2 by 2 1 X 2 < 1.32 P > 0.25 0.2636
(b) 18.34 4 by 4 9 16.92 < X 2 < 19.02 0.025 < P < 0.05 0.0314
(c) 24.21 2 by 8 7 22.04 < X 2 < 24.32 0.001 < P < 0.0025 0.0010
(d) 12.17 5 by 3 8 12.03 < X 2 < 13.36 0.10 < P < 0.15 0.1438

9.31. Two examples are shown on the right. In general, choose a to 30 20 10 40


be any number from 0 to 50, and then all the other entries can be 70 80 90 60
determined.
Note: This is why we say that such a table has “one degree of freedom”: We can make
one (nearly) arbitrary choice for the first number, and then have no more decisions to make.

9.32. To construct such a table, we can start by choosing values for the row a b r1
and column sums r1 , r2 , r3 , c1 , c2 , as well as the grand total N . Note that c d r2
the N = r1 + r2 + r3 = c1 + c2 , so we only have four choices to make. Then e f r3
find each count a, b, c, d, e, f by taking the corresponding row total, times c 1 c 2 N
the corresponding column total, divided by the grand total. For example, a = r1 × c1 /N and
d = r2 × c2 /N . Of course, these counts should be whole numbers, so it may be necessary to
make adjustments in the row and column totals to meet this requirement.
The simplest such table would have all six counts a, b, c, d, e, f equal to one another
(which would arise if we start with r1 = r2 = r3 and c1 = c2 ).

9.33. (a) Different graphical presentations are possible; one is shown on the following page.
More women perform volunteer work; the notably higher percent of women who are
“strictly voluntary” participants accounts for the difference. (The “court-ordered” and “other”
percents are similar for men and women.) (b) Either by adding the three “participant”
categories or by subtracting from 100% the non-participant percentage, we find that 40.3%
of men and 51.3% of women are participants. The relative risk of being a volunteer is
.
40.3% = 1.27.
therefore 51.3%
Solutions 261

100
90 Strictly voluntary
80
70 Court-ordered

Percent
60
50 Other
40
Non-volunteers
30
20
10
0
Men Women

9.34. Table shown on the right; for example, Strictly Court-


31.9% .
40.3% = 79.16%. The percents in each row
Gender voluntary ordered Other
sum to 100%, with no rounding error for up to Men 79.16% 5.21% 15.63%
Women 85.19% 2.14% 12.67%
four places after the decimal. Both this graph and
the graph in the previous exercise show that women are more likely to volunteer, but in this
view we cannot see the difference in the rate of non-participation.
100
90 Strictly voluntary
80
70 Other
Percent

60
50 Court-ordered
40
30
20
10
0
Men Women

9.35. (a) The missing entries (shown shaded on the right) Gender
are found by subtracting the number who have tried low- Low-fat diet? Women Men
fat diets from the given totals. (b) Viewing gender as Yes 35 8
explanatory, compute the conditional distributions of low- No 146 97
35 .
fat diet for each gender: 181 = 19.34% of women and Total 181 105
.
105 = 7.62% of men have tried low-fat diets. (c) The test statistic is X = 7.143 (df = 1),
8 2

for which P = 0.008. We have strong evidence of an association; specifically, women are
more likely to try low-fat diets.

Minitab output
Women Men Total 20
Percent who have tried

Yes 35 8 43
27.21 15.79 15
low-fat diets

No 146 97 243
153.79 89.21 10
Total 181 105 286
5
ChiSq = 2.228 + 3.841 +
0.394 + 0.680 = 7.143
df = 1, p = 0.008 0
Women Men
262 Chapter 9 Analysis of Two-Way Tables

9.36. (a) The best numerical summary Minitab output


would note that we view target audience Women Men Genl Total
(“magazine readership”) as explanatory, 1 351 514 248 1113
424.84 456.56 231.60
so we should compute the conditional
distribution of model dress for each au- 2 225 105 66 396
151.16 162.44 82.40
dience. This table and graph are shown
below. (b) Minitab output is shown on Total 576 619 314 1509
.
the right: X 2 = 80.9, df = 2, and P ChiSq = 12.835 + 7.227 + 1.162 +
36.074 + 20.312 + 3.265 = 80.874
is very small. We have very strong evi- df = 2, p = 0.000
dence that target audience affects model
dress. (c) The sample is not an SRS: A set of magazines were chosen, and then all ads in
three issues of those magazines were examined. It is not clear how this sampling approach
might invalidate our conclusions, but it does make them suspect.

40
35

Sexual ads (%)


30
Magazine readership
25
Model dress Women Men General
20
Not sexual 60.94% 83.04% 78.98% 15
Sexual 39.06% 16.96% 21.02% 10
5
0
Women Men General

9.37. (a) As the conditional distribution of model dress for each age group has been given to
us, it only remains to display this distribution graphically. One such presentation is shown
below. (b) In order to perform the significance test, we must first recover the counts from
.
the percents. For example, there were (0.723)(1006) = 727 non-sexual ads in young adult
magazines. The remainder of these counts can be seen in the Minitab output below, where
. .
we see X 2 = 2.59, df = 1, and P = 0.108—not enough evidence to conclude that age group
affects model dress.

Minitab output
Young Mature Total 25
1 727 383 1110
Sexual ads (%)

740.00 370.00 20
2 279 120 399 15
266.00 133.00
Total 1006 503 1509 10

ChiSq = 0.228 + 0.457 + 5


0.635 + 1.271 = 2.591
df = 1, p = 0.108 0
Young adult Mature adult
Solutions 263

9.38. (a) Subtract the “agreed” counts from the Minitab output
sample sizes to get the “disagreed” counts. The Students Non-st Total
table is in the Minitab output on the right. (The Agr 22 30 52
26.43 25.57
output has been slightly altered to have more
descriptive row and column headings.) We find Dis 39 29 68
. 34.57 33.43
X = 2.67, df = 1, and P = 0.103, so we
2

cannot conclude that students and non-students Total 61 59 120


differ in the response to this question. (b) For ChiSq = 0.744 + 0.769 +
0.569 + 0.588 = 2.669
testing H0 : p1 = p2 versus Ha : p1 = p2 , we
. . . df = 1, p = 0.103
have p̂1 = 0.3607, p̂2 = 0.5085, p̂ = 0.4333,
.
SE Dp = 0.09048, and z = −1.63. Up to rounding, z 2 = X 2 and the P-values are the same.
(c) The statistical tests in (a) and (b) assume that we have two SRSs, which we clearly do
not have here. Furthermore, the two groups differed in geography (northeast/West Coast)
in addition to student/non-student classification. These issues mean we should not place
too much confidence in the conclusions of our significance test—or, at least, we should not
generalize our conclusions too far beyond the populations “upper level northeastern college
students taking a course in Internet marketing” and “West Coast residents willing to partici-
pate in commercial focus groups.”

9.39. (a) First we must find the counts Minitab output


in each cell of the two-way table. Div1 Div2 Div3 Total
For example, there were about 1 966 621 998 2585
. 1146.87 603.54 834.59
(0.172)(5619) = 966 Division I
athletes who admitted to wagering. 2 4653 2336 3091 10080
4472.13 2353.46 3254.41
These counts are shown in the Minitab
output on the right, where we see that Total 5619 2957 4089 12665
.
X 2 = 76.7, df = 2, and P < 0.0001. ChiSq = 28.525 + 0.505 + 31.996 +
7.315 + 0.130 + 8.205 = 76.675
There is very strong evidence that the df = 2, p = 0.000
percent of athletes who admit to wa-
gering differs by division. (b) Even with much smaller numbers of students (say, 1000 from
each division), P is still very small. Presumably the estimated numbers are reliable enough
that we would not expect the true counts to be less than 1000, so we need not be concerned
about the fact that we had to estimate the sample sizes. (c) If the reported proportions are
wrong, then our conclusions may be suspect—especially if it is the case that athletes in some
division were more likely to say they had not wagered when they had. (d) It is difficult to
predict exactly how this might affect the results: Lack of independence could cause the es-
timated percents to be too large, or too small, if our sample included several athletes from
teams which have (or do not have) a “gambling culture.”

9.40. In Exercise 9.15, we are comparing three populations (model 1): small, medium, and
large claims. In Exercise 9.23, we test for independence (model 2) between amount of sleep
and level of exercise. In Exercise 9.24, we test for independence between gender and lying
to teachers. In Exercise 9.39, one could argue for either answer. If we chose three separate
random samples from each division, then we are comparing three populations (model 1). If
a single random sample of student athletes was chosen, and then we classified each student
by division and by gambling response, this is a test for independence (model 2).
264 Chapter 9 Analysis of Two-Way Tables

Note: For some of these problems, either answer may be acceptable, provided a
reasonable explanation is given. The distinctions between the models can be quite difficult to
make since the difference between several populations might, in fact, involve classification
by a categorical variable. In many ways, it comes down to how the data were collected.
For example, in Exercise 9.15, we were told that the data came from a stratified random
sample—which means that the three groups were treated as separate populations. Of course,
the difficulty is that the method of collecting data may not always be apparent, in which
case we have to make an educated guess. One question we can ask to educate our guess is
whether we have data that can be used to estimate the (population) marginal distributions.

9.41. The Minitab output on the right shows both Minitab output
the two-way table (column and row headings Female Male Total
have been changed to be more descriptive) and bihai 20 0 20
. 14.00 6.00
the results for the significance test: X 2 = 12.0,
df = 1, and P = 0.001, so we conclude that no 29 21 50
35.00 15.00
gender and flower choice are related. The count
of 0 does not invalidate the test: Our smallest Total 49 21 70
expected count is 6, while the text says that “for ChiSq = 2.571 + 6.000 +
1.029 + 2.400 = 12.000
2 × 2 tables, we require that all four expected df = 1, p = 0.001
cell counts be 5 or more.”

9.42. The graph below depicts the conditional Minitab output


distribution of domain type for each jour- NEJM JAMA Science Total
.
nal; for example, in NEJM, 4197 = 42.27%
.gov 41 103 111 255
36.81 71.72 146.47
of Internet references were to .gov do-
.
97 = 38.14% were to .org domains,
mains, 37 .org 37 46 162 245
35.36 68.91 140.73
and so on. The Minitab output shows the
expected counts, which tell a story sim- .com 6 17 14 37
5.34 10.41 21.25
ilar to the bar graph, and show that the
.edu 4 8 47 59
relationship between journal and domain
. 8.52 16.59 33.89
type is significant (X 2 = 56.12, df = 8,
other 9 15 52 76
P < 0.0005). 10.97 21.37 43.65
100 Total 97 189 386 672
90
Internet references (%)

80 ChiSq = 0.477 + 13.644 + 8.591 +


70 0.076 + 7.615 + 3.215 +
60 0.081 + 4.178 + 2.475 +
50 2.395 + 4.451 + 5.072 +
40 0.354 + 1.901 + 1.595 = 56.12
30 df = 8, p = 0.000
20
10
0
NEJM JAMA Science
.gov .com Other

.org .edu
Solutions 265

9.43. The graph on the right depicts 100


the conditional distribution of pet 90 No pets

Pet ownership (%)


ownership for each education level; 80
70 Dogs
for example, among those who did
. 60
542 = 77.68%
not finish high school, 421 50 Cats
93 .
owned no pets, 542 = 17.16% owned 40
28 . 30
dogs, and 542 = 5.17% (the rest) 20
owned cats. (One could instead 10
compute column percents—the con- 0
ditional distribution of education < HS HS graduate Postsec.
for each pet-ownership group—but Minitab output
education level makes more sense None Dogs Cats Total
as the explanatory variable here.) <HS 421 93 28 542
The (slightly altered) Minitab output 431.46 73.25 37.29
shows that the relationship between HS 666 100 40 806
education level and pet ownership 641.61 108.93 55.46
.
is significant (X 2 = 23.15, df = 4, >HS 845 135 99 1079
P < 0.0005). Specifically, dog 858.93 145.82 74.25
owners have less education, and cat Total 1932 328 167 2427
owners more, than we would expect ChiSq = 0.253 + 5.326 + 2.316 +
if there were no relationship between 0.927 + 0.732 + 4.310 +
0.226 + 0.803 + 8.254 = 23.147
pet ownership and educational level. df = 4, p = 0.000

9.44. The graph on the right depicts 100


the conditional distribution of pet 90 No pets
Pet ownership (%)

ownership for each gender; for exam- 80


. 70 Dogs
1266 = 80.88%
ple, among females, 1024 60
157 .
owned no pets, 1266 = 12.40% 50 Cats
85 .
owned dogs, and 1266 = 6.71% (the 40
30
rest) owned cats. (One could instead 20
compute column percents—the condi- 10
tional distribution of gender for each 0
pet-ownership group—but gender Female Male
makes more sense as the explanatory Minitab output
variable here.) The (slightly altered) None Dogs Cats Total
Minitab output shows that the re- Female 1024 157 85 1266
lationship between education level 1008.53 170.60 86.86
and pet ownership is not significant Male 915 171 82 1168
.
(X 2 = 2.838, df = 2, P = 0.242). 930.47 157.40 80.14
Total 1939 328 167 2434
ChiSq = 0.237 + 1.085 + 0.040 +
0.257 + 1.176 + 0.043 = 2.838
df = 2, p = 0.242
266 Chapter 9 Analysis of Two-Way Tables

9.45. The missing entries can be seen 100


in the “Other” column of the Minitab 90 Eng.
80

Transfer area (%)


output below; they are found by sub- Mgmt.
70
tracting the engineering, management, 60
and liberal arts counts from each row 50 L.A.
total. The graph on the right shows 40
Other
the conditional distribution of transfer 30
20
area for each initial major; for ex- 10
ample, of those initially majoring in 0
13 .
biology, 398 = 3.27% transferred to Bio. Chem. Math. Phys.
25 .
engineering, 398 = 6.28% transferred to management, and so on. The relationship is sig-
.
nificant (X 2 = 50.53, df = 9, P < 0.0005). The largest contributions to X 2 come from
chemistry or physics to engineering and biology to liberal arts (more transfers than expected)
and biology to engineering and chemistry to liberal arts (fewer transfers than expected).

Minitab output
Eng Mgmt LA Other Total
Bio 13 25 158 202 398
25.30 34.56 130.20 207.95
Chem 16 15 19 64 114
7.25 9.90 37.29 59.56
Math 3 11 20 38 72
4.58 6.25 23.55 37.62
Phys 9 5 14 33 61
3.88 5.30 19.96 31.87
Total 41 56 211 337 645
ChiSq = 5.979 + 2.642 + 5.937 + 0.170 +
10.574 + 2.630 + 8.973 + 0.331 +
0.543 + 3.608 + 0.536 + 0.004 +
6.767 + 0.017 + 1.777 + 0.040 = 50.527
df = 9, p = 0.000
Solutions 267

9.46. Note that the given counts actually form a three-way table Face checks
(classified by adhesive, side, and checks). Therefore, this anal- No Yes
ysis should not be done as if the counts come from a 2 × 4 PVA/loose 10 54
two-way table; for one thing, no conditional distribution will PVA/tight 44 20
answer the question of interest (how to avoid face checks). UF/loose 21 43
Nonetheless, many students may do this analysis, for which they UF/tight 37 27
will find X 2 = 6.798, df = 3, and P = 0.079.
A better approach is to rearrange the table as shown on the right. The conditional distri-
butions across the rows will then give us information about avoiding face checks; the graph
.
below illustrates this. We find X 2 = 45.08, df = 3, and P < 0.0005, so we conclude that
the appearance of face checks is related to the adhesive/side combination—specifically, we
recommend the PVA/tight combination.
Another approach (not quite as good as the previous one) is to perform two separate
analyses—say, one for loose side, and one for tight side. These computations show that UF
.
is better than PVA for loose side (X 2 = 5.151, df = 1, P = 0.023), but there is no significant
.
difference for tight side (X 2 = 1.647, df = 1, P = 0.200). We could also do separate
. .
analyses for PVA (X = 37.029, df = 1, P < 0.0005) and UF (X 2 = 8.071, df = 1,
2

P = 0.005), from which we conclude that for either adhesive, the tight side has fewer face
checks. (Minitab output on the following page.)

Minitab output
NoChk Chk Total 80
PVA-L 10 54 64 70
Face checks (%)

28.00 36.00 60
PVA-T 44 20 64 50
28.00 36.00 40
UF-L 21 43 64 30
28.00 36.00 20
10
UF-T 37 27 64
28.00 36.00 0
PVA/loose PVA/tight UF/loose UF/tight
Total 112 144 256 Adhesive/side combination
ChiSq = 11.571 + 9.000 +
9.143 + 7.111 +
1.750 + 1.361 +
2.893 + 2.250 = 45.079
df = 3, p = 0.000
268 Chapter 9 Analysis of Two-Way Tables

Minitab output Minitab output


––––––– Loose side ––––––– –––––––– PVA ––––––––
NoChk Chk Total NoChk Chk Total
PVA 10 54 64 Loose 10 54 64
15.50 48.50 27.00 37.00
UF 21 43 64 Tight 44 20 64
15.50 48.50 27.00 37.00
Total 31 97 128 Total 54 74 128
ChiSq = 1.952 + 0.624 + ChiSq = 10.704 + 7.811 +
1.952 + 0.624 = 5.151 10.704 + 7.811 = 37.029
df = 1, p = 0.023 df = 1, p = 0.000
––––––– Tight side ––––––– –––––––– UF ––––––––
NoChk Chk Total NoChk Chk Total
PVA 44 20 64 Loose 21 43 64
40.50 23.50 29.00 35.00
UF 37 27 64 Tight 37 27 64
40.50 23.50 29.00 35.00
Total 81 47 128 Total 58 70 128
ChiSq = 0.302 + 0.521 + ChiSq = 2.207 + 1.829 +
0.302 + 0.521 = 1.647 2.207 + 1.829 = 8.071
df = 1, p = 0.200 df = 1, p = 0.005

9.47. The Minitab output on the right shows Minitab output


the 2 × 2 table and significance test details: Mex-Am Other Total
X 2 = 852.433, df = 1, P < 0.0005. Using Juror 339 531 870
688.25 181.75
z = −29.2, computed in the solution to Exer-
cise 8.81(c), this equals z 2 (up to rounding). Not 143272 37393 180665
142922.75 37742.25
Total 143611 37924 181535
ChiSq =177.226 +671.122 +
0.853 + 3.232 = 852.433
df = 1, p = 0.000
Solutions 269

9.48. (a) The bar graph on the right shows how 100

Parental assessment (%)


parental assessment of URIs compares for the 90 Mild
two treatments. Note that parental assessment 80
70 Moderate
data were apparently not available for all URIs: 60
We have assessments for 329 echinacea URIs 50 Severe
and 367 placebo URIs. Minitab output gives 40
X 2 = 2.506, df = 2, P = 0.286, so treatment 30
20
is not significantly associated with parental 10
assessment. (b) If we divide each echinacea 0
count by 337 and each placebo count by 370, Echinacea Placebo
we obtain the table of proportions (below, left), Treatment
and illustrated in the bar graph (below, right). Minitab output
(c) The only significant results are for rash Echin Placebo Total
(z = 2.74, P = 0.0061), drowsiness (z = 2.09, Mild 153 170 323
P = 0.0366), and other (z = 2.09, P = 0.0366). 152.68 170.32
A 10×2 table would not be appropriate, because Mod 128 157 285
each URI could have multiple adverse events. 134.72 150.28
(d) All results are unfavorable to echinacea, so Sev 48 40 88
in this situation we are not concerned that we 41.60 46.40
have falsely concluded that there are differences. Total 329 367 696
In general, when we perform a large number of ChiSq = 0.001 + 0.001 +
significance tests and find a few to be signifi- 0.335 + 0.300 +
cant, we should be concerned that the significant 0.985 + 0.883 = 2.506
df = 2, p = 0.286
results may simply be due to chance.
45
Echinacea
Percent reporting event

40
Event p̂1 p̂2 z P 35 Placebo
Itchiness 0.0386 0.0189 1.57 0.1154 30
Rash 0.0712 0.0270 2.74 0.0061 25
“Hyper” 0.0890 0.0622 1.35 0.1756 20
Diarrhea 0.1128 0.0919 0.92 0.3595 15
Vomiting 0.0653 0.0568 0.47 0.6357 10
Headache 0.0979 0.0649 1.61 0.1068 5
Stomachache 0.1543 0.1108 1.71 0.0875 0
"Hyper"

Vomiting

Any
Headache

Other
Diarrhea
Rash
Itchiness

Drowsiness
Stomachache

Drowsiness 0.1869 0.1297 2.09 0.0367


Other 0.1869 0.1297 2.09 0.0367
Any event 0.4510 0.3946 1.52 0.1290

Adverse event

(e) We would expect multiple observations on the same child to be dependent, so the
assumptions for our analysis are not satisfied. Examination of the data reveals that the
results for both groups are quite similar, so we are inclined to agree with the authors
that there are no statistically significant differences. (f) Student opinions about the
criticisms of this study will vary. The third criticism might be dismissed as sounding
like conspiracy-theory paranoia, but the other three address the way that echinacea was
administered; certainly we cannot place too much faith in a clinical trial if it turns out that
the treatments were not given properly!
270 Chapter 9 Analysis of Two-Way Tables

.
9.49. The chi-square goodness of fit statistic is X 2 = 3.7807 with df = 3, for which P > 0.25
(software gives 0.2861), so there is not enough evidence to conclude that this university’s
distribution is different. The details of the computation are given in the table below; note
that there were 210 students in the sample.
Expected Expected Observed (O − E)2
frequency count count O−E E
Never 0.43 90.3 79 −11.3 1.4141
Sometimes 0.35 73.5 83 9.5 1.2279
Often 0.15 31.5 36 4.5 0.6429
Very often 0.07 14.7 12 −2.7 0.4959
210 3.7807

.
9.50. The chi-square goodness of fit statistic is X 2 = 3.4061 with df = 4, for which P > 0.25
(software gives 0.4923), so we have no reason to doubt that the numbers follow a Normal
distribution. The details of the computation are given in the table below. The table entries
from Table A for −0.6, −0.1, 0.1, and 0.6 are (respectively) 0.2743, 0.4602, 0.5398,
and 0.7257. Then, for example, the expected frequency in the interval −0.6 to −0.1 is
0.4602 − 0.2743 = 0.1859.
Expected Expected Observed (O − E)2
frequency count count O−E E
z ≤ −0.6 0.2743 137.2 139 1.85 0.0250
−0.6 < z ≤ −0.1 0.1859 93.0 102 9.05 0.8811
−0.1 < z ≤ 0.1 0.0796 39.8 41 1.20 0.0362
0.1 < z ≤ 0.6 0.1859 93.0 78 −14.95 2.4045
z > 0.6 0.2743 137.2 140 2.85 0.0592
3.4061

9.52. The chi-square goodness of fit statistic is X 2 = 5.50 with df = 4, for which


0.20 < P < 0.25 (software gives 0.2397), so we have no reason to doubt that the numbers
follow this uniform distribution. The details of the computation are given in the table below.
Expected Expected Observed (O − E)2
frequency count count O−E E
0 <x ≤ 0.2 0.2 100 114 14 1.96
0.2 < x ≤ 0.4 0.2 100 92 −8 0.64
0.4 < x ≤ 0.6 0.2 100 108 8 0.64
0.6 < x ≤ 0.8 0.2 100 101 1 0.01
0.8 < x <1 0.2 100 85 −15 2.25
5.50
Solutions 271

9.54. A P-value of 0.999 is suspicious because it means that there was an df X2


almost-perfect match between the observed and expected counts. (The 1 2 × 10−6
table on the right shows how small X 2 must be in order to have a P-value 2 0.0020
of 0.999; recall that X 2 is small when the observed and expected counts 3 0.0243
are close.) We expect a certain amount of difference between these counts 4 0.0908
due to chance, and become suspicious if the difference is too small. In 5 0.2102
particular, when H0 is true, a match like this would occur only once in 6 0.3810
7 0.5985
1000 attempts; if there were 1000 students in the class, that might not be
8 0.8571
too surprising. 9 1.1519
10 1.4787

9.55. (a) Each quadrant accounts for one-fourth of the Observed Expected (o − e)2 /e
area, so we expect it to contain one-fourth of the 100 18 25 1.96
trees. (b) Some random variation would not surprise us; 22 25 0.36
we no more expect exactly 25 trees per quadrant than 39 25 7.84
we would expect to see exactly 50 heads when flipping 21 25 0.64
a fair coin 100 times. (c) The table on the right shows 100 10.8
the individual computations, from which we obtain X 2 = 10.8, df = 3, and P = 0.0129. We
conclude that the distribution is not random.
Chapter 10 Solutions

10.1. The given model was µ y = 43.4+2.8x, with standard deviation σ = 4.3. (a) The slope is
2.8. (b) When x increases by 1, µ y increases by 2.8. (Or equivalently, if x increases by 2,
µ y increases by 5.6, etc.) (c) When x = 7, µ y = 43.4+2.8(7) = 63. (d) Approximately 95%
of observed responses would fall in the interval µ y ± 2σ = 63 ± 2(4.3) = 63 ± 8.6 = 54.4 to
71.6.

 = 29.578 − 0.655 PA. (a) With


10.2. The regression equation given in Example 10.3 was BMI
physical activity x = 9.5 steps/day, the estimated average BMI is 23.36 kg/m2 . (b) The
.
residual is 22.8 − 23.36 = −0.56. (c) Based on the scatterplot (Figure 10.3 in the text),
the regression equation is appropriate for 7000 or 12,000 steps/day, but using it for 2000
or 17,000 would be extrapolation, made additionally risky by the suggestion of a curved
relationship (Figure 10.5).

10.3. Example 10.5 gives the confidence interval −0.969 to −0.341 for the slope β1 . Recall
 ) when x (i.e., PA) changes by +1. (a) If PA
that slope is the change in y (i.e., BMI

increases by 1, we expect BMI to change by β1 , so the 95% confidence interval for the
change is −0.969 to −0.341—that is, a decrease of 0.341 to 0.969 kg/m2 . (b) If PA
 to change by −β1 , so the 95% confidence interval for the
decreases by 1, we expect BMI
 to
change is an increase of 0.341 to 0.969 kg/m2 . (c) If PA increases by 0.5, we expect BMI
change by 0.5β1 , so the 95% confidence interval for the change is a decrease of 0.1705 to
0.4845 kg/m2 .

10.4. The given prediction interval is 16.4 to 31.0 kg/m2 . This interval is 2t ∗ SE ŷ units wide,
. .
where t ∗ = 1.9845 for df = 98. Therefore, SE ŷ = 3.68. By examining Figure 10.8, we can
judge that the prediction intervals for x = 9 and x = 10 are roughly the same width, so the
standard errors should be roughly the same.
. .
Note: For the first question, students might take t∗ = 2, in which case SEŷ = 3.65. For
the second question, when x changes from 9 to 10, SEŷ increases by about 0.006, so it is
quite reasonable to say they are about the same.

10.5. (a) The plot on the following page suggests a linear increase. (b) The regression equation
is ŷ = −4566.24 + 2.3x. (c) The fitted values and residuals are given in the table on the
following page. Squaring the residuals and summing gives 0.952, so the standard error is:

0.952 .
s= = 0.3173 = 0.5633
n−2
(d) Given x (the year), spending comes from a N (µ y , σ ) distribution, where µ y = β0 + β1 x.
.
The estimates of β0 , β1 , and σ are b0 = −4566.24, b1 = 2.3, and s √= 0.5633.
 .
(e) We first note that x = 2005 and (xi − x)2 = 10, so SEb1 = s/ 10 = 0.1781.
We have df = n − 2 = 3, so t ∗ = 3.182, and the 95% confidence interval for β1 is
. .
b1 ± t ∗ SEb1 = 2.3 ± 0.5667 = 1.733 to 2.867. This gives the rate of increase of R&D
spending: between 1.733 and 2.867 billion dollars per year.

272
Solutions 273

Spending (billions of dollars)


48
Spending Fitted
Year ($billions) values Residuals 46
2003 40.1 40.66 −0.56 44
2004 43.3 42.96 0.34
2005 45.8 45.26 0.54 42
2006 47.7 47.56 0.14 40
2007 49.4 49.86 −0.46
2003 2004 2005 2006 2007
Year

10.6. (a) The variables x and y are reversed: Slope gives the change in y for a change in x.
(b) The population regression line has intercept β0 and slope β1 (not b0 and b1 ). (c) The
estimate µ̂ y = b0 + b1 x ∗ is more accurate when x ∗ is close to x, so the width of the
confidence interval grows with |x ∗ − x|.

10.7. (a) The parameters are β0 , β1 , and σ ; b0 , b1 , and s are the estimates of those
parameters. (b) H0 should refer to β1 (the population slope) rather than b1 (the estimated
slope). (c) The confidence interval will be narrower than the prediction interval because the
confidence interval accounts only for the uncertainty in our estimate of the mean response,
while the prediction interval must also account for the random error of an individual
response.

10.8. The table below gives two sets of answers: those found with critical values from
Table D, and those found with software. The approach taken only makes a noticeable
difference in part (c), where (with Table D) we take df = 80 rather than df = 98. In each
case, the margin of error is t ∗ SEb1 = 0.58t ∗ , with df = n − 2.

Table D Software
df b1 t∗ Interval t∗ Interval
(a) 23 1.1 2.069 −0.1000 to 2.3000 2.0687 −0.0998 to 2.2998
(b) 23 2.1 2.069 0.9000 to 3.3000 2.0687 0.9002 to 3.2998
(c) 98 1.1 1.990 −0.0542 to 2.2542 1.9845 −0.0510 to 2.2510

10.9. The test statistic is t = b1 /SEb1 = b1 /0.58, with df = n − 2. The tests for parts (a) and
(c) are not quite significant at the 5% level, while the test for part (b) is highly significant.
This is consistent with the confidence intervals from the previous exercise.

df b1 t P (Table D) P (software)
(a) 23 1.1 1.90 0.05 < P < 0.10 0.0705
(b) 23 2.1 3.62 0.001 < P < 0.002 0.0014
(c) 98 1.1 1.90 0.05 < P < 0.10* 0.0608
*Note that for (c), if we use Table D, we take df = 80.
274 Chapter 10 Inference for Regression

10.10. (a) The plot (below, left) shows a strong linear relationship with −2 7
no striking outliers. (b) The regression line (shown on the plot) is −2
−1 876
ŷ = 1133 + 1.6924x. (c) The residual plot (below, right) shows no −1 11
clear cause for concern. (d) A stemplot (shown) or histogram shows two −0 966
large residuals (one positive, one negative). (e) To test for a relationship, −0 433330
we test H0 : β1 = 0 versus Ha : β1 = 0 (or equivalently, use ρ in place 0 0011344
0 66789
of β1 ). (f) The test statistic and P-value are given in the Minitab output 1 02234
.
below: t = 10.55, P < 0.0005. We have strong evidence of a non-zero 1
slope. 2
2 5
Minitab output: Regression of 2008 tuition on 2000 tuition
The regression equation is y2008 = 1133 + 1.69 y2000
Predictor Coef Stdev t-ratio p
Constant 1132.8 701.4 1.61 0.116
y2000 1.6924 0.1604 10.55 0.000
s = 1134 R-sq = 78.2% R-sq(adj) = 77.5%
14
2008 tuition and fees ($1000)

2
12 Residual ($1000)
10 1

8 0

6 –1

4 –2

2 –3
2 3 4 5 6 7 2 3 4 5 6 7
2000 tuition and fees ($1000) 2000 tuition and fees ($1000)

.
10.11. (a) From the Minitab output above, we have SEb1 = 0.1604. With df = 30, t ∗ = 2.042,
.
so the 95% confidence interval for β1 is 1.6924 ± t ∗ SEb1 = 1.3649 to 2.0199. This slope
means that a $1 difference in tuition in 2000 changes 2008 tuition by between $1.36 and
$2.02. (It might be easier to understand expressed like this: If the costs of two schools
differed by $1000 in the year 2000, then in 2008, they would differ by between $1365 and
.
$2020.) (b) Regression explains r 2 = 78.2% of the variation in 2008 tuition. (c) When
.
x = 5100, the estimated 2008 tuition is ŷ = 1133 + 1.6924(5100) = $9764. (d) When
.
x = 8700, the estimated 2008 tuition is ŷ = 1133 + 1.6924(8700) = $15,857. (Software
reports $15,856; the difference is due to rounding. (e) The 2000 tuition at Stat U is similar
to others in the data set, while Moneypit U was considerably more expensive in 2000, so
that prediction requires extrapolation.

Minitab output: Estimates for Stat U and Moneypit U


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
9764 245 ( 9264, 10263) ( 7397, 12130)
15856 749 ( 14329, 17384) ( 13084, 18628) X
X denotes a row with very extreme X values
Solutions 275

10.12. (a) β0 is the population intercept, 0.8. This says that the mean overseas return is 0.8%
when the U.S. return is 0%. (b) β1 is the population slope, 0.46. This says that when the
U.S. return changes by 1%, the mean overseas return changes by 0.46%. (c) The full model
is yi = 0.8 + 0.46xi + i , where yi and xi are observed overseas and U.S. returns in a given
year, and i are independent N (0, σ ) variables. The residual terms i allow for variation in
overseas returns when U.S. returns remain the same.

10.13. (a) The regression equation is 0.2


.
ŷ = −0.0127 + 0.0180x, and r 2 = 80.0%.

Blood alcohol content


Not surprisingly, we find that BAC increases 0.15
as beer consumption increases; the relation-
ship is quite strong, with beer consumption 0.1
explaining 80% of the variation in BAC.
(b) To test H0 : β1 = 0 versus Ha : β1 > 0, 0.05
we find t = 7.48 and P < 0.0001. There
is very strong evidence that drinking more 0
1 2 3 4 5 6 7 8 9
beers increases BAC. (c) The predicted Beers
mean BAC for x = 5 beers is 0.07712; the
90% prediction interval is 0.040 to 0.114. Steve might be safe, but cannot be sure that his
BAC will be below 0.08.
Note: We use a prediction interval (rather than a confidence interval) because we want a
range of values for an individual BAC after 5 beers, rather than the mean BAC.

Minitab output: Regression of BAC on beer consumption


The regression equation is BAC = - 0.0127 + 0.0180 Beers
Predictor Coef Stdev t-ratio p
Constant -0.01270 0.01264 -1.00 0.332
Beers 0.017964 0.002402 7.48 0.000
s = 0.02044 R-sq = 80.0% R-sq(adj) = 78.6%
Fit Stdev.Fit 90.0% C.I. 90.0% P.I.
0.07712 0.00513 ( 0.06808, 0.08616) ( 0.03999, 0.11425)

10.14. (a) One issue is that correlation coefficients assume a linear association between two
variables; if two variables x and y are whole numbers, the only linear relationships between
them √ would have correlation 1 or −1. (b) To test H0 : ρ = 0 versus Ha : ρ = 0, we compare
r√ n − 2
t = (with n = 98) to a t (96) distribution. The table on the following page shows t
1 −r
2

and P; entries marked * or ** are significant at the α = 0.05 level. (c) Entries marked **
.
15 = 0.0033; that eliminates three significant associations, although two
are significant at 0.05
(ASSESS/XTRA and ASSESS/HAND) were barely eliminated, and perhaps should not be
completely dismissed. (d) We might hesitate to apply the results more broadly because of
the artificial setting of these interviews, and the fact that the subjects were undergraduate
students.
276 Chapter 10 Inference for Regression

XTRA AGREE HAND EYE DRESS


AGREE 4.53 (0.0000)**
HAND 2.32 (0.0227)* 0.49 (0.6249)
EYE 1.79 (0.0761) 1.39 (0.1692) 14.04 (0.0000)**
DRESS 1.69 (0.0942) 1.08 (0.2809) 4.53 (0.0000)** 4.67 (0.0000)**
ASSESS 2.86 (0.0052)* 1.28 (0.2020) 2.97 (0.0038)* 3.19 (0.0019)** 1.49 (0.1404)

10.15. (a) Both distributions are sharply right-


skewed. Histograms and five-number sum- 25
maries are below. (b) Regression does not

Overall rating
20
require that the variables x and y be Normal;
15
it is the errors (the deviation from the line)
that should be Normal. (c) The scatterplot 10
is on the right; note that incentive pay is the 5
explanatory variable. There is a weak pos-
itive linear relationship. (d) The regression 0
0 10 20 30 40 50 60 70 80
equation is ŷ = 6.247 + 0.1063x. (Not sur- Incentives as percent of salary
prisingly, regression only explains 15.3% of
the variation in rating.) (e) Residual analysis might include a histogram or stemplot, a plot
of residuals versus incentive pay, and perhaps a Normal quantile plot; the first two of these
items are shown on the following page. The residuals are slightly right-skewed. In addition,
we note that for incentive pay less than about 30%, most residuals are greater than about −7,
but extend up to +15.

60
120
50
100
Frequency

Frequency

40
80
60 30

40 20

20 10
0 0
0 10 20 30 40 50 60 70 80 90 0 5 10 15 20 25 30
Incentives as percent of salary Overall rating

Min Q1 M Q3 Max
Incentives as percent of salary 0 0.306 1.43 17.65 85.01
Overall player rating 0 2.25 6.31 12.69 27.88

Minitab output: Regression of rating on salary incentive percentage


The regression equation is Rating = 6.25 + 0.106 Percent
Predictor Coef Stdev t-ratio p
Constant 6.2469 0.4816 12.97 0.000
Percent 0.10634 0.01767 6.02 0.000
s = 5.854 R-sq = 15.3% R-sq(adj) = 14.8%
Solutions 277

15
60
10
50
Frequency 5

Residual
40
30 0

20 –5

10 –10
0 –15
–12 –9 –6 –3 0 3 6 9 12 15 18 21 0 10 20 30 40 50 60 70 80
Residual Incentives as percent of salary

10.16. (a) The regression equation is ŷ = 2.2043 + 0.02084x (Minitab output on the following
page). (b) Shown below are a histogram of the residuals, and a plot of residuals versus
incentive pay. These show no clear reasons for concern; in particular, the distribution of
residuals is considerably less skewed than those in Exercise 10.15. The residual spread is
.
also more consistent for low incentive pay. (c) The estimated slope is b1 = 0.02084 with
.
SEb1 = 0.003419. Whether we use df = 201 (and software) or df = 100 (and Table D), the
95% confidence interval is b1 ± t ∗ SEb1 = 0.0141 to 0.0276. Therefore, if the incentive
portion of salary increases by 1%, sqrt(rating) increases between 0.0141 and 0.0246. (d) The
predicted ratings are shown in the table (below, left). (e) In this plot (below, right), we see
that the models give similar estimates for high incentive pay, but the square-root model is
lower for low incentive pay. (f) Based on residuals, this model seems to be better. (There is
no clear reason to form a preference based on predicted values.)

35
2
30
25 1
Frequency

Residual

20 0
15
–1
10
–2
5
0 –3
–2.5 –2 –1.5 –1 –0.5 0 0.5 1 1.5 2 2.5 0 10 20 30 40 50 60 70 80
Residual Incentives as percent of salary
16
Model #1
14
Incentive Model
Predicted rating

Model #2
pay #1 (10.15) #2 (10.16) 12
0% 6.25 4.86 10
20% 8.37 6.87
8
40% 10.50 9.23
60% 12.63 11.94 6
80% 14.75 14.99 4
0 10 20 30 40 50 60 70 80
Incentives as percent of salary
278 Chapter 10 Inference for Regression

Minitab output: Regression of sqrt(rating) on incentive pay


The regression equation is SqrtRate = 2.20 + 0.0208 Percent
Predictor Coef Stdev t-ratio p
Constant 2.20428 0.09321 23.65 0.000
Percent 0.020842 0.003419 6.10 0.000
s = 1.133 R-sq = 15.6% R-sq(adj) = 15.2%
Predicting sqrt(rating) for 0%, 20%, 40%, 60%, 80%
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
2.2043 0.0932 ( 2.0205, 2.3881) ( -0.0377, 4.4463)
2.6211 0.0819 ( 2.4595, 2.7827) ( 0.3808, 4.8614)
3.0380 0.1187 ( 2.8038, 3.2721) ( 0.7913, 5.2847)
3.4548 0.1756 ( 3.1085, 3.8011) ( 1.1937, 5.7159)
3.8716 0.2386 ( 3.4011, 4.3422) ( 1.5882, 6.1551)
Predicting rating for 0%, 20%, 40%, 60%, 80%
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
6.247 0.482 ( 5.297, 7.197) ( -5.337, 17.831)
8.374 0.423 ( 7.539, 9.209) ( -3.201, 19.949)
10.500 0.613 ( 9.291, 11.710) ( -1.108, 22.108)
12.627 0.907 ( 10.838, 14.416) ( 0.945, 24.310)
14.754 1.233 ( 12.323, 17.185) ( 2.956, 26.552)

10.17. (a) 22 of these 30 homes sold for more than their assessed values. −3 3
This was “an SRS of 30 properties,” so it should be reasonably represen- −2 6543
−1 9422
tative, so the larger population should be similar (or at least it was at the −0 984430
time of the sample). (b) The scatterplot (below, left) shows a moderately 0 1134578
strong linear association. (c) The regression line ŷ = 21.50 + 0.9468x 1 1236
is included on the scatterplot. (d) The plot of residuals versus assessed 2 26
3
value (below, right) shows no obvious unusual features. The house with 4 17
the highest assessed value (which also stands out in the original scatter-
plot) may be influential. (e) A stemplot (shown here) or histogram looks reasonably Normal,
although there are two high residuals that stand apart from the rest. (f) There are no clear
violations of the assumptions—at least, none severe enough to cause too much concern.

40
Sales price ($1000)

250 30
20
Residual

10
200
0
–10
150 –20
–30
100 –40
100 150 200 250 300 100 150 200 250 300
Assessed value ($1000) Assessed value ($1000)

Minitab output: Regression of sales price on assessed value


The regression equation is SalesPrc = 21.5 + 0.947 Assessed
Predictor Coef Stdev t-ratio p
Constant 21.50 15.28 1.41 0.170
Assessed 0.94682 0.08064 11.74 0.000
s = 19.73 R-sq = 83.1% R-sq(adj) = 82.5%
Solutions 279

10.18. (a) and (b) The table on the right gives Assessed Selling Predicted
the predicted selling prices and residuals for value price price Residual
these three houses. (c) From the Minitab 155 142.9 168.26 −25.36
output in the previous solution, we have 220 224.0 229.80 −5.80
. . . −5.34
b0 = 21.50, SEb0 = 15.28, b1 = 0.94682, 285 286.0 291.34
.
and SEb1 = 0.08064. For 95% confidence with df = 28, we have t ∗ = 2.048, so the intervals
are −9.79 to 52.79 (intercept) and 0.78 to 1.11 (slope). (d) The two confidence intervals
in part (c) include the values 0 (for the intercept) and 1 (for the slope), so we cannot reject
y = x as unreasonable.
Note: In Exercise 10.17(a), we noted that 22 of the 30 homes in the sample sold for
more than their assessed value, and some students might fall back on that to answer part (d)
of this question. However, the confidence intervals in part (c) suggest that we do not have
enough evidence to reject the null hypothesis that the model is y = x.

10.19. (a) The plot (below, left) is roughly linear and increas- −3 520
ing. The number of tornadoes in 2004 (1819) is noticeably −2 931
−1 99843310
high, as is the 2008 count (1691) to a lesser extent. (b) The −0 9887654443211110
.
regression equation is ŷ = −28,438 + 14.82x; both the slope 0 001223556778
and intercept are significantly different from 0. In the Minitab 1 001224
. 2 011789
output on the following page, we see SEb1 = 1.463. With
∗ 3 6
t = 2.0049 for df = 54, the confidence interval for β1 is 4
.
b1 ± t ∗ SEb1 = 14.82 ± 2.93 = 11.89 to 17.76 tornadoes per 5 5
year. (c) In the plot (below, right), we see that the scatter might
be greater in recent years, and the 2004 residual is particularly high. (d) Based on a stem-
plot (right), the 2004 residual is an outlier; the other residuals appear to be roughly Normal.
.
(e) Without the 2004 count, the regression equation is ŷ = −26,584 + 13.88x. The estimated
slope decreases by almost one tornado per year.

1800 500
1600 400
Tornado count

1400 300
200
Residual

1200 100
1000 0
800 –100
–200
600 –300
400 –400
1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000
Year Year
280 Chapter 10 Inference for Regression

Minitab output: Regression of tornado count on year


The regression equation is Count = - 28438 + 14.8 Year
Predictor Coef Stdev t-ratio p
Constant -28438 2897 -9.82 0.000
Year 14.822 1.463 10.13 0.000
s = 176.9 R-sq = 65.5% R-sq(adj) = 64.9%
Regression with 2004 count removed
The regression equation is Count2 = - 26584 + 13.9 Year2
Predictor Coef Stdev t-ratio p
Constant -26584 2680 -9.92 0.000
Year2 13.881 1.354 10.26 0.000
s = 160.5 R-sq = 66.5% R-sq(adj) = 65.9%

10.20. (a) The scatterplot shows a fairly strong


50
positive linear association, with no extreme 48

Computer's mpg
outliers, so regression seems to be appro- 46
priate. (b) The regression equation (shown 44
on the scatterplot) is ŷ = 11.81 + 0.7754x. 42
(c) Student summaries will vary. The mpg 40
38
values are certainly similar, but one notable
36
difference is that all but three of the computer 34
values are higher than the driver’s values, and
30 32 34 36 38 40 42 44 46 48
the mean computer mpg is about 2.7 mpg Driver's mpg
higher than the mean driver mpg. Addition-
ally, the slope of the regression line is about 0.78, meaning that (on average) a 1 mpg change
in the driver’s value corresponds to a 0.78 mpg for the computer. The intercept, however, is
about 11.8 mpg, suggesting that the computer’s value is generally higher when the driver’s
value is small.

Minitab output: Regression of computer mpg on driver mpg


The regression equation is Computer = 11.8 + 0.775 Driver
Predictor Coef Stdev t-ratio p
Constant 11.812 5.432 2.17 0.043
Driver 0.7754 0.1335 5.81 0.000
s = 2.676 R-sq = 65.2% R-sq(adj) = 63.3%

.
10.21. (a) About r 2 = 8.41% of the variability in AUDIT score is explained by (a linear
regression on) gambling frequency. (b)√ With r = 0.29 and n = 908, the test statistic for
r n −2 .
H0 : ρ = 0 versus Ha : ρ = 0 is t = √ = 9.12 (df = 906), for which P is very small.
1 −r
2

(c) Nonresponse is a problem because the students who did not answer might have different
characteristics from those who did. Because of this, we should be cautious about considering
these results to be representative of all first-year students at this university, and even more
cautious about extending these results to the broader population of all first-year students.
Solutions 281

10.22. (a) Stemplots are shown on the right. x (wa- Area IBI
.
tershed area) is right-skewed; x = 28.2857 km , 2
0 2 2 99
. .
sx = 17.7142 km2 . y (IBI) is left-skewed; y = 65.9388, 0 5688999 3 233
. 1 0024 3 9
sy = 18.2796. (b) The scatterplot (below, left) shows a 1 66889 4 13
weak positive association, with more scatter in y for small 2 111133 4 67
x. (c) yi = β0 + β1 xi + i , i = 1, 2, ..., 49; i are indepen- 2 66667889 5 34
dent N (0, σ ) variables. (d) The hypotheses are H0 : β1 = 0 3 112244 5 556899
3 9 6 0124
versus Ha : β1 = 0. (e) See the Minitab output below. The 4 6 7
 = 52.92 + 0.4602 Area, and the
regression equation is IBI 4 799 7 11124
.
estimated standard deviation is s = 16.53. For testing the 5 244 7 56889
5 789 8 001222344
hypotheses in part (d), t = 3.42 and P = 0.001. (f) The 6 8 556899
residual plot (below, right) again shows that there is more 6 9 9 1
variation for small x. (g) As we can see from a stemplot 7 0
and/or a Normal quantile plot (both below), the residuals are somewhat left-skewed but oth-
erwise seem reasonably close to Normal. (h) Student opinions may vary. The two apparent
deviations from the model are (i) a possible change in standard deviation as x changes and
(ii) possible non-Normality of error terms.

90
20
80
10
70
Residual
0
60
IBI

50 –10

40 –20
30 –30
20 –40
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
2 2
Watershed area (km ) Watershed area (km )

Minitab output: Regression of IBI on watershed area


The regression equation is IBI = 52.9 + 0.460 Area
Predictor Coef Stdev t-ratio p
Constant 52.923 4.484 11.80 0.000
Area 0.4602 0.1347 3.42 0.001
s = 16.53 R-sq = 19.9% R-sq(adj) = 18.2%

−3 2200
−2 8 20
−2 42 10
−1 9665
Residual

−1 3 0
−0 8885 –10
−0 433100
0 223334 –20
0 666789 –30
1 022334
1 6799 –40
2 0024 –3 –2 –1 0 1 2 3
2 5 Normal score
282 Chapter 10 Inference for Regression

10.23. (a) The stemplot of percent forested is shown on the right; see Percent forested
the solution to the previous exercise for the stemplot of IBI. x (per- 0 00000033789
cent forested) is right-skewed; x = 39.3878%, sx = 32.2043%. y 1 0014778
2 125
(IBI) is left-skewed; y = 65.9388, sy = 18.2796. (b) The scatterplot 3 123339
(below, left) shows a weak positive association, with more scatter 4 133799
in y for small x. (c) yi = β0 + β1 xi + i , i = 1, 2, ..., 49; i are 5 229
independent N (0, σ ) variables. (d) The hypotheses are H0 : β1 = 0 6 38
7 599
versus Ha : β1 = 0. (e) See the Minitab output below. The regression 8 069
 = 59.91 + 0.1531 Forest, and the estimated standard
equation is IBI 9 055
.
deviation is s = 17.79. For testing the hypotheses in (d), t = 1.92 10 00
and P = 0.061. (f) The residual plot (below, right) shows a slight curve—the residuals seem
to be (very) slightly lower in the middle and higher on the ends. (g) As we can see from a
stemplot and/or a Normal quantile plot (both below), the residuals are left-skewed. (h) Stu-
dent opinions may vary. The three apparent deviations from the model are (i) a possible
change in standard deviation as x changes, (ii) possible curvature of residuals, and (iii) possi-
ble non-Normality of error terms.

90 30
80 20
70 10
Residual

60 0
IBI

50 –10
40 –20
30 –30
20 –40
0 20 40 60 80 100 0 20 40 60 80 100
Percent forested Percent forested

Minitab output: Regression of IBI on percent forested


The regression equation is IBI = 59.9 + 0.153 Forest
Predictor Coef Stdev t-ratio p
Constant 59.907 4.040 14.83 0.000
Forest 0.15313 0.07972 1.92 0.061
s = 17.79 R-sq = 7.3% R-sq(adj) = 5.3%

−3 55 30
−3 4
−2 988 20
−2 0 10
−1 985
Residual

−1 2110 0
−0 99887 –10
−0 410 –20
0 134
0 557899 –30
1 01122333 –40
1 55678 –3 –2 –1 0 1 2 3
2 044 Normal score
2 78
Solutions 283

10.24. The first model (using watershed area to predict IBI) is preferable because the
regression was significant (P = 0.001 versus P = 0.061) and explained a higher proportion
of the variation in IBI (19.9% versus 7.3%).

10.25. The precise results of these changes depend on which observation is changed. (There
are six observations which had 0% forest and two which had 100% forest.) Specifically, if
we change IBI to 0 for one of the first six observations, the resulting P-value is between
0.019 (observation 6) and 0.041 (observation 3). Changing one of the last two observations
changes the P-value to 0.592 (observation 48) or 0.645 (observation 49).
In general, the first change decreases P (that is, the relationship is more significant)
because it accentuates the positive association. The second change weakens the association,
so P increases (the relationship is less significant).

 = 52.92 + 0.4602 Area, the predicted mean response


10.26. With the regression equation IBI
 = .
when x = Area = 40 km is µ̂ y = IBI
2 71.33. While it is possible to find SEµ̂ and
SE ŷ using the formulas from Section 10.2, we rely on the software

output shown below.
. .
(SEµ̂ = 2.84, reported by Minitab as “Stdev.fit,” and SE ŷ = s 2 + SE2µ̂ = 16.77, where
.
s = 16.53 was given in the Minitab output shown with the solution to Exercise 10.22. For
df = 47, the appropriate critical value is t ∗ = 2.0117.) (a) The 95% confidence interval
for µ y is 65.61 to 77.05. (b) The 95% prediction interval for a future response is 37.57 to
105.09. (c) Among many streams with watershed area 40 km2 , we estimate the mean IBI to
be between about 65.61 and 77.05. For an individual stream with watershed area 40 km2 ,
we expect its IBI to be between about 37.57 and 105.09. (d) We probably cannot reliably
apply these results elsewhere; it is likely that the particular characteristics of the Ozark
Highland region play some role in determining the regression coefficients.

Minitab output: Predicting IBI for watershed area = 40 km2


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
71.33 2.84 ( 65.61, 77.05) ( 37.57, 105.09)

 = 52.92 + 0.4602 Area from Exercise 10.22,


10.27. Using Area = 10 in the model IBI
 .  = 59.91 + 0.1531 Forest from
IBI = 57.52. Using Forest = 63 in the model IBI
.
 = 69.55. Both predictions have a lot of uncertainty; recall that r 2 was
Exercise 10.23, IBI
fairly small for both models. Also note that the prediction intervals (shown below) are both
about 70 units wide.

Minitab output: Predicting IBI for watershed area = 10


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
57.52 3.41 ( 50.66, 64.39) ( 23.55, 91.50)
Predicting IBI for percent forest = 63
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
69.55 3.16 ( 63.19, 75.92) ( 33.20, 105.91)
284 Chapter 10 Inference for Regression

10.28. (a) With all 60 points, the regres- 100


sion equation is ŷ = −34.55 + 0.8605x,
. 80
s = 20.17. (This is the solid line in the

Reading score
scatterplot on the right.) The slope is sig- 60
nificantly different from 0: t = 4.82,
P < 0.0005. (b) Without the four points 40
from the bottom of the scatterplot, the re- 20
gression equation is ŷ = −33.40 + 0.8818x,
.
s = 15.18. (This is the dashed line in the 0
80 90 100 110 120 130 140
scatterplot.) The slope is again significantly
IQ
different from 0: t = 6.57, P < 0.0005.
With the outliers removed, the line changes slightly; the most significant change is the de-
crease in the estimated standard deviation s. This correspondingly makes t larger (i.e., b1 is
more significantly different from 0) and makes the regression line more useful for prediction
(r 2 increases from 28.9% to 44.4%). Of course, we should not arbitrarily remove data points;
more investigation is needed to determine why these students’ reading scores were so much
lower than we would expect based on their IQs.

10.29. (a) Stemplots are shown below; both variables are right-skewed. For pure tones,
. .
x = 106.20 and s = 91.76 spikes/second, and for monkey calls, y = 176.57 and
sy = 111.85 spikes/second. (b) There is a moderate positive association; the third point
(circled) has the largest residual; the first point (marked with a square) is an outlier for
 = 93.9 + 0.778 TONE and s = 87.30; the test
tone response. (c) With all 37 points, CALL
of β1 = 0 gives t = 4.91, P < 0.0001. (d) Without the first point, ŷ = 101 + 0.693x,
s = 88.14, t = 3.18. Without the third point, ŷ = 98.4 + 0.679x, s = 80.69, t = 4.49. With
neither, ŷ = 116 + 0.466x, s = 79.46, t = 2.21. The line changes a bit, but always has a
slope significantly different from 0.
Monkey call response (spikes/second)

500
Tone Call Residual
0 122222233444 0 4 −1 65 400
0 55556777 0 566667889 −1 3
1 0011244 1 011223334 −0 8876555 300
1 566778 1 5889999 −0 44444331100
2 24 2 0004 0 012334
2 5 2 7 0 667888 200
3 3 0134 1 14
3 3 1 7 100
4 4 2 0
4 7 4 8 0
5 0 0 100 200 300 400 500
Pure tone response (spikes/second)

10.30. The model is yi = β0 + β1 xi + i ; i are independent N (0, σ ) random variables.


(a) β0 represents the fixed costs. (b) β1 represents how costs change as the number of
students changes. This should be positive because more students mean more expenses.
(c) The error term (i ) allows for variation among equal-sized schools.
Solutions 285

.
10.31. (a) The stemplots (below, left) are fairly symmetric. For x (MOE), x = 1, 799, 180
. . .
and sx = 329, 253; for y (MOR), y = 11, 185 and sy = 1980. (b) The plot (below, right)
shows a moderately strong, positive, linear relationship. Because we would like to predict
MOR from MOE, we should put MOE on the x axis. (c) The model is yi = β0 + β1 xi + i ,
i = 1, 2, ..., 32; i are independent N (0, σ ) variables. The regression equation is
 = 2653 + 0.004742 MOE, s = .
MOR 1238. The slope is significantly different from 0:
t = 7.02 (df = 30), P < 0.0001. (d) Assumptions appear to be met: A stemplot of the
residuals shows one slightly low (not quite an outlier), but acceptable, and the plot of
residuals against MOE (not shown) does not suggest any particular pattern.
MOE MOR Residuals
11 6 6 3 −3 3
12 7 −2 15
13 55 8 3588 −2 14

MOR (thousands)
14 1578 9 222 −1 6 13
15 5589 10 22356 −1 31110 12
16 14 11 223455799 −0 76555 11
17 2479 12 00777 −0 43221 10
18 447 13 469 0 00223 9
19 358 14 5 0 78 8
20 0348 15 3 1 1334 7
21 8 1 599 6
22 1 2 1 1 1.2 1.4 1.6 1.8 2 2.2 2.4
23 47 MOE (millions)
24
25 3

10.32. (a) The 95% confidence interval gives a range of values for the mean MOR of many
pieces of wood with MOE equal to 2,400,000. The prediction interval gives a range of
values for the MOR of one piece of wood with MOE equal to 2,400,000. (b) The prediction
interval will include more values because the confidence interval accounts only for the
uncertainty in our estimate of the mean response, while the prediction interval must also
account for the random error of an individual response. (c) With the regression equation
 = 2653 + 0.004742 MOE, the predicted mean response when x = MOE = 2,400,000
MOR
 = .
is µ̂ y = MOE 14,034. The Minitab output below gives the two intervals, along with SEµ̂
(“Stdev.fit”).

Minitab output: Predicting MOR with MOE = 2,400,000


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
14034 461 ( 13092, 14976) ( 11335, 16733)
286 Chapter 10 Inference for Regression

10.33. (a) The scatterplot shows a weak neg-

Cash flow into bonds ($billions)


150
ative association; the regression equation
 = 55.58 − 0.1769 Stocks, with
is Bonds 100
.
s = 54.55. (This is the solid line in the
50
plot.) (b) For testing H0 : β1 = 0 versus
Ha : β1 = 0, we have t = −1.66 (df = 22) 0
and P = 0.111. The slope is not signifi- –50
cantly different from 0. (c) With the 2008
data removed, the (dashed) regression line –100
 = 69.46 − 0.2814 Stocks, with –200 –100 0 100 200 300
is Bonds
. Cash flow into stocks ($billions)
s = 53.12. The slope is now significantly
different from 0 (t = −2.24, P = 0.036). (d) We should explore whether something hap-
pened in 2008 that might explain why that point strayed from the line. (The economy would
seem to be a likely cause.)

Minitab output: Regression of bond cash flow on stock cash flow


The regression equation is Bonds = 55.6 - 0.177 Stocks
Predictor Coef Stdev t-ratio p
Constant 55.58 14.98 3.71 0.001
Stocks -0.1769 0.1066 -1.66 0.111
s = 54.55 R-sq = 11.1% R-sq(adj) = 7.1%
Regression without 2008 data
The regression equation is Bonds = 69.5 - 0.281 Stocks
Predictor Coef Stdev t-ratio p
Constant 69.46 17.33 4.01 0.001
Stocks -0.2814 0.1254 -2.24 0.036
s = 53.12 R-sq = 19.3% R-sq(adj) = 15.5%

10.34. (a) The t statistic for testing H0 : β1 = 0 versus Ha : β1 = 0 is t = b1 /SEb1 =


.
0.72/0.38 = 1.89 with df = 80. This has P = 0.0617, so we do not reject H0 . (b) For the
one-sided alternative β1 > 0, we would have P = 0.0309, so we could reject H0 at the 5%
significance level.

10.35. (a) Aside from the one high point (70 100
months of service, and wages 97.6801), 90
Wages (rescaled)

there is a moderate positive association— 80


fairly clear but with quite a bit of scat- 70
ter. (b) The regression equation is 60

WAGES = 43.383 + 0.07325 LOS, with 50
.
s = 10.21 (Minitab output follows). 40
The slope is significantly different from 30
0: t = 2.85 (df = 57), P = 0.006. (c) Wages 20
0 50 100 150 200
rise an average of 0.07325 wage units per
. Length of service (months)
week of service. (d) We have b1 = 0.07325
. .
and SEb1 = 0.02571. For a t distribution with df = 57, t ∗ = 2.0025 for a 95% confidence
interval, so the interval is 0.0218 to 0.1247.
Solutions 287

Minitab output: Regression of wages on length of service (outlier excluded)


The regression equation is wages = 43.4 + 0.0733 los
Predictor Coef Stdev t-ratio p
Constant 43.383 2.248 19.30 0.000
los 0.07325 0.02571 2.85 0.006
s = 10.21 R-sq = 12.5% R-sq(adj) = 10.9%
Regression of wages on length of service (outlier included)
The regression equation is wages = 44.2 + 0.0731 los
Predictor Coef Stdev t-ratio p
Constant 44.213 2.628 16.82 0.000
los 0.07310 0.03015 2.42 0.018
s = 11.98 R-sq = 9.2% R-sq(adj) = 7.6%

10.36. The table below summarizes the regression results with the outlier excluded, and those
with all points. Minitab output for both regressions is shown above. (a) The intercept and
slope estimates change very little, but the estimate of σ increases from 10.21 to 11.98.
(b) With the outlier, the t statistic decreases (because s has increased), and the P-value
increases slightly—although it is still significant at the 5% level. (c) The interval width
2t ∗ SEb1 increases from 0.1030 to 0.1207—roughly the same factor by which s increased.
(Because the degrees of freedom change from 57 to 58, t ∗ decreases from 2.0025 to 2.0017,
but the change in s has a much greater impact.)

b0 b1 s t P Interval width
Outlier excluded 43.383 0.07325 10.21 2.85 0.006 0.1030
All points 44.213 0.07310 11.98 2.42 0.018 0.1207

10.37. (a) The trend appears to be quite


Lean (0.1 mm over 2.9 m)

linear. (b) The regression equation is 750


 = −61.12 + 9.3187 Year with s = .
Lean 4.181. 725
The regression explains r 2 = 98.8% of
700
the variation in lean. (c) The rate we seek
is the slope. For df = 11 and 99% con- 675
fidence, t ∗ = 3.1058, so the interval is 650
9.3187 ± (3.1058)(0.3099) = 8.3562 to
10.2812 tenths of a millimeter/year. 625
74 76 78 80 82 84 86
Minitab output: Regression of lean on year Year
The regression equation is Lean = -61.1 + 9.32 Year
Predictor Coef Stdev t-ratio p
Constant -61.12 25.13 -2.43 0.033
Year 9.3187 0.3099 30.07 0.000
s = 4.181 R-sq = 98.8% R-sq(adj) = 98.7%
288 Chapter 10 Inference for Regression

.
10.38. (a) ŷ = −61.12 + 9.3187(18) = 107, for a prediction of 2.9107 m. (b) This is an
example of extrapolation—trying to make a prediction outside the range of given x-values.
Minitab reports that a 95% prediction interval for ŷ when x ∗ = 18 is about 62.6 to 150.7.
The width of the interval is an indication of how unreliable the prediction
 is.
. .
Note: Minitab’s “Stdev.Fit” value of 19.56 is SEµ̂ , so SEŷ = s2 + SE2µ̂ = 20.00, which
.
agrees with the margin for the prediction interval: t∗ SEŷ = (2.201)(20.00) = 44.02.

Minitab output: Predicting lean in 1918 (year = 18)


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
106.62 19.56 ( 63.56, 149.68) ( 62.58, 150.65) XX
XX denotes a row with very extreme X values
.
10.39. (a) Use x = 112 (the number of years after 1900). (b) ŷ = −61.12+9.3187(112) = 983,
for a prediction of 2.9983 m. (c) A prediction interval is appropriate because we are
interested in one future observation, not the mean of all future observations; in this situation,
it does not make sense to talk of more than one future observation. In the output below,
note that Minitab warns us of the risk of extrapolation.

Minitab output: Predicting lean in 2012 (year = 112)


Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
982.57 9.68 ( 961.27, 1003.88) ( 959.36, 1005.78) XX
XX denotes a row with very extreme X values

10.40. A negative association makes sense here: If the price of beer is above average, fewer
students can afford to drink, while more drinking happens when beer is cheaper.
Note: The fact that the correlation is relatively small indicates that the price of beer
is not a crucial factor in determining the prevalence of binge-drinking. In particular, a
.
straight-line relationship with the cost of beer only explains about r2 = 13% of the variation
in binge-drinking rates.

r n −2 .
10.41. To test H0 : ρ = 0 versus Ha : ρ = 0, we compute t = √ = −4.16. Comparing this
1 −r
2

to a t distribution with df = 116, we find P < 0.0001, so we conclude the correlation is


different from 0.
Solutions 289

10.42. (a) Scatterplot below, left. (b) Scatterplot below, right. (c) The regression equation is
. .
ŷ = −827.66 + 0.4200x with s = 0.2016. For 95% confidence with df = 6, t ∗ = 2.4469, so
. .
with b1 = 0.4200 and SEb1 = 0.006524, the confidence interval is 0.4041 to 0.4360.
Note: If students use a common logarithm (rather than a natural logarithm, as we have
done), everything would be multiplied by about 0.4343: The vertical scale on the graph
would be from 0 to about 6, the regression line would be ŷ = −359.45 + 0.1824x, and the
interval would be 0.1755 to 0.1893.
DRAM capacity (kbits x 106)

1 14

log(DRAM capacity)
12
0.8
10
0.6
8
0.4 6
0.2 4
2
0
0
1970 1980 1990 2000 1970 1980 1990 2000
Year Year

Minitab output: Regression of log(kilobits) on year


The regression equation is logKbits = - 828 + 0.420 Year
Predictor Coef Stdev t-ratio p
Constant -827.66 12.99 -63.69 0.000
Year 0.420024 0.006524 64.38 0.000
s = 0.2016 R-sq = 99.9% R-sq(adj) = 99.8%

10.43. Recall that testing H0 : ρ = 0 versus Ha : ρ = 0 is the same as testing H0 : β1 = 0 versus


Ha : β1 = 0. In the solution to Exercise 10.33, we had t = −1.66 (df = 22) and P = 0.111,
so we cannot reject H0 .

r n −2 .
10.44. (a) With r = −0.19 and n = 713, we have t = √ = −5.16. (b) Comparing to a t
2 1 −r
distribution with df = 711 (or anything reasonably close), the P-value is less than 0.0001, so
we conclude that ρ = 0.

10.45. For linear regression, DFM = 1. Because Source DF SS MS F


DFT = DFM + DFE and SST = SSM + SSE, Regression 1 4560.6 4560.6 20.55
we can find the missing degrees of freedom Residual 18 3995.4 221.97
(DF) and sum of squares (SS) entries on Total 19 8556.0
the Residual row by subtraction: DFE = 18 and SSE = 3995.4. The entries in the mean
SSM SSE .
square (MS) column are MSM = = 4560.6 and MSE = = 221.97. Finally,
DFM DFE
MSM .
F= = 20.55.
MSE

√ . SSM 4560.6 .
10.46. s = MSE = 14.8985 and r 2 = = = 0.5330.
SST 8556.0
290 Chapter 10 Inference for Regression

   √ .
10.47. As sx = 1
19 (xi − x)2 = 19.99%, we have (xi − x)2 = sx 19 = 87.1344%, so:
s . 14.8985 .
SEb1 =  = = 0.1710
(xi − x)2 87.1344
.
Alternatively, note that we have F = 20.55 and b1 = 0.775. Because t 2 = F, we know
that t = 4.5332 (take the positive square root, because t and b1 have the same sign).
Then SEb1 = b1 /t = 0.1710. (Note that with this approach, we do not need to know that
sx = 19.99%.)
Finally, with df = 18, t ∗ = 2.1009 for 95% confidence, so the 95% confidence interval is
0.775 ± 0.3592 = 0.4158 to 1.1342.

. . . . .
10.48. (a) With x = 80.9, sx = 17.2, y = 43.5, sy = 20.3, and r = 0.68, we find:
 
20.3 .
b1 = (0.68) = 0.8026
17.2
.
b0 = 43.5 − (0.8026)(80.9) = −21.4270
(Answers may vary slightly due to rounding.) The regression equation is therefore
 = −21.4270 + 0.8026 FVC. (b) Testing β1 = 0 is equivalent to testing ρ = 0, so the
GHP √
r n −2 .
test statistic is t = √ = 6.43 (df = 48), for which P < 0.0005. The slope (correlation)
21 −r
is significantly different from 0.

r n −2
10.49. Use the formula t = √ with r = 0.6. For n = 20, t = 3.18 with df = 18, for which
2 1 −r
the two-sided P-value is P = 0.0052. For n = 10, t = 2.12 with df = 8, for which the
two-sided P-value is P = 0.0667. With the larger sample size, r should be a better estimate
of ρ, so we are less likely to get r = 0.6 unless ρ is really not 0.

10.50. Most of the small banks have negative 50


Large
residuals, while most large-bank residuals are 40
positive. This means that, generally, wages at 30
Small
large banks are higher, and small bank wages
Residual

20
are lower, than we would predict from the
10
regression.
0
–10
–20
0 50 100 150 200
Length of service (months)
Solutions 291

10.51. (a) Not surprisingly, there is a positive


association between scores. The 47th pair 30
of scores (circled) is an outlier—the ACT 25
score (21) is higher than one would expect

ACT
20
for the SAT score (420). Since this SAT score
is so low, this point may be influential. No 15
other points fall outside the pattern. (b) The 10
regression equation is ŷ = 1.626 + 0.02137x.
The slope is significantly different from 0: 5
300 500 700 900 1100 1300 1500
t = 10.78 (df = 58) for which P < 0.0005. SAT
(c) r = 0.8167.

Minitab output: Regression of ACT score on SAT score


The regression equation is ACT = 1.63 + 0.0214 SAT
Predictor Coef Stdev t-ratio p
Constant 1.626 1.844 0.88 0.382
SAT 0.021374 0.001983 10.78 0.000
s = 2.744 R-sq = 66.7% R-sq(adj) = 66.1%

10.52. (a) The means are identical (21.133). (b) For the observed ACT scores,
sy = 4.714; for the fitted values, s ŷ = 3.850. (c) For z = 1, the SAT score is
.
x + sx = 912.7 + 180.1 = 1092.8. The predicted ACT score is ŷ = 25 (Minitab reports
24.983), which gives a standard score of about 1 (using the standard deviation of the
predicted ACT scores. (d) For z = −1, the SAT score is x − sx = 912.7 − 180.1 = 732.6.
.
The predicted ACT score is ŷ = 17.3 (Minitab reports 17.285), which gives a standard score
of about −1. (e) It appears that the standard score of the predicted value is the same as the
explanatory variable’s standard score. (See note below.)
  
Note: (a) This will always be true because i ŷi = i (b0 + b1 xi ) = n b0 + b1 i xi =
n(y − b1 x) + b1 n x = n y. (b) The standard deviation of the predicted values will be
sŷ = |r|sy ; in this case, sŷ = (0.8167)(4.714). To see this, observe that the variance of the
 
predicted values is n −1 1 i (ŷi − y)2 = n −1 1 i (b1 xi − b1 x)2 = b21 sx2 = r2 sy2 . (e) For a given
standard score z, note that ŷ = b0 + b1 (x + z sx ) = y − b1 x + b1 x + b1 z sx = y + z r sy . If r > 0,
the standard score for ŷ equals z; if r < 0, the standard score is −z.
292 Chapter 10 Inference for Regression

10.53. (a) For SAT: x = 912.6 and


sx = 180.1117. For ACT: y = 21.13 30
and sy = 4.7137. Therefore, the slope 25
.
is a1 = 0.02617 and the intercept is
.

ACT
20
a0 = −2.7522. (b) The new line is dashed.
(c) For example, the first prediction is 15
.
−2.7522 + (0.02617)(1000) = 23.42. Up 10
to rounding error, the mean and standard
deviation of the predicted scores are the same 5
300 500 700 900 1100 1300 1500
as those of the ACT scores: y = 21.13 and SAT
sy = 4.7137.
Note: The usual least-squares line minimizes the total squared vertical distance from the

points to the line. If instead we seek to minimize the total of i |hi vi |, where hi is the hori-
zontal distance and vi is the vertical distance, we obtain the line ŷ = a0 + a1 x—except that
we must choose the sign of a1 to be the same as the sign of r. (It would hardly be the “best
line” if we had a positive slope with a negative association.) If r = 0, either sign will do.

10.54. (a) The regression equations are:


 . .
WEIGHT = −468.91 + 28.462 LENGTH with s = 109.4 and r 2 = 0.902
 . .
WEIGHT = −449.44 + 174.63 WIDTH with s = 107.9 and r 2 = 0.905
(b) Both scatterplots suggest that the relationships are curved rather than linear. (Points to
the left and right lie above the line; those in the middle are generally below the line.)

10 10

8 8
Weight (100g)

Weight (100g)

6 6

4 4

2 2

0 0
5 10 15 20 25 30 35 40 45 1 2 3 4 5 6 7
Length (cm) Width (cm)

Minitab output: Regression of weight on length (Model 1)


The regression equation is weight = -469 + 28.5 length
Predictor Coef Stdev t-ratio p
Constant -468.91 92.55 -5.07 0.000
length 28.462 2.967 9.59 0.000
s = 109.4 R-sq = 90.2% R-sq(adj) = 89.2%
Regression of weight on width (Model 2)
The regression equation is weight = -449 + 175 width
Predictor Coef Stdev t-ratio p
Constant -449.44 89.27 -5.03 0.000
width 174.63 17.93 9.74 0.000
s = 107.9 R-sq = 90.5% R-sq(adj) = 89.5%
Solutions 293

 = −117.99 + 0.4970 SQLEN, s =.


10.55. (a) For squared length: Weight 52.76, r 2 = 0.977.
.
 = −98.99 + 18.732 SQWID, s = 65.24, r 2 = 0.965.
(b) For squared width: Weight
Both scatterplots look more linear.

10 10

8 8
Weight (100g)

Weight (100g)
6 6

4 4

2 2

0 0
0 500 1000 1500 2000 0 10 20 30 40 50 60
2 2 2 2
Length (cm ) Width (cm )

Minitab output: Regression of weight on squared length (Model 1)


The regression equation is weight = -118 + 0.497 sqlen
Predictor Coef Stdev t-ratio p
Constant -117.99 27.88 -4.23 0.002
sqlen 0.49701 0.02400 20.71 0.000
s = 52.76 R-sq = 97.7% R-sq(adj) = 97.5%
Regression of weight on squared width (Model 2)
The regression equation is weight = -99.0 + 18.7 sqwid
Predictor Coef Stdev t-ratio p
Constant -98.99 33.67 -2.94 0.015
sqwid 18.732 1.126 16.64 0.000
s = 65.24 R-sq = 96.5% R-sq(adj) = 96.2%


10.56. (a) The regression line is WEIGHT = 10
−115.10 + 3.1019( LENGTH)( WIDTH),
. 8
s = 41.69, r 2 = 0.986. (b) As measured by
Weight (100g)

r 2 , this last model is (by a slim margin) the 6


best. (However, this scatterplot again gives
4
some suggestion of curvature, indicating that
some other model might do better still.) 2

0
0 50 100 150 200 250 300 350
2
Length times width (cm )
Minitab output: Regression of weight on length*width
The regression equation is weight = -115 + 3.10 lenwid
Predictor Coef Stdev t-ratio p
Constant -115.10 21.87 -5.26 0.000
lenwid 3.1019 0.1179 26.32 0.000
s = 41.69 R-sq = 98.6% R-sq(adj) = 98.4%
294 Chapter 10 Inference for Regression

10.57. The table on the right shows the correlations r t P


and the corresponding test statistics. The first two IBI/area 0.4459 3.42 0.0013
results agree with the results of (respectively) Ex- IBI/forest 0.2698 1.92 0.0608
ercises 10.22 and 10.23. area/forest −0.2571 −1.82 0.0745

10.58. The correlation was significant for veg- r t P


etables, fruit, and meat, and nearly significant Vegetables −0.27 −6.65 0.0000
for eggs. All the significant correlations are Fruit −0.16 −3.84 0.0001
negative, meaning (for example) that children Meat −0.15 −3.60 0.0004
with high neophobia tend to eat these foods Eggs −0.08 −1.90 0.0576
less frequently. Sweet/fatty snacks 0.04 0.95 0.3430
Starchy staples −0.02 −0.47 0.6355

10.59. For√each correlation, we compute Rule Breaking Gene


r n −2
t = √ . For the whole group, t Measure Popularity Expression
1 −r
2
Sample 1 (n = 123)
ranges from 2.245 (P = 0.0266) to 3.208
RB.composite 0.28∗∗ 0.26∗∗
(P = 0.0017). For Caucasians only, t ∗
RB.questionnaire 0.22 0.23∗
ranges from 1.572 (P = 0.1193) to 2.397 RB.video 0.24 ∗∗
0.20∗
(P = 0.0185). The three smallest correlations Sample 1 Caucasians only (n = 96)
(0.16 and 0.19) are the only ones that are not RB.composite 0.22∗ 0.23∗
significant. RB.questionnaire 0.16 0.24∗
RB.video 0.19 0.16

10.60. See also the solution to Exercise 2.35.


1800 Women
Metabolic rate (cal/day)

(a) The association is linear and positive;


the women’s points show a stronger asso- 1600 Men
ciation. As a group, males typically have
1400
larger values for both variables. (b) The
women’s regression line (the solid line in 1200
the graph) is ŷ = 201.2 + 24.026x, with
. 1000
s = 95.08 and r 2 = 0.768. The men’s line
(the dashed line) is ŷ = 710.5 + 16.75x, 850
. 30 35 40 45 50 55 60
with s = 167.1 and r = 0.351. The
2
Lean body mass (kg)
women’s slope is significantly different
from 0 (t = 5.76, df = 10, P < 0.0005), but the men’s is not (t = 1.64, df = 5, P = 0.161).
These test results, and the values of s and r 2 , confirm the observation that the women’s asso-
ciation is stronger—however, see the solution to the next exercise.
Solutions 295

10.61. (a) These intervals (in the table below) overlap quite a bit. (b) These quantities can be
computed from the data, but it is somewhat simpler to recall that they can be found from
the sample standard deviations sx,w and sx,m :
√ . √ . √ . √ .
sx,w 11 = 6.8684 11 = 22.78 and sx,m 6 = 6.6885 6 = 16.38
The women’s SEb1 is smaller in part because it is divided by a large number. (c) In order to
reduce SEb1 for men, we should choose our new sample to include men with a wider variety
of lean body masses. (Note that just taking a larger sample will reduce SEb1 ; it is reduced
even more if we choose subjects who will increase sx,m .)
b1 SEb1 df t∗ Interval
Women 24.026 4.174 10 2.2281 14.7257 to 33.3263
Men 16.75 10.20 5 2.5706 −9.4699 to 42.9699

10.62. Scatterplots, and portions of the Minitab outputs, are shown below. The equations are:
For all points,  = −7.796 + 7.8742 LOGMPH
MPG
For speed ≤ 30 mph,  = −9.786 + 8.5343 LOGMPH
MPG
For fuel efficiency ≤ 20 mpg,  = −4.282 + 6.6854 LOGMPH
MPG

Students might make a number of observations about the effects of the restrictions; for
example, the estimated coefficients (and their standard errors) change quite a bit.
Speed ≤ 30 mph Fuel efficiency ≤ 20 MPG
20 20
Fuel efficiency (MPG)

Fuel efficiency (MPG)

19 19
18 18
17 17
16 16
15 15
14 14
13 13
12 12
2.4 2.6 2.8 3 3.2 3.4 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8
log(Speed in miles per hour) log(Speed in miles per hour)

Minitab output: Regression of fuel efficiency on log(speed)— with all points


Predictor Coef Stdev t-ratio p
Constant -7.796 1.155 -6.75 0.000
logMPH 7.8742 0.3541 22.24 0.000
s = 0.9995 R-sq = 89.5% R-sq(adj) = 89.3%
. . . with speed 30 mph or less
Predictor Coef Stdev t-ratio p
Constant -9.786 1.862 -5.26 0.000
logMPH 8.5343 0.6154 13.87 0.000
s = 0.7600 R-sq = 83.5% R-sq(adj) = 83.1%
. . . with fuel efficiency 20 mpg or less
Predictor Coef Stdev t-ratio p
Constant -4.282 1.647 -2.60 0.013
logMPH 6.6854 0.5323 12.56 0.000
s = 0.9462 R-sq = 78.6% R-sq(adj) = 78.1%
Chapter 11 Solutions

11.1. (a) The response variable is math GPA. (b) The number of cases is n = 106. (c) There
were p = 4 explanatory variables. (d) The explanatory variables were SAT Math, SAT
Verbal, class rank, and mathematics placement score.

11.2. (a) ŷ = −3.8 + 7.3(3) − 2.1(1) = 16. (b) No: We can compute predicted values for any
values of x1 and x2 . (Of course, it helps if they are close to those in the data set.) (c) This
is determined by the coefficient of x1 : An increase of two units in x1 results in an increase
of (7.3)(2) = 14.6 units in ŷ.

11.3. (a) The fact that the coefficients are all positive indicates that math GPA should increase
when any explanatory variable increases (as we would expect). (b) With n = 86 cases and
p = 4 variables, DFM = p = 4 and DFE = n − p − 1 = 81. (c) In the following table, each
t statistic is the estimate divided by the standard error; the P-values are computed from a t
distribution with df = 81. (The t statistic for the intercept was not required for this exercise,
but is included for completeness.)
Variable Estimate SE t P
Intercept −0.764 0.651 −1.1736 0.2440
SAT Math 0.00156 0.00074 2.1081 0.0381
SAT Verbal 0.00164 0.00076 2.1579 0.0339
HS rank 1.470 0.430 3.4186 0.0010
Bryant placement 0.889 0.402 2.2114 0.0298
All four coefficients are significantly different from 0 (although the intercept is not).

11.4. The missing entries in the DF and SS columns Source DF SS MS F


can be found by noting that DFE + DFM = DFT Model 3 75 25 2.84
and SSE + SSM = SST. The MS (mean square) Error 50 440 8.8
entries are computed as SS divided by DF, and Total 53 515
F = MSM/MSE. Comparison of F = 2.84 to an
.
F distribution with df 3 and 50 gives P = 0.0471, so we conclude the regression is signifi-
SSM 75 .
cant at the 5% level. Finally, R 2 = = = 0.1456.
SST 515

296
Solutions 297

11.5. The correlations are found in Fig- SATM SATV HSM HSS HSE
ure 11.4 and are summarized in the GPA 0.2517 0.1145 0.4365 0.3294 0.2890
table on the right. Of the 15 possible SATM 0.4639 0.4535 0.2405 0.1083
scatterplots to be made from these six SATV 0.2211 0.2617 0.2437
HSM 0.5757 0.4469
variables, three are shown below as
HSS 0.5794
examples. The pairs with the largest
correlations are generally easy to pick out. The whole-number scale for high school grades
causes point clusters in those scatterplots and makes it difficult to determine the strength of
the association. For example, in the plot of HSS versus HSE below, the circled point repre-
sents 9 of the 224 students. One might guess that these three scatterplots show relationships
of roughly equal strength, but because of the overlapping points, the correlations are quite
different; from left to right, they are 0.2517, 0.4365, and 0.5794.
10 10
750 9 9
650 8 8
7
7
SATM

HSM

HSS
550 6
6
5
450 4 5
3 4
350
2 3
250 1 2
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 2 3 4 5 6 7 8 9 10
GPA GPA HSE

11.6. The regression equation is given in the Minitab output below. The whole-number scale
for high school grades means that the predicted values also come in clusters. All but 21
students had both HSM and HSE above 5, so for all three plots, there are few residuals on
the left half.
2 2 2
1.5 1.5 1.5
1 1 1
0.5 0.5 0.5
Residual

0 0 0
–0.5 –0.5 –0.5
–1 –1 –1
–1.5 –1.5 –1.5
–2 –2 –2
–2.5 –2.5 –2.5
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1.25 1.5 1.75 2 2.25 2.5 2.75 3
HSM HSE Predicted values

Minitab output: Regression of GPA on HSM and HSE


The regression equation is GPA = 0.624 + 0.183 HSM + 0.0607 HSE
Predictor Coef Stdev t-ratio p
Constant 0.6242 0.2917 2.14 0.033
HSM 0.18265 0.03196 5.72 0.000
HSE 0.06067 0.03473 1.75 0.082
s = 0.6996 R-sq = 20.2% R-sq(adj) = 19.4%
298 Chapter 11 Multiple Regression

11.7. The table below gives two sets of answers: those found with critical values from Table D
and those found with software. In each case, the estimated coefficient is b1 = 6.4 with
standard error SEb1 = 3.1, and the margin of error is t ∗ SEb1 , with df = n − 3 for parts (a)
and (b), and df = n − 4 for parts (c) and (d). (The Table D interval for part (d) uses
df = 100.)
Table D Software
n df t ∗ Interval t ∗ Interval
(a) 27 24 2.064 0.0016 to 12.7984 2.0639 0.0019 to 12.7981
(b) 53 50 2.009 0.1721 to 12.6279 2.0086 0.1735 to 12.6265
(c) 27 23 2.069 −0.0139 to 12.8139 2.0687 −0.0128 to 12.8128
(d) 124 120 1.984 0.2496 to 12.5504 1.9799 0.2622 to 12.5378

.
11.8. For all four settings, the test statistic is t = b1 /SEb1 = 6.4/3.1 = 2.065, with df = n − 3
for parts (a) and (b) and df = n − 4 for parts (c) and (d). The P-values are 0.0499, 0.0442,
0.0504, and 0.0411. At the 5% significance level, we would reject the null hypothesis for
each test except (c); the test is barely significant for (a), and barely not significant for (c).
(This is consistent with the confidence intervals from the previous exercise.)

11.9. (a) H0 should refer to β2 (the population coefficient) rather than b2 (the estimated
coefficient). (b) This sentence should refer to the squared multiple correlation. (c) A small
P implies that at least one coefficient is different from 0.

11.10. (a) Multiple regression only assumes Normality of the error terms (residuals), not
the explanatory variables. (The explanatory variables do not even need to be random
variables.) (b) A small P-value tells us that the model is significant (useful for prediction)
but does not measure its explanatory power (the accuracy of those predictions). The squared
multiple correlation R 2 is a measure of explanatory power. (c) For example, if x1 and x2 are
significantly correlated with each other and with the response variable, it might turn out that
the coefficient of x1 is statistically significant and the coefficient of x2 is not. (d) R is not
the average correlation; if it were, adding additional variables might reduce make R closer
to 0. R 2 tells us the total explanatory power of the entire model.
Note: The statement for part (c) is a paraphrase of the “Caution” on page 602 of the
text. As a simple illustration of how this might happen, suppose that the response variable
y = ax1 + b (with little or no error term), where all observed values of x1 are positive, and
the second explanatory variable is x2 = x12 . The correlation between y and x2 might be very
large, but in a multiple regression model with x1 , the coefficient of x2 will almost certainly
not be significant.

11.11. (a) yi = β0 + β1 xi1 + β2 xi2 + · · · + β8 xi8 + i , where i = 1, 2, . . . , 135, and


i are independent N (0, σ ) random variables. (b) The sources of variation are model
(DFM = p = 8), error (DFE = n − p − 1 = 126), and total (DFT = n − 1 = 134).
Solutions 299

11.12. (a) With n = 82 and p = 6, the degrees of Source DF SS MS F


freedom in the ANOVA table are DFM = p = 6, Model 6 22.6 3.7667 2.8109
DFE = n − p − 1 = 75, and DFT = n − 1 = 81. Error 75 100.5 1.3400
With the first two degrees of freedom, we can find Total 81 123.1
SSM . .
MSM = DFM = 3.7667 and MSE = DFE = 1.34, and then compute F = MSM
SSE
MSE = 2.81.
(b) This F statistic has df 6 and 75. (c) Comparing to the F(6, 60) critical values in Table E,
we note that 2.63 < F < 3.12, so 0.01 < P < 0.025. (Software gives 0.016.) (d) This
.
regression explains R 2 = SSM
SST = 18.4% of the variation in the response variable.

11.13. We have p = 8 explanatory variables and n = 795 observations. (a) The ANOVA F test
has degrees of freedom DFM = p = 8 and DFE = n − p − 1 = 786. (b) This model explains
.
only R 2 = 7.84% of the variation in energy-drink consumption; it is not very predictive.
(c) A positive (negative) coefficient means that large values of that variable correspond to
higher (lower) energy-drink consumption. Therefore, males and Hispanics consume energy
drinks more frequently, and consumption increases with risk-taking scores. (d) Within a
group of students with identical (or similar) values of those other variables, energy-drink
consumption increases with increasing jock identity and increasing risk taking.

11.14. No (or at least, not necessarily). It is possible that, although no individual coefficient
is significant, the whole group (or some subset) is. Recall that the t tests “assess the
significance of each predictor variable assuming that all other predictors are included in the
regression equation.” If one variable is removed from the model (because its t statistic is
not significant), we can no longer use the other t statistics to draw conclusions about the
remaining coefficients.

11.15. We have n = 202, and p = 1 (for Model Variable t P


Model 1) or p = 2 (for Model 2). (a) For 0.204 .
1 Gene expression 0.066 = 3.09 0.0023
Model 1, DFE = 200. For Model 2, .
Gene expression 0.161 = 2.44 0.0153
DFE = 199. (b) and (c) The test statistics 2 0.066
0.100 .
RB = 3.33 0.0010
t = bi /SEbi and P-values are in the 0.030

table on the right. (d) The relationship is still positive after adjusting for RB. When gene ex-
pression increases by 1, popularity increases by 0.204 in Model 1, and by 0.161 in Model 2
(with RB fixed).

11.16. (a) All three correlations quite high: year and tornado count (0.8095), population and
tornado count (0.8180), and year and population (0.9981). The solution to Exercise 10.19
shows a scatterplot of tornadoes versus year; the other two scatterplots are shown on
the following page. Because of the near-perfect linear relationship between year and
population, the plot of tornadoes versus population looks nearly identical to the plot
of tornadoes versus year (apart from horizontal scale). (b) The regression equation is
 = 63677 − 33.91 YEAR + 0.0191 CENSUS. (Minitab output on the following page.)
COUNT
(c) The only cause for concern in the analysis is the extremely high count from 2004, which
is visible in all the plots. The plots versus year and versus population are nearly identical,
apart from scale; neither plot shows any striking patterns. The Normal quantile plot (along
with a stemplot of the residuals) suggests no serious deviations from Normality. (d) To
look for a linear increase over time, we test H0 : β1 = 0 versus Ha : β1 = 0, where β1 is
300 Chapter 11 Multiple Regression

coefficient of YEAR in our model. The test statistic is t = −1.46 (P = 0.149), so we cannot
reject H0 . With population included, the predictive information in year is made redundant.
(That is, once we know the population, the additional information from year does not
appreciably improve our estimate of tornado count.)

Minitab output: Regression of tornadoes on year and population


Count = 63677 - 33.9 Year + 0.0191 Census
Predictor Coef Stdev t-ratio p
Constant 63677 43769 1.45 0.152
Year -33.91 23.15 -1.46 0.149
Census 0.019124 0.009068 2.11 0.040
s = 171.5 R-sq = 68.2% R-sq(adj) = 67.0%
300 1800
280
Population (millions)

1600

Tornado count
260 1400
240
1200
220
1000
200
180 800
160 600
140 400
1950 1960 1970 1980 1990 2000 140 160 180 200 220 240 260 280 300
Year Population (millions)
500 500
400 400
300 300
200 200
Residual

Residual

100 100
0 0
–100 –100
–200 –200
–300 –300
–400 –400
1950 1960 1970 1980 1990 2000 140 160 180 200 220 240 260 280 300
Year Population (millions)
500
400
−3 6 300
−2 876432 200
Residual

−1 7654410 100
−0 9966655433222100 0
0 00123345699
–100
1 111335667
2 56799 –200
3 –300
4 9 –400
–3 –2 –1 0 1 2 3
Normal score
Solutions 301

 = 23.4 − 0.682x 1 + 0.102x 2 . (Minitab output below.)


11.17. (a) The regression equation is BMI
.
(b) The quadratic regression explains R 2 = 17.7% of the variation in BMI. (c) Analysis
of residuals might include a stemplot, plots of residuals versus x1 and x2 , and a Normal
quantile plot. All of these appear below; none suggest any obvious causes for concern.
(d) From the Minitab output, t = 1.83 with df = 97, for which P = 0.070—not significant.

Minitab output: Quadratic regression for predicting BMI from PA


The regression equation is BMI = 23.4 - 0.682 X1 + 0.102 X2
Predictor Coef Stdev t-ratio p
Constant 23.3956 0.4670 50.10 0.000
X1 -0.6818 0.1572 -4.34 0.000
X2 0.10195 0.05556 1.83 0.070
s = 3.612 R-sq = 17.7% R-sq(adj) = 16.0%
8 8
6 6
4 4
2 2
Residual

Residual
0 0
–2 –2
–4 –4
–6 –6
–8 –8
–10 –10
–6 –4 –2 0 2 4 6 0 5 10 15 20 25 30
(PA – 8.614) 2
(PA – 8.614)
−8 1
−7 3
−6 840 8
−5 9832 6
−4 553210 4
−3 8875421 2
Residual

−2 9943321 0
−1 7776311 –2
−0 9776666642100
–4
0 0256667778889
–6
1 012334467899
2 233447889 –8
3 67 –10
4 012277 –3 –2 –1 0 1 2 3
5 348 Normal score
6 68
7 0007

11.18. (a) All distributions are skewed to the right (stemplots follow). Student choices of
summary statistics may vary; five-number summaries are a good choice because of the
skewness, but some may also give means and standard deviations. Notice especially how the
skewness is apparent in the five-number summaries.
Variable x s Min Q1 M Q3 Max
Total billing 8.2524 6.7441 1.6 2.85 6.7 11.35 29.5
Number of architects 10.5714 8.9026 1.0 5.00 7.0 16.50 39.0
Number of engineers 6.8095 10.7964 0.0 0.00 2.0 9.00 36.0
Number of staff 59.9048 55.8891 9.0 15.50 58.0 71.00 240.0
302 Chapter 11 Multiple Regression

(b) Correlation coefficients are given with the scatterplots (below). All pairs
 =
of variables are positively correlated. (c) The regression equation is Billing
.
0.7799 + 0.0143 Arch − 0.1364 Eng + 0.1377 Staff, and the standard error is s = 1.935.
(Minitab output below.) (d) The plots of residuals versus the explanatory variables (not
shown) reveal no particular causes for concern. A stemplot of the residuals (below) is
somewhat right-skewed; this can also be seen in a Normal quantile plot (not shown). (e) The
predicted billing for HCO is 3.028 (million dollars).
Total billing Architects Engineers Staff Residuals
0 1112234 0 1223 0 000000000123 0 011111124 −2 5
0 5566779 0 556667789 0 5567 0 5566667789 −1 99921
1 0022 1 23 1 1 1 −0 886544
1 58 1 6779 1 57 1 6 0 006
2 2 2 2 2 4 1 245
2 9 2 2 2 1
3 3 3 3
3 9 3 56 4 2
30 30 30
25 25 25
Total billing

Total billing

Total billing
20 20 20
15 15 15
10 10 10
5 5 5
0 0 0
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 0 50 100 150 200
Number of architects Number of engineers Number of staff
Correlation: 0.7841 Correlation: 0.8194 Correlation: 0.9587
40 40 35
35 35
Number of engineers
Number of architects

Number of architects

30
30 30
25
25 25
20
20 20
15
15 15
10
10 10
5 5 5
0 0 0
0 5 10 15 20 25 30 35 0 50 100 150 200 0 50 100 150 200
Number of engineers Number of staff Number of staff
Correlation: 0.4569 Correlation: 0.7579 Correlation: 0.9018

Minitab output: Regression of total billing on numbers of architects, engineers, and staff
TotalBil = 0.780 + 0.014 N_Arch - 0.136 N_Eng + 0.138 N_Staff
Predictor Coef Stdev t-ratio p
Constant 0.7799 0.7126 1.09 0.289
N_Arch 0.0143 0.1252 0.11 0.910
N_Eng -0.1364 0.1558 -0.88 0.393
N_Staff 0.13773 0.04104 3.36 0.004
s = 1.935 R-sq = 93.0% R-sq(adj) = 91.8%
Prediction for 3 architects, 1 engineer, 17 staff members
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
3.028 0.566 ( 1.833, 4.223) ( -1.227, 7.283)
Solutions 303

11.19. (a) In the two scatterplots (below), we see a moderate positive linear relationship
for small banks. For large banks, the relationship is very weak. (b) For small banks,
 = 35.9 + 0.1042 LOS, with R 2 = . .
Wages 46.6% and s = 7.026. (c) For large banks,
 = 49.5 + 0.0560 LOS, with R 2 = . .
Wages 3.5% and s = 13.02. (d) The large-bank regression
is not significant (nor is it useful for prediction).
70 100
Small banks Large banks
65 90
60
80
55

Wages
Wages

50 70
45 60
40
50
35
30 40
25 30
0 50 100 150 200 0 50 100 150 200
Length of service Length of service

Minitab output: Regression of wages on length of service (small banks)


The regression equation is Wages-S = 35.9 + 0.104 LOS-S
Predictor Coef Stdev t-ratio p
Constant 35.914 2.284 15.73 0.000
LOS-S 0.10424 0.02328 4.48 0.000
s = 7.026 R-sq = 46.6% R-sq(adj) = 44.3%
Regression of wages on length of service (large banks)
The regression equation is Wages-L = 49.5 + 0.0560 LOS-L
Predictor Coef Stdev t-ratio p
Constant 49.545 4.013 12.35 0.000
LOS-L 0.05595 0.05116 1.09 0.282
s = 13.02 R-sq = 3.5% R-sq(adj) = 0.6%

11.20. (a) Note that most statistical software provides a way to define new variables like this.
 = 35.9 + 0.1042 LOS + 13.6 SIZE1 − 0.0483 LOSSIZE1.
(b) The regression equation is Wages
(c) The intercept and coefficient of LOS in this equation are the same as in the small-bank
regression from the previous exercise. (d) Up to rounding error, these two sums equal the
intercept and coefficient of LOS in the large-bank regression: Adding the intercept and SIZE1
coefficient gives 35.9 + 13.6 = 49.5, and adding the LOS and LOSSIZE1 coefficients gives
0.1042 − 0.0483 = 0.0559. (e) Large banks (SIZE1 = 1) have a larger intercept, suggesting
that on the average, they offer a higher starting wages (for employees with LOS = 0).
However, they also have a smaller slope, meaning that (on the average) wages increase
faster at smaller banks.

Minitab output: Regression of wages on LOS, SIZE1, and LOSSIZE1


Wages = 35.9 + 0.104 LOS + 13.6 size1 - 0.0483 LOSsize1
Predictor Coef Stdev t-ratio p
Constant 35.914 3.562 10.08 0.000
LOS 0.10424 0.03632 2.87 0.006
SIZE1 13.631 4.910 2.78 0.007
LOSSIZE1 -0.04828 0.05634 -0.86 0.395
s = 10.96 R-sq = 26.6% R-sq(adj) = 22.7%
304 Chapter 11 Multiple Regression

11.21. (a) Budget and Opening are right-skewed; Theaters and Opinion are roughly symmetric
(slightly left-skewed). Five-number summaries are best for skewed distributions, but all
possible numerical summaries are given here.
Variable x s Min Q1 M Q3 Max
Budget 61.81 52.47 6.5 20.0 45.0 85.0 185.0
Opening 28.59 31.89 1.1 10.0 18.6 32.1 158.4
Theaters 2785 921 808.0 2123.0 2808.0 3510.0 4366.0
Opinion 6.440 1.064 3.6 5.9 6.6 7.0 8.9
A worthwhile observation is that for all four variables, the maximum observation comes
from The Dark Knight. (b) Correlation coefficients are given with the scatterplots (below).
All pairs of variables are positively correlated. The Budget/Theaters and Opening/Theaters
relationships appear to be curved; the others are reasonably linear.

Budget Opening Theaters Opinion


0 0001111 0 0000000011111111111 0 8 3 6
0 222222233 0 22223333 1 123 4 3
0 4445 0 455 1 568 4
0 6677 0 666 2 01444 5 1234
0 888 0 2 556778899 5 5899
1 1 0 3 01244 6 00133
1 23 1 3 55679 6 5566667899
1 44555 1 5 4 0113 7 00112
1 7 7
1 8 8 000
8 9
180 180 180
160 160 160
140 140 140
120 120 120
Budget

Budget

Budget

100 100 100


80 80 80
60 60 60
40 40 40
20 20 20
0 0 0
0 20 40 60 80 100120140160 1000 2000 3000 4000 3 4 5 6 7 8 9
Opening Theaters Opinion
Correlation: 0.7690 Correlation: 0.7732 Correlation: 0.4240

160 160 4500


140 140 4000
120 120 3500
Theaters
Opening

Opening

100 100 3000


80 80 2500
60 60 2000
40 40 1500
20 20 1000
0 0 500
1000 2000 3000 4000 3 4 5 6 7 8 9 3 4 5 6 7 8 9
Theaters Opinion Opinion
Correlation: 0.7365 Correlation: 0.4017 Correlation: 0.1561
Solutions 305

11.22. (a) The distribution is sharply right-skewed. In millions 0 00000111233333344


.
of dollars, the mean and standard deviation are x = 86.8 and 0 55666789
. 1 0023
s = 106.2, and the five-number summary is Min = 2.3, Q 1 = 1 58
25.5, M = 52.5, Q 3 = 102.2, Max = 533.3. (As in the pre- 2 12
vious exercise, the maximum revenue is for The Dark Knight.) 2
(b) It is the deviations (errors) that need to be Normally dis- 3 1
3
tributed, not the response variable. (c) All four scatterplots 4
(below) suggest positive associations, but only one (revenue ver- 4
sus opening) looks convincingly linear. Revenue versus theaters 5 3
appears to be curved, and the other two are indeterminate.

500 500
U.S. revenue ($millions)

U.S. revenue ($millions)


400 400

300 300

200 200

100 100

0 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160
Budget ($millions) Opening ($millions)

500 500
U.S. revenue ($millions)

U.S. revenue ($millions)

400 400

300 300

200 200

100 100

0 0
500 1000 1500 2000 2500 3000 3500 4000 4500 3 4 5 6 7 8 9
Theaters Opinion

11.23. (a) The model is


USRevenuei = β0 + β1 Budgeti + β2 Openingi + β3 Theatersi + β4 Opinioni + i
where i = 1, 2, . . . , 35, and i are independent N (0, σ ) random variables. (b) The regression
equation is

USRevenue = −67.72 + 0.1351 Budget + 3.0165 Opening
− 0.00223 Theaters + 10.262 Opinion
(Minitab output on the following page.) (c) On the following page is a stemplot of the
residuals. The distribution is somewhat irregular, but a Normal quantile plot (not shown)
does not suggest severe deviations from Normality. The residual analysis should also include
a plot of residuals versus the explanatory variables; three of those plots are unremarkable
(and not shown). The plot of residuals versus theaters suggests that the spread of the
residuals increases with Theaters. The Dark Knight—noted as unusual in the previous two
.
exercises—may be influential. (d) This regression explains R 2 = 98.1% of the variation in
revenue.
306 Chapter 11 Multiple Regression

Minitab output: Regression of U.S. Revenue on budget, opening, theaters, and opinion
USRevenue = - 67.7 + 0.135 Budget + 3.02 Opening - 0.00223 Theaters
+ 10.3 Opinion
Predictor Coef Stdev t-ratio p
Constant -67.72 24.14 -2.81 0.009
Budget 0.13511 0.09776 1.38 0.177
Opening 3.0165 0.1461 20.65 0.000
Theaters -0.002229 0.005299 -0.42 0.677
Opinion 10.262 3.032 3.38 0.002
s = 15.69 R-sq = 98.1% R-sq(adj) = 97.8%

−2 440 40
−1 75 30
−1 4332
−0 875 20

Residual
−0 4322221100 10
0 144
0 0
1 0144 –10
1 568 –20
2 4
2 8 –30
3 500 1000 1500 2000 2500 3000 3500 4000 4500
3 6 Theaters

11.24. (a) The regression equation is



USRevenue = −76.6 + 3.12 Opening + 11.5 Opinion
.
(b) This regression explains R 2 = 97.9% of the variation in U.S. revenue. (c) With n = 35,
2 . .
p = 4, q = 2, R1 = 0.981, and R22 = 0.979, we have


n− p−1 R12 − R22 30 0.981 − 0.979 .
F= = · = 1.579
q 1− R12 2 1 − 0.981
.
Comparing to an F(2, 30) distribution, we have P = 0.2229. We do not reject H0 ,
meaning that the variables removed from the regression do not add “significant predictive
information” to the model.
Note: The first printing of the text mistakenly said n = 40 movies; this changes F to
1.842 and P to 0.1735, but the conclusion is the same.

Minitab output: Regression of U.S. revenue on opening and opinion


USRevenue = - 76.6 + 3.12 Opening + 11.5 Opinion
Predictor Coef Stdev t-ratio p
Constant -76.60 17.12 -4.47 0.000
Opening 3.12355 0.09224 33.86 0.000
Opinion 11.497 2.764 4.16 0.000
s = 15.71 R-sq = 97.9% R-sq(adj) = 97.8%
Solutions 307

11.25. (a) Using the full model, the 95% prediction interval is $86.86 to $154.91 million.
(b) With the reduced model, the interval is $89.93 to $155.00 million. (c) The intervals
are very similar; as we saw in the previous exercise, there is little additional predictive
information from the two variables we removed.
Note: According to http://www.imdb.com/title/tt0425061/business, the
actual U.S. revenue for Get Smart was $130.3 million.

Minitab output: Predicting U.S. revenue for Get Smart (full model)
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
120.89 5.58 ( 109.48, 132.29) ( 86.86, 154.91)
Predicting U.S. revenue for Get Smart (reduced model)
Fit Stdev.Fit 95.0% C.I. 95.0% P.I.
122.46 2.86 ( 116.64, 128.29) ( 89.93, 155.00)

11.26. (a) The two films are Yes Man and Hancock, which made (respectively) $36.7 and
$34.2 million more than predicted. (The easiest way to find these two movies is to find the
two largest residuals of the reduced-model regression.) (b) With those movies removed, the
regression equation is

USRevenue = −75.72 + 3.1038 Opening + 11.112 Opinion
The coefficients are significant (t = 39.15 and t = 4.75, both with P < 0.0005). (c) Both
coefficients decreased slightly, meaning that a change in either variable makes a slightly
smaller change on the predicted revenue. Another observation: R 2 is slightly larger for this
regression (98.6% versus 97.9%). This does not mean that this regression is better; rather,
removing the outliers means that there is less variation to explain. (d) A stemplot and
quantile plot (below) suggest no reasons for concern.

Minitab output: Regression of U.S. revenue on opening and opinion (outliers removed)
USRevenu = - 75.7 + 3.10 Opening + 11.1 Opinion
Predictor Coef Stdev t-ratio p
Constant -75.72 14.44 -5.24 0.000
Opening 3.10379 0.07928 39.15 0.000
Opinion 11.112 2.341 4.75 0.000
s = 13.17 R-sq = 98.6% R-sq(adj) = 98.5%
30
−2 0 20
−1 6
−1 420 10
Residual

−0 98765
0
−0 433210
0 0012334 –10
0 56789
1 024 –20
1 6 –30
2 0 –3 –2 –1 0 1 2 3
Normal score
308 Chapter 11 Multiple Regression

11.27. (a) The PEER distribution is left-skewed; the other two distributions are irregular
(stemplots below). Student choices of summary statistics may vary; both five-number
summaries and means/standard deviations are given below. (b) Correlation coefficients are
given below the scatterplots. PEER and FtoS are negatively correlated, FtoS and CtoF are
positively correlated, and the other correlation is very small.
Variable x s Min Q1 M Q3 Max
Peer review score 79.60 18.37 39 61 85 97 100
Faculty/student ratio 61.88 28.23 18 29 67 89 100
Citations/faculty ratio 63.84 25.23 17 40 66 86 100

Peer review score Faculty/student ratio Citations/faculty ratio


3 9 1 8 1 79
4 2 2 1233444 2 11
4 579 2 55788889999 2 569
5 234 3 02 3 01223344
5 567777789 3 678 3 69
6 114 4 4 00124
6 57789 4 55699 4 6778
7 223 5 03 5 13
7 678 5 55779 5 89
8 023344 6 6 002223
8 566778899 6 57899 6 56667
9 0233344 7 114 7 133
9 56677889 7 5677 7 58
10 00000000000000 8 0223 8 01122333
8 56679 8 6667999
9 01133 9 02344
9 5889999 9 566789
10 000000 10 000

100 100 100


Faculty/student ratio

Citation/faculty ratio

Citation/faculty ratio

80 80 80

60 60 60

40 40 40

20 20 20

0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Peer review score Peer review score Faculty/student ratio
Correlation: –0.1143 Correlation: 0.0045 Correlation: 0.5801

11.28. (a) All three scatterplots are on the following page. The plot versus peer review score is
much more linear than the other two. (b) The correlations are given below the scatterplots.
Not surprisingly, the PEER correlation is greatest.
Note: The fact that the scatterplots do not all suggest linear associations does not mean
that a multiple regression is inappropriate. Even if the data exactly fit a multiple regression
model, the pairwise scatterplots will not necessarily appear to be linear.
Solutions 309

100 100 100

90 90 90

Overall score

Overall score
Overall score

80 80 80

70 70 70

60 60 60

50 50 50
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Peer review score Faculty/student ratio Citation/faculty ratio
Correlation: 0.8073 Correlation: 0.0637 Correlation: 0.2691

11.29. (a) The model is OVERALLi = β0 + β1 PEERi + β2 FtoSi + β3 CtoSi + i , where i are
independent N (0, σ ) random variables. (b) The regression equation is:

OVERALL = 18.85 + 0.5746 PEER + 0.0013 FtoS + 0.1369 CtoF
(c) For the confidence intervals, take bi ± t ∗ SEbi , with t ∗ = 1.9939 (for df = 71). These
intervals have been added to the Minitab output below. The second interval contains 0,
because that coefficient is not significantly different from 0. (d) The regression explains
. .
R 2 = 72.2% of the variation in overall score. The estimate of σ is s = 7.043.

Minitab output: Regression of overall score on all three variables


OVERALL = 18.8 + 0.575 PEER + 0.0013 FtoS + 0.137 CtoF
Predictor Coef Stdev t-ratio p 95% confidence interval
Constant 18.846 4.363 4.32 0.000
PEER 0.57462 0.04504 12.76 0.000 0.4848 to 0.6644
FtoS 0.00130 0.03597 0.04 0.971 -0.0704 to 0.0730
CtoF 0.13690 0.03999 3.42 0.001 0.0572 to 0.2166
s = 7.043 R-sq = 72.2% R-sq(adj) = 71.0%

.
11.30. (a) Between GPA and IQ, r = 0.634 (so straight-line regression explains r 2 = 40.2%
.
of the variation in GPA). Between GPA and self-concept, r = 0.542 (so regression explains
r 2 = 29.4% of the variation in GPA). Since gender is categorical, the correlation between
GPA and gender is not meaningful. (b) The model is GPA = β0 + β1 IQ + β2 SC + i ,
where i are independent N (0, σ ) random variables. (c) Regression gives the equation
 = −3.88 + 0.0772 IQ + 0.0513 SC. Based on the reported value of R 2 , the regression
GPA
explains 47.1% of the variation in GPA. (So adding self-concept to the model only adds
about 6.9% to the variation explained by the regression.) (d) We test H0 : β2 = 0 versus
Ha : β2 = 0. The test statistic t = 3.14 (df = 75) has P = 0.002; we conclude that the
coefficient of self-concept is not 0.

Minitab output: Regression of GPA on IQ and self-concept score


GPA = -3.88 + 0.0772 IQ + 0.0513 SelfCcpt
Predictor Coef Stdev t-ratio p
Constant -3.882 1.472 -2.64 0.010
IQ 0.07720 0.01539 5.02 0.000
SelfCcpt 0.05125 0.01633 3.14 0.002
s = 1.547 R-sq = 47.1% R-sq(adj) = 45.7%
310 Chapter 11 Multiple Regression

11.31. (a) All distributions are skewed to varying degrees—GINI and CORRUPT to the right,
the other three to the left. CORRUPT, DEMOCRACY, and LIFE have the most skewness.
Student choices of summary statistics may vary; five-number summaries are a good choice
because of the skewness, but some may also give means and standard deviations.
Variable x s Min Q1 M Q3 Max
LSI 6.2597 1.2773 2.5 5.5 6.1 7.35 8.2
GINI 37.9399 8.8397 24.70 32.65 35.95 42.750 60.10
CORRUPT 4.8861 2.4976 1.7 2.85 4.0 7.3 9.7
DEMOCRACY 4.2917 1.6799 0.5 3.0 5.0 5.5 6.0
LIFE 71.9450 9.0252 44.28 70.39 73.16 78.765 82.07
Notice especially how the skewness is apparent in the five-number summaries.
(b) Correlation coefficients are given below the scatterplots. GINI is negatively (and weakly)
correlated to the other four variables, while all other correlations are positive and more
substantial (0.533 or more).
LSI 8 8
2 5 7 7
3 3
3 78 6 6
LSI

LSI
4 5 5
4 55678
5 0023334 4 4
5 555566677778889999
3 3
6 011233
6 557788 2 2
7 001113334 20 25 30 35 40 45 50 55 60 1 2 3 4 5 6 7 8 9 10
7 66677888899 GINI CORRUPT
8 000112 Correlation: –0.1394 Correlation: 0.7218

8 8
GINI 7 7
2 44 6 6
2 55566888999
LSI

LSI

3 0011223333344444444 5 5
3 55556666666788999 4 4
4 0001233344
4 56 3 3
5 0112234
2 2
5 778
0 1 2 3 4 5 6 40 45 50 55 60 65 70 75 80
6 0
DEMOCRACY LIFE
Correlation: 0.6084 Correlation: 0.8073
Solutions 311

CORRUPT 10 6
1 79 9
2 112224 8 5

DEMOCRACY
2 5556666688999999

CORRUPT
7 4
3 002244 6
3 55557 3
5
4 002233
4 2
4 58
5 000 3
1
5 79 2
6 134 1 0
6 5 20 25 30 35 40 45 50 55 60 20 25 30 35 40 45 50 55 60
7 03344 GINI GINI
7 56 Correlation: –0.3845 Correlation: –0.1875
8 24
8 66789 80
6
9 12
9 5667 75 5

DEMOCRACY
70
4
DEMOCRACY 65
LIFE

60 3
0 55
1 00 55 2
1 5555555 50
2 0 1
45
2 55 40 0
3 000000 20 25 30 35 40 45 50 55 60 1 2 3 4 5 6 7 8 9 10
3 5555 GINI CORRUPT
4 000
Correlation: –0.3828 Correlation: 0.7271
4 5555555
5 0000000
5 5555555555555555 80 80
6 000000000000000 75 75
70 70
LIFE 65 65
LIFE

LIFE

4 4 60 60
4 689 55 55
5 12 50 50
5 679 45 45
6 34
40 40
6 566899
1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6
7 00000011111122222233334
7 5556667888888889999999 CORRUPT DEMOCRACY
8 0000000112 Correlation: 0.6559 Correlation: 0.5331

11.32. The four regression equations are:


=
(a) LSI 7.02 − 0.0201 GINI
 = −3.83 + 0.0287 GINI + 0.1250 LIFE
(b) LSI
 = −3.25 + 0.0280 GINI + 0.1063 LIFE + 0.1857 DEMOCRACY
(c) LSI
 = −2.72 + 0.0368 GINI + 0.0905 LIFE + 0.0392 DEMOCRACY + 0.1855 CORRUPT
(d) LSI
Minitab output (on the next page) gives the values of R 2 for each regression and highlights
non-significant P-values. We note that GINI does not contribute significantly to the first
model but is significant in every other model, and DEMOCRACY is not significant in the last
model, even though it was significant in the second-to-last model. (Roughly speaking, this
means that whatever information DEMOCRACY contributed to the model, CORRUPT contains
312 Chapter 11 Multiple Regression

that same information but contributes it more efficiently than does DEMOCRACY. Recall
from the previous solution that DEMOCRACY and CORRUPT had correlation 0.7271.)
Shown on the next page are stemplots of the residuals for all four regressions; the first
distribution is clearly skewed, but the other three show no severe deviations from Normality.
A full analysis of the residuals for each regression would require a total of 10 scatterplots;
shown are six of these plots which suggest possible problems with the assumptions. The
first five show signs of non-constant standard deviations, and the last shows a hint of
curvature.

Minitab output: Regression of LSI on GINI (Model 1)


LSI = 7.02 - 0.0201 GINI
Predictor Coef Stdev t-ratio p
Constant 7.0238 0.6660 10.55 0.000
GINI -0.02014 0.01710 -1.18 0.243
s = 1.274 R-sq = 1.9% R-sq(adj) = 0.5%
Regression of LSI on GINI and LIFE (Model 2)
LSI = - 3.83 + 0.0287 GINI + 0.125 LIFE
Predictor Coef Stdev t-ratio p
Constant -3.8257 0.9746 -3.93 0.000
GINI 0.02873 0.01056 2.72 0.008
LIFE 0.12503 0.01034 12.09 0.000
s = 0.7266 R-sq = 68.6% R-sq(adj) = 67.6%
Regression of LSI on GINI, LIFE, and DEMOCRACY (Model 3)
LSI = - 3.25 + 0.0280 GINI + 0.106 LIFE + 0.186 DEMOCRACY
Predictor Coef Stdev t-ratio p
Constant -3.2524 0.9293 -3.50 0.001
GINI 0.028049 0.009891 2.84 0.006
LIFE 0.10634 0.01125 9.46 0.000
DEMOCRACY 0.18575 0.05682 3.27 0.002
s = 0.6804 R-sq = 72.8% R-sq(adj) = 71.6%
Regression of LSI on all four variables (Model 4)
LSI = - 2.72 + 0.0368 GINI + 0.0905 LIFE + 0.0392 DEMOCRACY + 0.186 CORRUPT
Predictor Coef Stdev t-ratio p
Constant -2.7201 0.8661 -3.14 0.003
GINI 0.036782 0.009393 3.92 0.000
LIFE 0.09048 0.01120 8.08 0.000
DEMOCRACY 0.03925 0.06566 0.60 0.552
CORRUPT 0.18554 0.05042 3.68 0.000
s = 0.6252 R-sq = 77.4% R-sq(adj) = 76.0%
Solutions 313

Model 1 residuals Model 2 residuals Model 3 residuals Model 4 residuals


−3 8 −1 54 −1 5 −1 5
−3 −1 −1 32 −1 3322
−2 7 −1 11100 −1 100 −1
−2 44 −0 9888 −0 99888 −0 9
−1 8665 −0 77666 −0 766 −0 77776
−1 33210 −0 554 −0 55444 −0 5444
−0 998877776666555 −0 33333222222 −0 33332222222 −0 3333322222
−0 44443322100 −0 1111000 −0 111000 −0 11111000
0 1234 0 00011 0 01111 0 00000111
0 66667889 0 2222333 0 222233333 0 222222222233
1 0112233444 0 4444555 0 445555 0 444444555
1 55566666677 0 6777777 0 6666666777 0 667
0 889 0 89 0 8889
1 011 1 00 1 00
1 2 1 33 1 2
1 55
1.5
1.5 1.5
1
Model 2 residuals

Model 3 residuals

Model 3 residuals
1 1
0.5 0.5 0.5
0 0 0
–0.5 –0.5 –0.5
–1 –1 –1
–1.5 –1.5 –1.5
–2 –2 –2
40 45 50 55 60 65 70 75 80 40 45 50 55 60 65 70 75 80 0 1 2 3 4 5 6
LIFE LIFE DEMOCRACY

1.5 1.5
1.5
1 1
Model 4 residuals

Model 4 residuals

Model 4 residuals

1
0.5 0.5 0.5
0 0 0
–0.5 –0.5 –0.5
–1 –1 –1
–1.5 –1.5 –1.5
–2 –2 –2
40 45 50 55 60 65 70 75 80 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10
LIFE DEMOCRACY CORRUPT

11.33. (a) The coefficients, standard errors, t statistics, and P-values are given in the Minitab
output shown with the solution to the previous exercise. (b) Student observations will vary.
For example, the t statistic for the GINI coefficient grows from t = −1.18 (P = 0.243) to
t = 3.92 (P < 0.0005). The DEMOCRACY t is 3.27 in the third model (P < 0.0005) but
drops to 0.60 (P = 0.552) in the fourth model. (c) A good choice is to use GINI, LIFE, and
CORRUPT (Minitab output on the following page). All three coefficients are significant, and
R 2 = 77.3% is nearly the same as the fourth model from previous exercise. However, a
scatterplot of the residuals versus CORRUPT (not shown) still looks quite a bit like the final
scatterplot shown in the previous solution, suggesting a slightly curved relationship, which
would violate the assumptions of our model.
314 Chapter 11 Multiple Regression

Minitab output: Regression of LSI on GINI, LIFE, and CORRUPT


LSI = - 2.74 + 0.0377 GINI + 0.0914 LIFE + 0.204 CORRUPT
Predictor Coef Stdev t-ratio p
Constant -2.7442 0.8610 -3.19 0.002
GINI 0.037734 0.009213 4.10 0.000
LIFE 0.09141 0.01104 8.28 0.000
CORRUPT 0.20382 0.03991 5.11 0.000
s = 0.6222 R-sq = 77.3% R-sq(adj) = 76.3%

11.34. (a) Stemplots (below) show that all four variables are right-skewed to some degree.
Variable x s Min Q1 M Q3 Max
VO+ 985.806 579.858 285.0 513.0 870.0 1251.0 2545.0
VO– 889.194 427.616 254.0 536.0 903.0 1028.0 2236.0
OC 33.416 19.610 8.1 17.9 30.2 47.7 77.9
TRAP 13.248 6.528 3.3 8.8 10.3 19.0 28.8

(b) Correlations and scatterplots (below) show that all six pairs of variables are positively
associated. The strongest association is between VO+ and VO– and the weakest is between
OC and VO–.

VO+ VO– OC TRAP


0 23 0 23 0 89 0 3
0 4444555 0 4444455 1 0 0 5
0 66667 0 6777 1 5677799 0 66
0 888899 0 889999999 2 00004 0 888899999
1 011 1 0001 2 1 00000
1 23 1 2 3 011 1
1 5 1 44 3 568 1 4444
1 66 1 7 4 04 1
1 1 4 7 1 89999
2 2 5 244 2
2 22 2 2 5 6 2 3
2 5 6 2 55
6 8 2
7 2 8
7 67

2500 r = 0.8958 2500 r = 0.6596 2500 r = 0.7649

2000 2000 2000


VO +

VO +

VO +

1500 1500 1500

1000 1000 1000

500 500 500

0 0 0
0 500 1000 1500 2000 0 10 20 30 40 50 60 70 80 0 5 10 15 20 25 30
VO – OC TRAP
Solutions 315

r = 0.4548 r = 0 6779 30 r = 0.7299


2000 2000 25

1500 1500 20

TRAP
VO –

VO –
15
1000 1000
10
500 500
5
0 0 0
0 10 20 30 40 50 60 70 80 0 5 10 15 20 25 30 0 10 20 30 40 50 60 70 80
OC TRAP OC

11.35. (a) See the previous solution for the 1500


scatterplot, which suggests greater variation
in VO+ for large OC. The regression 1000
equation is

Residual
500
 = 334.0 + 19.505 OC
VO+
. . 0
with s = 443.3 and R 2 = 0.435; the
test statistic for the slope is t = 4.73 –500
(P < 0.0005), so we conclude the slope –1000
is not zero. The plot of residuals against 0 10 20 30 40 50 60 70 80
OC suggests a slight downward curve on OC
the right end, as well as increasing scatter as OC increases. The residuals are also somewhat
right-skewed. A stemplot and Normal quantile plot of the residuals are not shown here but
could be included as part of the analysis. (b) The regression equation is
 = 57.7 + 6.415 OC + 53.87 TRAP
VO+
. .
with s = 376.3 and R 2 = 0.607. The coefficient of OC is not significantly different from
0 (t = 1.25, P = 0.221), but the coefficient of TRAP is (t = 3.50, P = 0.002). This is
consistent with the correlations found in the solution to Exercise 11.34: TRAP is more highly
correlated with VO+, and is also highly correlated with OC, so it is reasonable that, if TRAP
is present in the model, little additional information is gained from OC.

Minitab output: Regression of VO+ on OC (Model 1)


The regression equation is VOplus = 334 + 19.5 OC
Predictor Coef Stdev t-ratio p
Constant 334.0 159.2 2.10 0.045
OC 19.505 4.127 4.73 0.000
s = 443.3 R-sq = 43.5% R-sq(adj) = 41.6%
Regression of VO+ on OC and TRAP (Model 2)
The regression equation is VOplus = 58 + 6.41 OC + 53.9 TRAP
Predictor Coef Stdev t-ratio p
Constant 57.7 156.5 0.37 0.715
OC 6.415 5.125 1.25 0.221
TRAP 53.87 15.39 3.50 0.002
s = 376.3 R-sq = 60.7% R-sq(adj) = 57.9%
316 Chapter 11 Multiple Regression

11.36. (a) The model is


VO+ = β0 + β1 OC + β2 TRAP + β2 VOminus + i
where i are independent N (0, σ ) random variables. (b)–(d) The table below summarizes
the results for all the regressions called for in this exercise. The estimated coefficients and
P-values can change rather drastically from one model to the next. Generally, R 2 increases
(sometimes only slightly) as we add more explanatory variables to the model. (e) The results
of the regression in part (b) suggest that we remove TRAP from the model. This regression
equation and associated results are also in the table below. Because R 2 drops only slightly
for this simpler model, this is probably the best of all models we have considered to this
point.

Model R2 s
1  =
VO+ 334.0 + 19.505 OC 0.435 443.3
SE = 4.127
t = 4.73
P < 0.0005
2  =
VO+ 57.7 + 6.415 OC + 53.87 TRAP 0.607 376.3
SE = 5.125 SE = 15.39
t = 1.25 t = 3.50
P = 0.221 P = 0.002
3  = −243.5 + 8.235 OC
VO+ + 6.61 TRAP + 0.975 VOminus 0.884 207.8
SE = 2.840 SE = 10.33 SE = 0.1211
t = 2.90 t = 0.64 t = 8.05
P = 0.007 P = 0.528 P < 0.0005
4  = −234.1 + 9.404 OC
VO+ + 1.019 VOminus 0.883 205.6
SE = 2.150 SE = 0.0986
t = 4.37 t = 10.33
P < 0.0005 P < 0.0005

Minitab output: Regression of VO+ on OC, TRAP and VO– (Model 3)


The regression equation is VOplus = -243 + 8.23 OC + 6.6 TRAP + 0.975 VOminus
Predictor Coef Stdev t-ratio p
Constant -243.49 94.22 -2.58 0.015
OC 8.235 2.840 2.90 0.007
TRAP 6.61 10.33 0.64 0.528
VOminus 0.9746 0.1211 8.05 0.000
s = 207.8 R-sq = 88.4% R-sq(adj) = 87.2%
Regression of VO+ on OC and VO– (Model 4)
The regression equation is VOplus = -234 + 9.40 OC + 1.02 VOminus
Predictor Coef Stdev t-ratio p
Constant -234.14 92.09 -2.54 0.017
OC 9.404 2.150 4.37 0.000
VOminus 1.01857 0.09858 10.33 0.000
s = 205.6 R-sq = 88.3% R-sq(adj) = 87.4%
Solutions 317

11.37. Stemplots (below) show that all four variables are noticeably less skewed.
Variable x s Min Q1 M Q3 Max
LVO+ 6.7418 0.5555 5.652 6.240 6.768 7.132 7.842
LVO– 6.6816 0.4832 5.537 6.284 6.806 6.935 7.712
LOC 3.3380 0.6085 2.092 2.885 3.408 3.865 4.355
LTRAP 2.4674 0.4978 1.194 2.175 2.332 2.944 3.360
Correlations and scatterplots (on the following page) show that all six pairs of variables are
positively associated. The strongest association is between LVO+ and LVO– and the weakest
is between LOC and LVO–. The regression equations for these transformed variables are
given in the table below, along with significance test results. Residual analysis for these
regressions is not shown.
The final conclusion is the same as for the untransformed data: When we use all three
explanatory variables to predict LVO+, the coefficient of LTRAP is not significantly different
from 0 and we then find that the model that uses LOC and LVO– to predict LVO+ is nearly
as good (in terms of R 2 ), making it the best of the bunch.

LVO+ LVO– LOC LTRAP


5 6 5 5 2 0 1 1
5 99 5 2 23 1
6 011 5 8 2 1
6 223 6 001 2 7 1 7
6 4455 6 2223 2 8888999 1 89
6 67777 6 455 3 0001 2 001111
6 889 6 677 3 2 22233333
7 0011 6 8888888999 3 44455 2
7 33 7 01 3 667 2 6667
7 4 7 23 3 89 2 99999
7 77 7 4 4 000 3 1
7 8 7 7 4 233 3 223
R2 s
 = 4.3841 + 0.7063 LOC
LVO+ 0.599 0.3580
SE = 0.1074
t = 6.58
P < 0.0005
 = 4.2590 + 0.4304 LOC
LVO+ + 0.4240 LTRAP 0.652 0.3394
SE = 0.1680 SE = 0.2054
t = 2.56 t = 2.06
P = 0.016 P = 0.048
 = 0.8716 + 0.3922 LOC
LVO+ + 0.0275 LTRAP + 0.6725 LVOminus 0.842 0.2326
SE = 0.1154 SE = 0.1570 SE = 0.1178
t = 3.40 t = 0.18 t = 5.71
P = 0.002 P = 0.842 P < 0.0005
 = 0.8321 + 0.4061 LOC
LVO+ + 0.6816 LVOminus 0.842 0.2286
SE = 0.0824 SE = 0.1038
t = 4.93 t = 6.57
P < 0.0005 P < 0.0005
318 Chapter 11 Multiple Regression

r = 0.8397 r = 0.7737 r = 0.7550


7.5 7.5 7.5
log(VO +)

log(VO +)

log(VO +)
7 7 7

6.5 6.5 6.5

6 6 6

5.5 5.5 5.5


5.5 6 6.5 7 7.5 2 2.5 3 3.5 4 1 1.5 2 2.5 3
log(VO –) log(OC) log(TRAP)

r = 0.5547 r = 0.6643 r = 0.7954


7.5 7.5
3
7 7

log(TRAP)
log(VO –)

log(VO –)

2.5

6.5 6.5 2

6 6 1.5

5.5 5.5 1
2 2.5 3 3.5 4 1 1.5 2 2.5 3 2 2.5 3 3.5 4
log(OC) log(TRAP) log(OC)

11.38. Refer to the solution to Exercise 11.34 for the scatterplots. Note that, in this case, it
really makes the most sense to use TRAP (rather than OC) to predict VO– (because it is the
appropriate biomarker), but many students might miss that detail. Both single-explanatory
variable models are given in the first table on the following page. Residual analysis plots are
not included. Our conclusion here is similar to the conclusion in Exercises 11.36 and 11.37:
The best model is to use OC and VO+ to predict VO–.

11.39. Refer to the solution to Exercise 11.37 for the scatterplots. As in the previous exercise,
the more logical single-variable model would be to use LTRAP to predict LVO–, but many
students might miss that detail. Both single-explanatory variable models are given in the
second table on the following page. Residual analysis plots are not included. This time, we
might conclude that the best model is to predict LVO– from LVO+ alone; neither biomarker
variable makes an indispensable contribution to the prediction.
Solutions 319

For Exercise 11.38 R2 s


 = 557.8
VO– + 9.917 OC 0.207 387.4
SE = 3.606
t = 2.75
P = 0.010
 = 300.9
VO– + 44.41 TRAP 0.460 319.7
SE = 8.942
t = 4.97
P < 0.0005
 = 309.1
VO– − 1.868 OC + 48.50 TRAP 0.463 324.4
SE = 4.418 SE = 13.27
t = −0.42 t = 3.66
P = 0.676 P = 0.001
 = 267.3
VO– − 6.513 OC + 9.485 TRAP + 0.724 VOplus 0.842 179.2
SE = 2.507 SE = 8.788 SE = 0.090
t = −2.60 t = 1.08 t = 8.05
P = 0.015 P = 0.029 P < 0.0005
 = 298.0
VO– − 5.254 OC + 0.778 VOplus 0.835 179.7
SE = 2.226 SE = 0.0753
t = −2.36 t = 10.33
P = 0.025 P < 0.0005
For Exercise 11.39 R2 s
 = 5.2110
LVO– + 0.4406 LOC 0.308 0.4089
SE = 0.1227
t = 3.59
P = 0.001
 = 5.0905
LVO– + 0.6449 LTRAP 0.441 0.3674
SE = 0.1347
t = 4.79
P < 0.0005
 = 5.0370
LVO– + 0.0569 LOC + 0.5896 LTRAP 0.443 0.3732
SE = 0.1848 SE = 0.2259
t = 0.31 t = 2.61
P = 0.761 P = 0.014
 = 1.5729
LVO– − 0.2932 LOC + 0.2447 LTRAP + 0.8134 LVOplus 0.748 0.2558
SE = 0.1407 SE = 0.1662 SE = 0.1425
t = −2.08 t = 1.47 t = 5.71
P = 0.047 P = 0.152 P < 0.0005
 = 1.3109
LVO– − 0.1878 LOC + 0.8896 LVOplus 0.728 0.2611
SE = 0.1237 SE = 0.1355
t = −1.52 t = 6.57
P = 0.140 P < 0.0005
 = 1.7570
LVO– + 0.7304 LVOplus 0.705 0.2669
SE = 0.0877
t = 8.33
P < 0.0005
320 Chapter 11 Multiple Regression

11.40. (a) Histograms are below; all distributions are sharply right-skewed.
Variable x s Min Q1 M Q3 Max
PCB 68.4674 59.3906 6.0996 29.8305 47.9596 91.7140 318.746
PCB52 0.9580 1.5983 0.0200 0.2180 0.4770 0.8925 9.060
PCB118 3.2563 3.0191 0.2360 1.4800 2.4200 3.8950 18.900
PCB138 6.8268 5.8627 0.6400 2.9700 4.9200 8.7150 32.300
PCB180 4.1584 4.9864 0.3950 1.1950 2.6900 4.5900 31.500

(b) Scatterplots and correlations are on the following page. All pairs of variables are
positively associated, although some only weakly. In general, even when the association
is strong, the plots show more variation for large values of the two variables. If we test
H0 : ρ = 0 versus Ha : ρ = 0 for these correlations, we find that P < 0.0005 for eight
.
of them. The PCB52/PCB138 correlation is less significant (r = 0.3009, t = 2.58,
P = 0.0120), and the PCB52/PCB180 correlation is not significantly different from 0
.
(r = 0.0869, t = 0.71, P = 0.4775).

35
50
30
40
Frequency

Frequency
25
20 30
15
20
10
5 10

0 0
0 50 100 150 200 250 300 350 0 1 2 3 4 5 6 7 8 9 10
PCB PCB52
30 35
25 30
Frequency

Frequency

25
20
20
15
15
10
10
5 5
0 0
0 2 4 6 8 10 12 14 16 18 20 0 5 10 15 20 25 30 35
PCB118 PCB138
50 r = 0.5964
300

40 250
Frequency

200
30
PCB

150
20 100
10 50

0 0
0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 7 8 9
PCB180 PCB52
Solutions 321

r = 0.8433 r = 0.9288 r = 0.8008


300 300 300
250 250 250
200 200 200
PCB

PCB

PCB
150 150 150
100 100 100
50 50 50
0 0 0
0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 0 5 10 15 20 25 30
PCB118 PCB138 PCB180
9 r = 0.6849 9 r = 0.3009 9 r = 0.0869
8 8 8
7 7 7
6 6 6
PCB52

PCB52

PCB52
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 0 510 15 20 25 30
PCB118 PCB138 PCB180
r = 0.7294 r = 0.4374 r = 0.8823
30
16 16
25
12 12
PCB118

PCB118

PCB138
20

8 8 15
10
4 4
5
0 0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
PCB138 PCB180 PCB180

11.41. (a) The model is: −2 2


−1
PCBi = β0 + β1 PCB52 + β2 PCB118 −1 31
+ β3 PCB138 + β4 PCB180 + i −0 8776655
−0 44433332222111111000000
where i = 1, 2, . . . , 69; i are independent N (0, σ ) 0 000000000001111222223333444
random variables. (b) The regression equation is: 0 677778
 = 0.937 + 11.8727 PCB52 + 3.7611 PCB118 1 12
PCB 1
+ 3.8842 PCB138 + 4.1823 PCB180 2 2

with s = 6.382 and R 2 = 0.989. All coefficients are significantly different from 0, although
the constant 0.937 is not (t = 0.76, P = 0.449). That makes some sense—if none of these
four congeners are present, it might be somewhat reasonable to predict that the total amount
of PCB is 0. (c) The residuals appear to be roughly Normal, but with two outliers. There are
no clear patterns when plotted against the explanatory variables (these plots are not shown).
322 Chapter 11 Multiple Regression

Minitab output: Regression of PCB on PCB52, PCB118, PCB138, and PCB180


PCB = 0.94 + 11.9 PCB52 + 3.76 PCB118 + 3.88 PCB138 + 4.18 PCB180
Predictor Coef Stdev t-ratio p
Constant 0.937 1.229 0.76 0.449
PCB52 11.8727 0.7290 16.29 0.000
PCB118 3.7611 0.6424 5.85 0.000
PCB138 3.8842 0.4978 7.80 0.000
PCB180 4.1823 0.4318 9.69 0.000
s = 6.382 R-sq = 98.9% R-sq(adj) = 98.8%

11.42. (a) The outliers are specimen #50 (residual −22.0864) −1 2


and #65 (22.5487). Because residuals are observed val- −1
−0 98
ues minus predicted values, the negative residual (#50) is −0 76
an overestimate. (The estimated PCB for this specimen is −0 5544
 = . −0 33332222
PCB 144.882, and the actual level was 122.796.) (b) The
regression equation is: −0 1111111100000000000
0 0000000000111111
 = 1.6277 + 14.4420 PCB52 + 2.5996 PCB118
PCB 0 22233
0 445
+ 4.0541 PCB138 + 4.1086 PCB180 0 677
with s = 4.555 and R 2 = 0.994. As before, all coefficients 0 889
1
are significantly different from 0, although the constant is 1
barely not different (t = 1.84, P = 0.071). The residuals 1 4
again appear to be roughly Normal, but two new specimens
(#44 and #58) show up as outliers to replace the two we removed. There are no clear pat-
terns when plotted against the explanatory variables (these plots are not shown).

Minitab output: Regression of PCB on the four congeners (outliers removed)


PCB = 1.63 + 14.4 PCB52 + 2.60 PCB118 + 4.05 PCB138 + 4.11 PCB180
Predictor Coef Stdev t-ratio p
Constant 1.6277 0.8858 1.84 0.071
PCB52 14.4420 0.6960 20.75 0.000
PCB118 2.5996 0.5164 5.03 0.000
PCB138 4.0541 0.3752 10.80 0.000
PCB180 4.1086 0.3175 12.94 0.000
s = 4.555 R-sq = 99.4% R-sq(adj) = 99.4%

11.43. (a) The regression equation is:


 = −1.018 + 12.644 PCB52 + 0.3131 PCB118 + 8.2546 PCB138
PCB
with s = 9.945 and R 2 = 0.973. Residual analysis (not shown) suggests a few areas of
concern: The distribution of residuals has heavier tails than a Normal distribution, and the
scatter (that is, prediction error) is greater for larger values of the predicted PCB. (b) The
.
estimated coefficient of PCB118 is b2 = 0.3131; its P-value is 0.708. (Details in Minitab
.
output on the following page.) (c) In Exercise 11.41, b2 = 3.7611 and P < 0.0005. (d) This
illustrates how complicated multiple regression can be: When we add PCB180 to the model,
it complements PCB118, making it useful for prediction.
Solutions 323

Minitab output: Regression of PCB on PCB52, PCB118, and PCB138


PCB = -1.02 + 12.6 PCB52 + 0.313 PCB118 + 8.25 PCB138
Predictor Coef Stdev t-ratio p
Constant -1.018 1.890 -0.54 0.592
PCB52 12.644 1.129 11.20 0.000
PCB118 0.3131 0.8333 0.38 0.708
PCB138 8.2546 0.3279 25.18 0.000
s = 9.945 R-sq = 97.3% R-sq(adj) = 97.2%

11.44. (a) Because TEQ is defined as the sum TEQPCB + TEQDIOXIN + TEQFURAN, we have
β0 = 0 and β1 = β2 = β3 = 1. (b) The error terms are all zero, so they have no scatter;
therefore, σ = 0. (c) Results will vary slightly with software, but except for rounding error,
the regression confirms the values in parts (a) and (b).

Minitab output: Regression of TEQ on TEQPCB, TEQDIOXIN, and TEQFURAN


TEQ =0.000000 + 1.00 TEQPCB + 1.00 TEQDIOXIN + 1.00 TEQFURAN
Predictor Coef Stdev t-ratio p
Constant 0.00000032 0.00000192 0.16 0.870
TEQPCB 1.00000 0.00000 1211707.25 0.000
TEQDIOXIN 1.00000 0.00000 566800.75 0.000
TEQFURAN 1.00000 0.00001 176270.48 0.000
s = 0.000007964 R-sq = 100.0% R-sq(adj) = 100.0%

11.45. The model is: −1 66


−1 4200
TEQi = β0 + β1 PCB52 + β2 PCB118 −0 987666666666555555
+ β3 PCB138 + β4 PCB180 + i −0 44444333221111100
0 0000222224
where i = 1, 2, . . . , 69; i are independent N (0, σ ) random 0 566667788
variables. The regression equation is: 1 23334
 = 1.0600 − 0.0973 PCB52 + 0.3062 PCB118 1 9
TEQ 2 3
+ 0.1058 PCB138 − 0.0039 PCB180 2 57

with s = 0.9576 and R 2 = 0.677. Only the constant and the PCB118 coefficient are sig-
nificantly different from 0; see Minitab output below. Residuals (stemplot on the right) are
slightly right-skewed and show no clear patterns when plotted with the explanatory variables
(not shown).

Minitab output: Regression of TEQ on the four PCB congeners


TEQ = 1.06 - 0.097 PCB52 + 0.306 PCB118 + 0.106 PCB138 - 0.0039 PCB180
Predictor Coef Stdev t-ratio p
Constant 1.0600 0.1845 5.75 0.000
PCB52 -0.0973 0.1094 -0.89 0.377
PCB118 0.30618 0.09639 3.18 0.002
PCB138 0.10579 0.07470 1.42 0.162
PCB180 -0.00391 0.06478 -0.06 0.952
s = 0.9576 R-sq = 67.7% R-sq(adj) = 65.7%
324 Chapter 11 Multiple Regression

11.46. (a) Results will vary with software. (b) Different soft- x s
ware may produce different results, but (presumably) all LPCB28 −1.3345 1.1338
software will ignore those 16 specimens, which is probably LPCB52 −0.7719 1.1891
not a good approach. (c) The summary statistics (right) LPCB118 0.8559 0.8272
and stemplots (below) are based on natural logarithms; for LPCB126 −4.8457 0.7656
common logarithms, multiply mean and standard deviation LPCB138 1.6139 0.8046
LPCB153 1.7034 0.9012
by 2.3026. For LPCB126, the zero terms were replaced
. LPCB180 0.9752 0.9276
ln 0.0026 = −5.9522, which accounts for the odd appear- LPCB 3.9170 0.8020
ance of its stemplot. LTEQ 0.8048 0.5966

LPCB28 LPCB52 LPCB118


−5 1 −3 85 −1 4
−4 −3 −0 996
−4 −2 776 −0 33110
−3 6 −2 1100 0 0112233334444444
−3 −1 98877765 0 556666788899999
−2 88765 −1 4443211111000 1 012222222333334
−2 433331111100 −0 8777776666665 1 556678899
−1 8888776665555 −0 443332111100 2 111
−1 443222211111 0 22334 2 59
−0 99998888766555 0 5557
−0 3100 1 03
0 4 1 7
0 56789 2 02
1
1 9
LPCB126 LPCB138 LPCB153
−5 9999999999999999 −0 411 −0 110
−5 0 33 0 134
−5 0 6677788999 0 5578899
−5 222 1 0011112222333334 1 00011222222222333444
−5 111000 1 5555566777789 1 5677888899999
−4 99999999998888 2 000001111222344 2 01222233444
−4 7776 2 5555589 2 55569
−4 54444 3 024 3 01244
−4 3322 3 57
−4 11100
−3 999888
−3 666
−3 554
LPCB180 LPCB LTEQ
−0 975 1 8 0 00000000111111
−0 443110 2 12 0 2223333333
0 00001111222334 2 6999 0 4455555
0 567778889999 3 0011222233444444 0 66677
1 0011122222233344 3 5556666777888 0 888899
1 55666689 4 000000111233344 1 0111111
2 012223 4 555555777788 1 22333
2 589 5 1233 1 44455
3 4 5 57 1 6667777
1 888
Solutions 325

11.47. (a) The correlations (all positive) are listed in the table below. The largest correlation
is 0.956 (LPCB and LPCB138); the smallest (0.227, for LPCB28 and LPCB180) is not
quite significantly different from 0 (t = 1.91, P = 0.0607) but, with 28 correlations, such
a P-value could easily arise by chance, so we would not necessarily conclude that ρ = 0.
Rather than showing all 28 scatterplots—which are all fairly linear and confirm the positive
associations suggested by the correlations—we have included only two of the interesting
ones: LPCB against LPCB28 and LPCB against LPCB126. The former is notable because of
one outlier (specimen 39) in LPCB28; the latter stands out because of the “stack” of values
in the LPCB126 data set that arose from the adjustment of the zero terms. (The outlier in
LPCB28 and the stack in LPCB126 can be seen in other plots involving those variables; the
two plots shown are the most appropriate for using the PCB congeners to predict LPCB, as
the next exercise asks.) (b) All correlations are higher with the transformed data. In part,
this is because these scatterplots do not exhibit the “greater scatter in the upper right” that
was seen in many of the scatterplots of the original data.
LPCB28 LPCB52 LPCB118 LPCB126 LPCB138 LPCB153 LPCB180
LPCB52 0.795
LPCB118 0.533 0.671
LPCB126 0.272 0.331 0.739
LPCB138 0.387 0.540 0.890 0.792
LPCB153 0.326 0.519 0.780 0.647 0.922
LPCB180 0.227 0.301 0.654 0.695 0.896 0.867
LPCB 0.570 0.701 0.906 0.729 0.956 0.905 0.829

5.5 5.5
5 5
4.5 4.5
log(PCB)

log(PCB)

4 4
3.5 3.5
3 3
2.5 2.5
2 2
1.5 1.5
–6 –5 –4 –3 –2 –1 0 1 2 –6.5 –6 –5.5 –5 –4.5 –4 –3.5
log(PCB28) log(PCB126)

11.48. Student results will vary with how many different models they try, and what tradeoff
they consider between “good” (in terms of large R 2 ) and “simple” (in terms of the number
of variables included in the model). The first Minitab output on the next page, produced
with the BREG (best regression) command, gives some guidance as to likely answers; it
shows the best models with one, two, three, four, five, six, and seven explanatory variables.
We can see, for example, that if all variables are used, R 2 = 0.975, but we can achieve
similar values of R 2 with fewer variables. The best regressions with two, three, and four
explanatory variables are shown in the Minitab output on the next page.
326 Chapter 11 Multiple Regression

Minitab output: Best subsets regression


L L L L L
L L P P P P P
P P C C C C C
C C B B B B B
B B 1 1 1 1 1
Adj. 2 5 1 2 3 5 8
Vars R-sq R-sq C-p s 8 2 8 6 8 3 0
1 91.4 91.3 141.4 0.23689 X
2 96.2 96.1 28.5 0.15892 X X
3 96.8 96.6 16.1 0.14696 X X X
4 97.2 97.0 8.2 0.13826 X X X X
5 97.3 97.0 8.6 0.13776 X X X X X
6 97.5 97.2 6.0 0.13389 X X X X X X
7 97.5 97.2 8.0 0.13497 X X X X X X X
Best regression using two explanatory variables
LPCB = 2.74 + 0.175 LPCB52 + 0.813 LPCB138
Predictor Coef Stdev t-ratio p
Constant 2.74038 0.05860 46.76 0.000
LPCB52 0.17533 0.01926 9.10 0.000
LPCB138 0.81294 0.02846 28.56 0.000
s = 0.1589 R-sq = 96.2% R-sq(adj) = 96.1%
Best regression using three explanatory variables
LPCB = 2.79 + 0.0908 LPCB28 + 0.104 LPCB52 + 0.821 LPCB138
Predictor Coef Stdev t-ratio p
Constant 2.79394 0.05633 49.60 0.000
LPCB28 0.09078 0.02601 3.49 0.001
LPCB52 0.10371 0.02717 3.82 0.000
LPCB138 0.82056 0.02641 31.07 0.000
s = 0.1470 R-sq = 96.8% R-sq(adj) = 96.6%
Best regression using four explanatory variables
The regression equation is
LPCB = 2.79 + 0.107 LPCB28 + 0.0876 LPCB52 + 0.669 LPCB138 + 0.151 LPCB153
Predictor Coef Stdev t-ratio p
Constant 2.79081 0.05300 52.65 0.000
LPCB28 0.10684 0.02503 4.27 0.000
LPCB52 0.08763 0.02610 3.36 0.001
LPCB138 0.66854 0.05538 12.07 0.000
LPCB153 0.15118 0.04921 3.07 0.003
s = 0.1383 R-sq = 97.2% R-sq(adj) = 97.0%
Solutions 327

11.49. Using Minitab’s BREG (best regression) command for guidance, we see that there is
little improvement in R 2 beyond models with four explanatory variables. The best models
with two, three, and four variables are given in the Minitab output below.

Minitab output: Best subsets regression


L L L L L
L L P P P P P
P P C C C C C
C C B B B B B
B B 1 1 1 1 1
Adj. 2 5 1 2 3 5 8
Vars R-sq R-sq C-p s 8 2 8 6 8 3 0
1 72.9 72.5 10.8 0.31266 X
2 76.8 76.1 2.0 0.29166 X X
3 77.6 76.6 1.6 0.28859 X X X
4 78.0 76.7 2.5 0.28816 X X X X
5 78.1 76.4 4.2 0.28981 X X X X X
6 78.2 76.1 6.1 0.29188 X X X X X X
7 78.2 75.7 8.0 0.29400 X X X X X X X
Best regression using two explanatory variables
The regression equation is
LTEQ = 3.96 + 0.107 LPCB28 + 0.622 LPCB126
Predictor Coef Stdev t-ratio p
Constant 3.9637 0.2275 17.42 0.000
LPCB28 0.10749 0.03242 3.32 0.001
LPCB126 0.62231 0.04801 12.96 0.000
s = 0.2917 R-sq = 76.8% R-sq(adj) = 76.1%
Best regression using three explanatory variables
The regression equation is
LTEQ = 3.44 + 0.0777 LPCB28 + 0.114 LPCB118 + 0.543 LPCB126
Predictor Coef Stdev t-ratio p
Constant 3.4445 0.4029 8.55 0.000
LPCB28 0.07773 0.03736 2.08 0.041
LPCB118 0.11371 0.07319 1.55 0.125
LPCB126 0.54345 0.06952 7.82 0.000
s = 0.2886 R-sq = 77.6% R-sq(adj) = 76.6%
Best regression using four explanatory variables
The regression equation is
LTEQ = 3.56 + 0.0720 LPCB28 + 0.170 LPCB118 + 0.554 LPCB126 - 0.0693 LPCB153
Predictor Coef Stdev t-ratio p
Constant 3.5568 0.4152 8.57 0.000
LPCB28 0.07199 0.03767 1.91 0.060
LPCB118 0.16973 0.08928 1.90 0.062
LPCB126 0.55374 0.07005 7.90 0.000
LPCB153 -0.06929 0.06344 -1.09 0.279
s = 0.2882 R-sq = 78.0% R-sq(adj) = 76.7%

11.50. The degree of change in these elements of a regression can be readily seen by
comparing the three regression results shown in the solution to Exercise 11.48; they will be
even more visible if students have explored more models in their search for the best model.
Student explanations might include observations of changes in particular coefficients from
one model to another and perhaps might attempt to paraphrase the text’s comments about
why this happens.
328 Chapter 11 Multiple Regression

11.51. In the table, two IQRs are x M s IQR


given; those in parentheses are Taste 24.53 20.95 16.26 23.9 (or 24.58)
based on quartiles reported by Acetic 5.498 5.425 0.571 0.656 (or 0.713)
Minitab, which computes quartiles H2S 5.942 5.329 2.127 3.689 (or 3.766)
Lactic 1.442 1.450 0.3035 0.430 (or 0.4625)
in a slightly different way from
this text’s method.
None of the variables show striking deviations from Normality in the quantile plots (not
shown). Taste and H2S are slightly right-skewed, and Acetic has two peaks. There are no
outliers.

Taste Acetic H2S Lactic


0 00 4 455 2 9 8 6
0 556 4 67 3 1268899 9 9
1 1234 4 8 4 17799 10 689
1 55688 5 1 5 024 11 56
2 011 5 2222333 6 11679 12 5599
2 556 5 444 7 4699 13 013
3 24 5 677 8 7 14 469
3 789 5 888 9 025 15 2378
4 0 6 0011 10 1 16 38
4 7 6 3 17 248
5 4 6 44 18 1
5 67 19 09
20 1

11.52. The plots show positive associations between the variables. The correlations and
P-values are in the plots; all correlations are positive (as expected) and significantly
different from 0. (Recall that the√ P-values
√ are correct if the two variables are Normally
distributed, in which case t = r n − 2/ 1 − r 2 has a t (n − 2) distribution if ρ = 0.)

60 r = 0.5495 60 r = 0.7558 60 r = 0.7042


50 P = 0.0017 50 P < 0.0001 50 P < 0.0001

40 40 40
Taste

Taste

Taste

30 30 30
20 20 20
10 10 10
0 0 0

4 4.5 5 5.5 6 6.5 2 3 4 5 6 7 8 9 10 0.5 1 1.5 2


Acetic H2S Lactic

6.5 r = 0.6180 6.5 r = 0.6038 10 r = 0.6448


P = 0.0003 P = 0.0004 P = 0.0001
9
6 6 8
7
Acetic

Acetic

5.5 5.5
H2S

6
5 5 5
4
4.5 4.5
3
4 4 2
2 3 4 5 6 7 8 9 10 0.5 1 1.5 2 0.5 1 1.5 2
H2S Lactic Lactic
Solutions 329

 = −61.50 + 15.648 Acetic with


11.53. The regression equation is Taste −2 9
s = 13.82 and R = 0.302. The slope is significantly different from 0
2 −2 11
−1 65
(t = 3.48, P = 0.002). −1 31
Based on a stemplot (right) and quantile plot (not shown), the residuals −0 7655
seem to have a Normal distribution. Scatterplots (below) reveal positive −0 21
associations between residuals and both H2S and Lactic. The plot of resid- 0 0122224
0 5668
uals against Acetic suggests greater scatter in the residuals for large Acetic 1
values. 1 5679
2 0
2 6

60 30
50 20
40 10

Residual
Taste

30 0
20
–10
10
–20
0
–30
4 4.5 5 5.5 6 6.5 4 4.5 5 5.5 6 6.5
Acetic Acetic
30 30
20 20
10 10
Residual

Residual

0 0
–10 –10
–20 –20
–30 –30
2 3 4 5 6 7 8 9 10 0.75 1 1.25 1.5 1.75 2
H2S Lactic
330 Chapter 11 Multiple Regression

 = −9.787 + 5.7761 H2S with s = 10.83


11.54. The regression equation is Taste −1 5
and R = 0.571. The slope is significantly different from 0 (t = 6.11,
2 −1 42210
−0 87665
P < 0.0005). −0 44333
Based on a stemplot (right) and quantile plot (not shown), the residuals 0 1233
may be slightly skewed, but do not differ greatly from a Normal distribu- 0 556779
tion. Scatterplots (below) reveal weak positive associations between residu- 1 4
1 7
als and both Acetic and Lactic. The plot of residuals against H2S suggests 2 1
greater scatter in the residuals for large H2S values. 2 5

60 25
50 20
15
40 10

Residual
Taste

30 5
0
20
–5
10 –10
0 –15
–20
2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10
H2S H2S

20 20

10 10
Residual

Residual

0 0

–10 –10

–20 –20
4 4.5 5 5.5 6 6.5 0.75 1 1.25 1.5 1.75 2
Acetic Lactic
Solutions 331

 = −29.86 + 37.720 Lactic with


11.55. The regression equation is Taste −1 965
s = 11.75 and R = 0.496. The slope is significantly different from 0
2 −1 331
−0 988665
(t = 5.25, P < 0.0005). −0 210
Based on a stemplot (right) and quantile plot (not shown), the residuals 0 0122
appear to be roughly Normal. Scatterplots (below) reveal no striking pat- 0 567999
terns for residuals vs. Acetic and H2S. 1 04
1 58
2
2 7

60 25
50 20
15
40

Residual
10
Taste

30 5
0
20
–5
10 –10
0 –15
–20
0.75 1 1.25 1.5 1.75 2 0.75 1 1.25 1.5 1.75 2
Lactic Lactic

20 20
Residual

Residual

10 10

0 0

–10 –10

–20 –20
4 4.5 5 5.5 6 6.5 2 3 4 5 6 7 8 9 10
Acetic H2S

11.56. All information is in x  =


Taste F P r2 s
the table at the right. The Acetic −61.50 + 15.648x 12.11 0.002 30.2% 13.82
intercepts differ from one H2S −9.787 + 5.7761x 37.29 <0.0005 57.1% 10.83
model to the next because Lactic −29.86 + 37.720x 27.55 <0.0005 49.6% 11.75
they represent different things—for example, in the first model, the intercept is the predicted
value of Taste with Acetic = 0, etc.
332 Chapter 11 Multiple Regression

 = −26.94 + 3.801 Acetic + 5.146 H2S with s = 10.89


11.57. The regression equation is Taste
and R = 0.582. The t-value for the coefficient of Acetic is 0.84 (P = 0.406), indicating
2

that it does not add significantly to the model when H2S is used because Acetic and H2S
are correlated (in fact, r = 0.618 for these two variables). This model does a better job than
any of the three simple linear regression models, but it is not much better than the model
with H2S alone (which explained 57.1% of the variation in Taste)—as we might expect
from the t-test result.

Minitab output: Regression of taste on acetic and h2s


taste = -26.9 + 3.80 acetic + 5.15 h2s
Predictor Coef Stdev t-ratio p
Constant -26.94 21.19 -1.27 0.215
acetic 3.801 4.505 0.84 0.406
h2s 5.146 1.209 4.26 0.000
s = 10.89 R-sq = 58.2% R-sq(adj) = 55.1%

 = −27.592 + 3.946 H2S + 19.887 Lactic with s = 9.942.


11.58. The regression equation is Taste
The model explains 65.2% of the variation in Taste, which is higher than for the two simple
linear regressions. Both coefficients are significantly different from 0 (P = 0.002 for H2S,
and P = 0.019 for Lactic).

Minitab output: Regression of taste on h2s and lactic


taste = -27.6 + 3.95 h2s + 19.9 lactic
Predictor Coef Stdev t-ratio p
Constant -27.592 8.982 -3.07 0.005
h2s 3.946 1.136 3.47 0.002
lactic 19.887 7.959 2.50 0.019
s = 9.942 R-sq = 65.2% R-sq(adj) = 62.6%

 = −28.88 + 0.328 Acetic + 3.912 H2S + 19.671 Lactic


11.59. The regression equation is Taste
with s = 10.13. The model explains 65.2% of the variation in Taste (the same as for the
model with only H2S and Lactic). Residuals of this regression appear to be Normally
distributed and show no patterns in scatterplots with the explanatory variables. (These plots
are not shown.)
The coefficient of Acetic is not significantly different from 0 (P = 0.942); there is no
gain in adding Acetic to the model with H2S and Lactic. It appears that the best model is
the H2S/Lactic model of Exercise 11.58.

Minitab output: Regression of taste on acetic, h2s, and lactic


taste = -28.9 + 0.33 acetic + 3.91 h2s + 19.7 lactic
Predictor Coef Stdev t-ratio p
Constant -28.88 19.74 -1.46 0.155
acetic 0.328 4.460 0.07 0.942
h2s 3.912 1.248 3.13 0.004
lactic 19.671 8.629 2.28 0.031
s = 10.13 R-sq = 65.2% R-sq(adj) = 61.2%
Chapter 12 Solutions

12.1. (a) H0 says the population means are all equal. (b) Experiments are best for establishing
causation. (c) ANOVA is used to compare means (and assumes that the variances are equal).
(d) Multiple comparisons procedures are used when we wish to determine which means
are significantly different, but have no specific relations in mind before looking at the data.
(Contrasts are used when we have prior expectations about the differences.)

12.2. (a) If we reject H0 , we conclude that at least one mean is different from the rest.
(b) One-way ANOVA is used to compared two or more means. (When only means are to be
compared, we usually use a two-sample t test.) (c) Two-way ANOVA is used to examine the
effect of two explanatory variables (which have two or more values) on a response variable
(which is assumed to have a Normal distribution, meaning that it can take any value, at least
in theory).

12.3. We were given sample sizes n 1 = 23, n 2 = 20, and n 3 = 28 and standard deviations
s1 = 5, s2 = 5, and s3 = 6. (a) Yes: The guidelines for pooling standard deviations
.
say that the ratio of largest to smallest should be less than 2; we have 65 = 1.2 < 2.
(b) Squaring the three standard deviations gives s12 = 25, s22 = 25, and s32 = 36.
22s12 + 19s22 + 27s32 .  .
(c) sp2 = = 29.3676. (d) sp = sp2 = 5.4192.
22 + 19 + 27

12.4. (a) (b) (c)

–2 0 6 12 18 24 30 36 9 12 15 18 21 24 27 10 15 20

12.5. (a) This sentence describes between-group variation. Within-group variation is the
variation that occurs by chance among members of the same group. (b) The sums of squares
(not the mean squares) in an ANOVA table will add. (c) The common population standard
deviation σ (not its estimate sp ) is a parameter. (d) A small P means the means are not all
the same, but the distributions may still overlap quite a bit. (See the “Caution” immediately
preceding this exercise in the text.)

12.6. The answers are found in Table E (or using software) with p = 0.05 and degrees of
freedom I − 1 and N − I . (a) I = 4, N = 16, df 3 and 12: F > 3.49 (software: 3.4903).
(b) I = 4, N = 24, df 3 and 20: F > 3.10 (software: 3.0984). (c) I = 4, N = 32, df 3
and 28: F > 2.95 (software: 2.9467). (d) As the degrees of freedom increase, values from
an F distribution tend to be smaller (closer to 1), so smaller values of F are statistically
significant. In terms of ANOVA conclusions, we have learned that with smaller samples
(fewer observations per group), the F statistic needs to be fairly large in order to reject H0 .

12.7. Assuming the t (ANOVA) test establishes that the means are different, contrasts and
multiple comparison provide no further useful information. (With two means, there is only
one comparison to make, and it has already been made by the t test.)

333
334 Chapter 12 One-Way Analysis of Variance

12.8. (a) The stated hypothesis is µ50% < 12 (µ0% + µ100% ), so we use the contrast
ψ = 12 (µ0% + µ100% ) − µ50% , with coefficients 0.5, −1, and 0.5. The hypotheses
can then be stated H0 : ψ = 0 versus Ha : ψ > 0. (b) The estimated
 contrast is
.
c = 2 (50 + 120) − 75 = 10 cm , with standard error SEc = sp 40 + 40
1 3 0.25 1
+ 0.25
40 = 5.8095, so
. 10 .
the test statistic is t = 5.8095 = 1.7213 with df = 117. The one-sided P-value is P = 0.0439,
so this is significant at α = 0.05, but not at α = 0.01.
Note: We wrote the contrast so that it would be positive when Ha is true (in
keeping with the text’s advice). We could also test this hypothesis using the contrast
ψ  = µ50% − 12 (µ0% + µ100% ), or even ψ  = µ0% + µ100% − 2µ50% . The resulting t statistic
is the same (except possibly in sign) regardless of the way the contrast is stated.

12.9. (a) With I = 5 groups and N = 35, we have df


I − 1 = 4 and N − I = 30. In Table E, we see that
2.69 (p = 0.05)
2.69 < F < 3.25. (b) The sketch on the right shows 2.83
the observed F value and the critical values from Ta- 3.26 (p = 0.025)
ble E. (c) 0.025 < P < 0.050 (software gives 0.0420).
(d) The alternative hypothesis states that at least one 0 1 2 3 4
mean is different, not that all means are different.

12.10. Compare each F statistic to an F(I − 1, N − I ) distribution:

Critical P-value P-value


F I N DFG DFE values (Table E) (software)
(a) 2.69 7 35 6 28 2.45 < F < 2.90 0.025 < P < 0.050 0.0344
(b) 2.43 5 55 4 50 2.06 < F < 2.56 0.050 < P < 0.100 0.0597
(c) 3.06 6 34 5 28 F = 3.06 P = 0.025 0.0251

12.11. (a) I = 3 and N = 33, so the degrees of freedom are 2 and 30. F = 127
50 = 2.54.
Comparing to the F(2, 30) distribution in Table E, we find 2.49 < F < 3.32, so
.
0.050 < P < 0.100. (Software gives P = 0.0957.) (b) I = 4 and N = 32, so the degrees
58/3 .
of freedom are 3 and 28. F = 182/28 = 2.9744. Comparing to the F(3, 28) distribution in
.
Table E, we find 2.95 < F < 3.63, so 0.025 < P < 0.050. (Software gives P = 0.0486.)

F = 2.54 F = 2.97

0 1 2 3 4 0 1 2 3 4

12.12. (a) Yes: The guidelines for pooling standard deviations say that the ratio of largest to
28 = 1.5. (b) Squaring the three standard deviations
smallest should be less than 2; we have 42
28s12 + 31s22 + 120s32 .
gives s12 = 1369, s22 = 784, and s32 = 1764. (c) sp2 = = 1532.49.
 28 + 31 + 120
.
(d) sp = sp2 = 39.15. (e) Because the third sample was nearly twice as large as the other
two put together, the pooled standard deviation is closest to s3 .
Solutions 335

12.13. (a) Response: egg cholesterol level. Populations: chickens with different diets or drugs.
I = 3, n 1 = n 2 = n 3 = 25, N = 75. (b) Response: rating on five-point scale. Populations:
the three groups of students. I = 3, n 1 = 31, n 2 = 18, n 3 = 45, N = 94. (c) Response: quiz
score. Populations: students in each TA group. I = 3, n 1 = n 2 = n 3 = 14, N = 42.

12.14. (a) Response: time to complete VR path. Populations: children using different
navigation methods. I = 4, n i = 10 (i = 1, 2, 3, 4), N = 40. (b) Response:
calcium content of bone. Populations: chicks eating diets with differing pesticide levels.
I = 5, n i = 13 (i = 1, 2, 3, 4, 5), N = 65. (c) Response: total sales between 11:00 a.m.
and 2:00 p.m. Populations: customers responding to one of four sample offers. I = 4,
n i = 5 (i = 1, 2, 3, 4) and N = 20.

12.15. For all three situations, the hypotheses are H0 : µ1 = µ2 = µ3 versus Ha : at least one
mean is different. The degrees of freedom are DFG = DFM = I − 1 (“model” or “between
groups”), DFE = DFW = N − I (“error” or “within groups”), and DFT = N − 1 (“total”).
The degrees of freedom for the F test are DFG and DFE.

Situation I N DFG DFE DFT df for F statistic


Egg cholesterol level 3 75 2 72 74 F(2, 72)
Student opinions 3 94 2 91 93 F(2, 91)
Teaching assistants 3 42 2 39 41 F(2, 39)

12.16. For all three situations, the hypotheses are H0 : µ1 = µ2 = · · · = µ I versus Ha : at


least one mean is different. The degrees of freedom are DFG = DFM = I − 1 (“model” or
“between groups”), DFE = DFW = N − I (“error” or “within groups”), and DFT = N − 1
(“total”). The degrees of freedom for the F test are DFG and DFE.

Situation I N DFG DFE DFT df for F statistic


VR navigation methods 4 40 3 36 39 F(3, 36)
Effect of pesticide on birds 5 65 4 60 64 F(4, 60)
Effect of free food on sales 4 20 3 16 19 F(3, 16)

12.17. (a) This sounds like a fairly well-designed experiment, so the results should at least
apply to this farmer’s breed of chicken. (b) It would be good to know what proportion of
the total student body falls in each of these groups—that is, is anyone overrepresented in
this sample? (c) How well a TA teaches one topic (power calculations) might not reflect that
TA’s overall effectiveness.

12.18. (a) This sounds like a fairly well-designed experiment, assuming the subjects come
from a group which is representative of the population. (We assume that this teaching tool is
intended for use with children and that the children used in the experiment were themselves
deaf.) (b) This should at least give information about pesticide effect on bone calcium in
chicks. It might not apply to adult chickens, or other species of birds. (c) The results might
extend to similar sandwich shops, and perhaps to other times of day, or to weekend sales.
336 Chapter 12 One-Way Analysis of Variance

12.19. (a) With I = 3 and N = 120, we have df 2 and 117. (b) To use Table E, we compare
to df 2 and 100; with F > 5.02, we conclude that P < 0.001. Software gives P = 0.0003.
(c) Haggling and bargaining behavior is probably linked to the local culture, so we should
hesitate to generalize these results beyond similar informal shops in Mexico.

12.20. (a) P-values close


to 0.01 occur when F is
close to 5.483. (This is
the value for df 2 and 27;
this applet seems to have
three samples with 10
observations each.) How
close students can get
to this depends on how
much they play around
with the applet, and the
pooled standard error
setting. (b) As variation
increases, F decreases
and P increases.

12.21. (a) F can be made


very small (close to
0), and P close to 1.
(b) F increases, and
P decreases. Moving
the means farther apart
means that (even with
moderate spread) it is
easier to see that the
three groups represent
three different popula-
tions (that is, populations
having different means).
Therefore, the evidence
against H0 becomes
stronger.

12.22. We have I = 4 groups with N = 620. With the given group means, the overall mean is
130 · 2.93 + 248 · 3.00 + 174 · 3.01 + 68 · 3.39 .
x= = 3.0309
N
(a) DFG = I − 1 = 3 and DFE = N − I = 616. (b) The groups sum of squares is
.
SSG = 130(2.93 − x)2 + 248(3.00 − x)2 + 174(3.01 − x)2 + 68(3.39 − x)2 = 10.4051
MSG 10.4051/3 . .
(c) F = = = 2.68. (d) Software gives P = 0.0461, so we have enough
MSE 797.25/616
evidence to reject H0 at the 5% significance level. (e) The mean for the “other” group
appears to be higher than the means of the first three groups (which are similar).
Solutions 337

12.23. (a) Based on the sample means, fiber is cheapest and


120
cable is most expensive. (Note that the providers are shown
in this plot in the order given in the table, but they can be

Average price
110
rearranged in any order.) (b) Yes; the smallest-to-largest
.
26.09 = 1.55. (c) The degrees of
standard deviation ratio is 40.39 100
freedom are I − 1 = 2 and N − I = 44. From Table E (with
df 2 and 40), we have 0.025 < P < 0.050; software gives 90
P = 0.0427. The difference in means is (barely) significant
80
at the 5% level. DSL Cable Fiber
Provider

12.24. For the contrast ψ = 12 (µ D + µC ) − µ F , we test H0 : ψ = 0 versus Ha : ψ > 0. The


pooled estimate of the standard deviation is

18s12 + 19s22 + 7s32 .
sp = = 33.8170
18 + 19 + 7
The estimated contrast is c = 12 (104.49 + 119.98) − 83.87 = $28.365, with standard error

1 . .
SEc = sp 0.25
19 + 20 + 8 = 13.1259, so the test statistic is t = c/SEc = 2.161 with df = 44.
0.25

The one-sided P-value is P = 0.0181, so this is significant at α = 0.05, but not at α = 0.01.

12.25. (a) Use matched pairs t methods; we examine the change in reaction time for each
subject. (b) No: We cannot use ANOVA methods because we do not have four independent
samples. (The same group of subjects performed each of the four tasks.)

12.26. We have x 1 = 61.62, s1 = 13.75, n 1 = 71, x 2 = 46.47, s2 = 7.94, and n 2 = 37. For the
.
pooled t procedure, we find sp = 12.09 and t = 6.18 (df = 106, P < 0.0001). The Minitab
output below shows that F = 38.17 (t 2 , up to rounding error).

Minitab output: ANOVA table


Source DF SS MS F p
Factor 1 5583 5583 38.17 0.000
Error 106 15504 146
Total 107 21087

12.27. (a) With I = 4 and N = 2290, the degrees of freedom are DFG = I − 1 = 3 and
MSG 11.806 .
DFE = N − I = 2286. (b) MSE = sp2 = 4.6656, so F = = = 2.5304. (c) The
MSE 4.6656
.
F(3, 1000) entry in Table E gives 0.05 < P < 0.10; software give P = 0.0555.
338 Chapter 12 One-Way Analysis of Variance

12.28. (a) The plot of means suggests that spending is higher


X
for classical music, while pop and no music appear to 24

Mean bill (pounds)


have the same effect. (b) Yes: The guidelines for pooling
standard deviations say that the ratio of largest to small- 23
.
est should be less than 2; we have 3.332
2.243 = 1.49 < 2.
(c) The degrees of freedom are DFG = I − 1 = 2 and
22
DFE = N − I = 138. Comparing to an F(2, 100) distri- X
X
bution in Table E, we see that P < 0.001; software gives
.
P = 0.00005. We have strong evidence that the means are 21
Classical Pop None
not all the same. (d) The higher average bill for classical Background music
music led to this conclusion; the difference between pop
music and no background music is negligible. (e) The setting of this experiment (“a single
high-end restaurant in England”) might limit how much this conclusion can be generalized. It
might extend to other high-end restaurants, but perhaps not to “family-style” restaurants, and
almost certainly not to fast-food restaurants.

12.29. (a) The plot suggests that both drugs X


22
cause an increase in activity level, and
Drug B appears to have a greater effect. 20
Activity level

(b) Yes: The guidelines for pooling stan- 18 X


dard deviations say that the ratio of largest
to smallest should be less than 2; we have 16
 X
14.00 .
X

6.75 = 1.44 < 2. The pooled variance is 14 X

+ + ··· +
3(s12 s22 s52 ) 159
sp2 = = = 10.6 12
3+3+3+3+3 15 Placebo Low A High A Low B High B
√ . Treatment
and sp = 10.6 = 3.2558. (c) The degrees
of freedom are DFG = I − 1 = 4 and DFE = N − I = 15. (d) Comparing to an F(4, 15)
distribution in Table E, we see that 3.80 < F < 4.89, so 0.010 < P < 0.025; software gives
.
P = 0.0165. We have significant evidence that the means are not all the same.
Solutions 339

12.30. (a) It is useful to connect the points on


2.9
the plot, to make the pattern (or lack thereof)
more evident. There is some suggestion that 2.8

Mean grade
average grade decreases as the number of 2.7
accommodations increases. (b) Having too
many decimal points is distracting; in this 2.6
situation, no useful information is gained 2.5
by having more than one or two digits af-
ter the decimal point. For example, the 2.4
0 1 2 3 4
first mean and standard deviation would Number of accommodations
be more effectively presented as 2.79 and
0.85. (c) The largest-to-smallest SD ratio is slightly over 2 (about 2.009), so pooling is not
.
advisable. (If we pool in spite of this, we find sp = 0.8589.) (d) Eliminating data points
(without a legitimate reason) is always risky, although we could run the analysis with and
without them. Combining the last three groups would be a bad idea if the data suggested that
grades rebounded after 2 accommodations (i.e., if the average grades were higher for 3 and
4 accommodations), but as that is not the case, lumping 2, 3, and 4 accommodations seems
reasonable. (e) ANOVA is not appropriate for these data, chiefly because we do not have 245
independent observations. (f) There may be a number of local factors (for example, student
demographics or teachers’ attitudes toward accommodations) which affected grades; these
effects might not be the same elsewhere. (g) One weakness is that we do not have a control
group for comparison; that is, we cannot tell what grades these students (or a similar group)
would have had without accommodations.

12.31. (a) The variation in sample size is some cause


for concern, but there can be no extreme outliers in a
1-to-7 scale, so ANOVA is probably reliable. (b) Pool-
. 5.69
1.03 = 1.22 < 2. (c) With I = 5
ing is reasonable: 1.26
groups and total sample size N = 410, we use an
F(4, 405) distribution. We can compare 5.69 to an 0 1 2 3 4 5 6
F(4, 200) distribution in Table E and conclude that P < 0.001, or with software determine
.
that P = 0.0002. (d) Hispanic Americans have the highest emotion scores, Japanese are in
the middle, and the other three cultures are the lowest (and very similar).
340 Chapter 12 One-Way Analysis of Variance

12.32. (a) The largest-to-smallest SD ratios are Minitab output: Chi-square test
2.84, 1.23, and 1.14, so the text’s guidelines are Women Men Total
satisfied for intensity and recall, but not for fre- 1 38 8 46
31.64 14.36
quency. (b) As in the previous exercise, I = 5
and N = 410, so we use an F(4, 405) distribu- 2 22 11 33
22.70 10.30
tion. From the F(4, 200) distribution in Table E,
we can conclude that P < 0.001 for all three 3 57 34 91
62.59 28.41
response variables. With software, we find that
4 102 58 160
the P-values are much smaller; all are less than 110.05 49.95
0.00002. We conclude that, for each variable, we
5 63 17 80
have strong evidence that some group mean is 55.02 24.98
different. (This conclusion is cautious in the case
Total 282 128 410
of frequency because of our concern about the
standard deviations.) (c) The table below shows ChiSq = 1.279 + 2.817 +
0.021 + 0.047 +
one way of summarizing the means. For each 0.499 + 1.100 +
variable, it attempts to identify low (underlined), 0.589 + 1.297 +
medium, and high (boldface) values of that vari- 1.156 + 2.547 = 11.353
df = 4, p = 0.023
able. Hispanic Americans were higher than other
groups for all four variables. Asian Americans were low for all variables (the lowest in all
but global score). Japanese were low on all but global score, while European Americans and
Indians were in the middle for all but global score. (d) The results might not generalize to,
for example, subjects who are from different parts of their countries or not in a college or
university community. (e) Create a two-way table with counts of men and women in each
cultural group. The Minitab output on the right gives X 2 = 11.353, df = 4, and P = 0.023,
so we have evidence (significant at α = 0.05) that the gender mix was not the same for all
cultures. Specifically, Hispanic Americans and European Americans had higher percentages
of women, which might further affect how much we can generalize the results.

Score Frequency Intensity Recall


European Amer. 4.39 82.87 2.79 49.12
Asian Amer. 4.35 72.68 2.37 39.77
Japanese 4.72 73.36 2.53 43.98
Indian 4.34 82.71 2.87 49.86
Hispanic Amer. 5.04 92.25 3.21 59.99

12.33. Because the descriptions of these contrasts do not specify an expected direction for the
comparison, the subtraction could be done either way (in the order shown, or in the opposite
order). (a) ψ1 = µ2 − 12 (µ1 + µ4 ). (b) ψ2 = 13 (µ1 + µ2 + µ4 ) − µ3 .

12.34. Neither the descriptions in Exercise 12.33 nor the background information in
Example 12.25 seem to give any indication of an expected direction for the contrasts, so
we have given two-sided alternatives. If students give a one-sided alternative, they should
explain why they did so. (a) With ψ1 = µ2 − 12 (µ1 + µ4 ) and ψ2 = 13 (µ1 + µ2 + µ4 ) − µ3 ,
we test H0 : ψi = 0 versus Ha : ψi = 0 (for i = 1 or 2). (b) The estimated contrasts are
. .
c1 = 0.195 and c2 = 0.48. (c) The pooled estimate of the standard deviation sp is either
.
1.6771 or 1.6802 (see the note at the end of this solution), so SEc1 = 0.3093 or 0.3098,
.
and SEc2 = 0.2929 or 0.2934. (d) Neither contrast is significantly different from 0 (with a
Solutions 341

.
two-sided alternative). For comparing brown eyes to the other colors, t1 = 0.630 or 0.629,
. .
with df = 218, for which P = 0.5290 or 0.5298. For gaze up versus gaze down, t2 = 1.639
.
or 1.636, with df = 218, for which P = 0.1026 or 0.1033. (e) The confidence intervals are
ci ± t ∗ SEci , where t ∗ = 1.984 (Table D) or 1.971 (software). This gives roughly −0.41 to
0.80 for ψ1 and −0.10 to 1.06 for ψ2 .
Note: The simplest way to find the pooled standard deviation √ sp is to use the value
1.6771 reported by SAS and Minitab in Figure 12.11 (or take MSE from the Excel output).
Some students might compute it by hand from the numbers given in Example 12.25, which
gives 1.6802. The difference is due to rounding; note that the reported standard deviation
for brown eyes should be 1.72 rather than 1.73. In the end, our conclusions are the same
either way.

12.35. See the solution to Exercise 1.87 for stemplots. The means, standard deviations, and
standard errors (all in millimeters) are given below. We reject H0 and conclude that at least
one mean is different (F = 259.12, df 2 and 51, P < 0.0005).
Variety n x s sx 47

Flower lengths (mm)


bihai 16 47.5975 1.2129 0.3032 45
red 23 39.7113 1.7988 0.3751 43
yellow 15 36.1800 0.9753 0.2518 41
39
Minitab output: Analysis of Variance on length
Source DF SS MS F p 37
Factor 2 1082.87 541.44 259.12 0.000 35
Error 51 106.57 2.09 bihai red yellow
Total 53 1189.44 Heliconia variety

12.36. (a) On the right are summary statistics and a plot Group n x s
of means; side-by-side stemplots are on the following 1 20 2.95 0.945
page. Students might also use five-number summaries to 2 22 4.00 0.926
describe the data, but with small samples and relatively 3 10 3.10 1.524
unskewed distributions, they give us little additional
.
Average preference rating

4
information. (b) ANOVA gives F = 5.63 (df 2 and 49)
. 3.8
and P = 0.0063—strong evidence of a difference in
means. (c) Because preference ratings are whole numbers, 3.6
the underlying distributions cannot be Normal, but apart 3.4
from that, the stemplots and summary statistics show 3.2
no particular causes for concern. On the following page
3
are stemplots of the residuals, which show the expected
granularity (due to the ratings being whole numbers). 2.8
1 2 3
With such small samples, it is difficult to make any further
Group
judgments about Normality. (d) The three test statistics are
2.95 − 4.00 . 2.95 − 3.10 . 4.00 − 3.10 .
t12 =  = −3.18, t13 =  = −0.36, t23 =  = 2.21
sp 1
20
+ 1
22
sp 1
20
+ 1
10
sp 1
22
+ 1
10

Results will vary with the method used, and the overall significance level. Using the Bonfer-
.
roni method with α = 0.05 (and three comparisons), we have t ∗∗ = 2.479, so only groups 1
and 2 are significantly different.
342 Chapter 12 One-Way Analysis of Variance

Minitab output: Analysis of Variance on Preference


Source DF SS MS F p
Factor 2 12.84 6.42 5.63 0.006
Error 49 55.85 1.14
Total 51 68.69
Group 1 Group 2 Group 3
1 00 1 1 0
2 00 2 00 2 000
3 000000000000 3 000 3 000
4 000 4 0000000000 4 0
5 0 5 0000000 5 0
6 6 6 0
Group 1 residuals Group 2 residuals Group 3 residuals All residuals
−2 −2 00 −2 1 −2 100
−1 99 −1 000 −1 111 −1 99111000
−0 99 −0 00000 −0 111 −0 9911100000
0 000000000000 0 00000 0 9 0 000000000000000009
1 000 1 0000000 1 9 1 00000000009
2 0 2 2 9 2 09

12.37. Stemplots are shown on the right; bihai red yellow


means, standard deviations, and standard 38 33 36 233333 35 44
errors are given below. We reject H0 and 38 44444555 36 4445 35 667
38 7777 36 667 35 8889
conclude that at least one mean is different 38 36 8 36 00011
(F = 244.27, df 2 and 51, P < 0.0005). 39 11 37 00 36
These results are essentially the same as in 37 23333 36 4
Exercise 12.35. 37 4
37 6
Note: All of the numbers in these sam-
ples are between 34 and 50; for that range of input values, the log function closely resembles
a line. Because a linear transformation has no effect on skewness, the effect of this transfor-
mation is minimal, as can be confirmed by comparing these stemplots to those in the solution
to Exercise 1.87, and this means plot with the one in the solution to Exercise 12.35.

Variety n x s sx 3.85
log(Flower lengths)

bihai 16 3.8625 0.02515 0.006286 3.8


red 23 3.6807 0.04496 0.009374 3.75
yellow 15 3.5882 0.02698 0.006966 3.7
3.65
Minitab output: Analysis of Variance on log-length
Source DF SS MS F p 3.6
Factor 2 0.61438 0.30719 244.27 0.000 3.55
Error 51 0.06414 0.00126 bihai red yellow
Total 53 0.67852 Heliconia variety
Solutions 343

12.38. (a) Statistics and plots are below. (b) The standard deviations satisfy the text’s
guidelines for pooling. One concern is that all three distributions are slightly left-skewed and
the youngest nonfiction death is an outlier. (c) ANOVA gives F = 6.56 (df 2 and 120) and
P = 0.002, so we conclude that at least one mean is different. (d) The appropriate contrast
is ψ1 = 12 (µnov + µnf ) − µp . (This is defined so that the ψ1 > 0 if poets die younger. This
is not absolutely necessary but is in keeping with the text’s advice.) The null hypothesis is
H0 : ψ1 = 0; the Yeats quote hardly seems like an adequate reason to choose a one-sided
.
alternative, but students may have other opinions. For the test, we compute c = 10.9739,
. .
SEc = 3.0808, and t = 3.56 with df = 120. The P-value is very small regardless of whether
Ha is one- or two-sided, so we conclude that the contrast is positive (and poets die young).
(e) For this comparison, the contrast is ψ2 = µnov − µnf , and the hypotheses are H0 : ψ2 = 0
versus Ha : ψ2 = 0. (Because the alternative is two-sided, the subtraction in this contrast can
. . .
go either way.) For the test, we compute c = −5.4272, SEc = 3.4397, and t = −1.58 with
df = 120. This gives P = 0.1172; the difference between novelists and nonfiction writers is
not significant. (f) With three comparisons and df = 120, the Bonferroni critical value is
.
t ∗∗ = 2.4280. The pooled standard deviation is sp = 14.4592, so the differences, standard
errors, and t values are:

. . .
x nov − x p = 8.2603, SEnov−p = sp 1
67 +
= 3.1071,
1
32 t = 2.66

. 1 . .
x nov − x nf = −5.4272, SEnov−nf = sp 67
1
+ 24 = 3.4397, t = −1.58

. . .
x p − x nf = −13.6875, SEp−nf = sp 32
1
+ 24
1
= 3.9044, t = −3.51

The first and last differences are greater (in absolute value) than t ∗∗ , so those differences
are significant. The second difference is the same one tested in the contrast of part (e); the
standard error and the conclusion are the same.
78
Average age at death (years)

n x s sx 76
Novels 67 71.4478 13.0515 1.5945 74
Poems 32 63.1875 17.2971 3.0577 72
Nonfiction 24 76.8750 14.0969 2.8775 70
68
Minitab output: Analysis of Variance on age at death 66
Source DF SS MS F p 64
Writer 2 2744 1372 6.56 0.002 62
Error 120 25088 209 Novels Poems Nonfiction
Total 122 27832 Type of writing
344 Chapter 12 One-Way Analysis of Variance

12.39. (a) The means, standard deviations, and standard errors are given below (all in
grams per cm2 ). (b) All three distributions appear to reasonably close to Normal, and
the standard deviations are suitable for pooling. (c) ANOVA gives F = 7.72 (df 2
and 42) and P = 0.001, so we conclude that the means are not all the same. (d) With
df = 42, 3 comparisons, and α = 0.05, the Bonferroni critical value is t ∗∗ = 2.4937.
.
The pooled standard deviation is sp = 0.01437 and the standard error of each difference is
√ .
SE D = sp 1/15 + 1/15 = 0.005246, so two means are significantly different if they differ
.
by t ∗∗ SE D = 0.01308. The high-dose mean is significantly different from the other two.
(e) Briefly: High doses of kudzu isoflavones increase BMD.
n x s sx
Control 15 0.2189 0.01159 0.002992 0.23

BMD (g/cm2)
Low dose 15 0.2159 0.01151 0.002972
High dose 15 0.2351 0.01877 0.004847
0.22
Minitab output: Analysis of Variance on BMD
Source DF SS MS F p
Factor 2 0.003186 0.001593 7.72 0.001 0.21
Error 42 0.008668 0.000206 Control Low dose High dose
Total 44 0.011853 Treatment

12.40. (a) With I = 5 and N = 315, the F tests have df 4 and 310. (b) The response variable
does not need to be Normally distributed; rather, the deviations from the mean within each
group should be Normal. (c) By comparing each F statistic to 2.401—the 5% critical
value for an F(4, 310) distribution—we see that the means are significantly different for
the first three questions. (We can also compute the P-value for each F statistic to reach
this conclusion.) (d) A possible plot is shown below; this could also be split into three
separate plots. For the first question, the leisure group appears to have the most interest in
experiencing Hawaiian culture. For the second question, the sports and leisure groups had
a lower preference for a group tour, while fraternal associations had a higher preference.
For the third question, the leisure group is most interested in ocean sports, and the business
group is least interested.

5.5
Culture
(1=strongly disagree,

5
7=strongly agree)
Mean response

4.5 Group tour


4
3.5 Ocean sports
3
2.5
2
Honeymoon Frat. Sports Leisure Business
assoc.
Group
Solutions 345

12.41. (a) The mean responses were not significantly different for the last question. (b) Taking
the square roots of the given values of MSE gives the values of sp . For the Bonferroni
method with α = 0.05, df = 310, and 10 comparisons, t ∗∗ = 2.827. Only the largest
difference within each set of means is significant:
3.97 − 5.33 .
t14 =  = −2.891 experience culture—honeymoon and leisure groups
1.8058 34 + 26
1 1

3.38 − 2.39 .
t23 =  = 3.550 group tour—fraternal association/sports groups
1.6855 56 + 105
1 1

5.33 − 4.02 .
t45 =  = 2.856 ocean sports—leisure/business groups
2.0700 261
+ 94
1

12.42. (a) At right. (b) We test H0 : µ1 = · · · = Lesson n x s sx


µ4 versus Ha : not all µi are equal. ANOVA Piano 34 3.6176 3.0552 0.5240
gives F = 9.24 with df 3 and 74, for which Singing 10 −0.3000 1.4944 0.4726
P < 0.0005, so we reject the null hypothesis. Computer 20 0.4500 2.2118 0.4946
The type of lesson does affect the mean score None 14 0.7857 3.1908 0.8528
change; in particular, it appears that students who take piano lessons had significantly higher
scores than the other students.

Minitab output: Analysis of Variance on scores


Source DF SS MS F p
Lesson 3 207.28 69.09 9.24 0.000
Error 74 553.44 7.48
Total 77 760.72

12.43. We have six comparisons to make, and


df = 74, so the Bonferroni critical value
3
with α = 0.05 is t ∗∗ = 2.7111. The pooled
.
Mean score

standard deviation is sp = 2.7348. The table 2


below shows the differences, their standard
errors, and the t statistics. 1
The Piano mean is significantly higher
than the other three, but the other three 0
means are not significantly different.
Piano Singing Computer None
Lessons

DP S = 3.91765 D PC = 3.16765 DP N = 2.83193


SE P S = 0.98380 SE PC = 0.77066 SE P N = 0.86843
tP S = 3.982 t PC = 4.110 tP N = 3.261
D SC = −0.75000 DS N = −1.08571
SE SC = 1.05917 SE S N = 1.13230
t SC = −0.708 tS N = −0.959
DC N = −0.33571
SEC N = 0.95297
tC N = −0.352
346 Chapter 12 One-Way Analysis of Variance

12.44. We test the hypothesis H0 : ψ = µP − 13 (µS + µC + µN ) = 0; the sample contrast


.
is c = 3.618 − 13 (−0.300 + 0.450 + 0.786) = 3.306. The pooled standard deviation

.
estimate is sp = 2.735, so SEc = 2.735 1/34 + 19 /10 + 19 /20 + 19 /14 = 0.6356. Then
.
t = 3.306/0.6356 = 5.20, with df = 74. This is enough evidence (P < 0.001) to reject
H0 in favor of Ha : ψ > 0, so we conclude that mean score changes for piano students are
greater than the average of the means for the other three groups.

.
0.657 =
12.45. (a) Pooling is reasonable: The ratio is 0.824
1.25. For the pooled standard deviation, we compute F = 17.66
488s12 + 68s22 + 211s32 .
sp2 = = 0.5902 2.31 3.01
488 + 68 + 211 4.63
. √ .
so sp = 0.5902 = 0.7683. (b) Comparing F = 17.66
0 1 2 3 4 5
to an F(2, 767) distribution, we find P < 0.001.
Sketches of this distribution will vary; in the graph on the right, the three marked points are
the 10%, 5%, and 1% critical values, so we can see that the observed value lies well above
the bulk of this distribution. (c) For the contrast ψ = µ2 − 12 (µ1 + µ3 ), we test H0 : ψ = 0
. . .
versus Ha : ψ > 0. We find c = 0.585 with SEc = 0.0977, so t = c/SEc = 5.99 with
df = 767, and P < 0.0001.

12.46. (a) The linear-trend coefficients (plot below, left) fall on a line. If µ1 = µ2 = · · · = µ5 ,
then the linear-trend contrast ψ1 = −2µ1 − 1µ2 + 0µ3 + 1µ4 + 2µ5 = 0. (b) The
quadratic-trend coefficients (plot below, right) fall on a parabola. If all µi are equal, then
ψ2 = 2µ1 − 1µ2 − 2µ3 − 1µ4 + 2µ5 = 0. If µi = 5i, then
ψ2 = 2 · 5 − 1 · 10 − 2 · 15 − 1 · 20 + 2 · 25 = 0
. .
(c) The sample contrasts are c2 = −3.36 and c3 = 1.12. (d) The standard errors are
. . . .
SEc2 = 0.8306 and SEc3 = 0.6425, so the test statistics are t2 = −3.364 and t3 = 1.740.
With df = 129, the P-values are 0.001 and 0.0842. Combined with the linear-trend result
from Example 12.20 (t = −0.18, P = 0.861), we see that we have significant evidence for a
quadratic trend, but not for a linear or cubic trend.
Quadratic contrast coefficients
Linear contrast coefficients

2 2

1 1

0 0

–1 –1

–2 –2

–3 –3
0 1 2 3 4 5 0 1 2 3 4 5
i i
Solutions 347

12.47. (a) Pooling is reasonable, as the largest-to-smallest n x s


ratio is about 1.65. (b) ANOVA gives F = 7.98 (df Control 10 601.1 27.364
2 and 27), for which P = 0.002. We reject H0 and Low jump 10 612.5 19.329
conclude that not all means are equal. High jump 10 638.7 16.594
Minitab output: Analysis of Variance on density
Source DF SS MS F p
Treatment 2 7434 3717 7.98 0.002
Error 27 12580 466
Total 29 20013

12.48. (a) The residuals appear to be reasonably Normal. (b) With df = 27, three comparisons,
and α = 0.05, the Bonferroni critical value is t ∗∗ = 2.5525. The pooled standard deviation is
. √ .
sp = 21.5849, and the standard error of each difference is SE D = sp 1/10 + 1/10 = 9.6531,
.
so two means are significantly different if they differ by t ∗∗ SE D = 24.6390. The high-jump
mean is significantly different from the other two.
−4

Mean bone density (mg/cm3)


7 640
−3 2 40 635
−2 4 630
−1 8666322 20
Residual

−0 887751 625
0
0 1449 620
1 112899 –20 615
2 25 610
3 5 –40
605
4
–60 600
5 1
–3 –2 –1 0 1 2 3 Control Low jump High jump
Normal score Treatment

0.2520 = 2.49 > 2.


12.49. (a) Pooling is risky because 0.6283 n x s
(b) ANOVA gives F = 31.16 (df 2 and 9), for which Aluminum 4 2.0575 0.2520
P < 0.0005. We reject H0 and conclude that not all Clay 4 2.1775 0.6213
means are equal. Iron 4 4.6800 0.6283

Minitab output: Analysis of Variance on iron


Source DF SS MS F p
Pot 2 17.539 8.770 31.16 0.000
Error 9 2.533 0.281
Total 11 20.072

12.50. (a) There are no clear violations of Normality, but the number of residuals is so
small that it is difficult to draw any conclusions. (b) With df = 9, three comparisons, and
α = 0.05, the Bonferroni critical value is t ∗∗ = 2.9333. The pooled standard deviation is
. √ .
sp = 0.5305, and the standard error of each difference is SE D = sp 1/4 + 1/4 = 0.3751, so
.
two means are significantly different if they differ by t ∗∗ SE D = 1.1003. The iron mean is
significantly higher than the other two.
348 Chapter 12 One-Way Analysis of Variance

−0 8 0.6
−0 6 4.5
0.4

Mean iron (mg/100g)


−0 4 0.2 4
−0 2

Residual
0 3.5
−0 0
0 00 –0.2 3
0 33 –0.4
0 455 2.5
–0.6
–0.8 2
–1 1.5
–2 –1 0 1 2 Aluminum Clay Iron
Normal score Pot type

2.89 = 3 > 2.
12.51. (a) Pooling is risky because 8.66 n x s
(b) ANOVA gives F = 137.94 (df 5 and 12), for which ECM1 3 65.0% 8.6603%
P < 0.0005. We reject H0 and conclude that not all ECM2 3 63.3% 2.8868%
means are equal. ECM3 3 73.3% 2.8868%
MAT1 3 23.3% 2.8868%
Minitab output: Analysis of Variance on Gpi MAT2 3 6.6% 2.8868%
Source DF SS MS F p MAT3 3 11.6% 2.8868%
Treatment 5 13411.1 2682.2 137.94 0.000
Error 12 233.3 19.4
Total 17 13644.4

12.52. (a) The residuals have one low outlier, and a lot of granularity, so Normality is
difficult to assess. (b) With df = 12, 15 comparisons, and α = 0.05, the Bonferroni
.
critical value is t ∗∗ = 3.6489. The pooled standard deviation is sp = 4.4096%, and
√ .
the standard error of each difference is SE D = sp 1/3 + 1/3 = 3.6004%, so two
.
means are significantly different if they differ by t ∗∗ SE D = 13.1375%. The three
ECM means are significantly higher than the three MAT means. (c) The contrast is
ψ = 13 (µECM1 + µECM2 + µECM3 ) − 13 (µMAT1 + µMAT2 + µMAT3 ), and the hypotheses are
. .
H0 : ψ = 0 versus Ha : ψ = 0. For the test, we compute c = 53.33%, SEc = 2.0787%, and
.
t = 25.66 with df = 12. This has a tiny P-value; the difference between ECM and MAT is
highly significant. This is consistent with the Bonferroni results from part (b).
−1 0 6
70
−0 4
−0 60
Mean % Gpi cells

2
−0 0 50
Residual

−0 333 –2 40
−0 1111 –4
0 111111 30
–6
0 33 –8 20
0 55 10
–10
–12 0
–3 –2 –1 0 1 2 3 ECM1 ECM2 ECM3 MAT1 MAT2 MAT3
Normal score Pot type
Solutions 349

12.53. Let µ1 be the placebo mean, µ2 and µ3 be the means for low and high doses of
.
Drug A, and µ4 and µ5 be the means for low and high doses of Drug B. Recall that sp =
3.2558. (a) The first contrast is ψ1 = µ1 − 12 (µ2 +µ4 ); the second is ψ2 = µ3 −µ2 −(µ5 −µ4 ).
(b) The estimated contrasts are c1 = 14.00 − 0.5(15.25) − 0.5(15.75) = −1.5 and
c2 = (18.25 − 15.25) − (22.50 − 15.75) = −3.75. The respective standard errors are:

1 0.25 0 0.25 0 .
SEc1 = sp + + + + = 1.9937 and
4 4 4 4 4

0 1 1 1 1 .
SEc2 = sp + + + + = s p = 3.2558
4 4 4 4 4
. .
(c) Neither contrast is significant (t1 = −0.752 and t2 = −1.152, for which the one-sided
P-values are 0.2317 and 0.1337). We do not have enough evidence to conclude that low
doses increase activity level over a placebo, nor can we conclude that activity level changes
due to increased dosage are different between the two drugs.

12.54. (a) Below. (b) To test H0 : µ1 = · · · = µ4 versus Ha : not all µi are equal, ANOVA
(Minitab output below) gives F = 967.82 (df 3 and 351), which has P < 0.0005. We
conclude that not all means are equal; specifically, the “Placebo” mean is much higher than
the other three means.
30

Mean scalp flaking rating


Shampoo n x s sx
PyrI 112 17.393 1.142 0.108
PyrII 109 17.202 1.352 0.130 25
Keto 106 16.028 0.931 0.090
Placebo 28 29.393 1.595 0.301
20
Minitab output: Analysis of Variance on Flaking
Source DF SS MS F p 15
Code 3 4151.43 1383.81 967.82 0.000 PyrI PyrII Keto Placebo
Error 351 501.87 1.43 Shampoo
Total 354 4653.30

12.55. (a) The plot (below) shows granularity (which varies between groups), but that should
not make us question independence; it is due to the fact that the scores are all integers.
.
(b) The ratio of the largest to the smallest standard deviations is 1.595/0.931 = 1.714—less
than 2. (c) Apart from the granularity, the quantile plots (on the following page) are
reasonably straight. (d) Again, apart from the granularity, the residual quantile plot (below,
right) looks pretty good.
4 4
3 3
2 2
Residual

Residual

1 1
0 0
-1 -1
-2 -2
-3 -3
-4 -4
0 50 100 150 200 250 300 350 -3 -2 -1 0 1 2 3
Case Normal score
350 Chapter 12 One-Way Analysis of Variance

20 21
20

PyrII flaking score


PyrI flaking score

19
19
18 18
17 17
16
16
15
15 14
14 13
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score

18 32

Placebo flaking score


Keto flaking score

31
17
30
16 29

15 28
27
14
26
13 25
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score

12.56. We have six comparisons to make, and df = 351, so the Bonferroni√ critical value with
.
α = 0.05 is t ∗∗ = 2.6533. The pooled standard deviation is sp = MSE = 1.1958; the
differences, standard errors, and t statistics are below. The only nonsignificant difference is
between the two Pyr treatments (meaning the second application of the shampoo is of little
benefit). The Keto shampoo mean is the lowest; the placebo mean is by far the highest.

DPy1−Py2 = 0.19102 DPy1−K = 1.36456 DPy1−P = −12.0000


SEPy1−Py2 = 0.16088 SEPy1−K = 0.16203 SEPy1−P = 0.25265
tPy1−Py2 = 1.187 tPy1−K = 8.421 tPy1−P = −47.497
DPy2−K = 1.17353 DPy2−P = −12.1910
SEPy2−K = 0.16312 SEPy2−P = 0.25334
tPy2−K = 7.195 tPy2−P = −48.121
DK−P = −13.3646
SEK−P = 0.25407
tK−P = −52.601
Solutions 351

12.57. (a) The three contrasts are: c1 = −12.51 c2 = 1.269 c3 = 0.191


. . .
ψ1 = 13 µPy1 + 13 µPy2 + 13 µK − µP SEc1 = 0.2355 SEc2 = 0.1413 SEc3 = 0.1609
t1 = −53.17 t2 = 8.98 t3 = 1.19
ψ2 = 12 µPy1 + 12 µPy2 − µK .
P1 < 0.0005 P2 < 0.0005 P3 = 0.2359
ψ3 = µPy1 − µPy2
√ .
(b) The pooled standard deviation is sp = MSE = 1.1958. The estimated contrasts and
their standard errors are in the table. For example:

.
SEc1 = sp 19 /112 + 19 /109 + 19 /106 + 1/28 = 0.2355
(c) We test H0 : ψi = 0 versus Ha : ψi = 0 for each contrast. The t- and P-values are
given in the table. The Placebo mean is significantly higher than the average of the other
three, while the Keto mean is significantly lower than the average of the two Pyr means. The
difference between the Pyr means is not significant (meaning the second application of the
shampoo is of little benefit)—this agrees with our conclusion from Exercise 12.56.

12.58. (a) At right. (b) Each new value (except for n x s sx


n) is simply (old value)/100. (Standard errors ECM1 3 0.65 0.08660 0.05
were not computed for Exercise√ 12.51, but for all ECM2 3 0.633 0.02887 0.01667
groups, we simply divide by 3.) (c) The SS and ECM3 3 0.733 0.02887 0.01667
MS entries differ from those of Exercise 12.51— MAT1 3 0.233 0.02887 0.01667
by a factor of 0.0001 = (1/100)2 . However, MAT2 3 0.066 0.02887 0.01667
everything else is the same: F = 137.94 with df 5 MAT3 3 0.116 0.02887 0.01667
and 12; P < 0.0005, so we (again) reject H0 and
conclude that not all means are equal.
Minitab output: Analysis of Variance on GpiPct
Source DF SS MS F p
Treatment 5 1.34111 0.26822 137.94 0.000
Error 12 0.02333 0.00194
Total 17 1.36444

12.59. Because the measurements in Exer- Version (1) Version (2)


cise 12.51 are percents, the instructions to x s x s
“add 5% to each response” could be inter- ECM1 70.0 8.6603 68.25 9.0933
preted in two ways: ECM2 68.3 2.8868 66.5 3.0311
(1) new response = old response + 5 ECM3 78.3 2.8868 77 3.0311
(2) new response = old response × 1.05 MAT1 28.3 2.8868 24.5 3.0311
The table on the right gives summary statistics MAT2 11.6 2.8868 7 3.0311
for both interpretations (all numbers in per- MAT3 16.6 2.8868 12.25 3.0311
cents). For (1), the means increase by 5, but everything else remains the same; the ANOVA
table is identical to the one in the solution to Exercise 12.51. For (2), both the means and
standard deviations are multiplied by 1.05, SS and MS are multiplied by 1.052 , but F and P
remain the same (ANOVA table below).
Minitab output: Analysis of Variance on GpiVers2
Source DF SS MS F p
Treatment 5 14785.8 2957.1 137.94 0.000
Error 12 257.3 21.4
Total 17 15043.0
352 Chapter 12 One-Way Analysis of Variance

12.60. There is no effect on the test statistic, df, P-value, and conclusion. The degrees of
freedom are not affected, because the number of groups and sample sizes are unchanged;
meanwhile, the SS and MS values change (by a factor of b2 ), but this change does not
affect F because the factors of b2 cancel out in the ratio F = MSG/MSE. With the same
F- and df values, the P-value and conclusion are necessarily unchanged.
Proof of these statements is not too difficult, but it requires careful use of the SS
formulas. For most students, a demonstration with several choices of a and b would
probably be more convincing than a proof. However, here is the basic idea: Using results
of Chapter 1, we know that the means undergo the same transformation as the data
(x i∗ = a + bx i ), while the standard deviations are changed by a factor of |b|. Let x be the
I
average of all the data; note that x ∗ = a + bx. Now SSG = i=1 n i (x i − x)2 , so:
  
SSG∗ = i n i (x i∗ − x ∗ )2 = i n i (b x i − b x)2 = i n i b2 (x i − x)2 = b2 SSG
Similarly, we can establish that SSE∗ = b2 SSE and SST∗ = b2 SST. Since the MS values are
merely SS values divided by the (unchanged) degrees of freedom, these also change by a
factor of b2 .

12.61. A table of means and standard deviations is below. Quantile plots are not shown, but
apart from the granularity of the scores and a few possible outliers, there are no marked
deviations from Normality. Pooling is reasonable for both PRE1 and PRE2; the ratios are
1.24 and 1.48.
For both PRE1 and PRE2, we test H0 : µ B = µ D = µ S versus Ha : at least one mean is
different. Both tests have df 2 and 63. For PRE1, F = 1.13 and P = 0.329; for PRE2,
F = 0.11 and P = 0.895. There is no reason to believe that the mean pretest scores differ
between methods.
PRE1 PRE2
Method n x s x s
Basal 22 10.5 2.9721 5.27 2.7634
DRTA 22 9.72 2.6936 5.09 1.9978
Strat 22 9.136 3.3423 4.954 1.8639

Minitab output: Analysis of Variance on PRE1


Source DF SS MS F p
Group 2 20.58 10.29 1.13 0.329
Error 63 572.45 9.09
Total 65 593.03
Analysis of Variance on PRE2
Source DF SS MS F p
Group 2 1.12 0.56 0.11 0.895
Error 63 317.14 5.03
Total 65 318.26
Solutions 353

12.62. Stemplots and summary statistics are on the Variable sp SE D t ∗∗ SE D


following page. Some of the distributions have mild POST1 3.1885 0.9614 2.3646
outliers or skewness, but there are no serious viola- POST2 2.3785 0.7171 1.7639
tions of Normality evident. Pooling is appropriate for POST3 6.3141 1.9038 4.6825
all three response variables.
The three F statistics, all with df 2 and 63, are 5.32 (P = 0.007), 8.41 (P = 0.001), and
4.48 (P = 0.015). We conclude that at least one mean is different for each posttest.
For multiple comparisons, we have three comparisons, df 63, and α = 0.05, so the Bon-
ferroni critical value is t ∗∗ = 2.4596. In the table (above, right) are the pooled standard
deviations, standard error of each difference, and values of t ∗∗ SE D (the “minimum significant
difference,” or MSD) for each response variable. For both POST1 and POST3, DRTA is
significantly greater than Basal, but no other comparisons are significant. For POST2, Strat is
significantly greater than both Basal and DRTA.
We also may examine the contrasts ψ1 = −µB + 12 (µD + µS ), which is positive if the
average of the new methods is greater than the basal mean, and ψ2 = µD − µS , which
compares the two new methods. Estimated contrasts, standard errors, t, and P are given in
the table below. (The P-values are one-sided for ψ1 and two-sided for ψ2 .) We see that c1 is
significantly positive for all three variables. The second contrast was included in our multiple
comparisons, where we found the difference significant only for POST2, but this time, when
we are testing only this difference, rather than all three possible differences, we conclude that
µD > µS for POST1, in addition to the difference for POST2.

Variable c1 SEc1 t1 P1 c2 SEc2 t2 P2


POST1 2.0909 0.8326 2.51 0.0073 2.0000 0.9614 2.08 0.0415
POST2 1.7500 0.6211 2.82 0.0032 −2.1364 0.7171 −2.98 0.0041
POST3 4.4545 1.6487 2.70 0.0044 2.4545 1.9038 1.29 0.2019
10 8.5 47
9.5 8 46
Mean POST2 score
Mean POST1 score

Mean POST3 score

9 7.5 45
8.5 7 44
8 6.5 43
7.5 6 42
7 5.5 41
6.5 5 40
Basal DRTA Strat Basal DRTA Strat Basal DRTA Strat
Teaching method Teaching method Teaching method
354 Chapter 12 One-Way Analysis of Variance

Basal/POST1 DRTA/POST1 Strat/POST1


Test Method x s 1 1 1 0
POST1 Basal 6.6818 2.7669 2 0 2 2
DRTA 9.7727 2.7243 3 0 3 3 0
Strat 7.7727 3.9271 4 000 4 4 0000
5 00000 5 00 5 00
POST2 Basal 5.5455 2.0407 6 0 6 6 0
DRTA 6.2273 2.0915 7 00 7 000 7 000
Strat 8.3636 2.9040 8 000 8 0000 8 0
9 000 9 9 00
POST3 Basal 41.0455 5.6356 10 0 10 0000 10
DRTA 46.7273 7.3884 11 11 00 11 00
Strat 44.2727 5.7668 12 00 12 000 12 00
13 00 13 0
14 00 14 0
15 0
Basal/POST2 DRTA/POST2 Strat/POST2 Basal/POST3 DRTA/POST3 Strat/POST3
0 0 0 0 3 3 01 3
1 1 1 0 3 223 3 3 33
2 2 2 3 5 3 3 4
3 000 3 0 3 3 66 3 7 3
4 00000 4 4 0 3 99 3 3 8
5 000000 5 00 5 00 4 0011 4 01 4 1
6 0 6 0000000000 6 0 4 23 4 23 4 2223
7 00 7 0000 7 00 4 4555 4 4 455
8 000 8 00 8 000 4 66 4 7 4 7
9 0 9 0 9 0000 4 9 4 8889999 4 888999
10 0 10 10 000 5 5 0 5 01
11 0 11 00 5 5 33 5 3
12 00 5 4 5 455
13 0 5 7

Minitab output: Analysis of Variance on POST1


Source DF SS MS F p
Method 2 108.1 54.1 5.32 0.007
Error 63 640.5 10.2
Total 65 748.6
Analysis of Variance on POST2
Source DF SS MS F p
Method 2 95.12 47.56 8.41 0.001
Error 63 356.41 5.66
Total 65 451.53
Analysis of Variance on POST3
Source DF SS MS F p
Method 2 357.3 178.7 4.48 0.015
Error 63 2511.7 39.9
Total 65 2869.0
Solutions 355

12.63. The scatterplot (below, left) suggests that a straight line is not the best choice of a
model. Regression gives the formula
 = 4.432 − 0.000102 Friends
Score
Not surprisingly, the slope is not significantly different from 0 (t = −0.28, P = 0.782).
The regression only explains 0.1% of the variation in score. The residual plot (below, right)
is nearly identical to the first scatterplot, and suggests (as that did) that a quadratic model
might be a better choice.
Note: If one fits a quadratic model, it does better (and has significant coefficients), but it
still only explains 8.3% of the variation in attractiveness.
7 3
6
Attractiveness score

2
5 1

Residual
4 0
3 –1
2 –2
1 –3
0 –4
100 200 300 400 500 600 700 800 900 100 200 300 400 500 600 700 800 900
Number of friends Number of friends

Minitab output: Regression of attractiveness score on number of friends


The regression equation is Score = 4.43 -0.000102 Friends
Predictor Coef Stdev t-ratio p
Constant 4.4321 0.2060 21.51 0.000
Friends -0.0001023 0.0003694 -0.28 0.782
s = 1.150 R-sq = 0.1% R-sq(adj) = 0.0%

12.64. The pooled standard deviation sp is found by looking at the spread of each observation
about its group mean x i . The “total” standard deviation s given in Exercise 12.30 is the
spread about the grand mean (the mean of all the data values, ignoring distinctions between
groups). When we ignore group differences, we have more variation (uncertainty) in our
data, so s is almost always larger than sp .
This can be made clearer (to sufficiently mathematical students) by noting that the total
variance s 2 can be found in the ANOVA table:
SSE SST
Just as sp2 = = MSE, s 2 = = MST.
DFE DFT
(The total mean square is not included in the ANOVA table but is easily computed from the
values on the bottom line.) Because SSM + SSE = SST, we always have SSE ≤ SST, with
equality only when the model is completely worthless (that is, when all group means equal
the grand mean, so that SSM = 0). Because DFE < DFT, it might be that MSE ≥ MST but
that does not happen very often.
356 Chapter 12 One-Way Analysis of Variance

40 + 47 + 43
12.66. With σ = 7 and means µ1 = 40, µ2 = 47, and µ3 = 43, we have µ̄ = 3 = 43.3
and noncentrality parameter:
  
n (µi − µ̄)2 (10) (40 − 43.3)2 + (47 − 43.3)2 + (43 − 43.3)2 (10)(24.6) .
λ= = = = 5.0340
σ 2 49 49
(The value of λ in the G•Power output below is slightly different due to rounding.) The
degrees of freedom and critical value are the same as in Example 12.27: df 2 and 27,
F ∗ = 3.35. Software reports the power as about 46%. Samples of size 10 are not adequate
for this alternative; we should increase the sample size so that we have a better chance of
detecting it. (For example, samples of size 20 give nearly 80% power for this alternative.)

G•Power output
Post-hoc analysis for "F-Test (ANOVA)", Global, Groups: 3:
Alpha: 0.0500
Power (1-beta): 0.4606
Effect size "f": 0.4096
Total sample size: 30
Critical value: F(2,27) = 3.3541
Lambda: 5.0332

12.67. (a) Sampling plans will vary but should attempt to address how cultural groups will be
determined: Can we obtain such demographic information from the school administration?
Do we simply select a large sample then poll each student to determine if he or she belongs
to one of these groups? (b) Answers will vary with choice of Ha and desired power. For
example, with the alternative µ1 = µ2 = 4.4, µ3 = 5, and standard deviation σ = 1.2,
three samples of size 75 will produce power 0.89. (See G•Power output below.) (c) The
report should make an attempt to explain the statistical issues involved; specifically, it should
convey that sample sizes are sufficient to detect anticipated differences among the groups.

G•Power output
Post-hoc analysis for "F-Test (ANOVA)", Global, Groups: 3:
Alpha: 0.0500
Power (1-beta): 0.8920
Effect size "f": 0.2357
Total sample size: 225
Critical value: F(2,222) = 3.0365
Lambda: 12.4998
Solutions 357

12.68. Recommended sample sizes will vary with choice of Ha and desired power. For
example, with the alternative µ1 = µ2 = 0.22, µ3 = 0.24, and standard deviation σ = 0.015,
three samples of size 10 will produce power 0.84, and samples of size 15 increase the power
to 0.96. (See G•Power output below.) The report should make an attempt to explain the
statistical issues involved; specifically, it should convey that sample sizes are sufficient to
detect anticipated differences among the groups.

G•Power output
Post-hoc analysis for "F-Test (ANOVA)", Global, Groups: 3:
Alpha: 0.0500
Power (1-beta): 0.8379
Effect size "f": 0.6285
Total sample size: 30
Critical value: F(2,27) = 3.3541
Lambda: 11.8504
Note: Accuracy mode calculation.

Post-hoc analysis for "F-Test (ANOVA)", Global, Groups: 3:


Alpha: 0.0500
Power (1-beta): 0.9622
Effect size "f": 0.6285
Total sample size: 45
Critical value: F(2,42) = 3.2199
Lambda: 17.7756

12.69. The design can be similar, although the types of music might be different. Bear in
mind that spending at a casual restaurant will likely be less than at the restaurants examined
in Exercise 12.28; this might also mean that the standard deviations could be smaller. A
pilot study might be necessary to get an idea of the size of the standard deviations. Decide
how big a difference in mean spending you would want to detect, then do some power
computations.
Chapter 13 Solutions

13.1. (a) Two-way ANOVA is used when there are two factors (explanatory variables). (The
outcome [response] variable is assumed to have a Normal distribution, meaning that it can
take any value, at least in theory.) (b) Each level of A should occur with all three levels
of B. (Level A has two factors.) (c) The RESIDUAL part of the model represents the error.
(d) DFAB = (I − 1)(J − 1).

13.2. (a) Parallel profiles imply that there is no interaction. (b) It is not necessary that all
sample sizes be the same. (The standard deviations must all be the same.) (c) sp2 is found
by pooling the sample variances for each SRS. (d) The main effects can give useful
information even in the presence of an interaction.

13.3. (a) A large value of the AB F statistic indicates that we should reject the hypothesis
of no interaction. (b) The relationship is backwards: Mean squares equal sum of squares
divided by degrees of freedom. (c) Under H0 , the ANOVA test statistics have an F
distribution. (d) If the sample sizes are not the same, the sums of squares may not add
for “some methods of analysis.” (See the ‘Caution’ on page 680; for more detail, see
http://afni.nimh.nih.gov/sscc/gangc/SS.html.)

13.4. (a) Yes: The factor-A B2 50 B2


25 (a) (b)
means change more drasti- 45
cally under B2 than under 20 B1 40
Group mean

Group mean

B1. 35
(b) No interaction (the lines 30
15
are perfectly parallel). 25
20 B1
(c) Yes: The factor-A means 10
15
increase under B1, and de-
10
crease under B2. 5
A1 A2 A3 A1 A2 A3
(d) Yes: When A changes
Level of factor A Level of factor A
from level 2 to level 3, the 50
(c) (d)
means increase under B1 and 45 50 B2
decrease under B2. 40 B2
Group mean

Group mean

35 40
30
30
25
20 B1 20 B1
15
10 10
A1 A2 A3 A1 A2 A3
Level of factor A Level of factor A

358
Solutions 359

13.5. A 3 × 2 ANOVA with 5 observations per cell has 2.54


(p = 0.10) 3.40
I = 3, J = 2, and N = 30. (a) The degrees of free-
(p = 0.05)
dom for interaction are DFAB = (I − 1)(J − 1) = 2 3.72
and DFE = N − I J = 24. The five critical values 4.32
(p = 0.025)
from Table E are 2.54, 3.40, 4.32, 5.61, and 9.34.
(b) The sketch on the right shows the observed F- 0 1 2 3 4 5
value given in part (c) and the bounding critical values from Table E. (c) In Table E, we see
that 3.40 < F < 4.32, so 0.025 < P < 0.05. (Software gives 0.0392.) (d) The mean profiles
would not look parallel because the interaction term is significantly different from 0.

13.6. The answers are found in Table E (or using software) with P = 0.05. (a) We have I = 2,
J = 4 and N = 24, so DFA = 1 and DFE = 16. We would reject H0 if F > 4.49 (software
gives 4.4940). (b) We have I = J = 4 and N = 32, so DFAB = 9 and DFE = 16. We
would reject H0 if F > 2.54 (software: 2.5377). (c) We have I = J = 2 and N = 204, so
DFAB = 1 and DFE = 200. We would reject H0 if F > 3.89 (software: 3.8884).

4.49 2.54 3.89

0 1 2 3 4 5 6 0 1 2 3 4 0 1 2 3 4 5

13.7. (a) The factors are gender (I = 2) and age (J = 3). The response variable is the percent
of pretend play. The total number of observations is N = (2)(3)(11) = 66. (b) The factors
are time after harvest (I = 5) and amount of water (J = 2). The response variable is
the percent of seeds germinating. The total number of observations is N = 30 (3 lots of
seeds in each of the 10 treatment combinations). (c) The factors are mixture (I = 6) and
freezing/thawing cycles (J = 3). The response variable is the strength of the specimen. The
total number of observations is N = 54. (d) The factors are training programs (I = 4) and
the number of days to give the training (J = 2). The response variable is not specified, but
presumably is some measure of the training’s effectiveness. The total sample size is N = 80.

13.8. The table on the right summarizes (a) (b) (c) (d)
the degrees of freedom for each source. I = 2 5 6 4
J= 3 2 3 2
Source N= 66 30 54 80
A I −1= 1 4 5 3
B J −1= 2 1 2 1
AB (I − 1)(J − 1) = 2 4 10 3
Error N − IJ = 60 20 36 72
360 Chapter 13 Two-Way Analysis of Variance

13.9. (a) There appears to be an interaction: A thank-you note 7.5


increases repurchase intent by over 1 point for those with

Repurchase intent
short history, and decreases it (very slightly) for customers 7 Thank-you
with long history. Note that either variable could be on note
the horizontal axis in the plot of means. (b) The marginal 6.5
means are
6
Short history 6.245 No thank-you note 6.61
Long history 7.45 Thank-you note 7.085 No note
5.5
Short Long
For example, 5.69 +2 6.80 = 6.245. The history marginal Transaction history
means convey the fact that repurchase intent is higher for
customers with long history. The thank-you note marginal means suggest that a thank-you
note increases repurchase intent, but they are harder to interpret because of the interaction.

13.10. With I = J = 2 levels for each factor, the three missing entries in the DF column are
all 1. The MS entries are computed as MS MS
DF , and the F statistics are DFE . Comparing each
test statistic to an F(1, 160) distribution gives the P-values.
Source DF SS MS F P-value
Transaction history 1 61.445 61.445 12.94 0.0004
Thank-you statement 1 21.810 21.810 4.59 0.0336
Interaction 1 15.404 15.404 3.24 0.0736
Error 160 759.904 4.7494
The interaction is not quite significant, but the two main effects are.

13.11. (a) The plot suggests a possible interaction because


0.5
Reported minus actual

the means are not parallel. (Note that we could have


chosen to put dish type on the horizontal axis instead of 0
consumption

proximity; either explanatory variable will do.) (b) By


subjecting the same individual to all four treatments, -0.5 Opaque
rather than four individuals to one treatment each, we
reduce the within-groups variability (the residual), which -1
makes it easier to detect between-groups variability (the Clear
main effects and interactions). -1.5
Proximate Less Proximate
Proximity
Solutions 361

13.12. (a) The plot suggests a gender effect: Men had higher 114
postexercise blood pressure (BP) than women. There also Men

Systolic BP (mm Hg)


112
appears to be an interaction: BP was higher for endurance- 110
trained women than for sedentary women (as the re- 108
searchers had hypothesized), but for men, that pattern was 106
reversed. (b) The complete ANOVA table is given in the 104
Minitab output below. The apparent interaction noted in (a) 102 Women
was not significant, but there is a significant effect of gen- 100
der. (c) Subjects with a high before-exercise BP are likely
Sedentary Endurance
to have higher postexercise BP, as well. By incorporat- Training level
ing both measurements, the researchers can focus of the
change in BP after exercising, which should be a better measure of the effect of exercise.
Note: The fact that the interaction was not significant, despite the appearance of the
plot, is due to the large variation in individual BPs, indicated by the sizes of the standard
errors given in the table. (Observe that these are standard errors, not standard deviations.)
One reason for measuring change in BP—as suggested in (c)—is that we might expect this
measurement to have less subject-to-subject variation.

Minitab output: Two-way ANOVA for BP on gender and training level


Source DF SS MS F P
Gender 1 677.12 677.12 7.65 0.010
Training 1 0.72 0.72 0.01 0.929
Gender*Training 1 147.92 147.92 1.67 0.207
Error 28 2478.00 88.50
Total 31 3303.76

13.13. (a) There may be an No humor With humor


interaction: For a favor- 5.5 favorable 5.5
Mean satisfaction rating

Mean satisfaction rating

able process, a favorable 5 process 5 favorable


process
outcome increases satis- 4.5 4.5
faction quite a bit more 4 4
than for an unfavorable 3.5 3.5
process (+2.32 versus 3 3
2.5 unfavorable 2.5 unfavorable
+0.24). (b) With humor, process process
2 2
the increase in satisfaction
1.5 1.5
from a favorable outcome unfavorable favorable unfavorable favorable
is less for a favorable pro- Outcome Outcome
cess (+0.49 compared to
+1.32). (c) There seems to be a three-factor interaction, because the interactions in parts (a)
and (b) are different.

13.14. For the pooled standard deviation, we first find


(26)(0.792 ) + (28)(0.472 ) + · · · + (29)(0.712 ) 88.6838 .
sp2 = = = 0.3806
26 + 28 + · · · + 29 233
. √ .
so sp = 0.3806 = 0.6169. There were N = 241 students in the sample, and 8 groups,
so this has df = 241 − 8 = 233. The largest-to-smallest standard deviation ratio is
0.79 .
0.47 = 1.68 < 2, so it is reasonable to use this pooled estimate.
362 Chapter 13 Two-Way Analysis of Variance

13.15. Marginal means are listed in the table below. In each case, we find the average of the
four means for each level of the characteristic. For example, for Humor, we have
3.04 + 5.36 + 2.84 + 3.08
No humor: = 3.58
4
5.06 + 5.55 + 1.95 + 3.27
Humor: = 3.9575
4
The presence of humor slightly increases mean satisfaction. The process and outcome effects
appear to be greater (that is, the change in mean satisfaction is greater).
Marginal means
Humor Process Outcome
No 3.58 Favorable 4.7525 Favorable 4.315
Yes 3.9575 Unfavorable 2.785 Unfavorable 3.2225

13.16. (a) Within a given culture, females gen-


erally have a more positive attitude toward 7.5

Average attitude
cooking than males. Attitudes in France

toward cooking
are less positive than those in the U.S. and 7
Canada. (b) While the means plot is not per- Female
6.5
fectly parallel, it is not clear that it indicates
an interaction. If an interaction is present, it 6
is that the female/male difference is greatest Male
in Canada, and least in France. 5.5
Canada United States France
Culture

13.17. For the pooled standard deviation, we first find


(237)(1.6682 ) + (124)(1.9092 ) + · · · + (86)(1.8752 ) . 2535.19 .
sp2 = = = 3.1493
237 + 124 + · · · + 86 805
. √ .
so sp = 3.1493 = 1.7746. The largest-to-smallest standard deviation ratio is
2.024 .
1.601 = 1.26 < 2, so it is reasonable to use this pooled estimate.

13.18. (a) With a total sample size of N = 811, and six groups, we have df = 805. (b) There
.
are 6 means, so there are 62· 5 = 15 comparisons. (c) Using sp = 1.7746 from the previous

exercise, the t statistics are ti j = (x i − x j )/SEi j , where SEi j = sp n1i + n1j . The complete
set of differences, standard errors, and t values is listed on the following page; the eight
largest differences (marked with an asterisk) are significant.
Note: If doing these computations by hand, it is best to start with the largest differences
and work down until one finds a difference that is not significant. (One should then check a
few more, as there might be one or two more significant differences remaining because the
standard errors vary with sample size.) In this set of data, for example, there is little reason
to check for a significant difference between the Canadian male, U.S. male, and French
female means.
Solutions 363

Means Difference SE t
Female/Canada – Male/Canada 1.31 0.1960 6.6828 *
Female/Canada – Female/U.S. 0.34 0.1759 1.9334
Female/Canada – Male/U.S. 1.27 0.2107 6.0262 *
Female/Canada – Female/France 1.32 0.2272 5.8088 *
Female/Canada – Male/France 2.01 0.2223 9.0406 *
Male/Canada – Female/U.S. −0.97 0.2071 −4.6839 *
Male/Canada – Male/U.S. −0.04 0.2374 −0.1685
Male/Canada – Female/France 0.01 0.2522 0.0397
Male/Canada – Male/France 0.70 0.2478 2.8251
Female/U.S. – Male/U.S. 0.93 0.2211 4.2067 *
Female/U.S. – Female/France 0.98 0.2369 4.1376 *
Female/U.S. – Male/France 1.67 0.2321 7.1937 *
Male/U.S. – Female/France 0.05 0.2638 0.1895
Male/U.S. – Male/France 0.74 0.2596 2.8508
Female/France – M/France 0.69 0.2731 2.5262

13.19. Means plots are below. Possible observations: Except for female responses to purchase
intention, means decreased from Canada to the United States to France. Females had
higher means than men in almost every case, except for French responses to credibility
and purchase intention (suggesting a modest interaction). Gender differences in France are
considerably smaller than in either Canada or the United States.
5
4.6
toward functional foods

4.8 Female Female


of functional foods
Product benefits
General attitude

4.6 4.4
Male
4.4 4.2 Male
4.2
4
4
3.8 3.8
Canada United States France Canada United States France
Culture Culture
4.6 4.4
Credibility of information

Female
about functional foods

Female
for functional foods
Purchase intention

4.4 4.2
Male
Male 4
4.2
3.8
4
3.6
3.8 3.4
3.6 3.2
Canada United States France Canada United States France
Culture Culture
364 Chapter 13 Two-Way Analysis of Variance

13.20. Opinions of undergraduate students might be similar to a large segment of the young
adult population, but this sample is probably not an unbiased representation of that group.
Filling out the surveys in class might also affect the usefulness of the responses in some
way (although it is difficult to predict what that effect might be).

13.21. (a) The marginal means (as well as the individual cell means) are in the table below.
The first two means suggest that the intervention group showed more improvement than the
control group. (b) Interaction means that the mean number of actions changes differently
over time for the two groups. We see this in the plot below because the lines connecting the
means are not parallel.

Mean number of behaviors


12
Intervention
Time
11
Group Baseline 3 mo. 6 mo. Mean Control
Intervention 10.4 12.5 11.9 11.6
Control 9.6 9.9 10.4 9.967 10
Mean 10.0 11.2 11.15 10.783
9
Baseline 3 months 6 months
Time

13.22. (a) The data might be displayed in a 100


Percent displaying behavior

variety of ways. Because there are so many 90


numbers (intervention and control groups, at 80
70
baseline, 3 months, and 6 months), the graph 60
can very easily become overwhelmingly 50
crowded; in order to avoid this, the graph 40
on the right shows the percentage for each 30
20
group averaged over the three times. Any 10
reasonable graphical display will likely be 0
judged more effective than Table 13.1; for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
the most part, it is easier to interpret pictures Behavior
than lists of numbers. (b) The behaviors seemed to fall into two categories: Those that both
groups did most of the time and those that were less common. The biggest differences be-
tween the control and intervention groups are in the latter group, which includes the first five
and the 11th behaviors: hide money, hide extra keys, abuse code to alert family, hide extra
clothing, asked neighbors to call police, removed weapons. These behaviors should receive
special attention in future programs. (c) Perhaps the results of this study may be less applica-
ble to smaller communities, or to those which are less diverse.

13.23. We have I = 3, J = 2, and N = 30, so the degrees of freedom are DFA = 2, DFB = 1,
DFAB = 2, and DFE = 24. This allows us to determine P-values (or to compare to
Table E), and we find that there are no significant effects (although B is close):
FA = 1.87 has df 2 and 24, so P = 0.1759
FB = 3.49 has df 1 and 24, so P = 0.0740
FAB = 2.14 has df 2 and 24, so P = 0.1396
Solutions 365

13.24. (a) Based on the given P-values, the interaction and the main effect of B are significant
at α = 0.05. (b) In order to summarize the results, we would need to know the number of
levels for each factor (I and J ) and the sample sizes in each cell (n i j ). We also would want
to know the sample cell means x i j so that we could interpret the significant main effect and
the nature of the interaction.

13.25. (a) The means are nearly parallel, and show little

Change in testosterone level


0.15
evidence of an interaction. (b) With equal sample sizes,
the pooled variance is simply the unweighted average of 0.1
the variances:
1
sp2 = (0.124 + 0.142 + 0.122 + 0.132 ) = 0.016325 0.05 Highway
4
√ .
Therefore, sp = 0.016325 = 0.1278. (c) Note that 0
all of these contrasts have been arranged so that, if the City
researchers’ suspicions are correct, the contrast will be –0.05
positive. To compare new-car testosterone change to Old sedan New sports car
Car
old-car change, the appropriate contrast is
ψ1 = 12 (µnew,city + µnew,highway ) − 12 (µold,city + µold,highway )
To compare city change to highway change for new cars, we take
ψ2 = µnew,city − µnew,highway
To compare highway change to city change for old cars, we take
ψ3 = µold,highway − µold,city
(d) By subjecting the same individual to all four treatments, rather than four individuals to
one treatment each, we reduce the within-groups variability (the residual), which makes it
easier to detect between-groups variability (the main effects and interactions).

13.26. (a) The significant interaction (P = 0.016) is visible


1.9 Canada
in the plot of means: males and females in Canada report
Effect of peer pressure

similar amounts of peer pressure, while males in Germany 1.8


and Israel report less peer pressure than females. (Note 1.7
that we could have chosen to put country on the horizontal 1.6 Israel
axis instead of gender.) (b) It is not entirely clear how the
1.5
value placed on achievement relates to fear of being called Germany
a nerd or teacher’s pet. One view might be that in a cul- 1.4
.
ture that values achievement, peers should be less likely to
Female Male
“punish” those who do well. If we take this view, the data
Gender
would refute the researchers’ hypothesis. One could also
argue that placing a high value on achievement might make students more competitive, and
name-calling might be an expression of that competitiveness. Under this view, the low peer-
pressure scores for Germany support the researchers’ hypothesis. (c) With both responses, we
could explore the relationship between achievement and fear of peer pressure—for example,
are high-achieving students more or less concerned about peer pressure than low-achieving
students?
366 Chapter 13 Two-Way Analysis of Variance

13.27. (a) Plot on the right. (b) There seems to be a fairly 5.4
large difference between the means based on how much Eat-R
5.2

Mean GITH level


the rats were allowed to eat, but not very much difference
based on the chromium level. There may be an interac- 5
tion: the NM mean is lower than the LM mean, while the 4.8
NR mean is higher than the LR mean. (c) The marginal
4.6
means are L: 4.86, N: 4.871, M: 4.485, R: 5.246. For
low chromium level (L), R minus M is 0.63; for normal 4.4 Eat-M
chromium (N), R minus M is 0.892. Mean GITH levels
Low Normal
are lower for M than for R; there is not much difference Chromium level
for L versus N. The difference between M and R is greater
among rats who had normal chromium levels in their diets (N).

13.28. The “Other” category had the lowest mean HS math 9.5
EO

Mean HS math grades


grades for both genders; this is apparent from the graph
(right) and from the marginal means (CS: 8.895, EO: 8.855, 9 CS
O: 7.845). Females had higher mean grades; the female
marginal mean is 8.836 compared to 8.226 for males. The 8.5
female − male difference is similar for CS and O (about
0.5) but is about twice as big for EO (an interaction). 8 O

7.5
Males Females
Gender

13.29. The “Other” category had the lowest mean SATM


625 EO
score for both genders; this is apparent from the graph
Mean SATM score

(right) and from the marginal means (CS: 605, EO: 624.5,
600
O: 566.) Males had higher mean scores in CS and O, while
females are slightly higher in EO; this indicates an inter- 575 CS
action. Overall, the marginal means by gender are 611.7
(males) and 585.3 (females). 550
O
525
Males Females
Gender

13.30. A study today might include a category for those who declared a major such as
Information Technology (which probably did not exist at the time of the initial study). Some
variables that might be useful to consider: grade in first programming course, high school
physics grades, etc.
Solutions 367

13.31. (a) The pooled variance is


75

Amount from sender ($)


(31)(36.42 ) + (24)(31.22 ) + (24)(41.62 ) + (26)(42.42 ) Individual
sp2 = 70
31 + 24 + 24 + 26 sender
152, 711.52 . 65
= = 1454.4 60
105
. √ .
so sp = 1454.4 = 38.14. There were N = 109 items 55
Group
in the sample, and four groups, so df = 105. (b) Pooling 50
sender
is reasonable because the ratio of the largest and smallest 45
.
31.2 = 1.36 < 2. (c) The marginal
standard deviations is 42.4 40
Individual Group
means are: Responder
Sender Responder

2 (65.5 + 76.3) = $70.90 2 (65.5 + 54.0) = $59.75


1 1
Individual: Individual:

2 (54.0 + 43.7) = $48.85 2 (76.3 + 43.7) = $60.00


1 1
Group: Group:

(d) There appears to be an interaction: Individuals send more money to groups, while groups
send more money to individuals. (e) Compare the statistics to an F(1, 105) distribution. The
three P-values are 0.0033 (sender), 0.9748 (responder), and 0.1522 (interaction). Only the
main effect of sender is significant.

13.32. (a) The sample size is n = 4 for each pot/food


Mean iron level (mg/100 g) 4.5 Meat
combination; means and standard deviations are
4
given in the table below. The largest-to-smallest
. 3.5 Leg.
ratio is 0.63/0.07 = 8.8, which is well above our
3
guideline for pooling. (b) The iron levels differed Veg.
among the three food types, and for all food types, 2.5
aluminum and clay pots produced similar iron lev- 2
els, while iron pots resulted in much higher iron 1.5
levels. There is also evidence of an interaction: Iron 1
Aluminum Clay Iron
levels in iron pots rose much more for meat than Pot type
for legumes or vegetables. (c) The ANOVA table
(below) shows that all three effects are quite significant.

Iron (mg per 100 g)


Meat Legumes Vegetables
Pot type x s x s x s
Aluminum 2.0575 0.2520 2.3300 0.1111 1.2325 0.2313
Clay 2.1775 0.6213 2.4725 0.0714 1.4600 0.4601
Iron 4.6800 0.6283 3.6700 0.1726 2.7900 0.2399

Minitab output: Two-way ANOVA for iron on pot and food


Source DF SS MS F P
Pot 2 24.8940 12.4470 92.26 0.000
Food 2 9.2969 4.6484 34.46 0.000
Pot*Food 4 2.6404 0.6601 4.89 0.004
Error 27 3.6425 0.1349
Total 35 40.4738
368 Chapter 13 Two-Way Analysis of Variance

13.33. Yes; the iron-pot means are the highest, and the F statistic for testing the effect of
the pot type is very large. (In this case, the interaction does not weaken any evidence that
iron-pot foods contain more iron; it only suggests that while iron pots increase iron levels in
all foods, the effect is strongest for meats.)

13.34. The ANOVA table (below) shows signif-

Mean iron level (mg/100 g)


icant evidence that at least one group mean 5
is different. With df = 27, 36 comparisons, 4
and α = 0.05, the Bonferroni critical value is
3
t ∗∗ = 3.5629. The pooled standard deviation is
.
sp = 0.3673, the standard error of each differ- 2
√ .
ence SE D = sp 1/4 + 1/4 = 0.2597, so two 1
means are significantly different if they differ
.
by t ∗∗ SE D = 0.9253. The “error bars” in the 0

Clay/Veg.

Iron/Veg.
Alum/Meat

Alum/Leg.

Clay/Meat

Clay/Leg.

Iron/Meat

Iron/Leg.
Alum/Veg.
plot on the right are drawn with this length
(above and below each mean), so two means
are significantly different if the “dot” for one Pot/Foodcombination
mean does not fall within the other mean’s error bars. For example, we find that iron/meat
is significantly larger than everything else, and iron/legumes is significantly different from
everything except iron/vegetable. These conclusions are consistent with the results of the
two-way ANOVA.

Minitab output: One-way ANOVA for iron on pot/food combinations


Source DF SS MS F p
Potfood 8 36.831 4.604 34.13 0.000
Error 27 3.643 0.135
Total 35 40.474

13.35. (a) For all tool/time combinations,


25.03
n = 3. Means and standard deviations are
Mean diameter (mm)

in the table on the following page. Note


25.02
that five cells had no variability (s = 0).
(b) Plot on the right. Except for tool 1, 25.01 Time 1
mean diameter is highest at time 2. Tool
Time 2
1 had the highest mean diameters, fol- 25
lowed by tool 2, tool 4, tool 3, and tool 5. Time 3
(c) Minitab output below; all F statistics 24.99
Tool 1 Tool 2 Tool 3 Tool 4 Tool 5
are highly significant. (d) There is strong
evidence of a difference in mean diameter among the tools and among the times. There is
also an interaction (specifically, tool 1’s mean diameters changed differently over time com-
pared to the other tools).

Minitab output: Two-way ANOVA for diameter on tool and time


Source DF SS MS F P
Tool 4 0.00359720 0.00089930 412.94 0.000
Time 2 0.00018991 0.00009496 43.60 0.000
Tool*Time 8 0.00013320 0.00001665 7.65 0.000
Error 30 0.00006533 0.00000218
Total 44 0.00398564
Solutions 369

Diameter (mm)
Time 1 (8:00AM) Time 2 (11:00AM) Time 3 (3:00PM)
Tool x s x s x s
1 25.0307 0.001155 25.0280 0 25.0260 0
2 25.0167 0.001155 25.0200 0.002000 25.0160 0
3 25.0063 0.001528 25.0127 0.001155 25.0093 0.001155
4 25.0120 0 25.0193 0.001155 25.0140 0.004000
5 24.9973 0.001155 25.0060 0 25.0003 0.001528

13.36. All means and standard deviations will change by a factor of 0.04; the plot is identical
to that in Exercise 13.35, except that the vertical scale is different. All SS and MS values
change by a factor of 0.042 = 0.0016, but the F- (and P-) values are the same.

Minitab output: Two-way ANOVA for diameter on tool and time


Source DF SS MS F P
Tool 4 0.0000058 0.0000014 412.94 0.000
Time 2 0.0000003 0.0000002 43.60 0.000
Tool*Time 8 0.0000002 0.0000000 7.65 0.000
Error 30 0.0000001 0.0000000
Total 44 0.0000064

13.37. (a) All three F values have df 1 and 945, and the P-values are < 0.001, < 0.001, and
0.1477. Gender and handedness both have significant effects on mean lifetime, but there is
no significant interaction. (b) Women live about 6 years longer than men (on the average),
while right-handed people average 9 more years of life than left-handed people. “There is no
interaction” means that handedness affects both genders in the same way, and vice versa.

13.38. (a) Fseries = 7.02 with df 3 and 61; this has P = 0.0004. Fholder = 1.96 with df 1 and
61; this has P = 0.1665. Finteraction = 1.24 with df 3 and 61; this has P = 0.3026. Only
the series had a significant effect; the presence or absence of a holder and series/holder
interaction did not significantly affect the mean radon reading. (b) Because the ANOVA
indicates that these means are significantly different, we conclude that detectors produced in
different production runs give different readings for the same radon level. This inconsistency
may indicate poor quality control in production.
Note: In the initial printing of the text, the total sample size (N = 69) was not given,
without which we cannot determine the denominator degrees of freedom for part (a).
370 Chapter 13 Two-Way Analysis of Variance

13.39. (a) & (b) The table below lists the means and standard deviations (the latter in
parentheses) of the nitrogen contents of the plants. The two plots below suggest that plant 1
and plant 3 have the highest nitrogen content, plant 2 is in the middle, and plant 4 is the
lowest. (In the second plot, the points are so crowded together that no attempt was made
to differentiate among the different water levels.) There is no consistent effect of water
level on nitrogen content. Standard deviations range from 0.0666 to 0.3437, for a ratio of
5.16—larger than we like. (c) Minitab output below. Both main effects and the interaction
are highly significant.
Amount of water per day
Species 50mm 150mm 250mm 350mm 450mm 550mm 650mm
1 3.2543 2.7636 2.8429 2.9362 3.0519 3.0963 3.3334
(0.2287) (0.0666) (0.2333) (0.0709) (0.0909) (0.0815) (0.2482)
2 2.4216 2.0502 2.0524 1.9673 1.9560 1.9839 2.2184
(0.1654) (0.1454) (0.1481) (0.2203) (0.1571) (0.2895) (0.1238)
3 3.0589 3.1541 3.2003 3.1419 3.3956 3.4961 3.5437
(0.1525) (0.3324) (0.2341) (0.2965) (0.2533) (0.3437) (0.3116)
4 1.4230 1.3037 1.1253 1.0087 1.2584 1.2712 0.9788
(0.1738) (0.2661) (0.1230) (0.1310) (0.2489) (0.0795) (0.2090)

3.5 3 3 3.5
3
Mean percent nitrogen

Mean percent nitrogen

1 1
3 3 3 1
3 3 1 3
1 1
1
2.5 2 2.5
2
2 2 2 2 2 2 2
1.5 4 1.5
4 4 4
4
1 4 4 1
0.5 0.5
1 2 3 4 5 6 7 1 2 3 4
Water level Species

Minitab output: Two-way ANOVA for Pctnit on species and water


Source DF SS MS F P
Species 3 172.3916 57.4639 1301.32 0.000
Water 6 2.5866 0.4311 9.76 0.000
Species*Water 18 4.7446 0.2636 5.97 0.000
Error 224 9.8914 0.0442
Total 251 189.6143

13.40. The residuals appear to be reasonably


Normal, with no apparent outliers and no 0.4
clear patterns. 0.2
Residual

–0.2

–0.4

–0.6
–3 –2 –1 0 1 2 3
Normal score
Solutions 371

13.41. For each water level, there is highly Water level sp SE D MSD1 MSD2
significant evidence of variation in nitro- 1 0.1824 0.0860 0.2418 0.3059
gen level among plant species (Minitab 2 0.2274 0.1072 0.3015 0.3814
output below). For each water level, 3 0.1912 0.0902 0.2535 0.3208
we have df = 32, 6 comparisons, and 4 0.1991 0.0939 0.2640 0.3340
α = 0.05, so the Bonferroni critical 5 0.1994 0.0940 0.2643 0.3344
6 0.2318 0.1093 0.3073 0.3887
value is t ∗∗ = 2.8123. (If we take into
7 0.2333 0.1100 0.3093 0.3913
account that there are 7 water levels, so
that overall we are performing 6 × 7 = 42 comparisons, we should take t ∗∗ = 3.5579.)
The table on the right gives the pooled standard deviations sp , the standard errors of each

difference SE D = sp 1/9 + 1/9, and the “minimum significant difference” MSD = t ∗∗ SE D
(two means are significantly different if they differ by at least this amount). MSD1 uses
t ∗∗ = 2.8123, and MSD2 uses t ∗∗ = 3.5579. As it happens, for either choice of MSD,
the only nonsignificant differences are between species 1 and 3 for water levels 1, 4, and 7.
(These are the three closest pairs of points in the plot from the solution to Exercise 13.39.)
Therefore, for every water level, species 4 has the lowest nitrogen level and species 2 is next.
For water levels 1, 4, and 7, species 1 and 3 are statistically tied for the highest level; for the
other levels, species 3 is the highest, with species 1 coming in second.

Minitab output: One-way ANOVA on species for water level 1


Source DF SS MS F p
Species 3 18.3711 6.1237 184.05 0.000
Error 32 1.0647 0.0333
Total 35 19.4358
One-way ANOVA on species for water level 2
Source DF SS MS F p
Species 3 17.9836 5.9945 115.93 0.000
Error 32 1.6546 0.0517
Total 35 19.6382
One-way ANOVA on species for water level 3
Source DF SS MS F p
Species 3 22.9171 7.6390 208.87 0.000
Error 32 1.1704 0.0366
Total 35 24.0875
One-way ANOVA on species for water level 4
Source DF SS MS F p
Species 3 25.9780 8.6593 218.37 0.000
Error 32 1.2689 0.0397
Total 35 27.2469
One-way ANOVA on species for water level 5
Source DF SS MS F p
Species 3 26.2388 8.7463 220.01 0.000
Error 32 1.2721 0.0398
Total 35 27.5109
One-way ANOVA on species for water level 6
Source DF SS MS F p
Species 3 28.0648 9.3549 174.14 0.000
Error 32 1.7191 0.0537
Total 35 29.7838
One-way ANOVA on species for water level 7
Source DF SS MS F p
Species 3 37.5829 12.5276 230.17 0.000
Error 32 1.7417 0.0544
Total 35 39.3246
372 Chapter 13 Two-Way Analysis of Variance

13.42. The F statistics for all four ANOVAs are significant, and all four regressions are
significant as well. However, the regressions all have low R 2 (varying from 6.4% to 27.3%),
and plots indicate that a straight line is not really appropriate except perhaps for plant 3
(which had the highest R 2 value).

Minitab output: One-way ANOVA on water for plant species 1


Source DF SS MS F p
Water 6 2.3527 0.3921 14.25 0.000
Error 56 1.5413 0.0275
Total 62 3.8940
One-way ANOVA on water for plant species 2
Source DF SS MS F p
Water 6 1.5626 0.2604 7.51 0.000
Error 56 1.9420 0.0347
Total 62 3.5046
One-way ANOVA on water for plant species 3
Source DF SS MS F p
Water 6 1.9764 0.3294 4.15 0.002
Error 56 4.4464 0.0794
Total 62 6.4228
One-way ANOVA on water for plant species 4
Source DF SS MS F p
Water 6 1.4396 0.2399 6.85 0.000
Error 56 1.9618 0.0350
Total 62 3.4013
Regression of plant species 1 on water
The regression equation is plant1 = 2.88 + 0.0397 Water
Predictor Coef Stdev t-ratio p
Constant 2.88097 0.06745 42.71 0.000
Water 0.03971 0.01508 2.63 0.011
s = 0.2394 R-sq = 10.2% R-sq(adj) = 8.7%
Regression of plant species 2 on water
The regression equation is plant2 = 2.21 - 0.0299 Water
Predictor Coef Stdev t-ratio p
Constant 2.21262 0.06531 33.88 0.000
Water -0.02994 0.01460 -2.05 0.045
s = 0.2318 R-sq = 6.4% R-sq(adj) = 4.9%
Regression of plant species 3 on water
The regression equation is plant3 = 2.95 + 0.0833 Water
Predictor Coef Stdev t-ratio p
Constant 2.95100 0.07797 37.85 0.000
Water 0.08334 0.01743 4.78 0.000
s = 0.2768 R-sq = 27.3% R-sq(adj) = 26.1%
Regression of plant species 4 on water
The regression equation is plant4 = 1.38 - 0.0452 Water
Predictor Coef Stdev t-ratio p
Constant 1.37622 0.06129 22.45 0.000
Water -0.04516 0.01371 -3.29 0.002
s = 0.2176 R-sq = 15.1% R-sq(adj) = 13.7%
Solutions 373

3.6 Plant 1 2.6 Plant 2

Nitrogen content (%)

Nitrogen content (%)


3.4 2.4
3.2 2.2
3.0 2.0
2.8 1.8
2.6 1.6
2.4 1.4
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level

3.9 Plant 3 Plant 4


1.6
Nitrogen content (%)

Nitrogen content (%)


3.6 1.4

1.2
3.3
1.0
3.0
0.8

2.7 0.6
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level

13.43. (a) & (b) The tables on the following page list the means and standard deviations (the
latter in parentheses). The means plots show that biomass (both fresh and dry) increases
with water level for all plants. Generally, plants 1 and 2 have higher biomass for each water
level, while plants 3 and 4 are lower. Standard deviation ratios are quite high for both fresh
. .
and dry biomass: 108.01/6.79 = 15.9 and 35.76/3.12 = 11.5. (c) Minitab output below. For
both fresh and dry biomass, main effects and the interaction are significant. (The interaction
for fresh biomass has P = 0.04; other P-values are smaller.)

Minitab output: Two-way ANOVA for fresh biomass


Source DF SS MS F P
Species 3 458295 152765 81.45 0.000
Water 6 491948 81991 43.71 0.000
Species*Water 18 60334 3352 1.79 0.040
Error 84 157551 1876
Total 111 1168129
Two-way ANOVA for dry biomass
Source DF SS MS F P
Species 3 50523.8 16841.3 79.93 0.000
Water 6 56623.6 9437.3 44.79 0.000
Species*Water 18 8418.8 467.7 2.22 0.008
Error 84 17698.4 210.7
Total 111 133264.6
374 Chapter 13 Two-Way Analysis of Variance

Mean fresh biomass level 400 2 140 1

Mean dry biomass level


350 2 2 120 1
2
1
300 1 2
1 2
2
100 2
250 2 1
1
1 80 2
200 3 1 3
3 1 3 3 3
1 1 3 3 4 60 4
150 2 4 4 2 4
4
2
1 40 1 4
100 4
3 4 2 3 3 4
3 3 4
50 3 4 20
4 4
0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Water level Water level

Fresh biomass
Species 50mm 150mm 250mm 350mm 450mm 550mm 650mm
1 109.095 165.138 168.825 215.133 258.900 321.875 300.880
(20.949) (29.084) (18.866) (42.687) (45.292) (46.727) (29.896)
2 116.398 156.750 254.875 265.995 347.628 343.263 397.365
(29.250) (46.922) (13.944) (59.686) (54.416) (98.553) (108.011)
3 55.600 78.858 90.300 166.785 164.425 198.910 188.138
(13.197) (29.458) (28.280) (41.079) (18.646) (33.358) (18.070)
4 35.128 58.325 94.543 96.740 153.648 175.360 158.048
(11.626) (6.789) (13.932) (24.477) (22.028) (32.873) (70.105)
Dry biomass
Species 50mm 150mm 250mm 350mm 450mm 550mm 650mm
1 40.565 63.863 71.003 85.280 103.850 136.615 120.860
(5.581) (7.508) (6.032) (10.868) (15.715) (16.203) (17.137)
2 34.495 57.365 79.603 95.098 106.813 103.180 119.625
(11.612) (6.149) (13.094) (25.198) (18.347) (25.606) (35.764)
3 26.245 31.865 36.238 64.800 64.740 74.285 67.258
(6.430) (11.322) (11.268) (9.010) (3.122) (12.277) (7.076)
4 15.530 23.290 37.050 34.390 48.538 61.195 53.600
(4.887) (3.329) (5.194) (11.667) (5.658) (12.084) (25.290)

13.44. Both sets of residuals have a high outlier (observation #53); observation #52 is a low
outlier for fresh biomass. The other residuals look reasonably Normal.

150 50
40
100
30
50 20
Residual

Residual

10
0
0
–50 –10
–20
–100
–30
–150 –40
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score
Solutions 375

13.45. For each water level, there is highly significant evidence of variation in biomass level
(both fresh and dry) among plant species (Minitab output below). For each water level, we
have df = 12, 6 comparisons, and α = 0.05, so the Bonferroni critical value is t ∗∗ = 3.1527.
(If we take into account that there are 7 water levels, so that overall we are performing
6 × 7 = 42 comparisons, we should take t ∗∗ = 4.2192.) The table below √ gives the pooled
standard deviations sp , the standard errors of each difference SE D = sp 1/4 + 1/4, and the
“minimum significant difference” MSD = t ∗∗ SE D (two means are significantly different if
they differ by at least this amount). MSD1 uses t ∗∗ = 3.1527, and MSD2 uses t ∗∗ = 4.2192.
Rather than give a full listing of which differences are significant, we note that plants 3 and
4 are not significantly different, nor are 1 and 3 (except for one or two water levels). All
other plant combinations are significantly different for at least three water levels. For fresh
biomass, plants 2 and 4 are different for all levels, and for dry biomass, 1 and 4 differ for
all levels.
Fresh biomass Dry biomass
Water level sp SE D MSD1 MSD2 sp SE D MSD1 MSD2
1 20.0236 14.1588 44.6382 50.3764 7.6028 5.3760 16.9487 19.1274
2 31.4699 22.2526 70.1552 79.1735 7.6395 5.4019 17.0305 19.2197
3 19.6482 13.8934 43.8012 49.4318 9.5103 6.7248 21.2010 23.9263
4 43.7929 30.9663 97.6265 110.1762 15.5751 11.0133 34.7213 39.1846
5 38.2275 27.0310 85.2197 96.1746 12.5034 8.8412 27.8734 31.4565
6 59.3497 41.9666 132.3068 149.3147 17.4280 12.3235 38.8518 43.8462
7 66.7111 47.1719 148.7174 167.8348 23.7824 16.8167 53.0176 59.8329

Minitab output: One-way ANOVA for fresh biomass — water level 1


Source DF SS MS F p
Species 3 19107 6369 15.88 0.000
Error 12 4811 401
Total 15 23918
One-way ANOVA for fresh biomass — water level 2
Source DF SS MS F p
Species 3 35100 11700 11.81 0.001
Error 12 11884 990
Total 15 46984
One-way ANOVA for fresh biomass — water level 3
Source DF SS MS F p
Species 3 71898 23966 62.08 0.000
Error 12 4633 386
Total 15 76531
One-way ANOVA for fresh biomass — water level 4
Source DF SS MS F p
Species 3 62337 20779 10.83 0.001
Error 12 23014 1918
Total 15 85351
One-way ANOVA for fresh biomass — water level 5
Source DF SS MS F p
Species 3 99184 33061 22.62 0.000
Error 12 17536 1461
Total 15 116720
One-way ANOVA for fresh biomass — water level 6
Source DF SS MS F p
Species 3 86628 28876 8.20 0.003
Error 12 42269 3522
Total 15 128897
376 Chapter 13 Two-Way Analysis of Variance

One-way ANOVA for fresh biomass — water level 7


Source DF SS MS F p
Species 3 144376 48125 10.81 0.001
Error 12 53404 4450
Total 15 197780
One-way ANOVA for dry biomass — water level 1
Source DF SS MS F p
Species 3 1411.2 470.4 8.14 0.003
Error 12 693.6 57.8
Total 15 2104.8
One-way ANOVA for dry biomass — water level 2
Source DF SS MS F p
Species 3 4597.1 1532.4 26.26 0.000
Error 12 700.3 58.4
Total 15 5297.4
One-way ANOVA for dry biomass — water level 3
Source DF SS MS F p
Species 3 6127.2 2042.4 22.58 0.000
Error 12 1085.3 90.4
Total 15 7212.6
One-way ANOVA for dry biomass — water level 4
Source DF SS MS F p
Species 3 8634 2878 11.86 0.001
Error 12 2911 243
Total 15 11545
One-way ANOVA for dry biomass — water level 5
Source DF SS MS F p
Species 3 10026 3342 21.38 0.000
Error 12 1876 156
Total 15 11902
One-way ANOVA for dry biomass — water level 6
Source DF SS MS F p
Species 3 13460 4487 14.77 0.000
Error 12 3645 304
Total 15 17105
One-way ANOVA for dry biomass — water level 7
Source DF SS MS F p
Species 3 14687 4896 8.66 0.002
Error 12 6787 566
Total 15 21474

13.46. The F statistics for all eight ANOVAs are significant, and all eight regressions are
significant as well. Unlike the nitrogen level (Exercises 13.39 through 13.42), all of these
regressions have reasonably large values of R 2 , and the scatterplots suggest that a straight
line is an appropriate model for the relationship.

Minitab output: One-way ANOVA for fresh biomass — plant species 1


Source DF SS MS F p
Water 6 145543 24257 19.76 0.000
Error 21 25774 1227
Total 27 171317
One-way ANOVA for fresh biomass — plant species 2
Source DF SS MS F p
Water 6 257083 42847 9.63 0.000
Error 21 93463 4451
Total 27 350546
Solutions 377

One-way ANOVA for fresh biomass — plant species 3


Source DF SS MS F p
Water 6 80952 13492 17.77 0.000
Error 21 15948 759
Total 27 96901
One-way ANOVA for fresh biomass — plant species 4
Source DF SS MS F p
Water 6 68704 11451 10.75 0.000
Error 21 22365 1065
Total 27 91070
One-way ANOVA for dry biomass — plant species 1
Source DF SS MS F p
Water 6 27273 4545 30.44 0.000
Error 21 3136 149
Total 27 30408
One-way ANOVA for dry biomass — plant species 2
Source DF SS MS F p
Water 6 21802 3634 7.83 0.000
Error 21 9751 464
Total 27 31553
One-way ANOVA for dry biomass — plant species 3
Source DF SS MS F p
Water 6 9489.9 1581.6 18.82 0.000
Error 21 1764.6 84.0
Total 27 11254.5
One-way ANOVA for dry biomass — plant species 4
Source DF SS MS F p
Water 6 6478 1080 7.44 0.000
Error 21 3047 145
Total 27 9525
Regression of fresh biomass+plant species 1 on water
The regression equation is plant1 = 80.1 + 35.0 Water
Predictor Coef Stdev t-ratio p
Constant 80.13 15.38 5.21 0.000
Water 34.961 3.438 10.17 0.000
s = 36.39 R-sq = 79.9% R-sq(adj) = 79.1%
Regression of fresh biomass+plant species 2 on water
The regression equation is plant2 = 81.9 + 46.7 Water
Predictor Coef Stdev t-ratio p
Constant 81.94 26.97 3.04 0.005
Water 46.739 6.030 7.75 0.000
s = 63.82 R-sq = 69.8% R-sq(adj) = 68.6%
Regression of fresh biomass+plant species 3 on water
The regression equation is plant3 = 33.0 + 25.4 Water
Predictor Coef Stdev t-ratio p
Constant 33.02 12.98 2.55 0.017
Water 25.423 2.901 8.76 0.000
s = 30.70 R-sq = 74.7% R-sq(adj) = 73.7%
378 Chapter 13 Two-Way Analysis of Variance

Regression of fresh biomass+plant species 4 on water


The regression equation is plant4 = 15.7 + 23.6 Water
Predictor Coef Stdev t-ratio p
Constant 15.69 13.98 1.12 0.272
Water 23.641 3.127 7.56 0.000
s = 33.09 R-sq = 68.7% R-sq(adj) = 67.5%
Regression of dry biomass+plant species 1 on water
The regression equation is plant1 = 29.0 + 15.0 Water
Predictor Coef Stdev t-ratio p
Constant 28.971 6.033 4.80 0.000
Water 14.973 1.349 11.10 0.000
s = 14.28 R-sq = 82.6% R-sq(adj) = 81.9%
Regression of dry biomass+plant species 2 on water
The regression equation is plant2 = 31.7 + 13.4 Water
Predictor Coef Stdev t-ratio p
Constant 31.707 8.905 3.56 0.001
Water 13.365 1.991 6.71 0.000
s = 21.07 R-sq = 63.4% R-sq(adj) = 62.0%
Regression of dry biomass+plant species 3 on water
The regression equation is plant3 = 18.4 + 8.44 Water
Predictor Coef Stdev t-ratio p
Constant 18.436 4.741 3.89 0.001
Water 8.442 1.060 7.96 0.000
s = 11.22 R-sq = 70.9% R-sq(adj) = 69.8%
Regression of dry biomass+plant species 4 on water
The regression equation is plant4 = 10.3 + 7.20 Water
Predictor Coef Stdev t-ratio p
Constant 10.298 5.057 2.04 0.052
Water 7.197 1.131 6.36 0.000
s = 11.97 R-sq = 60.9% R-sq(adj) = 59.4%
Solutions 379

350 Plant 1 Plant 2


500

Fresh biomass level

Fresh biomass level


300
400
250
300
200
200
150
100 100

50 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level

240 Plant 3 Plant 4


220
Fresh biomass level

Fresh biomass level


200 200
180
160 150
140
120 100
100
80 50
60
40 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level

160 Plant 1 180 Plant 2


140 160
Dry biomass level

Dry biomass level

120 140
120
100
100
80
80
60 60
40 40
20 20
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level

Plant 3 80 Plant 4
90
80 70
Dry biomass level

Dry biomass level

70 60
60 50
50 40
40 30
30 20
20 10
10 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Water level Water level
380 Chapter 13 Two-Way Analysis of Variance

13.47. (a) With I = 2, J = 3, and N = 180, 1.6

Leaf damage (percent)


the numerator degrees of freedom are I − 1, 1.4 Males
J − 1, and (I − 1)(J − 1), respectively, and the 1.2
denominator degrees of freedom for all three 1
tests is DFE = N − I J = 174: 0.8
0.6
Source df 0.4
Females
Gender 1 and 174 0.2
Floral characteristic 2 and 174 0
Level 1 Level 2 Level 3
Interaction 2 and 174
Floral characteristic
(b) Damage to males was higher for all characteristics. For males, damage was highest under
characteristic level 3, while for females, the highest damage occurred at level 2. (c) Three of
the standard deviations are at least half as large as the means. Because the response variable
(leaf damage) had to be nonnegative, this suggests that these distributions are right-skewed;
taking logarithms reduces the skewness.

13.48. The table and plot of the means below suggest that, within a given gender, students
who stay in the sciences have higher HSS grades than those who end up in the “Other”
group. Males have a slightly higher mean in the CS group, but females have the edge in the
other two. Normal quantile plots show no great deviations from Normality, apart from the
granularity of the grades (most evident among women in EO). In the ANOVA, both main
effects and the interaction are all significant. Residual analysis (not shown) shows that they
are left-skewed.

Minitab output: Two-way ANOVA for HSS on sex and major


Source DF SS MS F P
Sex 1 12.927 12.927 5.06 0.025
Maj 2 44.410 22.205 8.69 0.000
Sex*Maj 2 24.855 12.427 4.86 0.009
Error 228 582.923 2.557
Total 233 665.115

Major 9
Mean HSS grade

Gender CS EO Other
Male n = 39 39 39 8.5
x = 8.6667 7.9231 7.4359
s = 1.2842 2.0569 1.7136 8 Female
Female n = 39 39 39
x = 8.3846 9.2308 7.8205 7.5
Male
7.3
s = 1.6641 0.7057 1.8046 CS EO Other
Major
Solutions 381

10

HSS grade (men in Other)


10 10

HSS grade (men in EO)


HSS grade (men in CS)
9 9
9 8
8
7
8 7
6
5 6
7
4 5
6 3 4
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

HSS grade (women in EO)


10
HSS grade (women in CS)

HSS grade (women in Other)


10 10
9 9
8
8
9 7
7
6
6 5
5 8
4

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

13.49. The table and plot of the means suggest that females have higher HSE grades than
males. For a given gender, there is not too much difference among majors. Normal quantile
plots show no great deviations from Normality, apart from the granularity of the grades
(most evident among women in EO). In the ANOVA, only the effect of gender is significant.
Residual analysis (not shown) reveals some causes for concern; for example, the variance
does not appear to be constant.

Minitab output: Two-way ANOVA for HSE on sex and major


Source DF SS MS F P
Sex 1 105.338 105.338 50.32 0.000
Maj 2 5.880 2.940 1.40 0.248
Sex*Maj 2 5.573 2.786 1.33 0.266
Error 228 477.282 2.093
Total 233 594.073

Major 9
Mean HSE grade

Gender CS EO Other
Male n = 39 39 39 8.5
Female
x = 7.7949 7.4872 7.4103
8
s = 1.5075 2.1505 1.5681
Female n = 39 39 39 7.5
x = 8.8462 9.2564 8.6154 Male
s = 1.1364 0.7511 1.1611 CS EO Other
Major
382 Chapter 13 Two-Way Analysis of Variance

10 10 10

HSE grade (men in Other)


HSE grade (men in EO)
HSE grade (men in CS)

9 9 9
8 8 8
7
7 7
6
6 6
5
5 5 4
4 4 3
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score
HSE grade (women in EO)
HSE grade (women in CS)

HSE grade (women in Other)


10 10 10

9 9

8 9 8

7 7

6 8 6

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

13.50. The table and plot of the means suggest that students who stay in the sciences have
higher mean GPAs than those who end up in the “Other” group. Both genders have similar
mean GPAs in the EO group, but in the other two groups, females perform better. Normal
quantile plots show no great deviations from Normality, apart from a few low outliers in
the two EO groups. In the ANOVA, sex and major are significant, while there is some (not
quite significant) evidence for the interaction.

Minitab output: Two-way ANOVA for GPA on sex and major


Source DF SS MS F P
Sex 1 3.1131 3.1131 7.31 0.007
Maj 2 26.7591 13.3795 31.42 0.000
Sex*Maj 2 2.3557 1.1779 2.77 0.065
Error 228 97.0986 0.4259
Total 233 129.3265
3.2
Major 3
Gender CS EO Other
Mean GPA

2.8
Male n = 39 39 39
2.6
x = 2.7474 3.0964 2.0477
2.4
s = 0.6840 0.5130 0.7304 Female
Female n = 39 39 39 2.2
x = 2.9792 3.0808 2.5236 2 Male
s = 0.5335 0.6481 0.7656 CS EO Other
Major
Solutions 383

4 4 3

GPA (men in Other)


GPA (men in CS) 3.5

GPA (men in EO)


3.5 2.5
3
2
2.5 3
1.5
2
2.5
1
1.5
2 0.5
1
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

4 4 4
3.5
GPA (women in CS)

GPA (women in EO)

GPA (women in Other)


3.5
3.5 3
3
2.5
3
2.5 2
2.5 2 1.5
1.5 1
2 0.5
1
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score

13.51. The table and plot of the means below suggest that students who stay in the sciences
have higher mean SATV scores than those who end up in the “Other” group. Female CS
and EO students have higher scores than males in those majors, but males have the higher
mean in the Other group. Normal quantile plots suggest some right-skewness in the “Women
in CS” group and also some non-Normality in the tails of the “Women in EO” group. Other
groups look reasonably Normal. In the ANOVA table, only the effect of major is significant.

Minitab output: Two-way ANOVA for SATV on sex and major


Source DF SS MS F P
Sex 1 3824 3824 0.47 0.492
Maj 2 150723 75362 9.32 0.000
Sex*Maj 2 29321 14661 1.81 0.166
Error 228 1843979 8088
Total 233 2027848
550
Major
Gender CS EO Other 525
Mean SATV

Male n = 39 39 39
x = 526.949 507.846 487.564 500
s = 100.937 57.213 108.779 Male
475
Female n = 39 39 39
x = 543.385 538.205 465.026 Female
450
s = 77.654 102.209 82.184 CS EO Other
Major
384 Chapter 13 Two-Way Analysis of Variance

800 600 800

SATV score (men in Other)


SATV score (men in CS)

SATV score (men in EO)


700 550 700

600 600
500
500 500
450
400 400

300 400 300

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score
SATV score (women in EO)
SATV score (women in CS)

SATV score (women in Other)


750 700 650
700 650 600
650 550
600
600 500
550 450
550
500 400
500 350
450 450 300
400 400 250
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score Normal score
Chapter 14 Solutions

p 0.5
14.1. If p = 0.5, then odds = = = 1, or “1 to 1.”
1− p 0.5

odds 3 3
14.2. If odds = 3, then p = = = .
odds + 1 3+1 4

. .
14.3. We have p̂men = 63
110 = 0.5727, and p̂women = 60
130 = 6
13 = 0.4615. Therefore:
63/110 63 .
oddsmen = = = 1.3404, and
47/110 47
6/13 6 .
oddswomen = = = 0.8571, or “6 to 7”
7/13 7

Note: The odds can also be computed without first finding p̂; for example, 63 men preferred
Commercial A and 47 preferred Commercial B, so oddsmen = 47 63
.

14.4. The odds for selecting Commercial B would be the reciprocal of the odds for
47 7 .
Commercial A: odds∗men = = 0.7460 and odds∗women = = 1.1667.
63 6

.
14.5. With oddsmen = 6347 and oddswomen = 7 , we have log(oddsmen ) = 0.2930 and
6
.
log(oddswomen ) = −0.1542.
Note: You may wish to remind students to use the natural logarithm, called “ln” by
Excel and most calculators. A student who mistakenly uses the common (base-10) logarithm
instead of the natural logarithm will get 0.1272 and −0.0669 as answers.

.
14.6. With odds∗men = 47 ∗ ∗
63 and oddswomen = 6 , we have log(oddsmen ) = −0.2930 and
7
.
log(odds∗women ) = 0.1542.
Note: Because these odds were the reciprocals of those from Exercise 14.3, the log odds
are the opposites (negations) of those found in Exercise 14.5. A student who mistakenly
uses the common (base-10) logarithm instead of the natural logarithm will get −0.1272 and
0.0669 as answers.

14.7. The model is y = log(odds) = β0 + β1 x. If x = 1 for men and 0 for women, we need:
 
pmen pwomen
log = β 0 + β1 and log = β0
1 − pmen 1 − pwomen
. .
We estimate b0 = log(oddswomen ) = −0.1542 and b1 = log(oddsmen ) − b0 = 0.4471, so the
regression equation is log(odds) = −0.1542 + 0.4471x.
.
If x = 0 for men and 1 for women, we estimate b0 = log(oddsmen ) = 0.2930
.
and b1 = log(oddswomen ) − b0 = −0.4471, so the regression equation is
log(odds) = 0.2930 − 0.4471x.
The estimated odds ratio is either:
. oddsmen .
e0.4471 = = 1.5638 if x = 1 for men, or
oddswomen
. oddswomen .
e−0.4471 = = 0.6395 if x = 1 for women
oddsmen

385
386 Chapter 14 Logistic Regression

14.8. Because of the relationships between the (log) odds for selecting Commercial A and the
(log) odds for selecting Commercial B, noted in the solutions to Exercises 14.4 and 14.6,
these coefficients are the opposites (negations) of, and the odds ratios are reciprocals of,
those found in the solution to the previous exercise.
The model is y = log(odds) = β0 + β1 x. If x = 1 for men and 0 for women, we need:
 
pmen pwomen
log = β0 + β1 and log = β0
1 − pmen 1 − pwomen
. .
We estimate b0 = log(odds∗women ) = 0.1542 and b1 = log(odds∗men ) − b0 = −0.4471, so the
regression equation is log(odds) = 0.1542 − 0.4471x.
.
If x = 0 for men and 1 for women, we estimate b0 = log(odds∗men ) = −0.2930 and b1 =
.
log(odds∗women ) − b0 = 0.4471, so the regression equation is log(odds) = −0.293 + 0.4471x.
The estimated odds ratio is either:
. oddsmen .
e−0.4471 = = 0.6395 if x = 1 for men, or
oddswomen
. oddswomen .
e0.4471 = = 1.5638 if x = 1 for women
oddsmen

14.9. (a) The appropriate test would be a chi-square test with df = 5. (b) The logistic
regression model has no error term. (c) H0 should refer to β1 (the population slope) rather
than b1 (the estimated slope). (d) The interpretation of coefficients is affected by correlations
among explanatory variables.

14.10. (a) β1 = 3 means that log(odds) increases by 3 when x increases by 1. This means the
.
odds increase by a factor of e3 = 20. (b) β0 is the log-odds of an event. (c) The odds of an
event is the ratio of the event’s probability and its complement.
Note: For part (a), it is difficult to make a simple statement about the effect on the
probability when odds increases by a factor of 20. With a little algebra, we can start with
odds 20odds 20
the formula p = and find that the new probability is p∗ = = .
odds + 1 20odds + 1 19 + 1/p

14.11. In each case, we compute x = LOpening log(odds) odds


log(odds) = −3.1658 + 1.3083x (a) 3.219 1.0456 2.8452
(b) 3.807 1.8149 6.1405
and odds = elog(odds ) . (c) 4.174 2.2950 9.9249

14.12. Use the formula given in Exercise 14.2: For each estimated odds value, the estimated
odds . 6.1405 . 9.9249 .
probability is p̂ = 3.8452 = 0.7399. (b) 7.1405 = 0.8600. (c) 10.9249 = 0.9085.
. (a) 2.8452
odds + 1

14.13. (a) For each column, divide the “yes” entry by the total to find p̂. (b) For each p̂,

compute odds = . (c) Finally, take log(odds).
1 − p̂
. .
p̂low = 88
1169 = 0.0753 oddslow = 0.0814 log(oddslow ) = −2.5083
. .
p̂high = 112
1246 = 0.0899 oddshigh = 0.0988 log(oddshigh ) = −2.3150
Solutions 387

. 15 .
142 = 0.7606 for exclusive-territory firms. (b) p̂2 = 28 = 0.5357 for other
14.14. (a) p̂1 = 108
. . .
firms. (c) odds1 = 1−p̂1p̂1 = 3.1765 and odds2 = 1−p̂2p̂2 = 1.1538. (d) log(odds1 ) = 1.1558 and
.
log(odds2 ) = 0.1431. (Be sure to use the natural logarithm for this computation.)

. .
14.15. (a) b0 = log(oddslow ) = −2.5083 and b1 = log(oddshigh ) − log(oddslow ) = 0.1933.
(b) The fitted model is log(odds) = −2.5083 + 0.1933x. (c) The odds ratio is
. .
oddshigh /oddslow = eb1 = 1.2132 (or 0.0988
0.0814 = 1.2132). (d) The relative risk from
Example 9.7 was 1.19—very close to this odds ratio.

. .
14.16. (a) b0 = log(odds2 ) = 0.1431 and b1 = log(odds1 ) − log(odds2 ) = 1.0127. (b) The fitted
.
model is log(odds) = 0.1431 + 1.0127x. (c) The odds ratio is odds1 /odds2 = eb1 = 2.7529.

14.17. Shown below is Minitab output. (a) The slope is significantly different from 0
(z = 2.37, P = 0.018), but the constant is not (z = 0.38, P = 0.706). (b) With b1 = 1.0127,
.
SEb1 = 0.4269, and z ∗ = 1.96, the 95% confidence interval for β1 is 0.176 to 1.849.
. .
(c) Exponentiating gives the interval e0.176 = 1.19 to e1.849 = 6.36.

Minitab output
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant 0.143101 0.378932 0.38 0.706
Exclusive
Yes 1.01267 0.426920 2.37 0.018 2.75 1.19 6.36

ea
14.18. Recall that, by properties of exponents, = ea − b . Therefore:
eb
oddsx + 1 e−11.0391 × e3.1709(x + 1)
= = e3.1709(x + 1) − 3.1709x = e3.1709(x + 1 − x) = e3.1709
oddsx e−11.0391 × e3.1709x

. .
14.19. With b1 = 3.1088 and SEb1 = 0.3879, the 99% confidence interval is
.
b1 ± 2.576SEb1 = b1 ± 0.9992, or 2.1096 to 4.1080.

14.20. To find the confidence interval for the odds ratio, we first make a confidence interval for
.
the slope b1 and then transform (exponentiate) it: b1 ± z ∗ SEb1 = 3.1088 ± (1.96)(0.3879) =
. .
2.3485 to 3.8691, so the odds ratio interval is e2.3485 = 10.470 to e3.8691 = 47.898. Up to
rounding error, this agrees with the software output.

. 2 .
0.3879 = 8.01. (b) z = 64.23, which agrees with the value of X given by SPSS
14.21. (a) z = 3.1088 2

and SAS. (c) The sketches are below. For both the Normal and chi-square distributions, the
test statistics are quite extreme, consistent with the reported P-value.

64.23
8.01

–2 0 2 4 6 8 10 0 2 4 6 8 10
388 Chapter 14 Logistic Regression

14.22. Shown in the table below are the coefficients for the full model (from Example 14.11),
as well as the three two-variable models and the three one-variable models (one of which
appeared in Example 14.6). P-values for individual coefficients are given in parentheses
below the coefficients. The X 2 statistics for the one-variable models are not shown; most
software will not produce this, because the P-value for the coefficient measures the overall
significance of the model.

Coefficient of:
Constant LOpening Theaters Opinion X2 df P
−2.0132 2.1467 −0.0010 −0.1095 12.716 3 0.0053
(0.0277) (0.2748) (0.8083)
−2.7164 2.1319 −0.0010 12.656 2 0.0018
(0.0286) (0.2805)
−2.7154 1.3091 −0.0710 11.432 2 0.0033
(0.0066) (0.8672)
−2.1815 0.00096 −0.0065 5.442 2 0.0658
(0.0332) (0.9858)
−3.1658 1.3083
(0.0070)
−2.2212 0.00096
(0.0329)
−0.1230 0.0822
(0.8030)

14.23. An odds ratio greater than 1 means a higher probability of a low tip. Therefore: The
odds favor a low tip from senior adults, those dining on Sunday, those who speak English
as a second language, and French-speaking Canadians. Diners who drink alcohol and lone
males are less likely to leave low tips. For example, for a senior adult, the odds of leaving a
low tip were 1.099 (for a probability of 0.5236).

14.24. (a) For each explanatory variable, we test H0 : βi = 0 versus Ha : βi = 0.


(b) Under the null hypotheses, the X 2 statistic has a chi-square distribution with df = 1.
Therefore, we reject H0 at the 5% level if X 2 > 3.84. We do not reject H0 for “Men’s
magazines” but have very strong evidence that all other coefficients (as well as the
constant) are not zero. (c) The probability that the model’s clothing is sexual is higher
for magazines targeted at young adults (as the problem states), when the model is female,
and for magazines aimed at men or at both men and women. (d) The fitted model is
log(odds) = −2.32 + 0.50x1 + 1.31x2 − 0.05x3 + 0.45x4 .

14.25. (a) For men’s magazines, the odds ratio confidence interval includes 1. This indicates
that this explanatory variable has no effect on the probability that a model’s clothing is not
sexual, which is consistent with our failure to reject H0 for men’s magazines in the previous
exercise. For all other explanatory variables, the odds ratio interval does not include 1,
equivalent to the significant evidence against H0 for those variables. (b) The odds that the
model’s clothing is not sexual are 1.27 to 2.16 times higher for magazines targeted at mature
adults, 2.74 to 5.01 times higher when the model is male, and 1.11 to 2.23 times higher
for magazines aimed at women. (These statements can also be made in terms of the odds
that the model’s clothing is sexual; for example, those odds are 1.27 to 2.16 times higher
Solutions 389

for magazines targeted at young adults, and so forth.) (c) The summary might note that it
is easier to interpret the odds ratio rather than the regression coefficients because of the
difficulty of thinking in terms of a log-odds scale.

p̂1 .
14.26. (a) p̂1 = 1000 = 0.463.
463
(b) odds1 = 1 − p̂1 = 0.8622. (c) p̂2 = 537
1000 = 0.537.
p̂2 .
(d) odds2 = 1 − p̂2 = 1.1598.(e) The odds in parts (b) and (d) are reciprocals—their product
is 1. (Likewise, the probabilities in (a) and (c) are complements—their sum is 1.)

. 75 .
14.27. (a) p̂hi = 73
91= 0.8022 and oddshi = 1 −p̂hip̂hi = 4.05. (b) p̂non = 109 = 0.6881 and
p̂non . .
oddsnon = 1 − p̂non = 2.2059. (c) The odds ratio is oddshi /oddsnon = 1.8385. The odds of a
high-tech company offering stock options are about 1.84 times those for a non-high-tech
firm.

. .
14.28. (a) log(oddshi ) = 1.4001 and log(oddsnon ) = 0.7911. (b) log(oddsnon ) = β0
and log(oddshi ) = β0 + β1 , so we find the estimates of β0 and β1 from the observed
. .
log-odds: b0 = log(oddsnon ) = 0.7911 and b1 = log(oddshi ) − log(oddsnon ) = 0.6090.
. .
(c) eb1 = e0.6090 = 1.8385, as we found in 16.13(c).

. .
14.29. (a) With b1 = 0.6090 and SEb1 = 0.3347, the 95% confidence interval is
.
b1 ± 1.96SEb1 = b1 ± 0.6560, or −0.0470 to 1.2650. (b) Exponentiating the confidence limits
gives the interval 0.9540 to 3.5430. (c) Because the confidence interval for β1 contains 0, or
equivalently because 1 is in the interval for the odds ratio, we could not reject H0 : β1 = 0 at
the 5% level. There does not appear to be a significant difference between the odds of stock
options for high-tech and other firms.
Note: Software reports z = 1.820 and a P-value of 0.0688, which are nearly
identical to the results for a two-proportion z test with the same counts (z = −1.832 and
P = 0.0669)—see the solution to Exercise 8.67. For large samples, these two tests should
give similar results.

Minitab output: Logistic regression (high-tech versus non-high-tech companies)


Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant 0.791128 0.206749 3.83 0.000
HT
Yes 0.608960 0.334663 1.82 0.069 1.84 0.95 3.54

14.30. Minitab output is on the following page. All proportions, odds, odds ratios, and
parameter estimates (b0 and b1 ) are unchanged. Because the standard error is smaller, the
.
95% confidence interval is narrower: b1 ± 1.96SEb1 = b1 ± 0.4637, or 0.1452 to 1.0727. The
odds-ratio interval is therefore 1.1563 to 2.9233. Because 0 is not in the confidence interval
for β1 and 1 is not in the odds-ratio interval, we have significant evidence of a difference in
the odds between the two types of companies.
Note: For testing H0 : β1 = 0, software reports z = 2.573 and P = 0.0101. For
comparison, the test of p1 = p2 yields z = 2.591 and P = 0.0096.
390 Chapter 14 Logistic Regression

Minitab output: Logistic regression with sample sizes doubled


Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant 0.791128 0.146194 5.41 0.000
HT
Yes 0.608960 0.236642 2.57 0.010 1.84 1.16 2.92
.
14.31. (a) For the high blood pressure group, p̂hi = 3338 55
= 0.01648, giving
.
oddshi = 1 −p̂hip̂hi = 0.01675, or about 1 to 60. (If students give odds in the form “a to
b,” their choices of a and b might be different.) (b) For the low blood pressure group,
21 . .
p̂lo = 2676 = 0.00785, giving oddslo = 1 −p̂lop̂lo = 0.00791, or about 1 to 126 (or 125). (c) The
.
odds ratio is oddshi /oddslo = 2.1181. Odds of death from cardiovascular disease are about
2.1 times greater in the high blood pressure group.

p̂w
14.32. (a) For female references, p̂w = 48
60 = 0.8, giving oddsw = 1 − p̂w = 4 (“4 to 1”). (b) For
p̂m
male references, p̂m = = 0.39, giving oddsm =
52
132 = 0.65 (“13 to 20”). (c) The odds
1 − p̂m
.
ratio is oddsw /oddsm = 6.1538. (The odds of a juvenile reference are more than six times
greater for females.)
 2
.
14.33. (a) The interval is b1 ± 1.96SEb1 , or 0.2452 to 1.2558. (b) X 2 = 0.7505
0.2578 = 8.47. This
gives a P-value between 0.0025 and 0.005. (c) We have strong evidence that there is a real
(significant) difference in risk between the two groups.
 2
.
14.34. (a) The interval is b1 ± 1.96SEb1 , or 1.0946 to 2.5396. (b) X 2 = 1.8171
0.3686 = 24.3. This
gives P < 0.0005. (c) We have strong evidence that there is a real (significant) difference in
juvenile references between male and female references.

.
14.35. (a) The estimated odds ratio is eb1 = 2.1181 (as we found in Exercise 14.31).
Exponentiating the interval for β1 in Exercise 14.33(a) gives the odds-ratio interval from
about 1.28 to 3.51. (b) We are 95% confident that the odds of death from cardiovascular
disease are about 1.3 to 3.5 times greater in the high blood pressure group.

Minitab output: Logistic regression on blood pressure


Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant -4.83968 0.219078 -22.09 0.000
BP
hi 0.750498 0.257840 2.91 0.004 2.12 1.28 3.51
.
14.36. (a) The estimated odds ratio is eb1 = 6.1538 (as we found in Exercise 14.32).
Exponentiating the interval for β1 in Exercise 14.34(a) gives the odds-ratio interval from
about 2.99 to 12.67. (b) We are 95% confident that the odds of a juvenile reference are
about 3 to 13 times greater among females.

Minitab output: Logistic regression on gender


Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant -0.430783 0.178131 -2.42 0.016
gender
Female 1.81708 0.368641 4.93 0.000 6.15 2.99 12.67
Solutions 391

 
pi
14.37. (a) The model is log 1 − pi = β0 + β1 xi , where xi = 1 if the ith person is over 40,
and 0 if he/she is under 40. (b) pi is the probability that the ith person is terminated; this
model assumes that the probability of termination depends on age (over/under 40). In this
case, that seems to have been the case, but we might expect that other factors were taken
.
into consideration. (c) The estimated odds ratio is eb1 = 3.859. (Of course, we can also
get this from 41/765
7/504 .) We can also find, for example, a 95% confidence interval for b1 :
b1 ± 1.96SEb1 = 0.5409 to 2.1599. Exponentiating this translates to a 95% confidence
interval for the odds: 1.7176 to 8.6701. The odds of being terminated are 1.7 to 8.7 times
greater
 for those over 40. (d) Use a multiple logistic regression model, for example,
log 1 −pi pi = β0 + β1 x1,i + β2 x2,i .

14.38. (a) Positive coefficients indicate increasing odds (and increasing probability), and
negative coefficients indicate decreasing odds. Therefore, the traits that make an individual
more likely to use the Internet are those listed in the rightmost column of the table below.
(The increase for having children is not significant.) (b) The odds ratios are given in the
.
table below; for example, e−0.063 = 0.9389 for Age. (c) The estimated log(odds) for this
individual would be
−0.063(23) + 0.013(50) + 0.367(1) − 0.222(1) + 1.080(1) + 0.285(0) + 0.049(0) = 0.426
.
so the estimated odds would be e0.426 = 1.5311. (d) The estimated probability is
odds = .
p = odds +1
0.6049.

b odds ratio Higher probability of Internet use


Age −0.063 0.9389 younger
Income 0.013 1.0131 higher income
Location 0.367 1.4434 urban location
Sex −0.222 0.8009 female
Education 1.080 2.9447 some post-secondary education
Language 0.285 1.3298 speak English
Children 0.049 1.0502 have children

14.39. It is difficult to find the needed probabilities from the Eats fruit?
numbers as given; this is made easier if we first convert Active? Yes No Total
the given information into a two-way table, shown on the Yes 169 494 663
right. The proportions meeting the activity guidelines are No 68 403 471
. 494 . . Total 237 897 1134
237 = 0.7131 and p̂no = 897 = 0.5507, so oddsfruit =
p̂fruit = 169
. . .
2.4853 and oddsno = 1.2258. Then log(oddsfruit ) = 0.9104 and log(oddsno ) = 0.2036, so
. .
b0 = 0.2036, b1 = 0.7068, and the model is log(odds) = 0.2036 + 0.7068x. Software reports
. .
SEb1 = 0.1585 and z = 4.46 for testing H0 : β1 = 0. A 95% confidence interval for the odds
ratio is 1.49 to 2.77.
392 Chapter 14 Logistic Regression

. .
14.40. (a) For females, p̂f = 708
1294 = 0.5471. For males, p̂m = 788
1862 = 0.4232. (b) The odds for
females and males are
p̂f . p̂m .
oddsf = = 1.2082 and oddsm = = 0.7337
1 − p̂f 1 − p̂m
.
0.7337 = 1.6467. (c) The model is log(odds) = β0 + β1 x, with x = 0 for
so the odds ratio is 1.2082
male and x = 1 for female. (These values for x make the slope positive, because the odds
.
are higher for females.) (d) eb1 = 1.6467, as we found in part (b). (e) The 95% confidence
interval for β1 is b1 ± 1.96 SEb1 = 0.3559 to 0.6417. Exponentiating gives a 95% confidence
interval for the odds ratio: 1.4275 to 1.8997. Female odds of reducing spending are 1.4 to
1.9 times those of males.

14.41. For each group, the probability, odds, and log(odds) of being overweight are
65080 . p̂no . .
p̂no = = 0.2732 oddsno = = 0.3759 log(oddsno ) = −0.9785
238215 1 − p̂no
83143 . p̂FF . .
p̂FF = = 0.2856 oddsFF = = 0.3997 log(oddsFF ) = −0.9170
291152 1 − p̂FF
With x = 0 for no fast food and x = 1 for fast food, the logistic regression equation
.
is log(odds) = −0.9785 + 0.0614x. Software reports SEb1 = 0.006163, and for testing
.
H0 : β1 = 0 we have z = 9.97, leaving little doubt that the slope is not 0. A 95% confidence
interval for the odds ratio is 1.0506 to 1.0763; the odds of being overweight for students at
schools close to fast-food restaurants are about 1.05 to 1.08 times greater than for students
at schools that are not close to fast food.

14.42. (a) The researchers were adjusting for violations of independence: The samples could
have included multiple students from the same school. (b) All of those other variables
could have a connection to being overweight. If the researchers had not controlled for
these variables, then their results might have been weakened (made less significant) if, for
example, they had a slightly higher number of female students from rural schools, or a small
number of non-exercising males in urban schools.

14.43. Portions of SAS and GLMStat output are given on the following page.
(a) The X 2 statistic for testing this hypothesis is 33.65 (df = 3), which has
P = 0.0001. We conclude that at least one coefficient is not 0. (b) The fitted model is
log(odds) = −6.053 + 0.3710 HSM + 0.2489 HSS + 0.03605 HSE. The standard errors of the
three coefficients are 0.1302, 0.1275, and 0.1253, giving respective 95% confidence intervals
0.1158 to 0.6262, −0.0010 to 0.4988, and −0.2095 to 0.2816. (c) Only the coefficient of
HSM is significantly different from 0, though HSS may also be useful.
Note: In the multiple regression case study of Chapter 11, HSM was also the only
significant explanatory variable among high school grades, and HSS was not even close to
significant. See Figure 11.5 on page 603 of the text.
Solutions 393

SAS output Intercept


Intercept and
Criterion Only Covariates Chi-Square for Covariates
-2 LOG L 295.340 261.691 33.648 with 3 DF (p=0.0001)

Analysis of Maximum Likelihood Estimates

Parameter Standard Wald Pr > Standardized


Variable DF Estimate Error Chi-Square Chi-Square Estimate

INTERCPT 1 -6.0528 1.1562 27.4050 0.0001 .


HSM 1 0.3710 0.1302 8.1155 0.0044 0.335169
HSS 1 0.2489 0.1275 3.8100 0.0509 0.233265
HSE 1 0.0361 0.1253 0.0828 0.7736 0.029971
GLMStat output
estimate se(est) z ratio Prob>|z|
1 Constant -6.053 1.156 -5.236 <0.0001
2 HSM 0.3710 0.1302 2.849 0.0044
3 HSS 0.2489 0.1275 1.952 0.0509
4 HSE 3.605e-2 0.1253 0.2877 0.7736

14.44. Portions of SAS and GLMStat output are given below. (a) The X 2 statistic for testing
this hypothesis is 14.2 (df = 2), which has P = 0.0008. We conclude that at least one
coefficient is not 0. (b) The model is log(odds) = −4.543+0.003690 SATM +0.003527 SATV.
The standard errors of the two coefficients are 0.001913 and 0.001751, giving respective
95% confidence intervals −0.000059 to 0.007439, and 0.000095 to 0.006959. (The first
coefficient has a P-value of 0.0537 and the second has P = 0.0440.) (c) We (barely) cannot
reject βSATM = 0—though because 0 is just in the confidence interval, we are reluctant to
discard SATM. Meanwhile, we conclude that βSATV = 0.
Note: By contrast, with multiple regression of GPA on SAT scores, we found SATM
useful but not SATV. See Figure 11.8 on page 607 of the text.

SAS output Intercept


Intercept and
Criterion Only Covariates Chi-Square for Covariates
-2 LOG L 295.340 281.119 14.220 with 2 DF (p=0.0008)
Analysis of Maximum Likelihood Estimates
Parameter Standard Wald Pr > Standardized
Variable DF Estimate Error Chi-Square Chi-Square Estimate
INTERCPT 1 -4.5429 1.1618 15.2909 0.0001 .
SATM 1 0.00369 0.00191 3.7183 0.0538 0.175778
SATV 1 0.00353 0.00175 4.0535 0.0441 0.180087
GLMStat output
estimate se(est) z ratio Prob>|z|
1 Constant -4.543 1.161 -3.915 <0.0001
2 SATM 3.690e-3 1.913e-3 1.929 0.0537
3 SATV 3.527e-3 1.751e-3 2.014 0.0440

14.45. The coefficients and standard errors for the fitted model are on the following page. Note
that the tests requested in parts (a) and (b) are not available with all software packages.
(a) The X 2 statistic for testing this hypothesis is given by SAS as 19.2256 (df = 3); because
P = 0.0002, we reject H0 and conclude that high school grades add a significant amount
394 Chapter 14 Logistic Regression

to the model with SAT scores. (b) The X 2 statistic for testing this hypothesis is 3.4635
(df = 2); because P = 0.1770, we cannot reject H0 ; SAT scores do not add significantly to
the model with high school grades. (c) For modeling the odds of HIGPA, high school grades
(specifically HSM, and to a lesser extent HSS) are useful, while SAT scores are not.

SAS output Analysis of Maximum Likelihood Estimates


Parameter Standard Wald Pr > Standardized
Variable DF Estimate Error Chi-Square Chi-Square Estimate
INTERCPT 1 -7.3732 1.4768 24.9257 0.0001 .
HSM 1 0.3427 0.1419 5.8344 0.0157 0.309668
HSS 1 0.2249 0.1286 3.0548 0.0805 0.210704
HSE 1 0.0190 0.1289 0.0217 0.8829 0.015784
SATM 1 0.000717 0.00220 0.1059 0.7448 0.034134
SATV 1 0.00289 0.00191 2.2796 0.1311 0.147566
Linear Hypotheses Testing
Wald Pr >
Label Chi-Square DF Chi-Square
HS 19.2256 3 0.0002
SAT 3.4635 2 0.1770
GLMStat output
estimate se(est) z ratio Prob>|z|
1 Constant -7.373 1.477 -4.994 <0.0001
2 SATM 7.166e-4 2.201e-3 0.3255 0.7448
3 SATV 2.890e-3 1.914e-3 1.510 0.1311
4 HSM 0.3427 0.1419 2.416 0.0157
5 HSS 0.2249 0.1286 1.748 0.0805
6 HSE 1.899e-2 0.1289 0.1473 0.8829

14.46. (a) The fitted model is log(odds) = −0.6124 + 0.0609 Gender; the coefficient
of gender is not significantly different from 0 (z = 0.21, P = 0.8331). (b) Now,
log(odds) = −5.214 + 0.3028 Gender + 0.004191 SATM + 0.003447 SATV. In this model,
gender is still not significant (P = 0.3296). (c) Gender is not useful for modeling the odds
of HIGPA.
GLMStat output: Gender only
estimate se(est) z ratio Prob>|z|
1 Constant -0.6124 0.4156 -1.474 0.1406
2 Gender 6.087e-2 0.2889 0.2107 0.8331
Gender and SAT scores
estimate se(est) z ratio Prob>|z|
1 Constant -5.214 1.362 -3.828 0.0001
2 Gender 0.3028 0.3105 0.9750 0.3296
3 SATM 4.191e-3 1.987e-3 2.109 0.0349
4 SATV 3.447e-3 1.760e-3 1.958 0.0502

14.47. The models reported below are for the odds of death, as requested in the instructions. If
a student models odds of survival, or codes the indicator variables for hospital and condition
differently, his or her answers will be slightly different from these (but the conclusions
should be the same). (a) The fitted model is log(odds) = −3.892 + 0.4157 Hospital,
. .
using 1 for Hospital A and 0 for Hospital B. With b1 = −0.4157 and SEb1 = 0.2831,
we find that z = −1.47 or X = 2.16 (P = 0.1420), so we do not have evidence to
2

suggest that β1 is not 0. A 95% confidence interval for β1 is −0.1392 to 0.9706 (this
Solutions 395

.
interval includes 0). We estimate the odds ratio to be eb1 = 1.515, with confidence
interval 0.87 to 2.64 (this includes 1, since β1 might be 0). (b) The fitted model is
log(odds) = −3.109 − 0.1320 Hospital − 1.266 Condition; as before, use 1 for Hospital A
and 0 for Hospital B, 1 for good condition and 0 for poor. The estimated odds ratio is
.
eb1 = 0.8764, with confidence interval 0.48 to 1.60. (c) In neither case is the effect of
Hospital significant. However, we can see the effect of Simpson’s paradox in the coefficient
of Hospital, or equivalently in the odds ratio. In the model with Hospital alone, this
coefficient was positive and the odds ratio was greater than 1, meaning Hospital A patients
have higher odds of death. When condition is added to the model, this coefficient is negative
and the odds ratio is less than 1, meaning Hospital A patients have lower odds of death.
GLMStat output: Hospital only
estimate se(est) z ratio Prob>|z|
1 Constant -3.892 0.2525 -15.41 <0.0001
2 Hosp 0.4157 0.2831 -1.469 0.1420

odds ratio lower 95% ci upper 95% ci


1 Constant 2.041e-2 1.244e-2 3.348e-2
2 Hosp 1.515 0.8701 2.639
Hospital and condition
estimate se(est) z ratio Prob>|z|
1 Constant -3.109 0.2959 -10.51 <0.0001
2 Hosp -0.1320 0.3078 -0.4288 0.6681
3 Cond -1.266 0.3218 -3.935 <0.0001

odds ratio lower 95% ci upper 95% ci


1 Constant 4.463e-2 2.499e-2 7.971e-2
2 Hosp 0.8764 0.4794 1.602
3 Cond 0.2820 0.1501 0.5298
Chapter 15 Solutions

15.1. The rankings are shown on the right. Group A ranks Group Rooms Ranks
(shaded) are 1, 2, 4, 6, and 8; Group B ranks are 3, 5, 7, 9, A 30 1
and 10. A 68 2
B 240 3
A 243 4
B 329 5
A 448 6
B 540 7
A 552 8
B 560 9
B 780 10

15.2. The list of ranks is not shown because it is nearly identical to the one shown in the
previous solution; the only change needed is to change 780 to 4003 in the last line. The
ranks assigned to each group are exactly the same.

15.3. The null hypothesis is H0 : no difference in distribution of number of rooms. The


alternative might be two-sided (“there is a difference”) or one-sided if we had a
prior suspicion that one group had more rooms than the other. The test statistic is
W = 1 + 2 + 4 + 6 + 8 = 21.

15.4. Changing the data does not change the hypotheses, so they are the same as in the
previous solution. Additionally, because the assigned ranks did not change, the test statistic
is still W = 21.

15.5. Under the null hypothesis,



(5)(11) (5)(5)(11) .
µW = = 27.5 and σW = = 4.7871
2 12
− 27.5 .
We found W = 21, so z = 214.7871 = −1.36, for which the two-sided P-value is
. − 27.5 .
2P(Z ≤ −1.36) = 0.1738. With the continuity correction, we find z = 21.54.787 = −1.25,
which gives P = 2P(Z ≤ −1.25) = 0.2112. The Minitab output on the following page gives
a similar P-value to that found with the continuity correction; the difference is due to the
rounding of z. Regardless of the P-value used, we do not reject H0 .
Note: If a one-sided alternative was specified in Exercise 15.3, the P-value would be half
.
as big: P = 0.0869, or 0.1056 with the continuity correction.
In the Minitab output, the medians are referred to as ETA1 and ETA2 (“eta” is the Greek
letter η). Minitab also reports an estimate of 0.21 for the difference η1 − η2 ; note that this
is not the same as the difference between the two sample medians (0.7 − 0.4 = 0.3). This
estimate, called the Hodges-Lehmann estimate, is not discussed in the text and has been
removed from the Minitab outputs accompanying other solutions for this chapter. Briefly,
this estimate is found by taking every response from the first group and subtracting every
response from the second group, yielding (in this case) a total of 25 differences. The median
of this set of differences is the Hodges-Lehmann estimate.

396
Solutions 397

Minitab output: Wilcoxon rank sum (Mann-Whitney) confidence interval and test
GrpA N = 5 Median = 243.0
GrpB N = 5 Median = 540.0
Point estimate for ETA1-ETA2 is -228.0
96.3 Percent C.I. for ETA1-ETA2 is (-537.0,208.0)
W = 21.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.2101

15.6. Because the ranks and test statistic are unchanged, all answers are the same as those
given in the previous solution.

99 .
15.7. (a) For example, 254 = 39.0% of self- Self-employed?
employed workers are completely satisfied. Yes 39.0% 55.9% 3.1% 2.0%
The complete table is on the right with a bar No 28.2% 61.2% 8.2% 2.3%
graph. Overall, self-employed workers are 60
more satisfied than the other group. (b) See
50
the Minitab output below: X 2 = 15.641
with df = 3, for which P = 0.001. We can 40

Percent
Self-employed
reject H0 and conclude that job satisfaction 30
and job type (self-employed or not) are not Not self-employed
20
independent.
10
0
Completely Mostly Mostly Completely
Minitab output: Chi-square test satisfied satisfied dissatisfied dissatisfied
Expected counts are printed below observed counts Job satisfaction

C2 C3 C4 C5 Total
1 99 142 8 5 254
77.83 152.53 18.06 5.58
2 250 542 73 20 885
271.17 531.47 62.94 19.42
Total 349 684 81 25 1139
ChiSq = 5.760 + 0.727 + 5.606 + 0.059 +
1.653 + 0.209 + 1.609 + 0.017 = 15.641
df = 3, p = 0.001

15.8. (a) Summary statistics for the two distributions are: Men Women
995 0 79
x s Min Q1 M Q3 Max 220 1 023
7 1 6
Men 15,287 7710 5180 9,951 12,423.5 21,791 29,920 31 2 124
Women 17,085 7858 7694 10,592 15,275.5 22,376 32,291 9 2
3 2
Various graphical summaries are possible; shown on the right is a
back-to-back stemplot. The men’s mean and median are lower than the women’s, but the
stemplots don’t suggest a substantial difference. Neither distribution has extreme skewness
or outliers. (b) The Wilcoxon test statistic is W = 99 with two-sided P = 0.6776 (Minitab
output on the following page). We do not have enough evidence to conclude that there is
difference between genders in words spoken.
398 Chapter 15 Nonparametric Tests

Minitab output: Wilcoxon rank sum test


MWords N = 10 Median = 12424
WWords N = 10 Median = 15276
W = 99.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.6776

15.9. Back-to-back stemplots on the right, summary Men Women


statistics below. The men’s distribution is skewed, and 42210 0 2
99999876655 0 5557777888889
the women’s distribution has a near-outlier. Men and 221100 1 01223444
women are not significantly different (W = 1421, 77665 1 66666777789
4331 2 0112244
P = 0.6890). The t test assumes Normal distributions; 965 2
with small samples (like the previous exercise), this 1 3 2
6 3
might be risky. In this exercise, the samples might be
large enough to overcome the apparent non-Normality of the distributions.
Note: Shown below is the Minitab output for a t test; the conclusion is the same as the
Wilcoxon test (t = −0.11, P = 0.92).

x s Min Q1 M Q3 Max
Men 14,060 9065 695 7464.5 11118 22740 36345
Women 14,252 6515 2363 8345.5 14602 18050 32291
Minitab output: Wilcoxon rank sum test
Mwords N = 37 Median = 11118
Wwords N = 41 Median = 14602
W = 1421.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.6890
Two-sample t test
N Mean StDev SE Mean
Mwords 37 14060 9065 1490
Wwords 41 14252 6515 1017
T-Test mu Mwords = mu Wwords (vs not =): T= -0.11 P=0.92 DF= 64
.
15.10. (a) We find W = 26 and P = 0.0152 (Minitab output on the following page). We
have strong evidence against the hypothesis of identical distributions; we conclude that
the weed-free yield is higher. (b) For testing H0 : µ0 = µ9 versus Ha : µ0 > µ9 , we find
. .
x 0 = 170.2, s0 = 5.4216, x 9 = 157.575, s9 = 10.1181, and t = 2.20, which gives
P = 0.0423 (df = 4.6). We have fairly strong evidence that the mean yield is higher with no
weeds—but the evidence is not quite as strong as in (a). (c) Both tests still reach the same
conclusion, so there is no “practically important impact” on our conclusions. The Wilcoxon
.
evidence is slightly weaker: W = 22, P = 0.0259. The t-test evidence is slightly stronger:
t = 2.79, df = 3, P = 0.0341. The new statistics for the 9-weeds-per-meter group are
.
x 9 = 162.633 and s9 = 0.2082; these are substantial changes for each value.
Solutions 399

Minitab output: Wilcoxon rank sum test (all points)


Weed0 N = 4 Median = 169.45
Weed9 N = 4 Median = 162.55
W = 26.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0152
Wilcoxon rank sum test (outlier removed)
C2 N = 4 Median = 169.45
C4 N = 3 Median = 162.70
W = 22.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0259

15.11. (a) Normal quantile plots are not shown. The score 0.00 for child 8 seems to be a low
outlier (although with only five observations, such judgments are questionable). (b) For
.
testing H0 : µ1 = µ2 versus Ha : µ1 > µ2 , we have x 1 = 0.676, s1 = 0.1189, x 2 = 0.406,
.
and s2 = 0.2675. Then, t = 2.062, which gives P = 0.0447 (df = 5.5). We have some
evidence that high-progress readers have higher mean scores. (c) We test:
H0 : Scores for both groups are identically distributed
vs. Ha : High-progress children systematically score higher
.
for which we find W = 36 and P = 0.0473 or 0.0463—significant evidence (at α = 0.05)
against the hypothesis of identical distributions. This is equivalent to the conclusion reached
in part (b).

Minitab output: Wilcoxon rank sum test


HiProg1 N = 5 Median = 0.7000
LoProg1 N = 5 Median = 0.4000
W = 36.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0473
The test is significant at 0.0463 (adjusted for ties)

15.12. (a) Normal quantile plots are not shown. The score 0.54 for child 3 seems to be a low
.
outlier. (b) For testing H0 : µ1 = µ2 versus Ha : µ1 > µ2 , we have x 1 = 0.768, s1 = 0.1333,
.
x 2 = 0.516, s2 = 0.2001. Then, t = 2.344, which gives P = 0.0259 (df = 6.97). We have
fairly strong evidence that high-progress readers have higher mean scores. (c) We test:
H0 : Scores for both groups are identically distributed
vs. Ha : High-progress children systematically score higher
.
for which we find W = 38 and P = 0.0184. This is evidence against H0 , slightly stronger
than that found in part (b).

Minitab output: Wilcoxon rank sum test


HiProg2 N = 5 Median = 0.8000
LoProg2 N = 5 Median = 0.4900
W = 38.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0184
400 Chapter 15 Nonparametric Tests

15.13. (a) See table. (b) For Story 2, W = Story 1 Story 2


8 + 9 + 4 + 7 + 10 = 38. Under H0 : Child Progress Score Rank Score Rank
(5)(11) 1 high 0.55 4.5 0.80 8
µW = = 27.5
2 2 high 0.57 6 0.82 9
(5)(5)(11) . 3 high 0.72 8.5 0.54 4
σW = = 4.7871 4 high 0.70 7 0.79 7
12
.
38 − 27.5
5 high 0.84 10 0.89 10
(c) z = 4.787= 2.19; with the continuity 6 low 0.40 3 0.77 6
− 27.5 . 7 low 0.72 8.5 0.49 3
correction, we compute 37.54.787 = 2.09,
8 low 0.00 1 0.66 5
which gives P = P(Z > 2.09) = 0.0183. 9 low 0.36 2 0.28 1
(d) See the table. 10 low 0.55 4.5 0.38 2

15.14. (a) The outline is shown below. (b) We consider score im- Treatment Control
provements (posttest minus pretest). The means, medians, and 0 455
76 0 7
standard deviations are: 0 8
. 110 1 1
Treatment: x = 11.4 M = 11.5 s = 3.1693 332 1 2
. 5 1 4
Control: x = 8.25 M = 7.5 s = 3.6936 6 1
A back-to-back stemplot is one way to compare the distributions graphically. Both of these
comparisons support the idea that the positive subliminal message resulted in higher test
scores. (c) We have W = 114, for which P = 0.0501 (or 0.0494, adjusted for ties). This is
just about significant at α = 0.05, and at least warrants further study.
Minitab output: Wilcoxon rank sum test
Trtmt N = 10 Median = 11.500
Ctrl N = 8 Median = 7.500
W = 114.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0501
The test is significant at 0.0494 (adjusted for ties)

Treatment 1
Group 1 - Subliminal
*

 10 subjects H
Random  message H
j
H Observe test
H *


assignment score change
H
j
H Group 2 - Treatment 2
8 subjects Neutral message

15.15. (a) At right. Unlogged plots appear to have a greater number Unlogged Logged
of species. (b) We test H0 : There is no difference in the number 0 4
0
of species on logged and unlogged plots versus Ha : Unlogged 0
plots have a greater variety of species. The Wilcoxon test gives 1 0
. 333 1 2
W = 159 and P = 0.0298 (0.0290, adjusted for ties). We conclude 55 1 455
that the observed difference is significant; unlogged plots really do 1 7
998 1 88
have a greater number of species. 10 2
22 2
Minitab output: Wilcoxon rank sum test
Unlogged N = 12 Median = 18.500
Logged N = 9 Median = 15.000
W = 159.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0298
The test is significant at 0.0290 (adjusted for ties)
Solutions 401

15.16. For the Wilcoxon test, we have W = 579, for which P = 0.0064 (0.0063, adjusted for
ties). The evidence is slightly stronger with the Wilcoxon test than for the t and permutation
tests.
Minitab output: Wilcoxon rank sum test
Trtmt N = 21 Median = 53.00
Ctrl N = 23 Median = 42.00
W = 579.0
Test of ETA1 = ETA2 vs. ETA1 > ETA2 is significant at 0.0064
The test is significant at 0.0063 (adjusted for ties)

15.17. (a) We find X 2 = 3.955 with df = (5 − 1)(2 − 1) = 4, giving P = 0.413. There is


little evidence to make us believe that there is a relationship between city and income.
.
(b) Minitab reports W = 56,370, with P = 0.5; again, there is no evidence that incomes are
systematically higher in one city.
Minitab output: Wilcoxon rank sum test
City1 N = 241 Median = 2.0000
City2 N = 218 Median = 2.0000
W = 56370.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.5080
The test is significant at 0.4949 (adjusted for ties)

15.18. We test:
H0 : Food scores and activities scores have the same distribution
vs. Ha : Food scores are higher
The differences, and their ranks, are:
Food Activities
Spa score score Difference Rank
1 90.9 93.8 −2.9 4
2 92.3 92.3 0.0 –
3 88.6 91.4 −2.8 3
4 81.8 95.0 −13.2 6
5 85.7 89.2 −3.5 5
6 88.9 88.2 0.7 1
7 81.0 81.8 −0.8 2
In fact, it is not necessary to give the complete set of rankings; the only observations we
need to make are (1) there is only one positive difference and (2) it is the smallest (in
absolute value) of all the nonzero differences. Therefore, W + = 1.
Note: In assigning ranks, differences of 0 are ignored; see the comment in the text
toward the bottom of page 735. If a student mistakenly assigns a rank of 1 to 0, they would
find W + = 2 (or perhaps 3 if they erroneously count 0 as a “positive difference”).
402 Chapter 15 Nonparametric Tests

15.19. We test:
H0 : Food scores and activities scores have the same distribution
vs. Ha : Food scores are higher
The differences, and their ranks, are:
Food Activities
Spa score score Difference Rank
1 77.3 95.7 −18.4 6
2 85.7 78.0 7.7 2
3 84.2 87.2 −3.0 1
4 85.3 85.3 0.0 –
5 83.7 93.6 −9.9 5
6 84.6 76.0 8.6 4
7 78.5 86.3 −7.8 3

The two positive differences have ranks 2 and 4, so W + = 6.


Note: In assigning ranks, differences of 0 are ignored; see the comment in the text
toward the bottom of page 735. If a student mistakenly assigns a rank of 1 to 0, they would
find W + = 3 + 5 = 8 (or perhaps 9 if they erroneously count 0 as a “positive difference”).

15.20. Because one difference was 0, we ignore it and take n = 6, so that:



6(6 + 1) . 6(6 + 1)(12 + 1) √ .
µW + = = 10.5, σW + = = 22.75 = 4.7697,
4 24
. .
and the approximate P-value is P(W + ≥ 0.5) = P(Z ≥ −2.10) = 0.9821. This agrees with
the Minitab output below; see the note in the solution to Exercise 15.24 for an explanation
of the estimated median reported by Minitab.
Note: If a student does not see the instruction about discarding differences of 0 at the
bottom of page 735, they might compute
 the mean and standard deviation using n = 7:
7(7 + 1) . 7(7 + 1)(14 + 1)
√ .
µW + = 4 = 14 and σW + = 24 = 35 = 5.9161. Such a student would
presumably take W + = 2 (or 3), so they would compute the approximate P-value as
. . . .
P(W + ≥ 1.5) = P(Z ≥ −2.11) = 0.9826 or P(W + ≥ 2.5) = P(Z ≥ −1.94) = 0.9738.
While these are close to the right answer (and lead to the same conclusion), they are not
quite correct. In other situations, failing to ignore differences of 0 may lead to the wrong
conclusion.
Minitab output: Wilcoxon signed rank rest (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 7 6 1.0 0.982 -2.000

15.21. Because one difference was 0, we ignore it and take n = 6, so that:



6(6 + 1) . 6(6 + 1)(12 + 1) √ .
µW + = = 10.5, σW + = = 22.75 = 4.7697,
4 24
. .
and the approximate P-value is P(W + ≥ 5.5) = P(Z ≥ −1.05) = 0.8531. This agrees with
the Minitab output below; see the note in the solution to Exercise 15.24 for an explanation
of the estimated median reported by Minitab.
Note: If a student does not see the instruction about discarding differences of 0 at the
bottom of page 735, they might compute the mean and standard deviation using n = 7:
Solutions 403

 √
. .
µW + = 7(74+ 1) = 14 and σW + = 7(7 + 1)(1424
+ 1)
= 35 = 5.9161. Such a student would
presumably take W + = 8 (or 9), so they would compute the approximate P-value as
. . . .
P(W + ≥ 7.5) = P(Z ≥ −1.10) = 0.8643 or P(W + ≥ 8.5) = P(Z ≥ −0.93) = 0.8238.
While these are close to the right answer (and lead to the same conclusion), they are not
quite correct. In other situations, failing to ignore differences of 0 may lead to the wrong
conclusion.
Minitab output: Wilcoxon signed rank test (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 7 6 6.0 0.853 -3.450

15.22. (a) Five of the six subjects rated drink A higher, by between 2 and 8 points. The
subject who rated drink B higher only gave it a 2-point edge. (b) For testing H0 : µd = 0
. .
versus Ha : µd = 0, we have x = 4.16 and s = 3.6560, so t = 2.79 (df = 5) and
.
P = 0.0384—enough evidence to reject H0 . (c) For testing H0 : Ratings have the same
distribution for both drinks versus Ha : One drink is systematically rated higher, we have
W + = 19.5 and P = 0.075—not quite significant at α = 0.05. (d) For a sample this small,
the Wilcoxon test has low power. (See the related note in the solution to Exercise 15.24.)
Minitab output: Matched pairs t test
Variable N Mean StDev SE Mean T P-Value
Diff 6 4.17 3.66 1.49 2.79 0.038
Wilcoxon signed rank test
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 6 6 19.5 0.075 5.000

15.23. (a) With this additional subject, six of the seven subjects rated drink A −0 2
higher, and (as before) the subject who preferred drink B only gave it a 0 2
0 5578
2-point edge. (b) For testing H0 : µd = 0 versus Ha : µd = 0, we have
. . . . 1
x = 7.8571 and s = 10.3187, so t = 2.01 (df = 6) and P = 0.0906. (c) For 1
testing H0 : Ratings have the same distribution for both drinks versus Ha : One 2
drink is systematically rated higher, we have W + = 26.5 and P = 0.043. 2
3 0
(d) The new data point is an outlier (see the stemplot, above on the right),
which may make the t procedure inappropriate. This also increases the standard deviation of
the differences, which makes t insignificant. The Wilcoxon test is not sensitive to outliers,
and the extra data point makes it powerful enough to reject H0 .
Minitab output: Matched pairs t test
Variable N Mean StDev SE Mean T P-Value
Diff 7 7.86 10.32 3.90 2.01 0.091
Wilcoxon signed rank test
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 7 7 26.5 0.043 5.500

15.24. (a) The differences (treatment minus control) were 0.01622, 0.01102, and 0.01607. The
. .
mean difference was x = 0.01444, and s = 0.002960. The fact that all are positive supports
the idea that there was more growth in the treated plots. (b) For testing H0 : µ = 0 versus
.
Ha : µ > 0, with µ the mean (treatment minus control) difference, we have t = √ x
= 8.45,
s/ 3
df = 2, and P = 0.0069. We conclude that growth was greater in treated plots. (c) The
404 Chapter 15 Nonparametric Tests

Wilcoxon statistic is W + = 6, for which P = 0.091. We would not reject H0 (which states
that there is no difference among pairs). (d) A low-power test has a low probability of
rejecting H0 when it is false.
Minitab output: Wilcoxon signed rank test (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 3 3 6.0 0.091 0.01485
Note: With only three pairs, the Wilcoxon signed rank test can never give a P-value smaller
than 0.091. This is one difference between some nonparametric tests and parametric tests
like the t test: With the t test, the power improves when we consider alternatives that are
farther from the null hypothesis; for example, if H0 says µ = 0, we have higher power for
the alternative µ = 10 than for µ = 5. With the Wilcoxon signed rank test, all alternatives
look the same; the values of W + and P would be the same if the three differences had been
100, 200, and 300.
Also, note that the “estimated median” in the Minitab output (0.01485) is not the same
as the median of the three differences (0.01607). The process of computing this point
estimate is not discussed in the text, but we will illustrate it for this simple case: The
Wilcoxon estimated median is the median of the set of Walsh averages of the differences.
This set consists of every possible pairwise average (xi + xj )/2 for i ≤ j; note that this
includes i = j, in which case the average is xi . In general, there are n(n + 1)/2 such
averages, so with n = 3 differences, we have 6 Walsh averages: the three differences
(0.01622, 0.01102, and 0.01607) and the averages of each pair of distinct differences
(0.013545, 0.01362, and 0.016145). The median of 0.01102, 0.013545, 0.01362, 0.01607,
0.016145, and 0.01622 is 0.014845.

15.25. We examine the heart-rate increase (final minus resting) from low-rate exercise;
our hypotheses are H0 : median = 0 versus Ha : median > 0. The statistic is W + = 10
(the first four differencesare positive, and the fifth is 0, so we drop it). We compute
W +
− 5 9.5 − 5 .
P = P(W + ≥ 9.5) = P ≥ = P(Z ≥ 1.64) = 0.0505. This is right on the
2.739 2.739
borderline of significance: It is fairly strong evidence that heart rate increases, but (barely)
not significant at 5%.
Minitab output: Wilcoxon signed rank test (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
LowDiff 5 4 10.0 0.050 7.500

15.26. (a) We first find the Final − Resting differences for both exercise rates (Low: 15, 9,
6, 9, 0; Medium: 21, 24, 15, 15, 18), then compute the differences of these differences
(6, 15, 9, 6, 18). To this last list of differences, we apply the Wilcoxon signed rank test.
The hypotheses are H0 : median = 0 versus Ha : median > 0. (The rank sum test is not
appropriate because we do not have two independent samples.) (b) The statistic is W + = 15
(all five differences were positive), and the reported P-value is 0.030—fairly strong evidence
that medium-rate exercise increases are greater than low-rate exercise increases.
Minitab output: Wilcoxon signed rank test (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
LowMed 5 5 15.0 0.030 10.50
Solutions 405

15.27. For testing H0 : median = 0 versus Ha : median > 0, the Wilcoxon statistic is W + = 119
(14 of the 15 differences were positive, and the one negative difference was the smallest
in absolute value), and P < 0.0005—very strong evidence that there are more aggressive
incidents during moon days. This agrees with the results of the t and permutation tests.
(See the note in the solution to Exercise 15.24 for an explanation of the estimated median
reported by Minitab.)
Minitab output: Wilcoxon signed rank test (median = 0 versus median > 0)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
diff 15 15 119.0 0.000 2.570

15.28. There are 17 nonzero differences; only one is negative (the boldface 6 in the list below).
Diff: 1 1 2 2 2 3 3 3 3 3 3 6 6 6 6 6 6
Rank:  1 2  3 
4 5  6 7 8 9 10 11  12 13 14 15 16 17
Value: 1.5 4 8.5 14.5

This gives W + = 138.5. (Note that the only tie we really need to worry about is the last
group; all other ties involve only positive differences.)

15.29. (a) At right. The distribution is clearly right-skewed but has no 9 1


outliers. (b) W + = 31 (only 4 of 12 differences were positive) and 9 5679
10 134
P = 0.556—there is no evidence that the median is other than 105. 10 5
(See the note in the solution to Exercise 15.24 for an explanation of the 11 1
estimated median reported by Minitab.) 11 9
12 2
Minitab output: Wilcoxon signed rank test (median = 105 versus median = 105)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Radon 12 12 31.0 0.556 103.2

15.30. If we compute Haiti content minus factory content (so that a negative −1 4
difference means that the amount of vitamin C decreased), we find that the −1 3322
−1
mean change is −5.33 and the median is −6. (See the note in the solution −0 9988
to Exercise 15.24 for an explanation of the estimated median reported by −0 7776666
Minitab.) The stemplot is right-skewed. There are five positive differences;−0 5444
the Wilcoxon statistic is W + = 37, for which P < 0.0005. The differences −0 2
−0 1
are systematically negative, so vitamin C content is lower in Haiti. 0 1
Minitab output: Wilcoxon signed rank test (median = 0 versus median < 0) 0 33
0 4
N FOR WILCOXON ESTIMATED 0
N TEST STATISTIC P-VALUE MEDIAN 0 8
Change 27 27 37.0 0.000 -5.500
406 Chapter 15 Nonparametric Tests

15.31. (a) The Wilcoxon statistic is W + = 0 (all of the differences were less than 16), for
which P = 0—very strong evidence against H0 . We conclude that the median weight
gain is less than 16 pounds. (b) Minitab (output below) gives the interval 3.75 to 5.90 kg
for the median weight gain. (For comparison, in the solution to Exercise 7.32, the 95%
confidence interval for the mean µ was about 3.80 to 5.66 kg. See the note in the solution
to Exercise 15.24 for an explanation of the estimated median reported by Minitab.)
Minitab output: Wilcoxon signed rank test (median = 16 versus median = 16)
N FOR WILCOXON ESTIMATED
N TEST STATISTIC P-VALUE MEDIAN
Diff 16 16 0.0 0.000 4.800
Wilcoxon signed rank confidence interval
ESTIMATED ACHIEVED
N MEDIAN CONFIDENCE CONFIDENCE INTERVAL
Diff 16 4.80 94.8 ( 3.75, 5.90)

15.32. (a) For testing


H0 : The distribution of age at death is the same for all three groups
vs. Ha : At least one group is systematically higher or lower
we find H = 11.11 with df = 2, for which P = 0.004. (b) In the solution to Exercise 12.38,
ANOVA yielded F = 6.56 (df 2 and 120) and P = 0.002. The conclusion is the same with
either test.
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 67 73.00 63.4 0.46
2 32 68.00 46.8 -2.81
3 24 77.50 78.5 2.53
OVERALL 123 62.0
H = 11.11 d.f. = 2 p = 0.004
H = 11.12 d.f. = 2 p = 0.004 (adjusted for ties)

15.33. (a) For testing


H0 : The distribution of BMD is the same for all three groups
vs. Ha : At least one group is systematically higher or lower
we find H = 9.10 with df = 2, for which P = 0.011. (b) In the solution to Exercise 12.39,
ANOVA yielded F = 7.72 (df 2 and 42) and P = 0.001. The ANOVA evidence is slightly
stronger, but (at α = 0.05) the conclusion is the same.
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 15 0.2190 20.1 -1.05
2 15 0.2160 17.7 -1.93
3 15 0.2320 31.2 2.97
OVERALL 45 23.0
H = 9.10 d.f. = 2 p = 0.011
H = 9.12 d.f. = 2 p = 0.011 (adjusted for ties)
Solutions 407

15.34. (a) The Kruskal-Wallis test (Minitab output below) gives H = 8.73, df = 4, and
P = 0.069—not significant at α = 0.05. Note, however, that the rankings clearly suggest
that vitamin C content decreases over time; the samples are simply too small to achieve
significance even with such seemingly strong evidence. (See also a related comment in
the solution to Exercise 15.24.) (b) The more accurate P-value is more in line with the
apparent strength of the evidence, and does change our conclusion. With it, we reject H0
and conclude that the distribution changes over time.
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
0 2 48.705 9.5 2.09
1 2 41.955 7.5 1.04
3 2 21.795 5.5 0.00
5 2 12.415 3.5 -1.04
7 2 8.320 1.5 -2.09
OVERALL 10 5.5
H = 8.73 d.f. = 4 p = 0.069

15.35. (a) Diagram below. (b) The stemplots (right) Control Low jump High jump
suggest greater density for high-jump rats and a 55 4 55 55
greater spread for the control group. (c) H = 10.66 56 9 56 56
57 57 57
with P = 0.005. We conclude that bone density dif- 58 58 8 58
fers among the groups. ANOVA tests H0 : all means 59 33 59 469 59
are equal, assuming Normal distributions with the 60 03 60 57 60
same standard deviation. For Kruskal-Wallis, the null 61 14 61 61
62 1 62 62 2266
hypothesis is that the distributions are the same (but 63 63 1258 63 1
not necessarily Normal). (d) There is strong evidence 64 64 64 33
that the three groups have different bone densities; 65 3 65 65 00
66 66 66
specifically, the high-jump group has the highest av- 67 67 67 4
erage rank (and the highest density), the low-jump
group is in the middle, and the control group is lowest.
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
Ctrl 10 601.5 10.2 -2.33
Low 10 606.0 13.6 -0.81
High 10 637.0 22.6 3.15
OVERALL 30 15.5
H = 10.66 d.f. = 2 p = 0.005
H = 10.68 d.f. = 2 p = 0.005 (adjusted for ties)

Group 1 - Treatment 1



10 rats Control J

J

JJ
^
Random - Group 2 - Treatment 2 - Observe bone
assignment 10 rats Low jump density
J


J

JJ
^ Group 3 - Treatment 3

10 rats High jump


408 Chapter 15 Nonparametric Tests

15.36. (a) For ANOVA, H0 : µ1 = µ2 = µ3 = µ4 versus Ha : Not all µi are equal. For
Kruskal-Wallis, H0 says that the distribution of the trapped insect count is the same for
all board colors; the alternative is that the count is systematically higher for some colors.
(b) In the order given, the medians are 46.5, 15.5, 34.5, and 15 insects; it appears that
yellow is most effective, green is in the middle, and white and blue are least effective. The
Kruskal-Wallis test statistic is H = 16.95, with df = 3; the P-value is 0.001, so we have
strong evidence that color affects the insect count (that is, the difference we observed is
statistically significant).
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
Lemon 6 46.50 21.2 3.47
White 6 15.50 7.3 -2.07
Green 6 34.50 14.8 0.93
Blue 6 15.00 6.7 -2.33
OVERALL 24 12.5
H = 16.95 d.f. = 3 p = 0.001
H = 16.98 d.f. = 3 p = 0.001 (adjusted for ties)

15.37. (a) I = 4, n i = 6, N = 24. (b) The table below lists color, insect count, and rank.
There are only three ties (and the second could be ignored, as both of those counts are for
white boards). The Ri (rank sums) are:
Yellow 17 + 20 + 21 + 22 + 23 + 24 = 127
White 3 + 4 + 5.5 + 9.5 + 9.5 + 12.5 = 44
Green 7 + 14 + 15 + 16 + 18 + 19 = 89
Blue 1 + 2 + 5.5 + 8 + 11 + 12.5 = 40

12 1272 + 442 + 892 + 402
(c) H = − 3(25) = 91.953 − 75 = 16.953.
24(25) 6
Under H0 , this has approximately the chi-squared distribution with df = I − 1 = 3;
comparing to this distribution tells us that 0.0005 < P < 0.001.

B B W W W B G B W W B W B G G G Y G G Y Y Y Y Y
7 11 12 13 14 14 15 16 17 17 20 21 21 25 32 37 38 39 41 45 46 47 48 59
1 2 3 4  5 6 7 8  910 11 12
 13 14 15 16 17 18 19 20 21 22 23 24
5.5 9.5 12.5

15.38. (a) The stemplots (right) appear to suggest that Unlogged 1 year ago 8 years ago
logging reduces the number of species per plot and 0 0 2 0
that recovery is slow (the 1-year-after and 8-years- 0 0 0 4
0 0 7 0
after stemplots are similar). The logged stemplots 0 0 8 0
have some outliers and appear to have more spread 1 1 11 1 0
than the unlogged stemplot. The medians are 18.5, 1 333 1 23 1 2
12.5, and 15. (b) For testing H0 : all medians are 1 55 1 4555 1 455
1 1 1 7
equal versus Ha : at least one median is different, we 1 899 1 8 1 88
have H = 9.31, df = 2, and P = 0.010 (or 0.009, ad- 2 01 2 2
justed for ties). This is good evidence of a difference 2 22 2 2
among the groups.
Solutions 409

Minitab output: Kruskal-Wallis test for logging data


LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 12 18.50 23.4 2.88
2 12 12.50 11.5 -2.47
3 9 15.00 15.8 -0.44
OVERALL 33 17.0
H = 9.31 d.f. = 2 p = 0.010
H = 9.44 d.f. = 2 p = 0.009 (adjusted for ties)

15.39. (a) Yes, the data support this Minitab output: Chi-square test
statement: The percent of high-SES Never Former Curr Total
subject who have never smoked High 68 92 51 211
68 . 58.68 83.57 68.75
( 211 = 32.2%) is higher than those
Mid 9 21 22 52
percents for middle- and low-SES 14.46 20.60 16.94
subjects (17.3% and 23.7%, respec-
Low 22 28 43 93
tively), and the percent of current 25.86 36.83 30.30
smokers among high-SES subjects
51 .
Total 99 141 116 356
( 211 = 24.2%) is lower than among
ChiSq = 1.481 + 0.850 + 4.584 +
the middle- (42.3%) and low- (46.2%) 2.062 + 0.008 + 1.509 +
SES groups. (b) X 2 = 18.510 with 0.577 + 2.119 + 5.320 = 18.510
df = 4, for which P = 0.001. There is df = 4, p = 0.001
a significant relationship between SES Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
and smoking behavior. (c) H = 12.72 High 211 2.000 162.4 -3.56
with df = 2, so P = 0.002—or, Mid 52 2.000 203.6 1.90
after adjusting for ties, H = 14.43 Low 93 2.000 201.0 2.46
and P = 0.001. The observed differ- OVERALL 356 178.5
ences are significant; some SES groups H = 12.72 d.f. = 2 p = 0.002
smoke systematically more. H = 14.43 d.f. = 2 p = 0.001
(adjusted for ties)

15.40. (a) Choice of graphical summaries will vary; a Women Men


back-to-back stemplot is shown on the right. 0 033334
96 0 66679999
22222221 1 2222222
x s Median 888888888875555 1 558
Women 165.16 56.515 175 4440 2 00344
2
Men 117.16 74.240 120 3 0
6 3
(b) The Wilcoxon rank sum test yields W = 1105.5,
with two-sided P = 0.0050—significant evidence of a difference. (Minitab output on the
. . .
following page.) (c) The t test yields t = 2.82 with df = 54.2 and P = 0.0067. (d) Both
distributions have a high outlier, and the men’s distribution is skewed, making the use of a t
test somewhat risky.
410 Chapter 15 Nonparametric Tests

Minitab output: Wilcoxon rank sum test


studyF N = 30 Median = 175.00
studyM N = 30 Median = 120.00
W = 1105.5
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0050
The test is significant at 0.0045 (adjusted for ties)
Two-sample t test
N Mean StDev SE Mean
studyF 30 165.2 56.5 10
studyM 30 117.2 74.2 14
95% C.I. for mu study - mu C8: ( 14, 82)
T-Test mu study = mu C8 (vs not =): T= 2.82 P=0.0067 DF= 54

15.41. (a) On the right is a histogram of service


60
times for Verizon customers. With only 10
50
CLEC service calls, it is hardly necessary

Frequency
to make such a graph for them; we can 40
simply observe that 7 of those 10 calls took 30
5 hours, which is quite different from the 20
distribution for Verizon customers. The
10
means and medians tell the same story:
0
.
Verizon x V = 1.7263 hr MV = 1 hr 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
CLEC x C = 3.8 hr MC = 5 hr Repair response time (hours)

(b) The distributions are sharply skewed, and the sample sizes are quite different; the t test is
not reliable in situations like this. The Wilcoxon rank-sum test gives W = 4778.5, which is
highly significant (P = 0.0026 or 0.0006). We have strong evidence that response times for
Verizon customers are shorter. It is also possible to apply the Kruskal-Wallis test (with two
groups). While the P-values are slightly different (P = 0.005, or 0.001 adjusted for ties), the
conclusion is the same: We have strong evidence of a difference in response times.
Minitab output: Wilcoxon rank sum test
Verizon N = 95 Median = 1.000
CLEC N = 10 Median = 5.000
W = 4778.5
Test of ETA1 = ETA2 vs. ETA1 < ETA2 is significant at 0.0026
The test is significant at 0.0006 (adjusted for ties)
Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 95 1.000 50.3 -2.80
2 10 5.000 78.7 2.80
OVERALL 105 53.0
H = 7.84 d.f. = 1 p = 0.005
H = 10.54 d.f. = 1 p = 0.001 (adjusted for ties)
Solutions 411

15.42. Stemplots and other details can be 300


found in the solution to Exercise 7.141.

Selling price ($1000)


250
(a) The distribution of prices for three-
bedroom houses is clearly right-skewed, 200
with high outliers. (b) For testing
H0 : µ3 = µ4 versus Ha : µ3 = µ4 , we 150
.
have t = −4.475 with either df = 20.98 100
.
(P = 0.0002) or df = 13 (P < 0.001).
We reject H0 and conclude that the mean 50
–3 –2 –1 0 1 2 3
prices are different (specifically, that 4BR Normal score
houses are more expensive). (c) We use the
Wilcoxon rank sum test for the hypotheses H0 : medians are equal versus Ha : medians are
.
different. We find W = 312 and P = 0.0001—significant evidence that prices differ. This is
equivalent to the conclusion reached in part (b).
Minitab output: Two-sample t test
N Mean StDev SE Mean
3BR 23 147561 61741 12874
4BR 14 266793 87275 23325
95% C.I. for mu 3BR - mu 4BR: ( -174820, -63644)
T-Test mu 3BR = mu 4BR (vs not =): T= -4.48 P=0.0002 DF= 20
Wilcoxon rank sum test
3BR N = 23 Median = 129900
4BR N = 14 Median = 259900
W = 312.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0001
The test is significant at 0.0001 (adjusted for ties)

15.43. See also the solutions to Exercises 1.79 and 12.35;


Median flower lengths (mm)

47
the latter exercise requests the same analysis for ANOVA.
The means, standard deviations, and medians (all in 45
millimeters) are: 43
41
Variety n x s M
39
bihai 16 47.5975 1.2129 47.12
red 23 39.7113 1.7988 39.16 37
yellow 15 36.1800 0.9753 36.11 35
bihai red yellow
The appropriate rank test is a Kruskal-Wallis test of Heliconia variety
H0 : all three varieties have the same length distribution versus Ha : at least one variety is
systematically longer or shorter. We reject H0 and conclude that at least one species has dif-
ferent lengths (H = 45.35, df = 2, P < 0.0005).
Minitab output: Kruskal-Wallis test
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 16 47.12 46.5 5.76
2 23 39.16 26.7 -0.32
3 15 36.11 8.5 -5.51
OVERALL 54 27.5
H = 45.35 d.f. = 2 p = 0.000
H = 45.36 d.f. = 2 p = 0.000 (adjusted for ties)
412 Chapter 15 Nonparametric Tests

15.44. (a) The mean and median suggest that iron n M x s


content is least for aluminum pots and greatest Aluminum 4 1.185 1.2325 0.2313
for iron pots. ANOVA requires Normal data Clay 4 1.615 1.46 0.4601
with equal standard deviations; the former is Iron 4 2.79 2.79 0.2399
difficult to assess with such small samples, and for the latter, the largest-to-smallest ratio is
1.99—just within our guidelines for pooling. (b) The Kruskal-Wallis test gives H = 8.00,
df = 2, P = 0.019. We conclude that vegetable iron content differs by pot type.
Minitab output: Kruskal-Wallis test for vegetable dish
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
Alum 4 1.185 3.5 -2.04
Clay 4 1.615 5.5 -0.68
Iron 4 2.860 10.5 2.72
OVERALL 12 6.5
H = 8.00 d.f. = 2 p = 0.019

15.45. Use the Wilcoxon rank sum test with a two-sided alternative. For meat, W = 15 and
P = 0.4705, and for legumes, W = 10.5 and P = 0.0433 (or 0.0421). There is no evidence
of a difference in iron content for meat, but for legumes the evidence is significant at
α = 0.05.
Minitab output: Wilcoxon rank sum test for meat
Alum N = 4 Median = 2.050
Clay N = 4 Median = 2.375
W = 15.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.4705
Cannot reject at alpha = 0.05
Wilcoxon rank sum test for legumes
Alum N = 4 Median = 2.3700
Clay N = 4 Median = 2.4550
W = 10.5
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0433
The test is significant at 0.0421 (adjusted for ties)

15.46. Using a Kruskal-Wallis test, we find H = 9.85, df = 2, and P = 0.007. We conclude


that there is a difference in iron content for foods cooked in iron pots.
Minitab output: Kruskal-Wallis test for iron pots
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
Meat 4 4.695 10.5 2.72
Leg 4 3.705 6.5 0.00
Veg 4 2.860 2.5 -2.72
OVERALL 12 6.5
H = 9.85 d.f. = 2 p = 0.007
Solutions 413

15.47. (a) The three pairwise comparisons are bihai-red, bihai-yellow, and red-yellow. (b) The
test statistics and P-values are given in the Minitab output below; all P-values are reported
as 0 to four decimal places. (c) All three are easily significant at the overall 0.05 level.
Minitab output: Wilcoxon rank sum test for bihai – red
bihai N = 16 Median = 47.120
red N = 23 Median = 39.160
W = 504.0
Test of ETA1 = ETA2 vs.ETA1 ~= ETA2 is significant at 0.0000
Wilcoxon rank sum test for bihai – yellow
bihai N = 16 Median = 47.120
yellow N = 15 Median = 36.110
W = 376.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0000
Wilcoxon rank sum test for red – yellow
red N = 23 Median = 39.160
yellow N = 15 Median = 36.110
W = 614.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0000

15.48. Multiple comparisons are appropriate as a follow-up to a significant result from a


Kruskal-Wallis test, so it only makes sense to do this for Exercises 15.44 and 15.46.
That means we have three comparisons from each of these exercises, for a total of 6. In
order to be significant at the overall 0.05 level, an individual P-value must be less than
0.05/6 = 0.0083. None of the differences are significant at this level; with such small
samples, these tests have low power. (For samples of size 4, W must be between 10 and 26,
so five of the six P-values are as small as they can be.)
Minitab output: Medians for the vegetable dishes
AlumVeg N = 4 Median = 1.185
ClayVeg N = 4 Median = 1.615
IronVeg N = 4 Median = 2.860
Aluminum versus clay pots (vegetable dishes)
W = 14.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.3123
Aluminum versus iron pots (vegetable dishes)
W = 10.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0304
Clay versus iron pots (vegetable dishes)
W = 10.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0304
Medians for iron pots
IronMeat N = 4 Median = 4.695
IronLeg N = 4 Median = 3.705
IronVeg N = 4 Median = 2.860
Meat versus legumes (iron pots)
W = 26.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0304
Meat versus vegetables (iron pots)
W = 26.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0304
Legumes versus vegetables (iron pots)
W = 26.0
Test of ETA1 = ETA2 vs. ETA1 ~= ETA2 is significant at 0.0304
Chapter 16 Solutions
The solutions for Chapter 16 present a special challenge. Because bootstrap and permutation
methods require software, the answers will vary because of (a) random variation due to
differences in resampling/rearrangement, and (b) possible systematic and feature differences
arising from the specific software used.
Because of (a), most of the solutions here give ranges of possible answers, rather than a
single answer. These ranges should include the results that most students should get from a
single bootstrap or permutation run. (Basically, for each such exercise, I reported the minimum
and maximum values from 1000 or more bootstraps or permutations.)
For (b), the text primarily refers to results from S-PLUS, but also mentions SAS and
SPSS. If you have other statistical software, you can learn about its bootstrapping capabilities
(if any) by consulting your document, or by doing a Web search for the name of your
software and “bootstrap.” Note that a free student version of S-PLUS is available at
www.onthehub.com/tibco, so your students may use it for this chapter, even if they normally
work with other software. (Faculty can download a 30-day evaluation copy.)
Many of these solutions were originally written by Tim Hesterberg (using S-PLUS) for
earlier editions of IPS, and have been edited and updated by Darryl Nester using R, the free,
open-source version of S-PLUS. One difference in R’s bootstrapping library versus that of
S-PLUS is that (at the time of this writing) it does not compute “tilting” confidence intervals,
so those results are not given. If your software finds tilting intervals, they will (for most of
these exercises) be similar to those found by other methods (percentile, BCa, etc.).

16.1. Student answers in this problem will vary substantially 2 4


due to using different random numbers. (If they do not, 3 08
4
you should be suspicious.) (b) While students could get a 5 09
sample mean as low as 0, or as high as 29.78, 95% of all 6 2288
sample means should be between about 5 and 23. (c) Shown 7 011133357
is a stemplot for a set of 200 resamples. Even for such a 8 0022588
9 13477
large number of resamples, the distribution is somewhat 10 0002355666889
irregular; student stemplots (for 20 resamples) will be even 11 1112233445557777788
more irregular. (d) The theoretical bootstrap standard error 12 00000012223333444566688
13 111111224555555688
is about 4.694, but with only 20 resamples, there will be a 14 0001144456667799
fair amount of variation (although almost certainly in the 15 011225588888888889
range 2.9 to 6.5). 16 0001444466677
Note: The range of numbers (5 to 23) given in part (b) 17 133556668
18 02244478
is based on 10000 resamples. 19 0146799
For part (d), the range of standard errors is based on 20 001122447788
the middle 99% of the SEs from 50000 separate resamples 21 05
22 022225588
of size 20. The theoretical value is based on considering 23 3
the six repair times as a population
√ to compute
√ the stan- 24 9
dard deviation (dividing by 6 rather than 5), yielding 25 5
. √ .
σ = 11.497, so the theoretical standard error is σ/ 6 = 4.694. The computation in the
text (page 16-6) does not mention this detail, although it is discussed briefly in Note 4 on
page 16-57. Because bootstrap methods are generally not used with small samples, and the
difference is negligible for large samples, it usually does not matter.

414
Solutions 415

16.2. The standard deviation of a sample measures the spread of that sample. The standard
error of a sample mean (or other statistic) estimates how much the mean would vary, if you
were to take the means of many samples from the same population. The SE is smaller by a

factor of n.

16.3. (a) The bootstrap samples from the sample (that is, the data), not the population. (b) The
bootstrap samples with replacement. (c) The sample size should be equal to the original
sample. (d) The bootstrap distribution is usually similar to the sampling distribution in shape
and spread, but not in center.

16.4. The bootstrap distribution is (usually) close to Normal, so we expect the sampling
distribution to also be close to Normal.

Normal quantile plot: mean(IQ) Bootstrap distribution: mean(IQ)


Observed
120

Mean
mean(IQ)
116
112
108

−3 −2 −1 0 1 2 3 108 110 112 114 116 118 120 12


Normal score mean(IQ)

16.5. The bootstrap distribution is (usually) close to Normal, with some positive skewness. We
expect the sampling distribution to be close to Normal.

Normal quantile plot: mean(CO2) Bootstrap distribution: mean(CO2)


Observed
Mean
7
mean(CO2)
6
5
4
3

−3 −2 −1 0 1 2 3 3 4 5 6 7 8
Normal score mean(CO2)

16.6. Due to the small sample size, the bootstrap distribution shows some discreteness (note
the small “stair-steps” in the quantile plot). This particular bootstrap distribution looks
reasonably Normal, but with a sample size this small, the sample skewness may vary
substantially, so we cannot say if the sampling distribution is really skewed.
Note: The small-sample variability in skewness is not discussed until Section 16.3.
Bootstrap methods are not well-suited to datasets this small.
416 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: mean(Hours) Bootstrap distribution: mean(Hours)


Observed
Mean
10
mean(Hours)
8
6
4

−3 −2 −1 0 1 2 3 4 6 8 10
Normal score mean(Hours)

16.7. The bootstrap distribution suggests that the sampling distribution should be close to
Normal.

Normal quantile plot: mean(Luck) Bootstrap distribution: mean(Luck)


Observed
Mean
6.5
mean(Luck)
5.5
4.5

−3 −2 −1 0 1 2 3 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5


Normal score mean(Luck)

16.8. The bootstrap distribution is non-Normal; specifically, it is skewed to the right. We


expect the sampling distribution to be skewed.
Note: This amount of skewness would not be a concern with some statistical procedures,
but it is when working with bootstrap distributions.

Normal quantile plot: Mean song length (seconds) Bootstrap distribution: Mean song length (seconds)
Observed
Mean song length (seconds)

Mean
450
350
250

−3 −2 −1 0 1 2 3 250 300 350 400 450 500


Normal score Mean song length (seconds)
Solutions 417

16.9. In each case, SEboot will vary. To get an idea of how much variability one might
observe, a range of “typical” bootstrap SE is given, based on 500 trials. (a) For the IQ data,
. .
s = 14.8009, so SEx = 1.9108. SEboot will typically be between and 1.77 and 2.01. (b) For
. .
the CO2 data, s = 4.8222, so SEx = 0.6960. SEboot will typically be between about 0.64 and
. .
0.74. (c) For the video-watching data, s = 3.8822, so SEx = 1.3726. SEboot will typically
be between about 1.20 and 1.36—almost certainly an underestimate.
√ The bootstrap is biased
downward for estimating standard error by a factor of about (n − 1)/n, which is about
0.94 when n = 8.

16.10. (a) Histogram on the right. (b) The


bootstrap distribution is clearly not Nor-

60
mal in the tails; both the quantile plot and

Frequency
the histogram (on the following page) are

40
clearly skewed to the right.

20
0
0 500 1000 1500 2000 2500 3000
Call length
Normal quantile plot: mean(Callcenter80 length) Bootstrap distribution: mean(Callcenter80 length)
Observed
mean(Callcenter80 length)
400

Mean
300
200
100

−3 −2 −1 0 1 2 3 100 150 200 250 300 350 400 450


Normal score mean(Callcenter80 length)

16.11. (a) The CALLCENTER20 bootstrap distribution is slightly skewed to the right, but it is
considerably less skewed than the CALLCENTER80 bootstrap distribution. (b) The standard
error for the smaller data set is much smaller: For CALLCENTER20, the standard error is
almost always between 21 and 25, and for CALLCENTER80, it is almost always between
35 and 41.
Note: The difference in standard errors is primarily because the sample standard
deviation for the CALLCENTER20 data is much smaller (103.8 versus 342.0).
418 Chapter 16 Bootstrap Methods and Permutation Tests

200 Normal quantile plot: mean(Callcenter20 length) Bootstrap distribution: mean(Callcenter20 length)
Observed
mean(Callcenter20 length)

Mean
150
100

−3 −2 −1 0 1 2 3 50 100 150 200


Normal score mean(Callcenter20 length)

.
16.12. (a) The mean of the sample is x = 13.7567, so the bootstrap bias estimate is
.
13.762 − 13.7567 = 0.0053. (b) SEboot = 4.725. (We do not need to divide the given
value by anything; it is already the estimate standard deviation of the sample mean.) (c) For
df = 5, the appropriate critical value is t ∗ = 2.571, so the 95% bootstrap confidence interval
.
is x ± t ∗ SEboot = 13.7567 ± 12.146 = 1.6107 to 25.9027.

16.13. See also the solution to Exercise 16.8. (a) The bootstrap Typical ranges
distribution is skewed; a t interval might not be appropriate. SEboot 39.5 to 46.5
(b) The bootstrap t interval is x ±t ∗ SEboot , where x = 354.1 sec, t lower 260.7 to 274.7
.
t ∗ = 2.0096 for df = 49, and SEboot is typically between 39.5 t upper 433.5 to 447.5
and 46.5. This gives the range of intervals shown on the right.
(c) The interval reported in Example 7.11 was 266.6 to 441.6 seconds.

16.14. (a) Based on 1000 resamples, SEboot is almost always be- Typical ranges
tween 3.8 and 4.5. (b) The bootstrap distribution looks reason- SEboot 3.9 to 4.5
ably Normal, with no appreciable bias, so a bootstrap t interval t lower 0.88 to 2.09
is appropriate. Typical ranges are on the right. (c) The t interval t upper 17.82 to 19.03
reported in Example 7.15 (page 439) was 1.2 to 18.7.
Normal quantile plot: diffmeans(DRP) Bootstrap distribution: diffmeans(DRP)
Observed
20

Mean
diffmeans(DRP)
15
10
5
0

−3 −2 −1 0 1 2 3 0 5 10 15 20
Normal score diffmeans(DRP)
Solutions 419

16.15. The summary statistics given in Example 16.6 include standard deviations
. . .
s1 = 14.7 min for Verizon and s2 = 19.5 min for CLEC, so SE D = 4.0820. (Computation
.
from the original data gives SE D = 4.0827.) The standard error reported by the S-PLUS
bootstrap routine (shown in the text following that example) is 4.052.

16.16. See also the solution to Exercise 16.6. (a) The bootstrap dis- Typical ranges
tribution found in the solution to Exercise 16.6 looks reasonably SEboot 1.2 to 1.4
Normal. (b) The likely range of bootstrap intervals is on the right. t lower 3.5 to 3.9
. .
(c) With x = 6.75 hr, s = 3.8822, and t ∗ = 2.365 (df = 7), the usual t upper 9.6 to 10
95% confidence interval is 3.50 to 10.00, so the bootstrap interval is
almost always narrower than the usual interval.

16.17. See also the solution to Exercise 16.10. (a) The bootstrap Typical ranges
bias is typically between −4 and 4, which is small relative to Bias −4 to 4
x = 196.575 min. (b) Ranges for the bootstrap interval are given on SEboot 35 to 41
.
the right. (c) SEx = 38.2392, while SEboot ranges from about 35 to t lower 114 to 127
41. The usual t interval is 120.46 to 272.69 min. t upper 266 to 279

16.18. (a) The two distributions should be similar (although the Typical ranges
default plots may differ depending on what software is used). x 25% 242.7 to 246.4
(b) Typical ranges for these quantities are shown on the right. Bias −1.3 to 2.4
(c) The interval in Example 16.5 was (210.19, 277.81). SEboot 16 to 19
t lower 205.5 to 211.5
t upper 276.5 to 282.5
Normal quantile plot: trim25(Real estate price) Bootstrap distribution: trim25(Real estate price)
320

Observed
Mean
trim25(Real estate price)
280
240
200

−3 −2 −1 0 1 2 3 200 220 240 260 280 300 320


Normal score trim25(Real estate price)

16.19. The bootstrap distribution (below) is noticeably non-Normal Typical ranges


in the tails, especially the low tail. In addition, the bias is larger Bias −22.6 to −7.7
than we would like (and almost always negative, because of the SEboot 76.7 to 87.9
heavy low tail). A t interval is risky (and perhaps not appropri- t lower 140.3 to 162.6
ate), but students may elect to compute it anyway. t upper 471.1 to 493.4
420 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: stdev(Real estate price) Bootstrap distribution: stdev(Real estate price)
Observed
500

Mean
stdev(Real estate price)
400
300
200
100

−3 −2 −1 0 1 2 3 100 200 300 400 500


Normal score stdev(Real estate price)

16.20. (a) Shown on the right are both stem- North South 60
plots and boxplots (copied from the solution 43322 0 2
65 0 57 50

Tree diameter (cm)


to Exercise 7.81). The north distribution is 443310 1 2
right-skewed, while the south distribution is 955 1 8 40
2 13
left-skewed. It might be appropriate to use 8755 2 689
0 3 2 30
standard t methods in spite of the skewness
996 3 566789
because the sample sizes are relatively large, 43 4 003444 20
and there are no outliers in either distribu- 6 4 578
4 5 0112 10
tion. (b) The bootstrap distribution appears 85 5
to be quite close to Normal, with very little 0
North South
bias. (c) Typical ranges for SEboot and the
bootstrap interval are given on the right. Typical ranges
(d) The standard t interval is −19.09 to Bias −0.4 to 0.5
−2.58 (df = 55.7). SEboot 3.7 to 4.3
t lower −19.5 to −18.3
t upper −3.4 to −2.2
Normal quantile plot: North DBH − south DBH Bootstrap distribution: North DBH − south DBH
Observed
0

Mean
North DBH − south DBH
−5
−20 −15 −10

−3 −2 −1 0 1 2 3 −25 −20 −15 −10 −5 0


Normal score North DBH − south DBH
Solutions 421

16.21. (a) The data appear to be roughly Normal, though with Typical ranges
the typical random gaps and bunches that usually occur with Bias −0.016 to 0.02
relatively small samples. It appears from both the histogram SEboot 0.12 to 0.14
and quantile plot that the mean is slightly larger than zero, but t lower −0.16 to −0.11
the difference is not large enough to rule out a N (0, 1) distri- t upper 0.36 to 0.41
bution. (b) The bootstrap distribution is extremely close to Normal with no appreciable bias.
.
(c) SEx = 0.1308, and the usual t interval is −0.1357 to 0.3854. Typical results for SEboot
and the bootstrap interval are above on the right.

20
3
2

15
Frequency
Normal78
1

10
0
−1

5
−2

0
−2 −1 0 1 2 −2 −1 0 1 2 3
Normal score Normal78

Normal quantile plot: mean(Normal78) Bootstrap distribution: mean(Normal78)


Observed
Mean
0.4
mean(Normal78)
0.2
0.0
−0.2

−3 −2 −1 0 1 2 3 −0.2 0.0 0.2 0.4


Normal score mean(Normal78)

16.22. Based on a quantile plot and histogram, the bootstrap distribution is quite non-Normal.

Normal quantile plot: median(Real estate price) Bootstrap distribution: median(Real estate price)
Observed
median(Real estate price)

Mean
300
250
200

−3 −2 −1 0 1 2 3 200 250 300


Normal score median(Real estate price)
422 Chapter 16 Bootstrap Methods and Permutation Tests

16.23. Because the scores are all between 1 and 10, there can be Typical ranges
no extreme outliers, so standard t methods should be safe. The Bias −0.07 to 0.06
bootstrap distribution appears to be quite Normal, with little bias. SEboot 0.44 to 0.53
The usual t interval is 4.9256 to 6.8744, which is in the range of t lower 4.8 to 5.0
typical bootstrap intervals. t upper 6.8 to 7.0

Normal quantile plot: mean(Luck) Bootstrap distribution: mean(Luck)


Observed
4.5 5.0 5.5 6.0 6.5 7.0

Mean
mean(Luck)

−3 −2 −1 0 1 2 3 4.5 5.0 5.5 6.0 6.5 7.0 7.5


Normal score mean(Luck)

.
16.24. (a) The sample standard deviation is s = 4.4149 mpg. Typical ranges
(b) The typical range for SEboot is in the table on the right. Bias −0.22 to -0.09
(c) SEboot is quite large relative to s, suggesting that s is not a SEboot 0.55 to 0.65
very accurate estimate. (d) There is substantial negative bias and t lower 3.07 to 3.26
some skewness, so a t interval is probably not appropriate. t upper 5.57 to 5.76

Normal quantile plot: stdev(Fuel efficiency) Bootstrap distribution: stdev(Fuel efficiency)


Observed
6

Mean
stdev(Fuel efficiency)
5
4
3

−3 −2 −1 0 1 2 3 3 4 5 6
Normal score stdev(Fuel efficiency)

16.25. The distribution is sharply right-skewed, so a trimmed Typical ranges


mean is a good choice. (Students will likely choose the 25% Bias 0.048 to 0.114
trimmed mean, both because it is mentioned in the text, and be- SEboot 0.29 to 0.41
cause it is given in the answer to this exercise.) For the sample, t lower 0.89 to 1.12
x 25% = 1.74 billion dollars. Typical ranges for the bias and t upper 2.36 to 2.59
SEboot are given on the right; note that the bias is a substantial fraction of SEboot .
The bootstrap distribution is strongly right-skewed, so a bootstrap t interval would be
questionable; the typical range of intervals is shown (above, right).
Note: By definition, the mean wealth (trimmed or not) of billionaires must be more than
$1 billion; the fact that the bootstrap interval can extend below that limit is an indication
that we should not rely on it.
Solutions 423

15

10
Billionaire wealth

Frequency
8
10

6
4
5

2
0
−2 −1 0 1 2 0 5 10 15
Normal score Billionaire wealth

Normal quantile plot: trim25(Billionaire wealth) Bootstrap distribution: trim25(Billionaire wealth)


3.5

Observed
Mean
trim25(Billionaire wealth)
3.0
2.5
2.0
1.5

−3 −2 −1 0 1 2 3 1.5 2.0 2.5 3.0 3.5


Normal score trim25(Billionaire wealth)

16.26. (a) The bootstrap distribution for the CLEC mean is strongly Typical ranges
right-skewed with mean 16.5, with bias and SEboot in the ranges Bias −0.40 to 0.42
shown on the right. For comparison, the bootstrap distribution SEboot 3.6 to 4.3
for ILEC mean in Figure 16.3 is barely skewed, with a mean of t lower 7.6 to 9.0
8.41 and SEboot = 0.367. (b) Note that SEboot is much larger for t upper 24.0 to 25.4
CLEC than for ILEC. Because the ILEC bootstrap means vary so little, when we compute
x ILEC − x CLEC , it is the latter term that primarily determines the shape of the distribution of
the difference. Because of the minus sign, the right skewness of the CLEC means cause the
difference to be left-skewed.
Normal quantile plot: mean(CLEC repair time) Bootstrap distribution: mean(CLEC repair time)
30

Observed
Mean
mean(CLEC repair time)
25
20
15
10

−3 −2 −1 0 1 2 3 10 15 20 25 30
Normal score mean(CLEC repair time)
424 Chapter 16 Bootstrap Methods and Permutation Tests


16.27. (a) x would have a Normal distribution with mean 8.4 and standard deviation 14.7/ n.
(b) and (c) Histograms below. The values of SEboot will be quite variable, both because of
variation in the original sample, and variation due to resampling. (d) Student answers will
vary, depending on their samples. There may be some skewness (right or left) for smaller
samples. SEboot should be roughly halved each time the sample size increases by a factor of
4, although for n = 10 and n = 40, the size of SEboot can vary considerably.

Bootstrap distribution: mean(x10) Bootstrap distribution: mean(x40) Bootstrap distribution: mean(x160)


Observed Observed Observed
Mean Mean Mean

−5 0 5 10 15 20 0 5 10 15 4 5 6 7 8 9 10
mean(x10) mean(x40) mean(x160)


16.28. (a) The mean is 8.4, and the standard deviation is 14.7/ n. (b) and (c) Histograms on
the following page. The values of SEboot will be quite variable, both because of variation
in the original sample, and variation due to resampling. (d) Student answers may vary,
depending on their samples. There may be substantial right skewness, and some irregularity,
for smaller samples (the n = 10 histogram below is an extreme case), but the distribution
should be closer to Normal for large samples. There will typically be little or no bias.
SEboot should be roughly halved each time the sample size increases by a factor of 4,
although for n = 10 and n = 40, the size of SEboot may vary considerably.

Bootstrap distribution: mean(repair10) Bootstrap distribution: mean(repair40) Bootstrap distribution: mean(repair160)


Observed Observed Observed
Mean Mean Mean

0 5 10 15 20 25 30 35 5 10 15 6 8 10 12 14
mean(repair10) mean(repair40) mean(repair160)

16.29. Student answers should vary depending on their samples. They should notice that the
bootstrap distributions are approximately Normal for larger sample sizes. For small samples,
the sample could be skewed one way or the other in Exercise 16.27, and most should be
right skewed for Exercise 16.28. Some of that skewness should come through into the
bootstrap distribution.

16.30. For a 90% bootstrap percentile confidence interval, we choose the 5th and 95th
percentiles of the bootstrap distribution. For an 80% interval, use the 10th and 90th
percentiles.
Solutions 425

16.31. (a) The bootstrap distribution looks close to Nor- Typical ranges
mal (though that does not mean much with this small Bias −0.54 to 0.54
sample). The bias is small. (b) The typical range of SEboot 4.35 to 5.01
bootstrap t intervals is on the right. (c) The bootstrap t lower 0.89 to 2.57
percentile interval is much narrower than the bootstrap t t upper 24.94 to 26.63
interval. (It is typically 70% to 80% as wide.) Percentile lower 3.7 to 5.9
Percentile upper 22.1 to 24.5
Note: The reason that the percentile interval is nar-
rower in this setting is that the bootstrap distribution has “heavy tails,” visible in the slight
curvature on the edges of the quantile plot. This inflates the standard error, and therefore
make the t interval wider than it should be.
Normal quantile plot: mean(repair6) Bootstrap distribution: mean(repair6)
Observed
Mean
25
mean(repair6)
20
15
10
5

−3 −2 −1 0 1 2 3 0 5 10 15 20 25 30
Normal score mean(repair6)

.
16.32. (a) The sample standard deviation is s = 14.8 with Typical ranges
.
n = 60, so SEx = 1.9108. The usual t confidence interval Bias −0.19 to 0.23
.
is x ± 2.001 SEx = 114.9833 ± 3.8235 = 111.16 to 118.81. SEboot 1.78 to 2.04
(b) The bootstrap distribution appears to be reasonably close t lower 110.9 to 111.5
to Normal, except perhaps in the tails. Ranges for SEboot and t upper 118.5 to 119.1
the bootstrap t interval are given on the right. (c) The intervals agree fairly well, although
resampling variation can produce different results. Here the formula interval would be fine.
Normal quantile plot: mean(IQ) Bootstrap distribution: mean(IQ)
Observed
Mean
boot t
118

percentile
mean(IQ)

BCa
usual t
114
110

−3 −2 −1 0 1 2 3 110 112 114 116 118 120


Normal score mean(IQ)
426 Chapter 16 Bootstrap Methods and Permutation Tests

16.33. (a) The bootstrap percentile and t intervals are Typical ranges
very similar, suggesting that the t intervals are accept- t lower −0.16 to −0.11
able. (b) Every interval (percentile and t) includes 0. t upper 0.36 to 0.41
Note: In the solution to Exercise 16.31, the per- Percentile lower −0.17 to −0.09
centile intervals were always 70% to 80% as wide as Percentile upper 0.35 to 0.42
the t intervals (because of the heavy tails of that bootstrap distribution). In this case, the
width of the percentile interval is 93% to 106% of the width of the t interval.

16.34. (a) The distribution is skewed to the

8
right, but has no extreme outliers, so stan-
dard t procedures should be safe unless high

Frequency
6
accuracy is needed. (b) The usual t interval
is x ± 2.045SEx = 99.8 ± 3.05 = 96.75

4
to 102.85. (c) The bootstrap distribution

2
is very close to Normal with no apprecia-
ble bias; a t interval should be accurate.

0
Ranges are in the table on the right. (d) The 85 90 95 100 105 110 115 120
bootstrap percentile interval (ranges on the data30
right) is typically similar to the bootstrap t Typical ranges
interval, and both are similar to the standard Bias −0.15 to 0.15
t interval. We conclude that the usual t in- SEboot 1.34 to 1.60
terval is accurate. t lower 96.7 to 97.0
Note: The width of the percentile inter- t upper 102.6 to 102.9
val is typically 90% to 103% of the width of Percentile lower 96.6 to 97.4
Percentile upper 102.3 to 103.2
the t interval.
Normal quantile plot: mean(data30) Bootstrap distribution: mean(data30)
104

Observed
Mean
102
mean(data30)
100
98
96

−3 −2 −1 0 1 2 3 96 98 100 102 104


Normal score mean(data30)

16.35. These intervals are given on page 16-37 of the text: The percentile interval is (−0.128,
0.356), and the bootstrap t interval is (−0.144, 0.358). The differences are relatively small
relative to the width of the intervals, so they do not indicate appreciable skewness.
Solutions 427

16.36. (a) These intervals will vary in a manner similar to the bootstrap t and percentile
intervals; see the solution to Exercise 16.34 for likely ranges. (b) BCa and tilting intervals
are typically similar to the standard t interval (96.75 to 102.85), again suggesting that the
usual t interval is accurate. (c) For a quick check, we might use the percentile interval. For
a more accurate check, we should use a BCa or tilting interval (if they are available).

16.37. Typical ranges for the BCa interval are shown on the Typical ranges
right; the tilting interval will be similar. Most intervals are BCa lower −0.19 to −0.11
fairly similar to the bootstrap t and percentile intervals from BCa upper 0.32 to 0.41
Example 16.10, suggesting that the simpler intervals are
adequate.

16.38. As was previously noted in the solutions to Exercises 16.8 and 16.13, the skewness
is a cause for concern. The lower limit of the percentile interval is generally larger than
lower limit of the bootstrap t interval. The BCa interval is almost always shifted to the right
relative to both the t and percentile intervals. The t and percentile intervals are inaccurate
here; the more sophisticated BCa or tilting intervals are more reliable.
Bootstrap distribution: mean(Seconds)
Observed
Typical ranges Mean
t lower 260.7 to 274.7 boot t
percentile
t upper 433.5 to 447.5 BCa
Percentile lower 270.2 to 287.0
Percentile upper 432.8 to 463.3
BCa lower 281.4 to 300.0
BCa upper 450.9 to 516.2
250 300 350 400 450 500
mean(Seconds)

16.39. The percentile interval is shifted to the right relative to the bootstrap t interval. The
more accurate intervals are shifted even further to the right.
Bootstrap distribution: mean(Callcenter80 length)
Observed
Typical ranges Mean
t lower 114 to 127 boot t
percentile
t upper 266 to 279 BCa
Percentile lower 127 to 140
Percentile upper 267 to 298
BCa lower 137 to 152
BCa upper 292 to 371
150 200 250 300
mean(Callcenter80 length)
428 Chapter 16 Bootstrap Methods and Permutation Tests

16.40. The percentile interval is typically shifted to the left relative to the others; the BCa
and tilting intervals are farther to the right. Based on these differences, the BCa or tilting
interval should be used.
Bootstrap distribution: stdev(Real estate price)
Observed
Typical ranges Mean
t lower 139.4 to 162.7 boot t
percentile
t upper 470.9 to 494.2 BCa
Percentile lower 122.8 to 151.3
Percentile upper 437.7 to 479.5
BCa lower 162.3 to 202.3
BCa upper 484.7 to 609.1
100 200 300 400 500
stdev(Real estate price)

16.41. The bootstrap distribution for the smaller sample is less skewed. The standard t interval
is 80.34 to 177.46; the bootstrap t interval is similar, and the other bootstrap intervals are
generally narrower and shifted to the right.
Note: Generally, a smaller sample should result in less regularity—that is, more
skewness, larger standard error, etc. In this case, the smaller sample does not contain the
nine highest call lengths, many of which would be considered outliers. Those increase the
skewness of the bootstrap distribution for CALLCENTER80.
Bootstrap distribution: mean(Callcenter20 length)
Observed
Typical ranges Mean
t lower 78.0 to 84.4 boot t
percentile
t upper 173.3 to 179.8 BCa
Percentile lower 80.8 to 92.8
Percentile upper 169.7 to 181.6
BCa lower 83.1 to 96.2
BCa upper 173.0 to 191.9
50 100 150 200
mean(Callcenter20 length)

16.42. See also the solution to Exercise 16.26. (a) The bootstrap distribution shows strong
right skewness, making the formula t, bootstrap t, and (to a lesser extent) the percentile
intervals unreliable. The BCa and tilting intervals adjust for skewness, so they should be
accurate. (b) Typical ranges for these intervals are shown on the right. Relative to the
.
sample mean x = 16.5, the BCa and tilting intervals are much more asymmetrical than
the other intervals, because they take into account the skewness in the data. The bootstrap
t ignores the skewness, and the percentile interval only catches part of the skewness. In
practical terms, a t interval would tend to underestimate the true value: It would not stretch
far enough to the right, so it would have a probability of missing the population mean
higher than 5%. This is true to a lesser extent for the percentile interval.
Solutions 429

Bootstrap distribution: mean(CLEC repair time)


Observed
Typical ranges Mean
boot t
t lower 7.5 to 9.00
percentile
t upper 24.0 to 25.5 BCa

Percentile lower 9.5 to 10.6


Percentile upper 23.9 to 27.3
BCa lower 10.7 to 12.0
BCa upper 26.8 to 37.9 10 15 20 25 30
mean(CLEC repair time)

.
16.43. The observed difference is x ILEC − x CLEC = −8.1. Ranges for all three intervals are
given below. Because of the left skew of the bootstrap distribution, the t interval does not
reach far enough to the left and reaches too far to the right, meaning that the interval would
be too high too often, effectively overestimating where the true difference lies. This may
also be true for the percentile interval, but considerably less so.
Bootstrap distribution: ILEC mean − CLEC mean
Observed
Typical ranges Mean
t lower −16.5 to −15.4 boot t
percentile
t upper −0.8 to 0.3 BCa

Percentile lower −18.5 to −16.0


Percentile upper −2.1 to −1.2
BCa lower −19.3 to −16.5
BCa upper −2.5 to −1.4 −25 −20 −15 −10 −5 0
ILEC mean − CLEC mean

16.44. (a) The bootstrap distribution is extremely Typical ranges


left-skewed, with consistent negative bias. This is Bias −0.071 to −0.033
definitely not a candidate for the bootstrap t interval. SEboot 0.13 to 0.19
(b) As expected, the t interval is not acceptable; in t lower 0.47 to 0.57
particular, the upper limit is always greater than 1. t upper 1.19 to 1.30
(In the plot below, it extends past the right border Percentile lower 0.37 to 0.51
Percentile upper 0.98 to 1.00
of the histogram.) The percentile and BCa intervals
BCa lower 0.38 to 0.55
are often similar; while the BCa interval is theoret- BCa upper 0.98 to 1.00
ically more sophisticated, in this case, the simpler
percentile method seems to be fine.
Normal quantile plot: corr(bots, spams/day) Bootstrap distribution: corr(bots, spams/day)
1.0

Observed
Mean
corr(bots, spams/day)

boot t
0.8

percentile
BCa
0.6
0.4
0.2

−3 −2 −1 0 1 2 3 0.2 0.4 0.6 0.8 1.0


Normal score corr(bots, spams/day)
430 Chapter 16 Bootstrap Methods and Permutation Tests

16.45. (a) The bootstrap distribution is sharply left- Typical ranges


skewed. (b) Shown are ranges for the percentile Bias −0.00019 to 0.00008
and BCa intervals, as well as the (inappropriate) t lower 0.9940 to 0.9948
bootstrap t interval. The percentile and BCa in- t upper 0.9995 to 1.0002
tervals typically have similar upper limits, but the Percentile lower 0.9931 to 0.9946
BCa lower limit is generally less than the per- Percentile upper 0.9988 to 0.9992
BCa lower 0.9899 to 0.9937
centile lower limit. The confidence intervals give
BCa upper 0.9985 to 0.9990
more than enough evidence to reject H0 : ρ = 0; in
fact, we have strong evidence that the correlation is at least 0.99.
Normal quantile plot: corr(Debt2006, Debt2007) Bootstrap distribution: corr(Debt2006, Debt2007)
Observed
corr(Debt2006, Debt2007)

Mean
boot t

percentile
0.996

BCa
0.992

−3 −2 −1 0 1 2 3 0.992 0.994 0.996 0.998 1.000


Normal score corr(Debt2006, Debt2007)

16.46. We should resample whole observations. If the data are stored in a spreadsheet with
observations in rows and the x and y variables in two columns, then we should pick a
random sample of rows with replacement. When a row is picked, we put the whole row into
a bootstrap data set. By doing so, we maintain the relationship between x and y.

16.47. (a) The regression equation for predicting salary Typical ranges
(in $millions) is ŷ = 0.8125 + 7.7170x, where x is Bias −0.56 to 1.35
batting average. (The slope is not significantly different SEboot 9.26 to 10.6
from 0: t = 0.744, P = 0.461.) (b) The bootstrap t lower −13.6 to −10.8
distribution is reasonably close to Normal, suggesting t upper 26.3 to 29.0
that any of the intervals would be reasonably accu- Percentile lower −12.7 to −7.7
Percentile upper 26.0 to 31.7
rate. Ranges for the intervals are given on the right.
BCa lower −13.6 to −7.6
(c) These results are consistent with our conclusion BCa upper 24.7 to 33.0
about correlation: All intervals include zero, which
corresponds to no (linear) relationship between batting average and salary.

R output: lm(I(Salary/10^6) ~ Average, data=mlbsalaries)


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.8125 2.6941 0.302 0.764
Average 7.7170 10.3738 0.744 0.461
Residual standard error: 2.718 on 48 degrees of freedom
Multiple R-squared: 0.0114, Adjusted R-squared: -0.009199
F-statistic: 0.5534 on 1 and 48 DF, p-value: 0.4606
Solutions 431

Normal quantile plot: Regression line slope Bootstrap distribution: Regression line slope
Observed

40
Regression line slope Mean
30 boot t

percentile
20

BCa
10
0
−10

−3 −2 −1 0 1 2 3 −20 −10 0 10 20 30 40
Normal score Regression line slope

16.48. (a) The residuals versus bots plot suggests that Typical ranges
spread increases with the number of bots—a violation Bias −0.02 to −0.01
of the conditions for regression. The small sample size SEboot 0.045 to 0.053
makes it hard to draw conclusions from the quantile t lower 0.050 to 0.067
plot (it is not very linear, but that is primarily because t upper 0.274 to 0.290
of one point). (b) The bootstrap distribution is decid- Percentile lower 0.028 to 0.049
Percentile upper 0.207 to 0.216
edly non-Normal; we should not use the t interval.
BCa lower 0.029 to 0.053
(The percentile interval is more reliable, but is still BCa upper 0.207 to 0.224
somewhat risky because of the shape of the bootstrap
distribution.) (c) Ranges for all three intervals are given in the table on the right. As ex-
pected, the bootstrap t interval is inaccurate, but the percentile and BCa intervals are quite
similar. With t ∗ = 2.306 for df = 8, the standard t interval is 0.1705 ± t ∗ (0.03189) = 0.0969
to 0.2440—shifted to the right relative to BCa (and percentile).
Note: Because H0 : β1 = 0 is equivalent to H0 : ρ = 0, it might be tempting to think that
bootstrapping the slope and bootstrapping the correlation should be equivalent. Although
the distributions are usually similar, the resampling process also changes sx and sy , which
complicates the relationship between ρ and β1 .

R output: lm(formula = SpamsPerDay ~ Bots, data = spam)


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.40192 4.31939 -0.788 0.453635
Bots 0.17048 0.03189 5.346 0.000689
Residual standard error: 9.246 on 8 degrees of freedom
Multiple R-squared: 0.7813, Adjusted R-squared: 0.754
F-statistic: 28.58 on 1 and 8 DF, p-value: 0.0006891
432 Chapter 16 Bootstrap Methods and Permutation Tests

10

10
5

5
0

0
Residual

Residual
−10

−10
−20

−20
0 50 100 150 200 250 300 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
Spam bots Normal score

Normal quantile plot: Regression line slope Bootstrap distribution: Regression line slope
Observed
Mean
0.20
Regression line slope

boot t

percentile
BCa
0.10
0.00

−3 −2 −1 0 1 2 3 0.00 0.05 0.10 0.15 0.20 0.25


Normal score Regression line slope

16.49. (a) The tails of the residuals are somewhat Typical ranges
heavier than we would expect from a Normal distri- Bias −0.0008 to 0.0026
bution. (b) The bootstrap distribution is reasonable SEboot 0.016 to 0.019
close to Normal, so the bootstrap t should be fairly t lower 1.07 to 1.09
accurate. (c) Ranges for all three bootstrap intervals t upper 1.14 to 1.16
are given in the table on the right; they all give Percentile lower 1.08 to 1.09
Percentile upper 1.14 to 1.16
similar results. With t ∗ = 2.074 for df = 22, the
BCa lower 1.07 to 1.09
standard t interval is 1.1159 ± t ∗ (0.0108) = 1.0784 BCa upper 1.14 to 1.16
to 1.1534. The bootstrap intervals are fairly close to
this.

R output: lm(formula = Debt2007 ~ Debt2006, data = debt)


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.11113 3.60162 0.309 0.76
Debt2006 1.11594 0.01808 61.727 <2e-16
Residual standard error: 11.09 on 22 degrees of freedom
Multiple R-squared: 0.9943, Adjusted R-squared: 0.994
F-statistic: 3810 on 1 and 22 DF, p-value: < 2.2e-16
Solutions 433

20

20
10

10
Residual

Residual
0

0
−20 −10

−20 −10
0 100 200 300 400 −2 −1 0 1 2
Debt in 2006 Normal score

Normal quantile plot: Regression line slope Bootstrap distribution: Regression line slope
1.18

Observed
Mean
Regression line slope

boot t
1.14

percentile
BCa
1.10
1.06

−3 −2 −1 0 1 2 3 1.06 1.08 1.10 1.12 1.14 1.16 1.18


Normal score Regression line slope

16.50. (a) The bootstrap distribution typically has Typical ranges


slight left skewness for the full data set, and is Bias (all points) −0.04 to 0.04
close to Normal with the outliers removed. The bias Bias (trimmed) −0.02 to 0.03
is close to zero in each case, but SEboot is substan- SEboot (all points) 0.33 to 0.39
tially larger for the full data set. (b) Ranges for the SEboot (trimmed) 0.22 to 0.26
BCa intervals are given on the right; the interval BCa lower (all) 1.42 to 1.76
BCa upper (all) 2.99 to 3.19
is higher and much narrower with the outliers ex-
BCa lower (trimmed) 2.47 to 2.64
cluded. BCa upper (trimmed) 3.44 to 3.63
Note: The outliers have the same effect on the
standard t interval: With all points, the interval is 1.6240 to 3.2414, and with the outliers
removed, the interval is 2.4679 to 3.5888.
Normal quantile plot: mean(aggdiff−all) Bootstrap distribution: mean(aggdiff−all)
3.5

Observed
Mean
3.0

boot t
mean(aggdiff−all)

percentile
2.5

BCa
2.0
1.5

−3 −2 −1 0 1 2 3 1.5 2.0 2.5 3.0 3.5


Normal score mean(aggdiff−all)
434 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: mean(aggdiff−trimmed) Bootstrap distribution: mean(aggdiff−trimmed)


Observed
3.6

Mean
mean(aggdiff−trimmed)

boot t

percentile
3.2

BCa
2.8
2.4

−3 −2 −1 0 1 2 3 2.5 3.0 3.5


Normal score mean(aggdiff−trimmed)

16.51. No, because we believe that one population has a smaller spread, but in order to pool
the data, the permutation test requires that both populations be the same when H0 is true.
√ .
16.52. The standard error is about (0.04)(0.96)/250 = 0.0124. We should not feel
comfortable declaring this to be significant at the 5% level, because 0.04 is less than one SE
below 0.05.

16.53. (a) The observed difference in means is 57+53


2 − 19 + 37 + 41 + 424 = 20.25.
(b) Student results will vary, but should be one of the 15 (equally likely) possible values:
−20.25, −17.25, −16.5, −8.25, −5.25, −3.75, −3, 0, 5.25, 8.25, 9, 11.25, 12, 20.25
(c) The histogram shape will vary considerably. (d) Out of 20 resamples, the number which
yield a difference of 20.25 or more has a binomial distribution with n = 20 and p = 1/15,
so most students should get between 0 and 4, for a P-value between 0 and 0.2. (e) As was
noted in part (b), only one resample gives a difference of means greater than or equal to the
.
observed value, so the exact P-value is 1/15 = 0.0667.
Note: To determine the 15 possible values, note that the six numbers sum to 249. If the
first two numbers add up to T, then the other four will add up to 249 − T, and the difference
in means will be 12 T − 14 (249 − T) = 34 T − 62.25. The values of T range from 19 + 37 = 56
to 57 + 53 = 110.
Solutions 435

16.54. (a) We test H0 : µ1 = µ2 versus Ha : µ1 = µ2 , where µ1 Typical ranges


is the mean selling price for all Seattle real estate transactions Bias −5 to 5
in 2001, and µ2 is the mean selling price for Seattle real estate SEboot 45 to 55
transactions in 2002. (b) With x 1 = 288.94 and x 2 = 329.34, BCa lower −51 to −27
. . .
we find t = 0.81, df = 71.9, and P = 0.4223. (If we assume BCa upper 137 to 201
.
equal standard deviations, df = 98 and P = 0.4216.) (c) With 1000 resamples, the two-sided
P-value will typically be between 0.34 and 0.50. These are reasonably consistent with the
P-value from part (b), leading to the same conclusion: There is little evidence that µ1 and
µ2 differ. (d) Ranges for the BCa interval are on the right. These intervals include 0 and
suggests that the two means are not significantly different at the 0.05 level, which is consis-
tent with the conclusions in parts (b) and (c).
Note: For part (c), note that if the true one-sided P-value
√ is 0.21, then nearly all esti-
.
mated one-sided P-values will be in the range 0.21 ± 3 (0.21)(0.79)/1000 = 0.21 ± 0.04, so
most estimated two-sided P-values will be between 0.34 and 0.50. That’s a wide range; for
more accuracy, use more resamples.
For part (d), the lower and upper endpoints of the BCa intervals can vary quite a bit.
When computing those intervals, R warns “Some BCa intervals may be unstable.”
Normal quantile plot: 2002 price − 2001 price Permutation distribution: 2002 price − 2001 price
150

Observed
Mean
2002 price − 2001 price
50
0
−50
−150

−3 −2 −1 0 1 2 3 −150 −100 −50 0 50 100 150


Normal score 2002 price − 2001 price

16.55. (a) The ILEC distribution (gray bars)


0.6

is clearly skewed to the right, while the


CLEC distribution is skewed to the left
Density
0.4

(although with only 10 observations,


that might not mean anything). (b) In
0.2

keeping with the discussion in Exam-


ple 16.13, we use a one-sided alternative.
For a test of H0 : µILEC = µCLEC versus
0.0

.
Ha : µILEC < µCLEC , we find t = −3.25, 0 5 10 15
. .
df = 10.71, and P = 0.004. (c) Based on Repair time

1000 resamples (each of size 1000), the P-value typically ranges from 0.001 to 0.010. The
permutation test does not require Normal distributions, and gives more accurate answers in
the case of skewness. A plot of the permutation distribution shows there is substantial skew-
ness. (d) The difference is significant at the 5% level (and usually at the 1% level).
436 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: mean(ILEC time − CLEC time) Permutation distribution: mean(ILEC time − CLEC time)
Observed
mean(ILEC time − CLEC time)

Mean
0.5
−0.5
−1.5
−2.5

−3 −2 −1 0 1 2 3 −2 −1 0 1
Normal score mean(ILEC time − CLEC time)

16.56.
√ The standard deviations are approximately Standard
P(1 − P)/n, giving the results on the right. Study n P deviation
DRP 1000 0.015 0.00384
Verizon 500,000 0.0183 0.00019

16.57. (a) The two populations should be the same shape, but skewed—or otherwise clearly
non-Normal—so that the t test is not appropriate. (b) Either test is appropriate if the two
populations are both Normal with the same standard deviation. (c) We can use a t test, but
not a permutation test, if both populations are Normal with different standard deviations.

16.58. With resamples of size 1000, the P-value typically ranges from 0.53 to 0.68. If the
price distributions in 2001 and 2002 were the same, then a difference in medians as large as
we observed would occur more than half the time. We conclude that the difference is easily
explained by chance, and is therefore not statistically significant.
Note: The permutation distribution (below) appears to be bimodal, but that does not
affect our conclusion.

Normal quantile plot: ILEC median − CLEC median Permutation distribution: ILEC median − CLEC median
Observed
ILEC median − CLEC median

Mean
50
0
−50

−3 −2 −1 0 1 2 3 −50 0 50 100
Normal score ILEC median − CLEC median
Solutions 437

16.59. (a) We test H0 : µ = 0 versus

6
Ha : µ > 0, where µ is the population

Gain (posttest – pretest)


4
mean difference before and after the sum-
.

2
mer language institute. We find t = 3.86,
.
df = 19, and P = 0.0005. (b) The quantile

0
−6 −4 −2
plot (right) looks odd because we have a
small sample, and all differences are in-
tegers. (c) The P-value is almost always
less than 0.002. Both tests lead to the same −2 −1 0 1 2
conclusion: The difference is statistically Normal score
significant.
Note: The text states that “the histogram [of the permutation distribution] looks a bit
odd.” In fact, different software produces different default histograms, some of which look
fine. (This statement was made about the default histogram produced by S-PLUS.) To avoid
potential confusion, check what your software does, and (if necessary) tell students to ignore
that part of the question.
Normal quantile plot: mean(gain) Permutation distribution: mean(gain)
Observed
3

Mean
2
mean(gain)
1
0
−1
−2

−3 −2 −1 0 1 2 3 −2 −1 0 1 2 3
Normal score mean(gain)

16.60. (a) We test H0 : µ D = µC versus Ha : µ D < µC , where µ D is the mean driver-calculated


. .
mpg, and µC is the mean computer mpg. We find t = 4.36, df = 19, and P = 0.0002.
This is very strong evidence against the null hypothesis. (b) The permutation test P-value is
almost always 0.001 or less. This is reasonably close to the value from the t test.

Normal quantile plot: mean(Computer − Driver) Permutation distribution: mean(Computer − Driver)


Observed
mean(Computer − Driver)

Mean
2
1
0
−1
−2
−3

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Normal score mean(Computer − Driver)
438 Chapter 16 Bootstrap Methods and Permutation Tests

16.61. We test H0 : ρ = 0 versus Ha : ρ = 0, where ρ is the population correlation. (One could


also justify the one-sided alternative ρ > 0 in this case; the ultimate conclusion is the same
for either alternative.) The permutation distribution (found by permuting the debts from one
year, then computing the correlation) is roughly Normal. In the histogram of the permutation
.
distribution below, the observed correlation (r = 0.997) is not marked because it lies far
out on the high tail, nearly five standard deviations above the mean (0). Consequently,
the reported P-value is nearly always 0, confirming the very strong evidence found in the
solution to Exercise 16.45.

Normal quantile plot: corr(Debt2006, Debt2007) Permutation distribution: corr(Debt2006, Debt2007)


Observed
0.6
corr(Debt2006, Debt2007)

Mean
0.2
−0.2
−0.6

−3 −2 −1 0 1 2 3 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6


Normal score corr(Debt2006, Debt2007)

16.62. (a) We test H0 : ρ = 0 versus Ha : ρ > 0, where ρ is the population correlation.


(b) With 10,000 resamples, the P-value from the permutation test will almost always
be between 0.21 and 0.25. This does not provide significant evidence against the null
hypothesis. (With fewer resamples, the P-values have a wider range, but any reasonable
resample size will lead to the same conclusion.)

Normal quantile plot: corr(Salary, Average) Permutation distribution: corr(Salary, Average)


0.4

Observed
Mean
corr(Salary, Average)
0.2
0.0
−0.2
−0.4

−3 −2 −1 0 1 2 3 −0.4 −0.2 0.0 0.2 0.4


Normal score corr(Salary, Average)

16.63. For testing H0 : σ1 = σ2 versus Ha : σ1 = σ2 , the permutation test P-value will almost
.
always be between 0.065 and 0.095. In the solution to Exercise 7.105, we found F = 1.50
.
with df = 29 and 29, for which P = 0.2757—three or four times as large. In this case, the
permutation test P-value is smaller, which is typical of short-tailed distributions.
Solutions 439

Normal quantile plot: var(north) / var(south) Permutation distribution: var(north) / var(south)


Observed

2.0
var(north) / var(south) Mean
1.5
1.0
0.5

−3 −2 −1 0 1 2 3 0.5 1.0 1.5 2.0


Normal score var(north) / var(south)

. .
16.64. (a) The group means (in µmol/L) are x 1 = 0.778 (group 1, uninfected) and x 2 = 0.620
(group 2, infected). (b) If R = µ1 /µ2 is the ratio of the population means, we test
H0 : R = 1 versus Ha : R > 1. The one-sided P-value is typically between 0.014 and
0.21. The permutation distribution (below) is centered near 1 with standard deviation
approximately 0.11; it is roughly Normal with some right skewness. We expect the
permutation distribution to be centered at about 1, because that is the null hypothesis value
for the ratio.
Note: Our permutation resampling will, on the average, produce x1 = x2 , so it seems
“obvious” that the ratio x1 /x2 should equal 1 on the average. Of course, one should beware
of accepting the “obvious”; in general, the expected value of a ratio is not equal to the
ratio of the expected values, although it will often (as in this case) be close.

Normal quantile plot: mean(uninf.)/mean(inf.) Permutation distribution: mean(uninf.)/mean(inf.)


Observed
1.3

Mean
mean(uninf.)/mean(inf.)
1.1
0.9
0.7

−3 −2 −1 0 1 2 3 0.8 1.0 1.2 1.4


Normal score mean(uninf.)/mean(inf.)

16.65. For the permutation test, we must resample in a way that is consistent with the null
hypothesis. Hence we pool the data—assuming that the two populations are the same—and
draw samples (without replacement) for each group from the pooled data. For the bootstrap,
we do not assume that the two populations are the same, so we sample (with replacement)
from each of the two datasets separately, rather than pooling the data first.
Note: Shown below is the bootstrap distribution for the ratio of means; comparing this
with the permutation distribution from the previous solution illustrates the effect of the
resampling method.
440 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: mean(uninf.) / mean(inf.) Bootstrap distribution: mean(uninf.) / mean(inf.)


Observed
Mean
mean(uninf.) / mean(inf.)
1.6
1.4
1.2
1.0

−3 −2 −1 0 1 2 3 1.0 1.2 1.4 1.6 1.8


Normal score mean(uninf.) / mean(inf.)

16.66. (a) Ranges for the BCa interval (based on 1000 re- Typical ranges
samples) are given on the right. Note that some of these Bias −0.01 to 0.02
intervals include 1, suggesting that for some resamples, SEboot 0.12 to 0.15
we could not reject H0 : R = 1. (b) The bootstrap distri- BCa lower 0.97 to 1.06
bution (shown with the solution to the previous problem) BCa upper 1.47 to 1.58
is right skewed, with relatively little bias. Typically, the Percentile lower 0.99 to 1.05
Percentile upper 1.49 to 1.58
percentile interval is shifted slightly to the right of the
BCa interval, which means that it suggests slightly stronger evidence against H0 : R = 1.
Note: If students use a larger resample size, the intervals show less variability. with
10,000 resamples, both the BCa and percentile intervals will almost certainly (barely) not
include 1; in particular, the BCa lower bound is typically between 1.003 and 1.023.

16.67. (a) The resampled CLEC standard deviation is sometimes 0, so for best results (that
is, to avoid infinite ratios), put that standard deviation in the numerator. Both bootstrap
distributions are shown below. (We do not need quantile plots to confirm that these
distributions are non-Normal.) Regardless of which ratio we use, the resulting P-value is
close to 0.37. (b) The difference in the P-values is evidence of the inaccuracy of the F test;
these distributions clearly do not satisfy the Normality assumption.

Bootstrap distribution: sd(CLEC) / sd(ILEC) Bootstrap distribution: sd(ILEC) / sd(CLEC)


Observed Observed
Mean Mean

0 1 2 3 4 0 1 2 3 4 5 6
sd(CLEC) / sd(ILEC) sd(ILEC) / sd(CLEC)
Solutions 441

16.68. To test H0 : p1 = p2 versus Ha : p1 = p2 , we resample (without replacement) from the


pooled data: 2822 + 1553 = 4375 subjects, of which 198 + 295 = 493 have downloaded a
podcast. The exact approach depends on the software used; one way is to code each of the
4375 responses as 0 or 1, and do a permutation test for a difference of means. Regardless of
.
the approach, the result should be the same: The observed difference ( p̂2 − p̂1 = 0.1198) is
about 12 standard deviations above the mean, and the reported P-value is nearly always 0.

Normal quantile plot: 2008 prop. − 2006 prop. Permutation distribution: 2008 prop. − 2006 prop.
0.03

Observed
Mean
2008 prop. − 2006 prop.
0.01
−0.01
−0.03

−3 −2 −1 0 1 2 3 −0.03 −0.01 0.00 0.01 0.02 0.03


Normal score 2008 prop. − 2006 prop.

16.69. The bootstrap distribution looks quite Normal, and (as a consequence) all of the
bootstrap confidence intervals are similar to each other, and also are similar to the standard
(large-sample) confidence interval: 0.0981 to 0.1415.
Note: At the time these solutions were written, R’s bootstrapping package would fail if
asked to find the BCa confidence interval for this exercise.

Normal quantile plot: 2008 prop. − 2006 prop. Bootstrap distribution: 2008 prop. − 2006 prop.
Observed
0.15

Mean
2008 prop. − 2006 prop.
0.13
0.11
0.09

−3 −2 −1 0 1 2 3 0.10 0.12 0.14 0.16


Normal score 2008 prop. − 2006 prop.

16.70. See the solution to Exercise 7.65 for stemplots and summary statistics, and the unpooled
test; the solution to Exercise 7.86 gives the details for the pooled test. In Exercise 7.65,
the text suggested that there was a prior suspicion that that sad group would be willing to
pay more, so we used a one-sided alternative. Here we use a two-sided alternative. (a) The
. .
unpooled t statistic is t = −4.30 with df = 26.48, for which the two-sided P-value is
. . .
P = 0.0002. (b) The pooled test gives t = −4.10, df = 2, and P = 0.0003. (c) Apart from
some granularity (visible in the quantile plot), the permutation distribution is reasonably
Normal. The observed difference is about three standard deviations above the mean, and
P < 0.002 (nearly always). (d) Student preferences will vary. Note that the permutation test
is safest because it makes the fewest assumptions about the populations, while the pooled
442 Chapter 16 Bootstrap Methods and Permutation Tests

t test makes the most assumptions. Given the identical conclusions and the similarity of
strength of the evidence, there is little reason to have a strong preference here.

Normal quantile plot: mean(Sad − Neutral) Permutation distribution: mean(Sad − Neutral)


Observed
1.5

Mean
mean(Sad − Neutral)
0.5
−0.5
−1.5

−3 −2 −1 0 1 2 3 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5


Normal score mean(Sad − Neutral)

16.71. (a) The standard test of H0 : σ1 = σ2 versus Ha : σ1 = σ2 leads to F = 0.3443 with df


.
13 and 16; P = 0.0587. (b) The permutation P-value is typically between 0.02 and 0.03.
(c) The P-values are similar, even though technically, the permutation test is significant
at the 5% level, while the standard test is (barely) not. Because the samples are too small
to assess Normality, the permutation test is safer. (In fact, the population distributions are
discrete, so they cannot follow Normal distributions.)

Normal quantile plot: var(Neutral) / var(Sad) Permutation distribution: var(Neutral) / var(Sad)


Observed
Mean
var(Neutral) / var(Sad)
2.5
1.5
0.5

−3 −2 −1 0 1 2 3 0.5 1.0 1.5 2.0 2.5 3.0 3.5


Normal score var(Neutral) / var(Sad)

16.72. To compare the means, we need a matched-pairs permutation test, which gives
a P-value near 0.78—no reason to suspect a systematic difference in the operators’
measurements.
There is not really a legitimate way to compare the spreads with the data we have. Most
of the variation in each operator’s measurements can be attributed to variation in the subjects
being measured, rather than variation due to the operator’s abilities. Nonetheless, we can
do this comparison by randomly swapping (or not) observations within each matched pair,
and then examining the ratio of variances (or equivalently, the standard deviations). The
.
permutation distribution of the variance ratio is given below; it has P = 0.66. In both cases
there is not statistically significant evidence of a difference between the operators. The
differences could easily arise by chance; even larger differences would occur by chance
more than half the time.
Note: For a legitimate comparison of the spreads for the two operators, we would want
Solutions 443

to have multiple measurements on each subject.


Although the ratio of variances is the most common comparison, we could also compute
the difference between the variances (or standard deviations). These approaches yield
P-values similar to those found for the ratio.

Normal quantile plot: mean(Operator 1 − Operator 2) Permutation distribution: mean(Operator 1 − Operator 2)


0.000 0.005 0.010
mean(Operator 1 − Operator 2)

Observed
Mean
−0.010

−3 −2 −1 0 1 2 3 −0.010 −0.005 0.000 0.005 0.010


Normal score mean(Operator 1 − Operator 2)

Normal quantile plot: var(Operator 1) / var(Operator 2) Permutation distribution: var(Operator 1) / var(Operator 2)


var(Operator 1) / var(Operator 2)

Observed
1.10

Mean
1.00
0.90

−3 −2 −1 0 1 2 3 0.90 0.95 1.00 1.05 1.10 1.1


Normal score var(Operator 1) / var(Operator 2)

16.73. See the solution to Exercise 16.18 for another view of the bootstrap distribution. Ranges
for the bootstrap t, percentile, and BCa intervals are given on the right. All have similar
upper endpoints, but the lower endpoint for the bootstrap t is typically less than the others
(because it ignores the skewness in the bootstrap distribution). It appears that any of the
other intervals—including the percentile interval—would be more reliable.
Bootstrap distribution: trim25(Price)
Observed
Typical ranges Mean
t lower 205.5 to 211.5 boot t
percentile
t upper 276.5 to 282.5 BCa
Percentile lower 208.4 to 216.5
Percentile upper 276.2 to 288.8
BCa lower 207.9 to 217.4
BCa upper 276.0 to 293.6
200 220 240 260 280 300
trim25(Price)
444 Chapter 16 Bootstrap Methods and Permutation Tests

16.74. See the solution to Exercise 16.5 for the bootstrap distribu- Typical ranges
tion. (a) The bootstrap distribution is approximately Normal, with t lower 3.1 to 3.4
a very slight right skew. A t interval should be acceptable unless t upper 5.8 to 6.1
high accuracy is needed. (b) Ranges for the t and BCa intervals BCa lower 3.2 to 3.6
are given on the right. (c) The BCa interval is typically shifted to BCa upper 5.9 to 6.5
the right of the t interval, as we might expect because of the slight right skew.

16.75. All answers (including the shape of the bootstrap distribution) will depend strongly on
the initial sample of uniform random numbers. The median M of these initial samples will
be between about 0.36 and 0.64 about 95% of the time; this is the center of the bootstrap t
confidence interval. (a) For a uniform distribution on 0 to 1, the population median is 0.5.
Most of the time, the bootstrap distribution is quite non-Normal; three examples are shown
below. (b) SEboot typically ranges from about 0.04 to 0.12 (but may vary more than that,
depending on the original sample). The bootstrap t interval is therefore roughly M ± 2SEboot .
(c) The more sophisticated BCa and tilting intervals may or may not be similar to the
bootstrap t interval. The t interval is not appropriate because of the non-Normal shape of
the bootstrap distribution, and because SEboot is unreliable for the sample median (it depends
strongly on the sizes of the gaps between the observations near the middle).
Note: Based on 5000 simulations of this exercise, the bootstrap t interval M ± 2SEboot
will capture the true median (0.5) only about 94% of the time (so it slightly underperforms
its intended 95% confidence level). In the same test, both the percentile and BCa intervals
included 0.5 over 95% of the time, while at the same time being narrower than the
bootstrap t interval nearly two-thirds of the time. These two measures (achieved confidence
level, and width of confidence interval) both confirm the superiority of the other intervals.
The bootstrap percentile, BCa, and tilting intervals do fairly well despite the high
variability in the shape of the bootstrap distribution. They give answers similar to the exact
rank-based confidence intervals obtained by inverting hypothesis tests. One variation of
tilting intervals matches the exact intervals.

Observed Observed Observed


Mean Mean Mean

0.2 0.3 0.4 0.5 0.6 0.7 0. 0.3 0.4 0.5 0.6 0.3 0.4 0.5 0.6 0.7
median(uniform50) median(uniform50) median(uniform50)
Solutions 445

16.76. (a) The difference appears to be quite large; it should be signifi- Male Female
cant. (b) The two-sided permutation P-value is about 0.000655. With 1 9
2 01
1000 resamples, students will typically get a P-value of no more than 3 2 2233
0.004. (c) We conclude that there is significant evidence that the mean 5 2 4
6 2
ages differ. The t test P-value is similar to the (true) permutation P- 899 2 9
value, although student estimates of the latter might be too high. 01 3
22 3
Note: Some software will compute the exact permutation test P- 5 3
.
110
value; it is 167,960 = 0.000655. This is about three times larger than the standard t test
.
result: P = 0.000223.
Normal quantile plot: mean(male age − female age) Permutation distribution: mean(male age − female age)
Observed
6
mean(male age − female age)

Mean
4
2
0
−6 −4 −2

−3 −2 −1 0 1 2 3 −6 −4 −2 0 2 4 6
Normal score mean(male age − female age)

16.77. See the solution to Exercise 2.33 for a scatterplot. The permutation distribution (found
by permuting one variable and computing the correlation) is roughly Normal, and the
.
observed correlation (r = 0.878) lies far out on the high tail, about three standard deviations
above the mean (0). We conclude there is a significant positive relationship.

Normal quantile plot: corr(distress, activity) Permutation distribution: corr(distress, activity)


Observed
Mean
corr(distress, activity)
0.5
0.0
−0.5

−3 −2 −1 0 1 2 3 −0.5 0.0 0.5


Normal score corr(distress, activity)
446 Chapter 16 Bootstrap Methods and Permutation Tests

16.78. (a) The permutation distribution is centered near 0, Typical ranges


because for a hypothesis test, we resample in a way that t lower 0.67 to 0.74
is consistent with H0 : ρ = 0. In contrast, the bootstrap is t upper 1.02 to 1.09
centered near the observed correlation of 0.878. The right Percentile lower 0.61 to 0.71
tail is bounded above by 1, whereas the left tail can be Percentile upper 0.95 to 0.97
much longer. (b) Ranges for the intervals are given on the BCa lower 0.51 to 0.70
BCa upper 0.94 to 0.97
right. (Because of the skew, the t interval is a poor choice
for this situation, as evidenced by the upper limit exceeding 1.) None of the intervals are
even close to including 0; we conclude that there is a significant positive relationship.
Normal quantile plot: corr(distress, activity) Bootstrap distribution: corr(distress, activity)
1.0

Observed
Mean
corr(distress, activity)

boot t
0.8

percentile
BCa
0.6
0.4
0.2

−3 −2 −1 0 1 2 3 0.2 0.4 0.6 0.8 1.0


Normal score corr(distress, activity)

16.79. (a) The 2001 data is slightly skewed, but 2000 prices 2001 prices
close to Normal given the sample size (50). The 1 3346899 0 5677
2000 data is strongly right-skewed with two 2 001488 1 0134445799
3 3669 2 0011123444677899
high outliers; a sample of size 20 is probably 4 8 3 1123457
not enough to compensate. (b) The two-sided 5 4 25556788
P-value for the permutation test is approximately 6 5 017
0.28. We conclude that there is not strong evi- 7 6 8
8 7 1
dence that the mean selling prices are different 9
for all Seattle real estate in 2000 and in 2001. 10
11 0
12
13
14
15
16
17
18 4
Solutions 447

Normal quantile plot: 2000 mean − 2001 mean Permutation distribution: 2000 mean − 2001 mean
Observed
2000 mean − 2001 mean Mean
150
50
−50
−150

−3 −2 −1 0 1 2 3 −100 0 100 200


Normal score 2000 mean − 2001 mean

16.80. (a) See the solution to Exercise 1.41 for stemplots. Typical ranges
Summary statistics (all in units of minutes): t lower 12.0 to 17.0
t upper 79.0 to 84.0
x s Min Q1 M Q3 Max Percentile lower 8.8 to 18.9
Men 117.2 74.24 0 60 120 150 300 Percentile upper 75.4 to 85.2
Women 165.2 56.51 60 120 175 180 360 BCa lower 7.4 to 20.8
BCa upper 75.1 to 87.9
(b) The (unpooled) two-sample t test of H0 : µ F = µ M
. . .
versus Ha : µ F = µ M gives t = 2.82, df = 54.2, and P = 0.0067—a significant difference.
A 95% confidence interval for the difference µ F − µ M is 13.85 to 82.15 minutes. (c) A
two-sided permutation test for the difference of means typically gives P no more than 0.02
(with 1000 resamples). The bootstrap distribution is slightly skewed; confidence intervals are
similar to the standard t interval, although the percentile and BCa intervals are sometimes
shifted to the left.
Normal quantile plot: Female mean − male mean Permutation distribution: Female mean − male mean
Observed
Female mean − male mean
40

Mean
20
0
−20
−60

−3 −2 −1 0 1 2 3 −60 −40 −20 0 20 40


Normal score Female mean − male mean
448 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: Female mean − male mean Bootstrap distribution: Female mean − male mean
Female mean − male mean

Observed
Mean
80

boot t
percentile
60

BCa
40
20
0

−3 −2 −1 0 1 2 3 −20 0 20 40 60 80 100
Normal score Female mean − male mean

16.81. The permutation and bootstrap distributions for the Typical ranges
difference in medians are extremely non-Normal, with t lower 9.6 to 16.5
many gaps and multiple peaks. In this situation, we have t upper 93.5 to 100.4
conflicting results: The permutation test gives fairly Percentile lower 0 to 30
strong evidence of a difference (the two-sided P-value is Percentile upper 90 to 100
roughly 0.032), but the BCa interval for the difference in BCa lower −32.5 to 0
BCa upper 75 to 90
medians nearly always includes 0.
Normal quantile plot: Female median − male median Permutation distribution: Female median − male median
60

Observed
Female median − male median

Mean
40
20
0
−20
−60

−3 −2 −1 0 1 2 3 −60 −40 −20 0 20 40 60


Normal score Female median − male median

Normal quantile plot: Female median − male median Bootstrap distribution: Female median − male median
Observed
Female median − male median
120

Mean
boot t

percentile
20 40 60 80

BCa
0

−3 −2 −1 0 1 2 3 0 20 40 60 80 100 120 14
Normal score Female median − male median
Solutions 449

16.82. The standard test for equality of variances gives Typical ranges
.
F = 0.58 with df 29 and 29, for which p = 0.1477. t lower −0.22 to 0.02
(a) Using a permutation test, the two-sided P-value is t upper 1.14 to 1.38
about 0.226. Ranges for the bootstrap intervals are on Percentile lower 0.17 to 0.23
the right; the bootstrap t is a bad choice for this sharply Percentile upper 1.24 to 1.62
skewed distribution. (b) The variances are equal if and BCa lower 0.21 to 0.27
BCa upper 1.42 to 2.42
only if the standard deviations are equal. Any conclu-
sions about the ratio of variances from the bootstrap and permutation distributions has an
equivalent conclusion about the ratio of standard deviations. (c) We have strong evidence
that the means of the two distributions are different, but cannot reject H0 : σ F = σ M . The
evidence regarding the medians is mixed.
Normal quantile plot: var(Female) / var(Male) Permutation distribution: var(Female) / var(Male)
3.5

Observed
Mean
var(Female) / var(Male)
2.5
1.5
0.5

−3 −2 −1 0 1 2 3 0.5 1.0 1.5 2.0 2.5 3.0 3.5


Normal score var(Female) / var(Male)

Normal quantile plot: var(Female) / var(Male) Bootstrap distribution: var(Female) / var(Male)


Observed
3.0

Mean
var(Female) / var(Male)

boot t

percentile
2.0

BCa
1.0
0.0

−3 −2 −1 0 1 2 3 0.0 0.5 1.0 1.5 2.0 2.5 3.0


Normal score var(Female) / var(Male)

16.83. See Exercise 8.55 for more details about this survey. Typical ranges
The bootstrap distribution appears to be close to Normal; t lower 0.31 to 0.32
bootstrap intervals are similar to the large-sample interval t upper 0.38 to 0.39
(0.3146 to 0.3854). Percentile lower 0.30 to 0.32
Note: At the time of this writing, R’s bootstrapping Percentile upper 0.38 to 0.39
package would not compute the BCa intervals for this exercise.
450 Chapter 16 Bootstrap Methods and Permutation Tests

Normal quantile plot: Teen prop. − adult prop. Bootstrap distribution: Teen prop. − adult prop.
Observed
Mean
Teen prop. − adult prop.

boot t
0.38

percentile
0.34
0.30

−3 −2 −1 0 1 2 3 0.30 0.32 0.34 0.36 0.38 0.40


Normal score Teen prop. − adult prop.

16.84. The bootstrap distribution is slightly skewed, but close Typical ranges
enough to Normal that there is little difference among the t lower 1.54 to 1.56
interval methods. t upper 1.73 to 1.76
Note: At the time of this writing, R’s bootstrapping Percentile lower 1.54 to 1.57
package would not compute the BCa intervals for this Percentile upper 1.73 to 1.77
exercise.
Normal quantile plot: Teen prop. / adult prop. Bootstrap distribution: Teen prop. / adult prop.
1.80

Observed
Mean
Teen prop. / adult prop.

boot t
1.70

percentile
1.60
1.50

−3 −2 −1 0 1 2 3 1.50 1.55 1.60 1.65 1.70 1.75 1.80


Normal score Teen prop. / adult prop.

16.85. (a) This is the usual way of computing percent change: 89/54 − 1 = 0.65. (b) Subtract
1 from the confidence interval found in Exercise 16.84; this typically gives an interval
similar to 0.55 to 0.75. (c) Preferences will vary.

16.86. (a) Jocko’s mean estimate is $1827.5; the other Typical ranges
garage’s mean is $1715. The matched-pairs t interval Bias −3.7 to 3.7
for the difference is $112.5 ± $88.52 = $23.98 to SEboot 34.7 to 39.9
$201.02. (b) Because these are matched pairs, we re- t lower 22.3 to 34.0
sample the differences. The distribution is reasonably t upper 191.0 to 202.7
close to Normal; ranges for the bootstrap intervals are Percentile lower 25.1 to 47.6
Percentile upper 175 to 195
on the right. (c) The bootstrap t interval is similar to
BCa lower 20 to 45
the standard t interval; the other intervals are typically BCa upper 170 to 195
narrower.
Solutions 451

Normal quantile plot: mean(Jocko − Other) Bootstrap distribution: mean(Jocko − Other)


Observed

200
mean(Jocko − Other) Mean
150 boot t

percentile
BCa
100
50

−3 −2 −1 0 1 2 3 50 100 150 200


Normal score mean(Jocko − Other)

16.87. (a) The mean ratio is 1.0596; the usual t interval is Typical ranges
.
1.0596 ± (2.262)(0.02355) = 1.0063 to 1.1128. The (a) Mean ratio
bootstrap distribution for the mean is close to Normal, t lower 1.00 to 1.02
and the bootstrap confidence intervals (typical ranges on t upper 1.10 to 1.12
the right) are usually similar to the usual t interval, but Percentile lower 1.00 to 1.03
slightly narrower. Bootstrapping the median produces a Percentile upper 1.09 to 1.11
BCa lower 1.00 to 1.03
clearly non-Normal distribution; the bootstrap t interval
BCa upper 1.09 to 1.11
should not be used for the median. (Ranges for median (b) Ratio of means
intervals are not given.) (b) The ratio of means is 1.0656; t lower 0.59 to 0.68
the bootstrap distribution is noticeably skewed, so the t upper 1.46 to 1.54
bootstrap t is not a good choice, but the other methods Percentile lower 0.69 to 0.78
usually give intervals similar to 0.75 to 1.55. Also shown Percentile upper 1.45 to 1.64
below is the bootstrap distribution for the ratio of the me- BCa lower 0.69 to 0.78
BCa upper 1.45 to 1.66
dians. It is considerably less erratic than the median ratio,
but we have still not included these confidence intervals. (c) For example, the usual t interval
from part (a) could be summarized by the statement, “On average, Jocko’s estimates are 1%
to 11% higher than those from other garages.”
Normal quantile plot: mean(Jocko / Other) Bootstrap distribution: mean(Jocko / Other)
Observed
Mean
1.10
mean(Jocko / Other)

boot t

percentile
BCa
1.05
1.00

−3 −2 −1 0 1 2 3 1.00 1.05 1.10


Normal score mean(Jocko / Other)
452 Chapter 16 Bootstrap Methods and Permutation Tests

1.15 Normal quantile plot: median(Jocko / Other) Bootstrap distribution: median(Jocko / Other)
Observed
Mean
median(Jocko / Other)
1.10

boot t

percentile
BCa
1.05
1.00
0.95

−3 −2 −1 0 1 2 3 0.95 1.00 1.05 1.10 1.15


Normal score median(Jocko / Other)

Normal quantile plot: mean(Jocko) / mean(Other) Bootstrap distribution: mean(Jocko) / mean(Other)


Observed
mean(Jocko) / mean(Other)

Mean
boot t
1.4

percentile
BCa
1.0
0.6

−3 −2 −1 0 1 2 3 0.6 0.8 1.0 1.2 1.4 1.6 1.


Normal score mean(Jocko) / mean(Other)

Normal quantile plot: median(Jocko) / median(Other) Bootstrap distribution: median(Jocko) / median(Other)


2.0

Observed
median(Jocko) / median(Other)

Mean
boot t
1.5

percentile
BCa
1.0
0.5

−3 −2 −1 0 1 2 3 0.5 1.0 1.5 2.0


Normal score median(Jocko) / median(Other)
Chapter 17 Solutions

17.4. Possible examples of special causes might include: traffic, number of passengers on the
shuttle (especially if the shuttle makes several stops along the way), mechanical problems
with the shuttle.

√ line is at µ = 85 seconds. The control limits should be at µ ± 3σ/ 6 =
17.5. The center
85 ± 3(17/ 6 ), which means about 64.18 and 105.82 seconds.

17.6. (a) With n√= 5, the center line is unchanged (85 seconds), but the control limits are
now µ ± 3σ/ 5 = 62.18 and 107.82 seconds. (b) With √ n. = 7, the center line is unchanged
(85 seconds), but the control limits are now µ ± 3σ/ 7 = 65.72 and 104.28 seconds. (c) To
convert to minutes, divide the original center line and control limits by 60: The center line
is 1.417 minutes, and the control limits are 1.070 and 1.764 minutes.

17.8. Common causes of variation might include the time it takes to call in the order, to make
the pizza, and to deliver it. Examples of special causes might include heavy traffic or
waiting for a train (causing delays in delivery), high demand for pizza (for example, during
events like the Super Bowl), etc.

17.9. The most common problems are related to the application of the color coat; that should
be the focus of our initial efforts.

For 17.9 For 17.10

30 14
Frequency (percent)

12
Percent of losses

25
10
20
8
15
6
10
4
5 2
0 0
s
ss
r

209 116 107 462 109 148 430 403 104


ss
n
r

rity
s
olo
olo

es

nc

os

ve
ne

glo
cla
kn

ere
s/c

/gl

ne
c

ck
re/

re/

DRG
hic

les

tu
ple

of
dh
thi
xtu

xtu
st

oa
p

ck
ra
Rip

lor

Rip
Te

Te
os

La
o

tro
Co

Po
Gl

c
Ele

Problem

17.10. These DRGs account for a total of 80.5% of all losses. Certainly the first two (209 and
116) should be among those that are studied first; some students may also include 107, 462,
and so on.

17.11. Possible causes could include delivery delays due to traffic or a train, high demand
during special events, and so forth.

453
454 Chapter 17 Statistics for Quality: Control and Capability


17.12. (a) The center line is at µ = 72◦ F; the control limits should be at µ ± 3σ/ 5, which
means about 71.46◦ F and 72.54◦ F. (b) For n = 5, c4 = 0.94 and B6 = 1.964, so the center
line for the s chart is (0.94)(0.4) = 0.376◦ F, and the control limits are 0 and 0.7856◦ F.

17.13. (a) For√the x chart, the center line is at µ = 1.028 lb; the control limits should be
at µ ± 3σ/ 3, which means about 0.9864 and 1.0696 lb. (b) For n = 3, c4 = 0.8862
and B6 = 2.276, so the center line for the s chart is (0.8862)(0.024) = 0.02127 lb, and
the control limits are 0 and 0.05462 lb. (c) The control charts are below. (d) Both charts
suggest that the process is in control.

1.07

Sample standard deviation


1.06 0.05
1.05
Sample mean

0.04
1.04
1.03 0.03
1.02
1.01 0.02
1 0.01
0.99
0.98 0
0 5 10 15 20 0 5 10 15 20
Sample number Sample number

17.14. (a) Common causes might include processing time, normal workload fluctuation, or
postal delivery time. (b) s-type special causes might include a new employee working in the
personnel department. (c) Special causes affecting x might include a sudden large influx of
applications or perhaps introducing a new filing system for applications.
Solutions 455

17.15. (a) The center line is at µ = 11.5 √


Kp; 11.9 Set A x x
the control limits should be at µ ± 3σ/ 4 = 11.8 • •
11.5 ± 0.3 = 11.2 and 11.8 Kp. (b) Graphs •
11.7 •

Sample mean
• ••
on the right and below. Points outside con- 11.6 • • •••
trol limits are marked with an “X.” (c) Set B 11.5 • •
• •••
is from the in-control process. The process 11.4 •
mean shifted suddenly for Set A; it appears 11.3 •
to have changed on about the 11th or 12th 11.2
sample. The mean drifted gradually for the 11.1
0 5 10 15 20
process in Set C.
Sample number

11.9 Set B 12.1 Set C x•


11.8
x•
11.9
11.7 •
Sample mean

Sample mean
11.6 •• • • • 11.7 ••••
• • •
11.5 • •• • •• •••
• • • • ••••
11.4 •
11.5 ••••
11.3
••
11.3 •
11.2
11.1 11.1
0 5 10 15 20 0 5 10 15 20
Sample number Sample number

17.16. (a) For the x chart, the center line is 11.5, and the control limits are 11.2 and 11.8 (as
in Exercise 17.15). For n = 4, c4 = 0.9213 and B6 = 2.088, so the center line for the s
chart is (0.9213)(0.2) = 0.18426, and the control limits are 0 and 0.4176. (b) The s chart is
certainly out of control at sample 7 (and was barely in control at sample 6). After that, there
are a number of out-of-control points. The x chart is noticeably out of control at sample 8.
(c) A change in the mean does not affect the s chart; the effect on the x chart is masked
by the change in σ: Because of the increased variability, the sample means are sometimes
below the UCL even after the process mean shifts.

12.1 x• 0.8 x
Sample standard deviation

0.7

11.9 x x x
• • 0.6 x
Sample mean

• x •
0.5 • x
11.7 •• x

• • ••• 0.4 • •••••
11.5 •
••
• ••
• 0.3 •
•• • 0.2
• •
••• •
11.3
0.1 • •
11.1 0
0 5 10 15 20 0 5 10 15 20
Sample number Sample number

17.17. For the s chart with n = 6, we have c4 = 0.9515, B5 = 0.029 and B6 = 1.874, so the
center line is (0.9515)(0.001) = 0.0009515 inch, and the control limits are 0.000029 and
0.001874 √ inch. For the x chart, the center line is µ = 0.87 inch, and the control limits are
. .
µ ± 3σ/ 6 = 0.87 ± 0.00122 = 0.8688 and 0.8712 inch.
456 Chapter 17 Statistics for Quality: Control and Capability

17.18. (a) For n = 5, we have c4 = 0.94, B5 = 0, and B6 = 1.964, so the center line is
√ . are 0 and 0.249428. (b) The center line is µ = 4.22, and the
0.11938, and the control limits
control limits are µ ± 3σ/ 5 = 4.00496 to 4.3904.

17.19. For the x chart, the center line is 43, and the control limits are 25.91 and 60.09.
For n = 5, c4 = 0.9400 and B6 = 1.964, so the center line for the s chart is
(0.9400)(12.74) = 11.9756, and the control limits are 0 and 25.02. The control charts
(below) show that sample 5 was above the UCL on the s chart, but it appears to have been
special cause variation, as there is no indication that the samples that followed it were out of
control.

60 x

Sample std. deviation


25
55

Sample mean

50 20
• • • • • • •
45 • • • • •
15 •
40 • • • • • • • • • • • •
• • • • 10
35 • • • •
30 5 •
25
0
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18
Sample number Sample number

17.20. The new type of yarn would appear on the x chart because it would cause a shift in the
mean pH. (It might also affect the process variability and therefore show up on the s chart.)
Additional water in the kettle would change the pH for that kettle, which would change the
mean pH and also change the process variability, so we would expect that special cause to
show up on both the x and s charts.

√ µ = 715. The control limits


17.21. (a) The process mean is the same as the center line:
are three standard errors from the mean, so 30 = 3σ/ 4, meaning that σ = 20.
(b) If the mean changes to µ √ = 700, then x is approximately Normal with mean
700 and standard deviation σ/ 4 = 10, so x will fall outside the control limits with
probability 1 − P(685 < x < 745) = 1 − P(−1.5 < Z < 4.5) = 0.0668.
(c) With µ = 700 and √ σ = 30, x is approximately Normal with mean 700 and
standard deviation σ/ 4 = 15, so x will fall outside the control limits with probability
1 − P(685 < x < 745) = 1 − P(−1 < Z < 3) = 0.16.

17.22. c = 3.090. (Looking at Table A, there appear to be three possible answers—3.08, 3.09,
or 3.10. Software gives the answer 3.090232. . . .)

17.23. The usual 3σ limits are µ ± 3σ/ n for an x chart and (c4 ± 3c5 )σ for an s chart. For

2σ limits, simply replace “3” with “2.” (a) µ ± 2σ/ n. (b) (c4 ± 2c5 )σ .

17.24. (a) The R chart monitors the variability (spread) of the process. (b) The R chart is
commonly used because R is easier to compute (by hand) than s. (c) The x control limits
are affected because we estimate process spread using R instead of s.
Solutions 457

17.25. (a) Shrinking the control limits would increase the frequency of false alarms, because
the probability of an out-of-control point when the process is in control will be higher
(roughly 5% instead of 0.3%). (b) Quicker response comes at the cost of more false alarms.
(c) The runs rule is better at detecting gradual changes. (The one-point-out rule is generally
better for sudden, large changes.)

17.26. (a) Either (ii) or (iii), depending on whether the deterioration happens quickly or
gradually. We would not necessarily expect that this deterioration would result in a
change in variability (s or R). (b) (i) s or R chart: A change in precision suggests altered
variability (s or R), but not necessarily a change in center (x). (c) (i) s or R chart:
Assuming there are other (fluent) customer service representatives answering the phones, this
new person would have unusually long times, which should most quickly show up as an
increase in variability. (d) (iii) A run on the x chart: “The runs signal responds to a gradual
shift more quickly than the one-point-out signal.”

.
17.27. We estimate σ̂ to √be s/0.9213 = 1.1180, so the x chart has center line x = 47.2 and
.
control limits x ± 3σ̂ / 4 = 45.523 and 48.877. The s chart has center line s = 1.03 and
.
control limits 0 and 2.088σ̂ = 2.3344.

. .
17.28. To estimate µ and σ , we compute x = 1.0299 lb and s = 0.0222 lb from the
sample means and standard deviations given in Table 17.3. µ̂ = x is our estimate of
µ; this is about 0.002 lb greater than the historical value (1.028 lb). To estimate σ , we
.
use σ̂ = s/c4 = 0.0222/0.8862 = 0.0251 lb—about 4.6% greater than the historical
value (0.024 lb). Both of these differences are so small that, even if they are statistically
significant, it seems unlikely that they suggest any noteworthy change in this process.

17.29. One possible x chart is shown, created


4.5 •
with the (arbitrary) assumption that the expe- • •
4 • • • ••
rienced clerk processes invoices in an average •
Sample mean

of 2 minutes, while the new hire takes an av- 3.5 • •


erage of 4 minutes. (The control limits were 3
set arbitrarily as well.) 2.5 •
2 •
Note: Such a process would not be con- •• • •• • •
sidered to be in control for very long. The 1.5
initial control limits might be developed based 1
0 5 10 15 20
on a historical estimate of σ , but eventually
Sample number
we should assess that estimate based on our
sample standard deviations. Because both clerks “are quite consistent, so that their times
vary little from invoice to invoice,” each sample has a small value of s, so the revised esti-
mate of σ would likely be smaller. At that point, the control limits (based on that smaller
spread) will be moved closer to the center line.

17.30. (a) Sketches will vary quite a bit; many students will struggle with the implications of
this situation on the appearance of the two charts. The two charts below were produced
using a much more sophisticated approach than most students would take; they arose from
a simulation taking samples of size 6 (3 from each clerk), where the experienced clerk’s
458 Chapter 17 Statistics for Quality: Control and Capability

processing time (in minutes) is N (2, 0.5) and the new hire’s processing time is N (4, 0.8).
The center lines and control limits were estimated from the data. (b) For example, this
would be acceptable if we are concerned with overall processing time and are not interested
in individual processing times. In particular, it would not be appropriate to compute
tolerance limits for this situation because the individual measurements do not have a Normal
distribution.

Sample standard deviation


4.5 2.5
4
2
Sample mean

3.5
3 1.5
2.5 1
2
0.5
1.5
1 0
0 5 10 15 20 0 5 10 15 20
Sample number Sample number

Note: The situation described here demonstrates the problems that can arise when we do
not carefully consider the question of “rational subgroups” in our sampling design; see the
discussion on page 17-30.
This situation is related to the issue of distinguishing “within-groups” variation
from “between-groups” variation, as discussed in Chapter 12 (One-Way ANOVA). The
within-groups variation is variation in invoice processing time for each clerk, and the
between-groups variation is the difference between their processing times. In this case,
though, we are not paying attention to the explanatory variable (which clerk processed the
invoice), so all we see is a mixture of the two sources of variation. If—as was the case
here—the two clerks were fairly consistent so that within-groups variation is small, the
sample standard deviations are most affected by the between-groups variation.
Note that both charts show less variation than we typically see; nearly all the points are
no more than 1 or 2 standard deviations from the center line. To begin to understand why,
imagine an extreme case with no within-groups variation—where one clerk always takes
exactly 2 minutes, and the other always takes exactly 4 minutes. Then each sample would
.
contain the 6 numbers 2, 2, 2, 4, 4, and 4, so x = 3 and s = 1.0954 for all samples, and the
control charts would have no variation at all.

17.31. (a) Average the 20 sample means and standard deviations and estimate µ to be
.
µ̂ = x = 2750.7 and σ to be σ̂ = s/c4 = 345.5/0.9213 = 375.0. (b) In the s chart shown in
Figure 17.7, most of the points fall below the center line.

17.32. For the 15 samples, we have s = $799.1 and x = $6442.4.


(a) σ̂ = s/c4 = 799.1/0.9650 = 828.1; the center line is s, and the control limits are
B5 σ̂ = (0.179)($828.1) = $148.2 and B6 σ̂ = (1.751)($828.1) = $1450.0. √ (b) For the x
chart, the center line is x = $6442.4, and the control limits are x ± 3σ̂ / 8 = $5564.1 to
$7320.7. The control chart shows that the process is in control.
Solutions 459

1400

Sample std. deviation


1200 7000

• • • •

Sample mean
1000 •
800 • • • • • 6500 • • • • • •
• • • •
• •
600
• •
• • • •
400 6000

200
0 5500
0 3 6 9 12 15 0 3 6 9 12 15
Sample number Sample number

17.33. If the manufacturer practices SPC, that provides some assurance that the phones are
roughly uniform in quality—as the text says, “We know what to expect in the finished
product.” So, assuming that uniform quality is sufficiently high, the purchaser does not need
to inspect the phones as they arrive because SPC has already achieved the goal of that
inspection: to avoid buying many faulty phones. (Of course, a few unacceptable phones may
be produced and sold even when SPC is practiced—but inspection would not catch all such
phones anyway.)

.
17.34. The standard deviation of all 120 measurements is s = $811.53, and the mean is
.
x = $6442.4 (the same as x—as it must be, provided all the individual samples were the
same size). The natural tolerances are x ± 3s = $4007.8 to $8877.0.

17.35. The quantile plot does not suggest any


8500
serious deviations from Normality, so the
8000
natural tolerances should be reasonably
Loss (dollars)

7500
trustworthy. 7000
Note: We might also assess Normality 6500
with a histogram or stemplot; this looks 6000
reasonably Normal, but we see that the 5500
number of losses between $6000 and $6500 5000
is noticeably higher than we might expect 4500
–3 –2 –1 0 1 2 3
from a Normal distribution. In fact, the Normal score
smallest and largest losses were $4727 and
$8794. These are both within the tolerances, but note that the minimum is quite a bit more
than the lower limit of the tolerances ($4008). The large number of losses between $6000
and $6500 makes the mean slightly lower and therefore lowers both of the tolerance limits.

17.36. (a) About 99.9% meet the old specifications: If X is the water resistance on a randomly
chosen jacket, then:
 
− 2750 − 2750 .
P(1000 < X < 4000) = P 1000383.8 < Z < 4000383.8 = P(−4.56 < Z < 3.26) = 0.9994
(b) About 97.4% meet the new specifications:
 
1500 − 2750 3500 − 2750 .
P(1500 < X < 3500) = P 383.8 <Z< 383.8 = P(−3.26 < Z < 1.95) = 0.9738
460 Chapter 17 Statistics for Quality: Control and Capability

17.37. If we shift the process mean to 2500 mm, about 99% will meet the new specifications:
 
− 2500 − 2500 .
P(1500 < X < 3500) = P 1500383.8 < Z < 3500383.8 = P(−2.61 < Z < 2.61) = 0.9910

17.38. (a) The means (1.2605 and 1.2645) agree exactly with those given; the standard
deviations are the same up to rounding. (b) The s chart tracks process spread. For the 30
.
samples, we have s = 0.0028048, so σ̂ = s/c4 = s/0.7979 = 0.003515; the center line is
.
s, and the control limits are B5 σ̂ = 0 and B6 σ̂ = 2.606σ̂ = 0.009161. Short-term variation
seems to be in control. (c) For the x chart, which monitors
√ .the process center, the center
line is x = 1.26185, and the control limits are x ± 3σ̂ / 2 = 1.2544 to 1.2693. The control
chart shows that the process is in control.
Sample standard deviation

0.009 1.268
0.008 •
0.007
1.266 • •

Sample mean
0.006 • •• 1.264 • • • •
• •
0.005 • • • 1.262
• • ••••• •••
0.004 • 1.260 • • • • •
0.003 • • • • • •
• • • • 1.258 • • •
0.002 • • • •• •
0.001 • • • • 1.256
0
• • • • • •
1.254
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Sample number Sample number

17.39. The mean of the 17 in-control samples is x = 43.4118, and the standard deviation is
11.5833, so the natural tolerances are x ± 3s = 8.66 to 78.16.

17.40. There were no out-of-control points, so we estimate the mean of the process using
µ̂ = x = 1.26185. The estimated standard deviation is computed from the 60 individual data
.
points; this gives s = 0.003328. The natural tolerances are x ± 3s = 1.2519 to 1.2718.

17.41. Only about 44% of meters meet the specifications. Using the mean (43.4118) and
standard deviation (11.5833) found in the solution to Exercise 17.39:
 
− 43.4118 − 43.4118 .
P(44 < X < 64) = P 4411.5833 < Z < 6411.5833 = P(0.05 < Z < 1.78) = 0.4426

17.42. There is no clear deviation from Normality apart from granularity due to the limited
accuracy of the recorded measurements.
Solutions 461

For 17.42 For 17.43

1.268
Phantom measurement

Distance measurement
70
1.266
60
1.264
1.262 50
1.260 40
1.258
1.256 30

1.254 20
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Normal score Normal score

17.43. The limited precision of the measurements shows up in the granularity (stair-step
appearance) of the graph. Aside from this, there is no particular departure from Normality.

.
17.44. The standard deviation of all 60 weights is s = 0.0224 lb, and the mean is
.
x = 1.02996 lb (the same as x, except for rounding error). The natural tolerances are
x ± 3s = 0.9627 to 1.0972 lb.

17.45. The quantile plot, while not perfectly 1.07


linear, does not suggest any serious de- 1.06
viations from Normality, so the natural 1.05
Weight (lbs)

tolerances should be reasonably trustworthy. 1.04


1.03
1.02
1.01
1
0.99
0.98
–3 –2 –1 0 1 2 3
Normal score

. .
17.46. (a) For the 21 samples, we have s = 0.2786, so σ̂ = s/c4 = 0.2786/0.9213 = 0.3024;
.
the center line is s, and the control limits are B5 σ̂ = 0 and B6 σ̂ = (2.088)(0.3024) =
0.6313. Short-term variation seems
√ to be in control. (b) For the x chart, the center line is 0
and the control limits are ±3σ̂ / 4 = ±0.4536. The x chart suggests that the process mean
has drifted. (Only the first four out-of-control points are marked.) One possible cause for the
increase in the mean is that the machine that makes the bearings is gradually drifting out of
adjustment.
462 Chapter 17 Statistics for Quality: Control and Capability

0.8 •• •
0.6 x •
Sample std. deviation
0.6 x x x
0.5 • • ••• • •
• •

Sample mean
• 0.4 •
0.4 • •• •
• • 0.2 •
0.3 • • • 0
•••
•• • • • •
0.2 • –0.2 • •
0.1 •• • • –0.4

0 –0.6
0 5 10 15 20 0 5 10 15 20
Sample number Sample number

17.47. (a) (ii) A sudden change in the x chart: This would immediately increase the amount of
time required to complete the checks. (b) (i) A sudden change (decrease) in s or R because
the new measurement system will remove (or decrease) the variability introduced by human
error. (c) (iii) A gradual drift in the x chart (presumably a drift up, if the variable being
tracked is the length of time to complete a set of invoices).

17.49. The process is no longer the same as it was during the downward trend (from the
1950s into the 1980s). In particular, including those years in the data used to establish the
control limits results in a mean that is too high to use for current winning times, and a
standard deviation that includes variation attributable to the “special cause” of the changing
conditioning and professional status of the best runners. Such special cause variation should
not be included in a control chart.

17.50. The center line is 181 pounds and the


186 •
control limits are 181 ± 3.2 = 177.8 and • •
Joe’s weight (pounds)

185
184.2 pounds. The first four points are above 184 • •
the upper control limit; there are no runs 183 • •
(above or below the center line) longer than 182 • •
181 • • • •
five. The overall impression is that Joe’s •
180 •
weight returns to being “in control”; it de- 179

creases fairly steadily, and the last 12 points 178
are between the control limits. 177
0 2 4 6 8 10 12 14 16
Week

17.51. LSL and USL are specification limits on the individual observations. This means that
they do not apply to averages and that they are specified as desired output levels, rather than
being computed based on observation of the process. LCL and UCL are control limits for
the averages of samples drawn from the process. They may be determined from past data,
or independently specified, but the main distinction is that the purpose of control limits is to
detect whether the process is functioning “as usual,” while specification limits are used to
determine what percentage of the outputs meet certain specifications (are acceptable for use).

17.52. In each graph below, the large tick marks are 3σ apart, and the smaller tick marks are
1σ apart, and the target is marked as “T.” Because Cp = 1.5, the specification limits are 9σ
apart, located at T ± 4.5σ . The first two graphs could be flipped (i.e., the peak of the curve
Solutions 463

could be closer to the LSL than the USL). (a) Cpk = 0.5 means that the nearer specification
limit is (0.5)(3σ ) = 1.5σ above (or below) the mean. (b) Cpk = 1.0 means that the nearer
specification limit is 3σ above (or below) the mean. (c) Cpk = 1.5 means that the nearer
specification limit is (1.5)(3σ ) = 4.5σ above (or below) the mean—so that µ falls exactly
on the target (halfway between the specification limits).
Note: At the end of Example 17.16, the text notes that Cp = Cpk means “the process is
properly centered”—that is, µ equals the target.

LSL T USL LSL T USL LSL T USL

17.53. For computing Ĉpk , note that the estimated process mean (2750.7 mm) lies closer
4000 − 1000 . 4000 − 2750.7 .
to the USL. (a) Ĉp = = 1.3028 and Ĉpk = = 1.0850.
6 × 383.8 3 × 383.8
3500 − 1500 . 3500 − 2750.7 .
(b) Ĉp = = 0.8685 and Ĉpk = = 0.6508.
6 × 383.8 3 × 383.8

.
17.54. (a) With the original specifications, Ĉp = 1.3028 (unchanged from the previous exercise,
4000 − 2500 .
because Ĉp does not depend on µ) and Ĉpk = = 1.3028. (b) Once again,
3 × 383.8
. 3500 − 2500 .
Ĉp = 0.8685 is unchanged. Ĉpk = = 0.8685.
3 × 383.8

17.55. In the solution to Exercise 17.44, we found that the mean and standard deviation of all
. . 1.10 − 0.94 .
60 weights are x = 1.02996 lb and s = 0.0224 lb. (a) Ĉp = = 1.1901 and
6 × 0.0224
1.10 − 1.03 .
Ĉpk = = 1.0418. (These were computed with the unrounded values of x and s;
3 × 0.0224
rounding will produce slightly different results.) (b) Customers typically will not complain
about a package that was too heavy.

17.56. A change to the process mean would not change Ĉp , but we could increase Ĉpk by
centering the process mean between the specification limits, at µ = 1.10 +2 0.94 = 1.02 lb.
With that change, Ĉpk increases to 1.1901 (the same as Ĉp before the change).
Note: The effect of this change is hard to predict if we suspect that the weight
measurements are non-Normal, but the data do not suggest any such problems (see the
solution to Exercise 17.45). Additionally, decreasing the process mean might have the
undesirable effect of increasing customer dissatisfaction (see part (b) of the previous
exercise).
464 Chapter 17 Statistics for Quality: Control and Capability
LSL USL
.
0.75 − 0.25
17.57. (a) Cpk = 3σ = 0.5767. 50% of the output meets the
specifications. (b) LSL and USL are 0.865 standard deviations
above and below to mean, so the proportion meeting specifica-
.
tions is P(−0.865 < Z < 0.865) = 0.6130. (c) The relationship
between Cpk and the proportion of the output meeting specifica-
tions depends on the shape of the distribution.

0 0.25 0.5 0.75 1

.
17.58. In the solution to Exercise 17.31, we found σ̂ = s/c4 = 375.0; from this, we compute
3500 − 2500 .
Ĉpk = = 0.8889, which is larger than the previous value (0.8685).
3 × 375.0

17.59. See also the solution to Exercise 17.43. (a) Use LSL USL
the mean and standard deviation of the 85 remaining
observations: µ̂ = x = 43.4118 and σ̂ = s = 11.5833.
.
(b) Ĉp = 620σ̂ = 0.2878 and Ĉpk = 0 (because µ̂ is
outside the specification limits). This process has very
8.7 20.2 31.8 43.4 55.0 66.6 78.2
poor capability: The mean is too low and the spread
too great. Only about 46% of the process output meets specifications.

17.60. See also the solution to Exercise 17.34. (a) About 97.1%: For the 120 observations
. .
in Table 17.7, we find µ̂ = x = $6442.4 and σ̂ = s = $811.53. Therefore, we estimate
P($4500 < X < $7500) = P(−2.39 < Z < 1.30) = 0.9032 − 0.0084 = 0.8948.
7500 − 4500 . 7500 − 6442.4 .
(b) Ĉp = = 0.6161. (c) Ĉpk = = 0.4344.
6 × 811.53 3 × 811.53

17.61. We have x = 22.005 mm and s = 0.009 mm, so we assume that an individual bearing
diameter X follows a N (22.005, 0.009) distribution. (a) About 85.3% meet specifications:
 
21.985 − 22.005 22.015 − 22.005
P(21.985 < X < 22.015) = P 0.009 <Z< 0.009
= P(−2.22 < Z < 1.11)
= 0.9868 − 0.1335 = 0.8533.
22.015 − 22.005 .
(b) Ĉpk = = 0.3704.
3 × 0.009

17.62. (a) This is unlikely to have any beneficial effect; it would result in more frequent
adjustments, but these would often be unnecessary and so might degrade capability. Control
limits are for correcting special-cause variation, not common-cause variation. (b) If some of
the nonconforming bearings are due to operator error, further training may have the effect of
reducing σ and increasing Cpk . Part (d) offers a slightly different viewpoint. (c) Assuming
the new machine has less variability (smaller σ ), this should improve the process capability.
(d) The number of nonconforming bearings produced by an operator is (for the most part)
a result of random variation within the system; no incentive can cause the operator to do
better than the system allows. (e) Better raw material should (presumably) result in better
product, so this should improve the capability.
Solutions 465

17.63. This graph shows a process with Normal output


and Cp = 2. The tick marks are σ units apart; this LSL USL
is called “six-sigma quality” because the specification
limits are (at least) six standard deviations above and
below the mean.

17.64. (a) The graph on the right shows the mean shifted
toward the USL; it could also be shifted toward the LSL USL
LSL. As in the graph in the previous problem, tick
marks are σ units apart. (b) Cpk = 4.5σ
3σ = 1.5.
Six-sigma quality does not mean that Cpk ≥ 2; the latter is a stronger requirement. (c) The
desired probability is 1 − P(−7.5 < Z < 4.5), for which software gives 3.4 × 10−6 , or about
3.4 out-of-spec parts per million.

17.65. Students will have varying justifications for the sampling choice. Choosing six calls
per shift gives an idea of the variability and mean for the shift as a whole. If we took
six consecutive calls (at a randomly chosen time), we might see additional variability
in x because sometimes those six calls might be observed at particularly busy times
(when a customer has to wait for a long time until a representative is available or when a
representative is using the restroom).

17.66. (a) For n = 6, we have c4 = 0.9515, B5 = 0.029, and B6 = 1.874. With s = 29.985
.
seconds, we compute σ̂ = s/c4 = 31.5134 seconds, so the initial s chart has center line s
. .
and control limits B5 σ̂ = 0.9139 and B6 σ̂ = 59.0561 seconds. There are four out-of-control
points, from samples 28, 39, 42, and 46. (b) With the remaining 46 samples, s = 24.3015,
so σ̂ = s/c4 = 25.54 seconds, and the control limits are B5 σ̂ = 0.741 and B6 σ̂ = 47.86
seconds. There are no more out-of-control points. (The second s chart
√ is not shown.) (c) We
have center line x = 29.2087 seconds, and control limits x ± 3σ̂ / 6 = −2.072 and 60.489
seconds. (The lower control limit should be ignored or changed to 0.) The x chart has no
out-of-control points.

120 •x 60 •
x
Sample std. deviation

100 • x 50
• • • •
Sample mean

80 40
• • ••
x
30 •
•• •• •• ••• • • • •
60 • • •• ••
• ••• •• • ••• • •
40 • • • • • 20 • •
• • • • • • ••
• • • • • •
20 ••• ••• •• • •• •••• •• • • •• •• •• 10 •
• • • •• • 0
0
0 10 20 30 40 50 0 10 20 30 40 50
Sample number Sample number

17.67. The outliers are 276 seconds (sample 28), 244 seconds (sample 42), and 333 seconds
(sample 46). After dropping those outliers, the standard deviations drop to 9.284, 6.708, and
31.011 seconds. (Sample #39, the other out-of-control point, has two moderately large times,
144 and 109 seconds; if they are removed, s drops to 3.416.)
466 Chapter 17 Statistics for Quality: Control and Capability

17.68. For those 10 days, there were 961 absences and 10 · 987 = 9870 person-days available
961 .
for work, so p = = 0.09737, and:
9870
p(1 − p)
CL = p = 0.09737, control limits: p ± 3 = 0.06906 and 0.12567
987

17.69. (a) For those 10 months, there were 1028 overdue invoices out of 28,400 total invoices
1028 .
(opportunities), so p = = 0.03620. (b) The center line and control limits are:
28, 400
p(1 − p)
CL = p = 0.03620, control limits: p ± 3 = 0.02568 and 0.04671
2840

17.70. Based on 3.22 complaints per 1000 passengers, the center line is p = 3.22
1000 = 0.00322,

p(1 − p)
and the control limits are p ± 3 , which means about −0.000179 and 0.006619. As
2500
the problem says, we take LCL = 0.

17.71. The center


line is at the historical rate (0.0189); the control limits are
0.0189 · 0.9811
0.0189 ± 3 , which means about 0.00063 and 0.03717.
500

17.72. For both operators, the center line


Proportion of damaged eggs

0.04
is 0.0189. For the first operator, the con-
trol limits are those found in the previous 0.03
solution: 0.00063 and 0.03717. For the
second operator,
the control limits are 0.02
0.0189 · 0.9811
0.0189 ± 3 , which yields
400 0.01
−0.00153 (use 0) and 0.03933.
Note: We could simplify this control 0
chart with a couple of practical observa- 1 2 3 4 5 6 7 8
tions. First, we would not be concerned Sample number (hour)
if the proportion of broken eggs were too low, so we could take the first operator’s LCL to
be 0. In addition, for the first (second) operator, the hourly proportion will be a multiple of
0.002 (0.0025), so it will exceed the UCL if p̂ ≥ 0.038 (p̂ ≥ 0.04).

.
17.73. The center line is at p = 163
36,480 = 0.004468; the control limits should be at

p(1 − p)
p±3 , which means about -0.00066 (use 0) and 0.0096.
1520

17.74. The initial center line and control limits are:



p(1 − p)
CL = p = 0.01, control limits: p ± 3 = 0.009005 and 0.010995
90,000
On a day when only 45,000 prescriptions are filled, the center line is unchanged, while the
control limits change to:

p(1 − p)
p±3 = 0.008593 and 0.011407
45,000
Solutions 467

17.75. (a) The student counts sum to 9218,


while the absentee total is 3277, so p = 0.4
3277
= 0.3555 and n = 921.8. (b) The • •

Absentee rate
9218 0.38
center line is p = 0.3555, and the control • •
0.36
limits are: • • •
p(1 − p) 0.34

p±3 = 0.3082 and 0.4028 •
921.8 0.32 •
The p chart suggests that absentee rates
0.3
are in control. (c) For October, the limits
0 2 4 6 8 10
are 0.3088 and 0.4022; for June, they are Sample number (month)
0.3072 and 0.4038. These limits appear
as solid lines on the p chart, but they are not substantially different from the control limits
found in (b). Unless n varies a lot from sample to sample, it is sufficient to use n.

17.76. (a) p = 1,000,000


3.5
= 0.0000035. At 5000 pieces per day, we expect 0.0175 defects per
day; in a 24-day month, we would expect 0.42 defects. (b) The center line is 0.0000035;
assuming that every day we examine all 5000 pieces, the LCL is negative (so we use 0), and
the UCL is 0.0000083. (c) Note that most of the time, we will find 0 defects, so that p̂ = 0.
If we should ever find even one defect, we would have p̂ = 0.0002, and the process would
be out of control. On top of this, it takes an absurd amount of testing in order to catch the
rare defect.

8000
17.77. (a) p = = 0.008. We expect about 4 = (500)(0.008) defective orders per
1, 000, 000
month. (b) The center line and control limits are:

p(1 − p)
CL = p = 0.008, control limits: p ± 3 = −0.00395 and 0.01995
500
(We take the lower control limit to be 0.) It takes at least ten bad orders in a month to be
out of control because (500)(0.01995) = 9.975.

17.78. Control charts focus on ensuring that the process is consistent, not that the product
is good. An in-control process may consistently produce some percentage of low-quality
products. Keeping a process in control allows one to detect shifts in the distribution of the
output (which may have been caused by some correctable error); it does not help in fixing
problems that are inherent to the process.
468 Chapter 17 Statistics for Quality: Control and Capability

17.79. (a) The percents do not add to 100%


because one customer might have several 30

Frequency (percent)
complaints; that is, he or she could be 25
counted in several categories. (b) Clearly, 20
top priority should be given to the pro- 15
cess of creating, correcting, and adjusting
10
invoices, as the three most common com-
5
plaints involved invoices.
0

t
es

ed
nic ct s ent

op nual
te
e

live ...

ret .
Co f inv .

Me omp men

..
y o g ..

oic

Eq ity o y da
ce
oic

urn
tes
Te Corr hipm
rac inin

inv

ma
ng eten
hip

era
r
a

Co lete
es

.
f ..
Ac f obt

lls
de
let

Ca
mp

nt
o

c
mp

me
al
se

r
cu

eti
Cla
uip
Ea

ch
Complaint

17.80. (a) Use x and s charts to track the time required. (b) Use a p chart to track the
acceptance percentage. (c) Use x and s charts to track the thickness. (d) Use a p chart to
track the proportion of dropped calls.

17.81. On one level, these two events are similar: Points below the LCL on an x (s) chart
suggest that the process mean (standard deviation) may have decreased. The difference is in
the implications of such a decrease (if not due to a special cause). For the mean, a decrease
might signal a need to recalibrate the process in order to keep meeting specifications (that
is, to bring the process back into control). A decrease in the standard deviation, on the other
hand, typically does not indicate that adjustment or recalibration is necessary, but it will
require re-computation of the x chart control limits.

situation calls for a p chart with center line p = = 0.006 and control limits
6
17.82. This 1000
p(1 − p)
p±3 = 0.006 ± 0.01238. We take LCL = 0, and the UCL is 0.0184. (In order to
350
exceed this UCL, we would need to reject at least 7 of the 350 lots.)

17.83. We find that s = 7.65, so with


c4 = 0.8862 and B6 = 2.276, we com- 20 • UCL for 17.83 •
Sample std. deviation

pute σ̂ = 8.63 and UCL = 19.65. One point


15 UCL for 17.84 •
(from sample #1) is out of control. (And, if
• •
that cause were determined and the point re- 10 • •
moved, a new chart would have s for sample • • •
#10 out of control.) The second (lower) UCL • • •
5
line on the control chart is the final UCL, • •• ••
• •• •
after removing both of those samples (per the 0
0 5 10 15 20
instructions in Exercise 17.84).
Sample number
Solutions 469

17.84. Without samples 1 and 10, s = 6.465, 850


.
σ̂ = s/c4 = 7.295, and the new UCL is
2.276σ̂ = 16.60; this line is shown on the •

Sample mean
840 •• •
control chart in the solution to the previous • •
• • • •
problem. Meanwhile, x =√834.5, and the •• •
• • •
control limits are x ± 3σ̂ / 3 = 821.86 to 830

847.14. The x chart gives no indication of • •
trouble—the process seems to be in control. •
820
0 5 10 15 20
Sample number

.
17.85. (a) As was found in the previous exercise, σ̂ = s/c4 = 7.295. Therefore,
.
Cp = 650σ̂ = 1.1423. This is a fairly small value of Cp ; the specification limits are just barely
wider than the 6σ̂ width of the process distribution, so if the mean wanders too far from
830, the capability will drop. (b) If we adjust the mean to be close to 830 mm × 10−4
(the center of the specification limits), we will maximize Cpk . Cpk is more useful when the
mean is not in the center of the specification limits. (c) The value of σ̂ used for determining
Cp was estimated from the values of s from our control samples. These are for estimating
short-term variation (within those samples) rather than the overall process variation. To
get a better estimate of the latter, we should instead compute the standard deviation s of
the individual measurements used to obtain the means and standard deviations given in
Table 17.11 (specifically, the 60 measurements remaining after dropping samples 1 and
10). These numbers are not available. (See “How to cheat on Cpk ” on page 17–44 of
Chapter 17.)

.
17.86. About 99.94%: With σ̂ = 7.295 and mean 830, we compute P(805 < X < 855) =
P(−3.43 < Z < 3.43) = 0.9994.

p(1 − p)
17.87. (a) Use a p chart, with center line p = 15
= 0.003 and control limits p ± 3
5000 ,
100
or 0 to 0.0194. (b) There is little useful information to be gained from keeping a p chart:
If the proportion remains at 0.003, about 74% of samples will yield a proportion of 0, and
about 22% of proportions will be 0.01. To call the process out of control, we would need to
see two or more unsatisfactory films in a sample of 100.

17.88. Assuming x is (approximately) Normally distributed, the probability that it would fall
within the 1σ level is about 0.68, so the probability that it does this 15 times is about
.
0.6815 = 0.0031.

17.89. Several interpretations of this problem are possible, but for most reasonable
interpretations, the probability is about 0.3%. From the description, it seems reasonable
to assume that all three points are inside the control limits; otherwise, the one-point-out
rule would take effect. Furthermore, the phrase “two out of three” could be taken to mean
either “exactly two out of three,” or “at least two out of three.” (Given what we are trying to
detect, the latter makes more sense, but students may have other ideas.)
For the kth point, we name the following events:

• Ak = “that point is no more than 2σ/ n from the center line,”
470 Chapter 17 Statistics for Quality: Control and Capability

• Bk = “that point is 2 to 3 standard errors from the center line.”


For an in-control process, P(Ak ) = 95% (or 95.45%) and P(Bk ) = 4.7% (or 4.28%).
The first given probability is based on the 68–95–99.7 rule; the second probability (in
parentheses) comes from Table A or software.
Note that, for example, the probability that the first point gives no cause for concern, but

the second and third are more than 2σ/ n from, and on the same side of, the center line,
would be:
.
2 P(A1 ∩ B2 ∩ B3 ) = 0.10% (or 0.09%)
1

(The factor of 1/2 accounts for the second and third points being on the same side of the
center line.) If the “other” point is the second or third point, this probability is the same,
so if we interpret “two out of three” as meaning “exactly two out of three,” then the total
probability is three times the above number:
.
P(false out-of-control signal from an in-control process) = 0.31% (or 0.26%)
With the (more-reasonable) interpretation “at least two out of three”:
P(false out-of-control signal) = 12 P(A1 ∩ B2 ∩ B3 ) + 12 P(B1 ∩ A2 ∩ B3 ) + 12 P(B1 ∩ B2 )
.
= 0.32% (or 0.27%)

You might also like