0% found this document useful (0 votes)
135 views91 pages

R - (2017) Understanding and Applying Basic Statistical Methods Using R (Wilcox - R - R) (Sols.)

The document provides answers and R code for exercises related to understanding and applying basic statistical methods using R. It includes summaries of key concepts like measures of center, variability, outliers, and distributions. The exercises cover topics such as computing means, medians, standard deviations, identifying outliers, and understanding properties of statistical measures. R code solutions are provided for each exercise to demonstrate how to obtain the answers using R.

Uploaded by

David duran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views91 pages

R - (2017) Understanding and Applying Basic Statistical Methods Using R (Wilcox - R - R) (Sols.)

The document provides answers and R code for exercises related to understanding and applying basic statistical methods using R. It includes summaries of key concepts like measures of center, variability, outliers, and distributions. The exercises cover topics such as computing means, medians, standard deviations, identifying outliers, and understanding properties of statistical measures. R code solutions are provided for each exercise to demonstrate how to obtain the answers using R.

Uploaded by

David duran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Wilcox Solutions

UNDERSTANDING AND APPLYING BASIC


STATISTICAL METHODS USING R:
ANSWERS TO THE EXERCISES

Chapter 1

1.

> x=c(-20, -15,-5, 8, 12, 9, 2, 23, 19) # Store the values in x


> sum(x) # Add the values in x.

[1] 33

2.

> mean(x)

[1] 3.666667

> # The R function mean is the built-in function for computing the
> # average, which is called the sample mean.

3. sum(x)/length(x) # Add the values in x and divide by the number of values in x,


which is determined with the R command length.
4.

> x=c(-20, -15,-5, 8, 12, 9, 2, 23, 19)


> sum(x[x>0]) # sum the values in x that are greater than 0.

[1] 73

5.

> x=c(-20, -15,-5, 8, 12, 9, 2, 23, 19)


> id=which(x==max(x)) # determine where in x the
> #values equal to the maximum are stored.
> #Here, id=8. That is x[8] contains the maximum value.
> # Because id=8, the next command says to ignore the value in x[8]
> # and compute the average (the sample mean)
> # of the remaining values.
> mean(x[-id])

[1] 1.25

1
6.

> x=c(-20, -15,-5, 8, 12, 9, 2, 23, 19)


> # The next command returns the values that
> # satisfy two conditions: |x|>= 8 and x<8.
> x[abs(x)>=8 & x<8]

[1] -20 -15

7.

> x=c(23, 18, 29, 22, 24, 27, 28, 19, 28, 23)
> mean(x)

[1] 24.1

> min(x)

[1] 18

> max(x)

[1] 29

8.

> y=c(2,4,8)
> z=c(1,5,2)
> 2*y

[1] 4 8 16

> y+z

[1] 3 9 10

> y-2

[1] 0 2 6

9.

> x = c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
> sum(x)/length(x)

[1] 4.8

> x-4 # subtract 4 from each value in x

2
[1] -3 4 -2 2 -1 4 1 1 1 1

> max(x)-min(x) #Computes the difference between the largest and smallest values.

[1] 7

> # Or can use the range command to compute the difference


> # between the largest the largest and smallest values:
> range(x)[2]-range(x)[1]

[1] 7

10.
> x = c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
> sum(x-mean(x))

[1] 1.776357e-15

The exact value is zero but due to rounding errors, R returns a value that is not exactly
exactly equal to zero.
11.
id=!is.na(m[,1])
m[id,]
The first command identifies which rows in m are not missing based on the data in column
1. The second command returns the values in m ignoring any rows with NA in column 1.
12.
> mean(ChickWeight[,1])

[1] 121.8183

> mean(ChickWeight[,3])

[1] NA

> mean(as.numeric(ChickWeight[,3]))

[1] 26.25952

> chk=as.numeric(ChickWeight[,3]) # convert to numeric


> is.numeric(chk) #This verifies that values chk are numeric

[1] TRUE

13.
> m=matrix(c(3,5,6,NA,12, 23,NA,19,34,7),ncol=2)
> m

3
[,1] [,2]
[1,] 3 23
[2,] 5 NA
[3,] 6 19
[4,] NA 34
[5,] 12 7

elimna(m)
returns

[,1] [,2]
[1,] 3 23
[2,] 6 19
[3,] 12 7

14. The following R commands accomplish this goal.

> flag=chickwts[,2]=='horsebean'
> mean(chickwts[flag,1])

[1] 160.2

15.

> x=c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
> sum(x[c(-3,-5)])

[1] 43

> sum(x[c(1,2,4,6,7,8,9,10)])

[1] 43

16.

> x=c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
> sum(x[x!=5])

[1] 28

> sum(x[1:6])

[1] 28

17.

4
> x=c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)
> x[x==8]=7
> x

[1] 1 7 2 6 3 7 5 5 5 5

18.
> matrix(c(1:8),nrow=4)

[,1] [,2]
[1,] 1 5
[2,] 2 6
[3,] 3 7
[4,] 4 8

19.
> matrix(c(1:4,11:14),nrow=2,byrow=TRUE)

[,1] [,2] [,3] [,4]


[1,] 1 2 3 4
[2,] 11 12 13 14

Chapter 2
1. Here are the answers: (a) 22, (b) 2, (c) 20, (d) 484, (e) 27, (f) −41, (g) 2, (h) 220, (i)
12, (j) 54. Here are the R commands to get these answers:
> x=c(1,3,0,-2,4,-1,5,2,10)
> sum(x) # a

[1] 22

> sum(x[3:5]) #b

[1] 2

> sum(x[1:4]^3) #c

[1] 20

> sum(x)^2 # d

[1] 484

> 3*9 # e

[1] 27

5
> sum(x-7) # f

[1] -41

> 3*sum(x[1:5])-sum(x[6:9]) # g

[1] 2

> sum(10*x) #h

[1] 220

> sum(c(2:6)*x[2:6]) #i

[1] 12

> 6*9 # j

[1] 54

Uii , (c) ( Yi ) 4 .
P P P
2. (a) Xi /i, (b)
3.
> x=c(1:3) # Store the values 1, 2 and 3 in x
> sum(x^2) # square the values in x and sum the results

[1] 14

> sum(x)^2 # sum the values in x and square the results.

[1] 36

4. (a) X̄ = −0.2, M = 0, (b)X̄ = 186.1667, M = 6.5. Using R:


> x=c(-1, 0, 3, 0, 2, -5)
> mean(x)

[1] -0.1666667

> median(x)

[1] 0

> x=c(2, 2, 3, 10, 100, 1000)


> mean(x)

[1] 186.1667

> median(x)

6
[1] 6.5

5. X̄ = 83.93 X̄t = 83.1, M = 80.5. Using R:

> x=c(73, 74, 92, 98, 100, 72, 74, 85, 76, 94, 89, 73, 76, 99)
> mean(x)

[1] 83.92857

> mean(x,tr=0.2) #tr=0.2 says to use 20% trimming

[1] 83.1

> median(x)

[1] 80.5
P P
6. Because X̄ = Xi /n, nX̄ = Xi , so the answer is 23(14.7)=338.1.
7. (a) 30.6, (b) 120.6, (c) 1020.6. Using R:

> x=c(3, 6, 8, 12, 23, 26, 37, 42, 49, 63)


> mean(x)

[1] 26.9

> x[10]=100
> mean(x)

[1] 30.6

> x[10]=1000
> mean(x)

[1] 120.6

These results illustrate that the sample mean is not resistant to outliers.
8. 24.5 in all three cases. This illustrates that the median is highly resistant to outliers.
9. One
10. The minimum number of values that must be altered to make the 20% trimmed
mean arbitrarily large is g + 1, where g is 0.2n rounded down to the nearest integer. For the
median, about half of the values must altered to make it arbitrarily large.
11. Putting the values in ascending order yields:
-19 -10 -7 -6 -3 -1 0 1 1 12 12 23.
The sample size is n = 12, so n/4 + 5/12 = 12/4 + 5/12 = 3.416667. So j = 3, h = 0.41667,
q1 = (1 − 0.41667)(−7) + 0.41667(−6) = −6.58
and

7
q2 = (1 − 0.41667)(12) + 0.41667(1) = 7.4

Using R:
x=c(0, 23, -1, 12, -10, -7, 1, -19, -6, 12, 1, -3)
idealf(x)
returns

$ql
[1] -6.583333

$qu
[1] 7.416667

This assumes that software written for the book, stored in the file Rallfun, has been installed
as described in Section 1.4.
12. Putting the values in ascending order yields
-12 -10 -7 -6 -6 -2 -1 -1 2 2 3 3 6 8 12
n = 15,15/4+5/12=4.166667, so j = 4, k = n − j + 1 = 12, h = 0.16667, X(4) = −6,
X(5) = −6, X(12) = 3, X(12) = 3
q1 = 0.16667(−6) + (1 − 0.16667)(−6) = −6 and
q2 = 0.16667(3) + (1 − 0.16667)(33) = 3. Using R:
x=c(-1, -10, 2, 2, -7, -2, 3, 3, -6, 12, -1, -12, -6, 8, 6)
idealf(x) returns

$ql
[1] -6

$qu
[1] 3

13. About a fourth.


14. The first requirement for a measure of location is that its value is between X(1) and
X(n) inclusive. That is, its value must be greater than or equal X(1) , the smallest observed
value, and less than or equal to X(n) , the largest value. Clearly X(1) satisfies this property.
If we multiply all of the values by any constant c, then in particular X(1) becomes cX(1) . If
c is added to all of the values, then in particular X(1) becomes X(1) + c. So X(1) satisfies the
definition of a measure of location.
15.

> x=c(12, 6, 15, 3, 12, 6, 21, 15, 18 , 12)


> sum(x-mean(x))

[1] 0

16. Range=18, s2 = 32, s = 5.66. Using R:

8
> x=c(12, 6, 15, 3, 12, 6, 21, 15, 18 , 12)
> max(x)-min(x)

[1] 18

> var(x)

[1] 32

> sd(x)

[1] 5.656854

(Xi − X̄) = ( Xi ) − nX̄ = 0.


P P P
17. Note that nX̄ = Xi . So
18. s = 0.37.

> x=c(-4.10, -4.13,-5.09, -4.08, -4.10, -4.09, -4.12)


> sd(x)

[1] 0.3733121

19. s = 11.58.

> x=c(280, 295, 275, 305, 300, 290)


> sd(x)

[1] 11.58303

20. 20 is an outlier. Using R:


x=c(20, 121, 132, 123, 145, 151, 119, 133, 134, 130, 200)
outms(x)$out.value
returns 20, assuming the Rallfun file has been installed as described in Section 1.4.
21. Both 20 and 200 are outliers.
x=c(20, 121, 132, 123, 145, 151, 119, 133, 134, 130, 200)
outbox(x)$out.val
returns 20 and 200. And
outpro(x)
returns the same results. This differs from the classic rule due to masking. The very presence
of outliers can inflate the standard deviation to the point that outliers are missed based on
the classic outlier detection rule.
22. Yes.
x=c(0, 121, 132, 123, 145, 151, 119, 133, 134, 130, 250)
outms(x)$out.value
returns 0 and 250.
23.
x=c(0, 121, 132, 123, 145, 151, 119, 133, 134, 130, 250)

9
outbox(x)$out.value
returns 0 and 250.
24.
x=c(20, 121, 132, 123, 145, 151, 119, 133, 134, 240, 250)
outms(x)$out.value
returns numeric(0), meaning that no outliers were found. The value of $n.out is 0.
25.
x=c(20, 121, 132, 123, 145, 151, 119, 133, 134, 240, 250)
outbox(x)$out.value
returns 20 240 250. That is, these three values are declared outliers.
26. Sometimes, even with two or more outliers, the classic rule might catch all of the
outliers, but the boxplot rule is better at avoiding masking.
27. 0.2 × n = 0.2 × 21 = 4.2, so g = 4, X̄t = 80.08.
28.

> x=c(21, 36, 42, 24, 25, 36, 35, 49, 32)
> mean(x)

[1] 33.33333

> mean(x,tr=0.2)

[1] 32.85714

> median(x)

[1] 35

29.
> x=c(21, 36, 42, 24, 25, 36, 35, 49, 32)
> mean(x)

[1] 33.33333

> mean(x,tr=0.2)

[1] 32.85714

> median(x)

[1] 35

30. Must alter at least 2. Only the largest observation is trimmed. So increasing the two
largest observations, the 20% trimmed mean can be made arbitrarily large.
31. The minimum number of values needed to make the median greater than 1000 is 5.
Altering the four largest values does not impact the median. This illustrates that the mean
is least resistant to outliers and the median is the most resistant.
32.

10
> x=c(6, 3, 2, 7, 6, 5, 8, 9, 8, 11)
> mean(x)

[1] 6.5

> mean(x,0.2)

[1] 6.666667

> median(x)

[1] 6.5

33. Must alter at least g + 1, where g is the number of observations trimmed from both
tails. For 20% trimming, g = 0.2n rounded down to the nearest integer.
34. X̄ = 229.2, X̄t = 220.8, M = 221, X̄mm = 214.12. Using R:

> x=c(250, 220, 281, 247, 230, 209, 240, 160, 370, 274, 210, 204, 243, 251, 190,
+ 200, 130, 150, 177, 475, 221, 350, 224, 163, 272, 236, 200, 171, 98)
> mean(x)

[1] 229.1724

> mean(x,tr=.2)

[1] 220.7895

> median(x)

[1] 221

>mom(x)
[1] 214.12
35. The sample size is n = 9, so g = [0.2(9)] = 1, which means that the smallest value
is set equal to the smallest value not trimmed, which is 24. The largest value is set equal to
the largest value not trimmed, which is 42. So the Winsorized values are: 24, 24, 25, 32, 35,
36, 36, 42, 42.
36.
x=c(21, 36, 42, 24, 25, 36, 35, 49, 32)
winvar(x)
51.36
37. The sample variance would be expected to be larger because Winsorizing pulls in
extreme values resulting in less variability: s2 = 81, s2w = 51.4.
38. Yes, because Winsorizing pulls in extreme values. That is, the values are not as
spread out after Winsorizing.
39.

11
> x=c(6, 3, 2, 7, 6, 5, 8, 9, 8, 11)
> var(x)

[1] 7.388889

winvar(x)

[1] 1.822222

40. The 20% Winsorized values are:


171 171 171 171 171 171 177 190 200 200 204 209 210 220 221 224
230 236 240 243 247 250 251 272 272 272 272 272 272
Computing the sample variance based on these Winsorized values yields the Winsorized sam-
ple variance: s2w = 1375.6
Using R:

> x=c(250, 220, 281, 247, 230, 209, 240, 160, 370, 274, 210, 204, 243, 251, 190,
+ 200, 130, 150, 177, 475, 221, 350, 224, 163, 272, 236, 200, 171, 98)

winvar(x)

[1] 1375.606

41.Using R:

> x=c(90, 76, 90, 64, 86, 51, 72, 90, 95, 78)

tmean(x)

[1] 82

winvar(x)

[1] 69.15556

Chapter 3
1. The mean is 2.85, the variance is 1.94697 and the standard deviation is 1.395.
Using R:

> x=c(1, 2, 3, 4, 5)
> fx=c(0.2, 0.3, 0.1, 0.25, 0.15) # Frequencies
> xbar=sum(x*fx)
> xbar

12
[1] 2.85

> VAR=sum((x-xbar)^2*fx)*100/99
> VAR

[1] 1.94697

> sd=sqrt(VAR) # standard deviation


> sd

[1] 1.395339

2. The mean is 2.52, the variance is 0.989, standard deviation = 0.995.


> x=c(1, 2, 3, 4)
> fx=c(0.2, 0.24, 0.4, 0.16) # Frequencies
> xbar=sum(x*fx)
> xbar

[1] 2.52

> VAR=sum((x-xbar)^2*fx)*50/49
> VAR

[1] 0.9893878

> sd=sqrt(VAR) # standard deviation


> sd

[1] 0.9946797

3. The mean is 3, the variance is 1.5, standard deviation = 1.22.


> x=c(0,1, 2, 3, 4, 5, 6)
> fx=c(0.015625, 0.093750, 0.234375, 0.312500, 0.234375, 0.093750,
+ 0.015625)
> xbar=sum(x*fx)
> xbar

[1] 3

> VAR=sum((x-xbar)^2*fx)*10000/9999
> VAR

[1] 1.50015

> sd=sqrt(VAR) # standard deviation


> sd

13
[1] 1.224806

4. The mean is 18.387, the variance is 85.04, standard deviation = 9.22.

> x=c(5, 10, 15, 20, 25, 50)


> f=c(20, 30, 10, 40, 50, 5)
> n=sum(f) # Determines the sample size by adding the frequencies
> fx=f/n # Relative frequencies.
> xbar=sum(x*fx)
> xbar

[1] 18.3871

> VAR=sum((x-xbar)^2*fx)*n/(n-1)
> VAR

[1] 85.04399

> SD=sqrt(VAR) # standard deviation


> SD

[1] 9.22193

5. The mean is 11.1, the variance is 42.3 and the standard deviation is 6.5.

> x=c(1, 5, 10, 20 )


> f=c(10, 20, 40, 30)
> n=sum(f) # Determines the sample size by adding the frequencies
> fx=f/n # Relative frequencies.
> xbar=sum(x*fx)
> xbar

[1] 11.1

> VAR=sum((x-xbar)^2*fx)*n/(n-1)
> VAR

[1] 42.31313

> SD=sqrt(VAR)
> SD

[1] 6.504854

6. The data for this exercise are stored on the author’s web page in the file ibtable2 1 dat.txt
and can be downloaded as described in Section 1.5.

14
> chol=scan(file='ibtable2_1_dat.txt')# Assuming the data are stored in
> # the main directory where R expects data.
> # Use file.choose if this is not the case. See Section 1.3.1.
> hist(chol,xlab='Change Cholesterol, Experimental Group')

Histogram of chol
60
50
40
Frequency

30
20
10
0

−40 −20 0 20 40

Change Cholesterol, Experimental Group

So the histogram might suggest that there are no outliers because there are no values
separated from the bulk of the data. But the R command

outbox(chol)

indicates that the values 25 and 31 are outliers and

outpro(chol)

indicates that the values 19, 20, 25 and 31 are outliers.

7. The data for this exercise are stored on the author’s web page in the file ibtable2 2 dat.txt
and can be downloaded as described in Section 1.5.

15
> chol=scan(file='ibtable2_2_dat.txt')# Placebo group
> # Assumng the data are in the
> # main directory where R expects data.
> # Use file.choose if this is not the case. See Section 1.3.
> hist(chol,xlab='Change Cholesterol, Placebo')

Histogram of chol
60
50
40
Frequency

30
20
10
0

−40 −20 0 20 40 60 80

Change Cholesterol, Placebo

So the histogram might suggest that there are two outliers because the values 68 and 71
are separated from the bulk of the data. The histogram might also suggest that there are no
outliers in the left tail. But the R command

outbox(chol)

indicates that the values -43, -36, -36, -35, 34, 39, 68, and 71 are outliers.

8. The data for this exercise are stored on the author’s web page in the file skull height dat.txt
and can be downloaded as described in Section 1.5.
> skull=c(121, 124, 129, 129, 130, 130, 131, 131, 132, 132, 132, 133, 133, 134, 134, 13
+ 136, 136, 136, 136, 137, 137, 138, 138, 138, 140, 143)
> hist(skull,xlab='Skull Height')

16
Histogram of skull
12
10
8
Frequency

6
4
2
0

120 125 130 135 140 145

Skull Height

The R command
outbox(skull)
indicates that the value 121 is an outlier and the command
outpro(skull)
indicates that the values 121, 124 and 143 are outliers.
9. This can be verified with the R command
outms(skull)
assuming the data are still stored in the R variable skull.
But nevertheless, using the mean and standard deviation to detect outliers can result in
masking. That is, outliers can be missed due to the sensitivity of the mean and standard
deviation to outliers. Even extreme outliers can be missed.
10. The boxplot rule can declare values to be outliers that do not appear to be outliers
based on a histogram.
11. Use the R commands
exam=c(83, 69, 8, 72, 63, 88, 92, 81, 54, 57, 79, 84, 99, 74, 86, 71, 94, 71, 80, 51, 68, 81,
84, 92, 63, 99, 91)
stem(exam)
12. For the value 34.679 the leaf is 7, so the stem is 34.6, the values to the left of the

17
leaf.
13. There would be only one stem.
14. The median=80, the lower and upper quartiles are 50 and 121, approximately, so
IQR=121-50=71. So any value greater than 121+1.5(71)=227.5 would be declared an outlier.
Here, the largest value not declared an outlier is 215.
15. As indicated in the answer to the previous exercise, values greater than 227.5 are
declared outliers. Values less than 50-1.5(71)= -56.5 are declared outliers as well.
16. Assuming the file film dat.txt has been downloaded from the author’s web page,

> film=scan(file='film_dat.txt')
> boxplot(film)
8
7
6
5
4
3
2

An easy way to plot the relative frequencies is with the R command


splot(film)
17. Assuming the data file mismatch dat.txt has been downloaded,

> mism=scan(file='mismatch_dat.txt')
> boxplot(mism)

18
3.0
2.5
2.0
1.5
1.0
0.5
0.0

The R command
akerd(mism)
is one way of plotting the distribution using a kernel density estimator.
18. When the population histogram is symmetric and outliers are rare.
19. In some cases, 100 is sufficient but in others a much larger sample size is needed.
The difficulty is that it depends on the nature of the population distribution, which is not
known.
20. Generally, the boxplot is better than a histogram.
21. Not necessarily. Situations are encountered where the sample histogram is a poor
indication of the population histogram.

Chapter 4
1. No, the P (x) values do not sum to one.
2. No, can’t have a negative probability.
3. Yes, the values P (x) are all between 0 and 1 and they sum to one.
4. P (X < 3.4) = P (X ≤ 3) = 0.2 + 0.3 = 0.5.
5. 0, there are no values less than or equal to one.
6. P (X > 3) = P (X ≥ 4) = 0.4 + 0.1 = 0.5.

19
7.P (X ≥ 3)= 0.3+0.4+0.1=0.8
8. 1-0.3=0.7
9. This corresponds to a binomial with probability of success p = 0.3 based on a single
trial. That is, n = 1. So the mean is np = 0.3 and the variance is np(1−p) = 0.3(0.7) = 0.21.
Alternatively, µ = xP (x) = 0(0.7) + 1(0.3) = 0.3. The variance is p(1 − p) = 0.21. Or
P

using R, the variance is

> x=c(0,1) # The sample space


> px=c(0.7,0.3) # the probabilities P(x)
> sum((x-0.3)^2*px)

[1] 0.21

The probability of getting a value less than the mean is P (X < 0.3) = P (X ≤ 0) = 0.7.
10. The sample space is 1, 2, 3, 4 and the corresponding probabilities are 0.2, 0.4, 0.3,
0.1. So the mean can be computed with R:
> x=c(1,2,3,4)
> px=c(0.2, 0.4, 0.3, 0.1)
> sum(x*px)

[1] 2.3

The variance is
> x=c(1,2,3,4)
> px=c(0.2, 0.4, 0.3, 0.1)
> sum((x-2.3)^2*px)

[1] 0.81

and the standard deviation is the square root of 0.81=0.9.


11. The mean=3.2, the variance is 1.76 and the standard deviation is 1.32665. Using R:
> x=c(1,2,3,4,5)
> px=c(0.2, 0.1, 0.1, 0.5, 0.1)
> xbar=sum(x*px)
> VAR=sum((x-xbar)^2*px)
> xbar

[1] 3.2

> VAR

[1] 1.76

> sqrt(VAR)

20
[1] 1.32665

12. We see that µ − σ = 3.2 − 1.33 = 1.87 and µ + σ = 3.2 + 1.33 = 4.53. But the only
possible values between 1.87 and 4.53 are 2, 3 and 4. So the answer is P (2) + P (3) + P (4) =
0.7.
13. The mean=2, the standard deviation is 0.63.

> x=c(1,2,3)
> px=c(0.2, 0.6, 0.2)
> xbar=sum(x*px) # Mean
> SD=sqrt(sum((x-xbar)^2*px)) # Standard deviation
> xbar

[1] 2

> SD

[1] 0.6324555

14. Increase: the values are spread out more.

> x=c(0,2,4)
> px=c(0.2, 0.6, 0.2)
> mu=sum(x*px)
> sum((x-mu)^2*px)

[1] 1.6

which is greater than the variance in Exercise 13, which is 0.632452 = 0.4.
15. The mean=3, the variance is 1.6. Using R:

> x=c(1,2,3,4,5)
> px=c( 0.15, 0.2, 0.3, 0.2, 0.15)
> xbar=sum(x*px)
> VAR=sum((x-xbar)^2*px)
> xbar

[1] 3

> VAR

[1] 1.6

Probability of being less than the mean is P (X < 3) = P (X ≤ 2) = 0.15 + 0.2 = 0.315.
16. Smaller because the most extreme values are less likely to occur.

21
> x=c(1:5)
> px=c(0.1, 0.25, 0.3, 0.25, 0.1)
> xbar=sum(x*px)
> VAR=sum((x-xbar)^2*px)
> VAR

[1] 1.3

17. Larger because the most extreme values are more likely to occur.

> x=c(1:5)
> px=c(rep(0.2,5))
> xbar=sum(x*px)
> VAR=sum((x-xbar)^2*px)
> VAR

[1] 2

18. (a) 0.030+0.180+0.090=0.3, (b) 0.03/0.3, (c) 0.09/0.3, (d) 0.108/0.18.


19. Independent. For example, P(Age < 30)=0.3, P(HIGH INCOME)=0.9 and 0.3(0.1)=0.03.
That is, the product rule holds. This can be verified for the other cell probabilities in a similar
manner.
20. (a) (757+496)/3398=1253/3398=0.3687463, (b) 757/1828=0.4141138, (c) 757/1253=0.60415,
(d) no, the product rule does not hold. (e) 757/3398 + 1074/3398=1831/3398=0.5388464,
(f) 1-757/3398=0.7777, (g) (757+496+1071)/3398= 0.6839317.
21. Yes, if the variance of the cost of a home depends on X, the crime rate, this can only
happen if the conditional probabilities change when you are told X.
22. Yes, knowing the value of X alters the probability associated with Y .
23. Yes, knowing X impacts the likelihood of the values of Y .
24. (a) P (X = 0) = P (X ≤ 0), which can be read from Table 2 and is equal to 0.006, (b)
P (X ≤ 3) is read directly from Table 2 and is 0.3823, (c) P (X < 3) = P (X ≤ 2)=0.1673,
(d) P (X > 4) = 1 − P (X ≤ 4)=1-0.633=0.367, (e) P(X≤ 5)-P(X ≤ 1)=0.7874. Or using R:

> pbinom(5,10,0.4)-pbinom(1,10,0.4)

[1] 0.787404

25. (a) P (0) = P (X ≤ 0) = 0.00475, (b) P (X ≤ 3) = 0.29687, (c) P (X < 3) = P (X ≤


2) = 0.1268277, (d) P (X > 4) = 1 − P (X ≤ 4) = 0.4845, (e) P (2 ≤ X ≤ 5) = P (X ≤
5) − P (X ≤ 1) = 0.686.
26. P(X ≤ 10)-P(X ≤ 9) = 0.1859. Using R:

> dbinom(10,15,0.6)

[1] 0.1859378

22
27. Table 2 in Appendix B does not contain entries for p = 0.35. So either use the
expression for the binomial probability function or use the R function dbinom. Using the
equation for the binomial probability function, the binomial coefficient corresponding to
exactly 2 based on 7 trials successes is 7!/(2!5!)=21. So the probability of exactly exactly 2
successes is
> 21*0.35^2*0.65^5
[1] 0.2984848
Or using R:
> dbinom(2,7,0.35)
[1] 0.2984848
28. E(X)=np=18(0.6)=10.8 and the variance is np(1-p)=4.32.
29. E(X)=np=22(0.2)=4.4 and np(1-p)=3.52.
30. E(p̂) = p = 0.7, variance is p(1 − p)/n=0.0105
31. E(p̂) = p = 0.3, variance is p(1 − p)/n=0.007.
32. (a) With n = 10, P (p̂ ≤ 0.7) corresponds to P (X ≤ 7), where X is the number of
successes. From Table 2 in Appendix B, this yields 0.3222. (b) The strategy is to write the
probability so that Table 2 can be used: P (X ≥ 8) = 1 − P (X ≤ 7) =0.6778. (c) This is
the same as P (X = 8) = P (X ≤ 8) − P (X ≤ 7) =0.302. Or using R:

> dbinom(8,10,0.8)
[1] 0.3019899
33. Two heads and a tail has probability 0.44, which is the probability of two successes in
three trials when p = 0.7. The probability of three heads is the probability of three successes
in three trials, which is 0.343. Using R:

> dbinom(2,3,0.7) # Prob(2 heads)


[1] 0.441
> dbinom(3,3,0.7) # Prob(3 heads)
[1] 0.343
34. It is 0.5 because independence means that the probability of a success does not
depend on prior outcomes.
35. Note that this is a binomial distribution with probability 0.75 associated with the
event of interest and n = 5. But Table 2 in Appendix B does not contain entries for
p = 0.75, so either use the equation for the binomial probability function or use R. For
the first approach (a) 0.755 , (b) 0.255 , (c) it helps to note that the probability that at
least two lose money is one minus the probability that one or less lose money. That is,
P (X ≥ 2) = 1 − P (X ≤ 1). So the answer is 1 − 0.255 − 5(0.75)(0.25)4 = 0.984.
Or the R function dbinom can be used:

23
> dbinom(5,5,0.75) # a

[1] 0.2373047

> dbinom(0,5,0.75) # b

[1] 0.0009765625

> 1-pbinom(1,5,0.75) # c

[1] 0.984375

36. Using Table 2 in Appendix B: (a)P (X < 11) = P (X ≤ 10) = 0.5858, (b) 0.7323, (c)
P (X > 9) = 1 − P (X ≤ 9) = 0.5754, (d) P (X ≥ 9) = 1 − P (X ≤ 8) = 0.7265.
37. E(X)=np=25(0.4)=10, VAR(X) = np(1-p)=6, E(p̂) = p = 0.4, VAR(p̂) = p(1 −
p)/n = 0.0096.
38. (a) From Table 1 in Appendix B P (Z ≥ 1.5) = 1 − P (Z ≤ 1.5) = 0.0668, (b)P (Z ≤
−2.5)= 0.0062, (c) P (Z < −2.5) = P (Z ≤ −2.5)=0.0062. (For continuous distributions
there is no distinction between < and ≤.) Note that P (Z ≤ −2.5) = P (Z = −2.5) + P (Z <
−2.5). But the area of a line is zero. That is P (Z = −2.5) = 0. (d) P (−1 ≤ Z ≤ 1)=P (Z ≤
1) = P (Z ≤ −1)= 0.683.
39. (a) 0.691, (b) 0.894, (c) 0.799, (d) 0.928. Using R:

> pnorm(0.5) # a

[1] 0.6914625

> 1-pnorm(-1.25) # b

[1] 0.8943502

> pnorm(1.28)-pnorm(-1.28) #c

[1] 0.7994549

> pnorm(1.8)-pnorm(-1.8) # d

[1] 0.9281394

>

40. Using Table 1 in Appendix B, (a) 0.31, (b) 0.885, (c) 0.018,(d) 0.221. Using R:

> pnorm(-0.5) # a

[1] 0.3085375

> pnorm(1.2) # b

24
[1] 0.8849303

> 1-pnorm(2.1) # c

[1] 0.01786442

> pnorm(0.28)-pnorm(-0.28) # d

[1] 0.2205225

41. Using Table 1 in Appendix B, (a) c is the 0.0099 quantile, which is −2.33, (b) c is the
0.9732 quantile, which is 1.93, (c) c is the 1-0.5691=0.4309 quantile, which is −0.174, (d) c
is the (1+0.2358)/2=0.6179 quantile, which is 0.3.
42. Using Table 1 in Appendix B, (a) 1.43, (b) -0.01, (c) 1.7, (d) 1.28. Using R:
> qnorm(1-0.0764) # a

[1] 1.429711

> qnorm(1-0.5040) # b

[1] -0.01002668

> qnorm((1+0.9108)/2) # c

[1] 1.699633

> qnorm((1+0.8)/2) # d

[1] 1.281552

43. Convert to Z scores and use Table 1 in Appendix B: (a) P (X ≤ 40) = P (Z ≤


(40 − 50)/9)=0.133,
(b) P (X ≤ 55) = P (Z ≤ (55 − 50)/9)=0.71,
(c) 1 − P (X ≤ 60) = 1 − P (Z ≤ (60 − 50)/9) =0.133,
(d) P (X ≤ 60) − P (X ≤ 40) = P (Z ≤ (60 − 50)/9) − P (Z ≤ (40 − 50)/9)=0.733.
44.
> pnorm(22,20,9) # a

[1] 0.5879296

> 1-pnorm(17,20,9) # b

[1] 0.6305587

> 1-pnorm(15,20,9) # c

[1] 0.7107426

25
> pnorm(38,20,9)-pnorm(2,20,9) # d

[1] 0.9544997

45. Using the R function pnorm:

> pnorm(1,0.75,0.5)-pnorm(0.5,0.75,0.5) # a

[1] 0.3829249

> pnorm(1.25,0.75,0.5)-pnorm(0.25,0.75,0.5) # b

[1] 0.6826895

46. Standardizing, we see that P (µ − cσ < X < µ + cσ) = P (−c < (X − µ)/σ < c) =
P (−c < Z < c) = 0.95, so c is the 0.975 quantile, which is 1.96.
47. 1.28. Proceeding as done in Exercise 46, P (−c < Z < c) = 0.8. So c is the 0.9
quantile, which is 1.28.
48. P (X > 78) = P (Z > (78 − 68)/10) = 0.16.
49. P (X > c) = 0.05. So, P (Z > (c − 68)/10) = 0.05. From Table 1 in Appendix B,
(c-68)/10=1.645, so c = 84.45.
50. P (X > 62) = P (Z > 62 − 58)/3) = 1 − 0.91.
51.

> pnorm(115000,100000,10000)-pnorm(85000,100000,10000)

[1] 0.8663856

52. P (X ≥ 0) = P (Z ≥ (0 − (−300))/100) = P (Z ≥ 3) = 0.001.


53. P ((40000 − 50000)/10000 ≤ Z ≤ (60000 − 50000)/10000) = P (−1 ≤ Z ≤ 1) = 0.68.
54. P (Z ≤ (550 − 450)/50) − P (Z ≤ (350 − 450)/50) = 0.954.
55.

> 1-pnorm(260,230,25)

[1] 0.1150697

56.

> 1-pnorm(20000,14000,3500)

[1] 0.04323813

26
57. Yes. There is a formal proof that is not given in the text. Notice that regardless of
how large the sample size might be, outliers can have an inordinate impact on the sample
mean. This provides some sense of why the population mean is sensitive to small changes in
the tails of a distribution.
58. Small changes in the tails of a distribution can substantially alter the variance. This
was illustrated with the mixed normal distribution.
59. No, could be much larger. This was illustrated in the text using the mixed normal.
60. No. This is related to the previous exercise.
61. Yes, as illustrated in the text. See Figure 4.6.
62. Yes, the population mean can fall in the extreme portion of the tail.

Chapter 5
1. (a) With n = 25 and p = 0.5, note that the event p̂ ≤ 15/25 is the same as x/25 ≤
15/25, which corresponds to x ≤ 15. So P (p̂ ≤ 15/25) = P (X ≤ 15) = 0.885, (b) 1-
0.885=0.115, (c) This is P (X ≤ 15) − P (X ≤ 9) = 0.7705. Using R:

> pbinom(15,25,0.5) # (a)

[1] 0.8852385

> 1-pbinom(15,25,0.5) # (b)

[1] 0.1147615

> pbinom(15,25,0.5)-pbinom(9,25,.5) #(c)

[1] 0.7704771

2. With n = 10 and p = 0.05, the probability of getting p̂ = 0.1 is (using Table 2 in


Appendix B) P (X = 1) = P (X ≤ 1) − P (X ≤ 0) = 0.3151. Using R:

> dbinom(1,10,.05)

[1] 0.3151247

3. 0. It is impossible to get p̂ = 0.05 when n = 10.


4. Note that with n = 25, there are only two ways of getting p̂ ≤ 0.05: when the number
of successes is 0 or 1. The probability of getting 0 or 1 successes, when the probability of
success is p=0.1, is P(0)+P(1)= 0.2712. Or proceeding as described in Section 5.1, compute
0.05(25) = 1.4, round down to the nearest integer yielding 1, in which case P (p̂ ≤ 0.05) =
P (X ≤ 1), where X is a binomial random variable based on 25 trials where the probability
of success 0.05. So the probability is

> pbinom(1,25,0.1)

[1] 0.2712059

27
5. Assuming independence might be unreasonable. That is, knowing the husband’s
response might impact the probabilities associated with the wife’s response.
6. 0.4. As noted in the text E(p̂) = p. That is, on average p̂ estimates p.
7. The variance (squared standard error) of p̂, which is p(1 − p)/n = 0.4(0.6)/30 = 0.008.
8. (a) P (Z ≤ 4(29 − 30)/2) = 0.0228. Using R:
> pnorm(4*(29-30)/2)

[1] 0.02275013

(b) 1 − P (X ≤ 30.5) = 0.1587,


(c) P (Z ≤ 2) − P (Z ≤ −2) = 0.9545.
9. (a) P (Z ≤ 5(4 − 5)/5) = 0.1587, (b) 1 − P (X ≤ 7)= 0.023, (c) 0.977 − 0.023 = 0.954.
10. Using√R, and noting that X̄ has a normal distribution with mean 100000 and standard
deviation σ/ n = 10000/4, P (X̄ ≤ 95000) is
> pnorm(95000,100000,10000/4)

[1] 0.02275013

11.
> pnorm(102500,100000,10000/4)-pnorm(97500,100000,10000/4)

[1] 0.6826895

q(Z ≤ 3(800
12. P q − 750)/100) − P (Z ≤ 3(700 − 750)/100) = 0.866.
2
13. s /n = 160.78/9 = 4.23.
14. (a) P (Z ≤ 4(34 − 36)/5) = 0.055, (b) 0.788, (c) 0.992 (d) 0.788-0.055.
15. Using the central limit theorem means that by assumption, the sample size is suffi-
ciently large to assume normality. (a) P (Z ≤ 5(24 − 25)/3) = 0.047, (b) 0.952 (c) 1-0.047,
(d) 0.952-0.047.
16. A sample size that is relatively small, particularly when sampling from a skewed,
heavy-tailed distribution.
17. Symmetric distributions, or a skewed distribution that is relatively light-tailed,
roughly meaning that outliers are relatively rare, and n is not too small.
18. Using R:
x=c(4, 8, 23, 43, 12, 11, 32, 15, 6, 29)
msmedse(x)
7.570377
19. No tied values.
20.
x=c(5, 7, 2, 3, 4, 5, 2, 6, 7, 3, 4, 6, 1, 7, 4)
msmedse(x)
0.9705612
21. There are tied values.
22. No, because the estimate of the standard error might be highly inaccurate.

28
23. The value 201 is clearly an outlier suggesting that the√standard error of the mean
will be relatively large. It can be seen that sM = 3.60 while s/ n = 17.4.
24. When tied values are impossible and the sample size is not too small.
25. The Winsorized standard deviation is √ 181.459, n = 19, so the estimated standard
error of the 20% trimmed mean is 181.459/(0.6 19) = 69.4. Or using R
x=c(59, 106, 174, 207, 219, 237, 313, 365, 458, 497, 515, 529, 557, 615, 625,
645, 973, 1065, 3215)
trimse(x)
69.38258
26. The value 3215 appears to be an outlier, this is confirmed using the MAD-Median
rule. This outlier inflates the standard deviation but it does not impact the Winsorized
standard deviation.
27. The sample mean has the smallest standard error under normality. So if there is an
ideal estimator, it must be the mean, but under non-normality it can perform poorly.
28. No. Distributions can be approximately normal yet the standard error of the sample
mean can be large relative to the standard error of the 20% trimmed mean and median. If
the distribution is heavy-tailed, meaning outliers are common, a trimmed mean or median
can provide a more accurate estimate of the population mean when dealing with symmetric
distributions.
29. The standard errors can differ substantially as was illustrated in the text.
30.
median.vals=NULL
for(study in 1:10000){
x=lrnorm(25)
median.vals[study]=median(x)
}
mean(median.vals<1.5)
This last command computes the proportion of values that are less than 1.5.

Chapter 6
1. See the first paragraph in Section 6.1.
2. Because σ is known and normality is assumed, c corresponds to the 1 − α/2 quantile
of a standard normal distribution. A 0.8 confidence interval means that 1 − α = .0.8, so
1 − α/2 = 0.9, and from Table 1 in Appendix B c = 1.28. In a similar manner, for a 0.92 and
0.98 confidence interval, the values for c are 1.75, 2.33, respectively. R can be used instead
to determine c:

> qnorm(0.9) # For a 0.8 confidence interval

[1] 1.281552

> qnorm(0.96) # For a 0.92 confidence interval

[1] 1.750686

29
> qnorm(0.99) # For a 0.98 confidence interval

[1] 2.326348

3. Because σ is given and normality is assumed, c = 1.96, so the 0.95 confidence interval
is 45 ± 1.96(5/5) = (43.04, 46.96).
4. The 0.995 quantile for a standard normal distribution is c = 2.576, so the 0.99
confidence interval is 45 ± 2.576(5/5) = (42.424, 47.576).
5. No. The 0.95 confidence interval is

> 1150-1.96*25/6

[1] 1141.833

> 1150+1.96*25/6

[1] 1158.167

Because (1141.8, 1158.2) does not contain the value 1200, the data do not support the claim
that the population mean is 1200.
6. In each case, σ is given, normality is assumed, so c = 1.96, which is read from Table 1
in Appendix B. The resulting confidence intervals are: (a) (52.55, 77.45), (b) (180.82, 189.18),
(c) (10.68, 27.32). Using R:

> 65-1.96*22/sqrt(12)

[1] 52.55233

> 65+1.96*22/sqrt(12) # (a)

[1] 77.44767

> 185-1.96*10/sqrt(22)

[1] 180.8213

> 185+1.96*10/sqrt(22) # (b)

[1] 189.1787

> 19-1.96*30/sqrt(50)

[1] 10.68442

> 19+1.96*30/sqrt(50) # (c)

[1] 27.31558

30
√ √
7. Length=2cσ/
√ n. When n is doubled, length= 2cσ/ 2n. Therefore,
√ the ratio of the
lengths is 1/ 2. That is, the length is decreased by a factor of 1/ 2. For 4n, the ratio is
1/2.
8. Using R:

> x=c(20.01, 19.88, 20.00, 19.99)


> mean(x)-1.96*0.01/sqrt(4) # Lower end of CI
[1] 19.9602
> mean(x)+1.96*0.01/sqrt(4) # Upper end of CI
[1] 19.9798
That is, the 0.95 confidence interval is CI = (19.9602, 19.9798).(b) No, the confidence interval
does not contain the desired value, 20. (c) Just because a confidence interval contains 20, this
does not necessarily mean that µ = 20. Indeed, it can be argued that surely the population
mean differs from 20 at some decimal place.
9. Because σ is given and normality is assumed, from Table 1 in Appendix B, c = 2.58.
Using R provides a bit more accuracy. The R command qnorm(0.995) returns the 0.995
quantile of a standard normal distribution. The R commands for the lower and upper ends
of the 0.99 confidence interval are:

> 2.1-qnorm(0.995)*0.25/sqrt(10) #
[1] 1.896363
> 2.1+qnorm(0.995)*0.25/sqrt(10)
[1] 2.303637
and so the 0.99 confidence interval is (1.896363, 2.303637).
10.Using R:

> 1.7-qnorm(0.95)*0.3/sqrt(45)
[1] 1.62644
> 1.7+qnorm(0.95)*0.3/sqrt(45)
[1] 1.77356
So the 0.90 CI=(1.62644, 1.77356).
11. (a) This says that the value t is the 0.995 quantile of Student’s T distribution
with ν = 20 degrees of freedom. From Table 4 in Appendix B, t0.995 = 2.84534. (b)
P (T ≥ t) = 0.025 means that P (T ≤ t) = 0.975. That is t is the 0.975 quantile, which is
2.085963, (c) P (−t ≤ T ≤ t) = 0.90 means that P (T ≤ t) = 0.95. So t = 1.725. Using R:

31
> qt(0.995,20) # (a)

[1] 2.84534

> 1-qt(0.025,20) # (b)

[1] 3.085963

> qt(0.95,20) # (c)

[1] 1.724718

12. Note that the sample standard deviation s is given, not the population standard
deviation σ. This means that Student’s T distribution is used when computing a confidence
interval. That is, determine t using Table 4 with ν = n − 1 degrees of freedom. (a) Because
n = 10, ν = 9 and t = 2.262157. Using R, the lower and upper ends of the 0.95 confidence
interval are
> 26-qt(0.975,9)*9/sqrt(10)

[1] 19.56179

> 26+qt(0.975,9)*9/sqrt(10) # (a)

[1] 32.43821

> 132-qt(0.975,17)*20/sqrt(18)

[1] 122.0542

> 132+qt(0.975,17)*20/sqrt(18) #(b)

[1] 141.9458

> 52-qt(0.975,24)*12/sqrt(25)

[1] 47.04664

> 52+qt(0.975,24)*12/sqrt(25) # (c)

[1] 56.95336

That is, the 0.95 confidence intervals are (a) (19.56179, 32.43821), (b) (122.0542, 141.9458),
(c) (47, 57).
13. Using R, the lower and upper ends of the 0.95 confidence are
> 26-qt(0.995,9)*9/sqrt(10)

[1] 16.75081

32
> 26+qt(0.995,9)*9/sqrt(10) # (a)

[1] 35.24919

> 132-qt(0.995,17)*20/sqrt(18)

[1] 118.3376

> 132+qt(0.995,17)*20/sqrt(18) #(b)

[1] 145.6624

> 52-qt(0.995,24)*12/sqrt(25)

[1] 45.28735

> 52+qt(0.995,24)*12/sqrt(25) # (c)

[1] 58.71265

That is, the 0.99 confidence intervals are: (a) (16.75081, 35.24919), (b) (118.3376, 145.6624),
(c) (45.28735, 58.71265).
14. The easiest way is to use the R function trimci with the argument tr=0. (This
assumes that the R package Rallfun described in Chapter 1 has been installed.) Here are
the R commands:
x=c(77,87,88,114,151,210,219,246,253,262,296,299,306,
376,428,515, 666,1310,2611)
trimci(x,tr=0)
The 0.95 confidence interval returned is (161.5030, 734.7075). The command
trimcibt(x,tr=0) applies the bootstrap-t method. The resulting 0.95 confidence interval
is (−276.0116, 1172.22). A boxplot indicates a skewed distribution with outliers. Using
Student’s T gives a shorter confidence interval but the boxplot suggests that the actual
probability coverage is less than 0.95.
15. Assuming normality means that Student’s T is used. Using R
x=c(5, 12, 23, 24, 18, 9, 18, 11, 36, 15)
trimci(x,tr=0)
The 0.95 confidence interval is (10.69766, 23.50234).
16.

> 34-1.96*3

[1] 28.12

> 34+1.96*3

[1] 39.88

33
That is, the 0.95 confidence interval is CI=(28.12, 39.88).
17. With no tied values, the McKean–Schrader estimator can be used to compute a con-
fidence interval assuming the sample median is approximately normal. For a 0.99 confidence
interval, use the 0.995 quantile of a normal distribution, which is c = 2.58. So the 0.99
confidence interval for the population median is (262-2.58(77.83901), 262+2.58(77.83901))
= (61.3, 239.2).
18. Section 6.4.3 describes how to solve this problem. Here are the details: For a binomial
with probability of success 0.5, the probability of getting k − 1 = 1 successes or less, based
on n = 10 trials, is 0.01074219. That is, P (Y ≤ 1) = 0.01074219 where Y has a binomial
distribution with probability of success 0.5. So the probability coverage for the median is
1-2(0.01074219)=0.9785156.
19. The R function pbinom can be used, which was described in Section 4.5.1. The
probability coverage is
> 1-2*pbinom(3,15,0.5)

[1] 0.9648438

20. There are n = 19 values and X(3) = 88. That is, 88 is the third smallest value meaning
that, in the notation used in Section 6.4.3, k = 3. So the probability coverage is
> 1-2*pbinom(3,15,0.5)

[1] 0.9648438

21.The sample size is n = 15, the number of successes is X = 5, so p̂ = 5/15 and


CI=(0.09476973, 0.5718969) using Equation (6.10). Using Agresti-Coull, c = 1.96,
ñ = 15 + 1.962 = 18.84, p̃ = (5 + 1.962 /2)/18.84 = 0.3673139.
0.3673139-1.96sqrt(0.3673139(1-0.3673139)/18.84)=0.15
0.3673139+1.96sqrt(0.3673139(1-0.3673139)/18.84)=0.585.
That is, the 0.95 confidence interval is (0.15, 0.585). To apply the Agresti-Coull method
using R:
acbinomci(5,15)
22. The squared standard error is estimated with p̂(1 − p̂)/n (a) p̂ = 5/25 = 0.2 so the
estimated squared error is
> 0.2*(1-0.2)/25

[1] 0.0064

(b)
> phat=12/48
> phat*(1-phat)/48

[1] 0.00390625

(c)

34
> phat=80/100
> phat*(1-phat)/100

[1] 0.0016

(d)

> phat=160/300
> phat*(1-phat)/300

[1] 0.0008296296

23. For a 0.95 confidence interval, c = 1.96, the lower and upper ends of the confidence
interval can be computed as follows:

> phat=10/100
> phat-1.96*sqrt(phat*(1-phat)/100)

[1] 0.0412

> phat+1.96*sqrt(phat*(1-phat)/100)

[1] 0.1588

That is, the 0.95 confidence interval is (0.0412, 0.1588). For Agresti-Coull

> x=10
> xtil=x+1.96^2/2
> n=100
> ntil=n+1.96^2
> ptil=xtil/ntil
> ptil-1.96*sqrt(ptil*(1-ptil)/100)

[1] 0.05231745

> ptil+1.96*sqrt(ptil*(1-ptil)/100)

[1] 0.1772784

So the 0.95 confidence interval is (0.052, 0.177).


24.

> phat=290/1000
> phat-1.96*sqrt(phat*(1-phat)/1000)

[1] 0.2618755

> phat+1.96*sqrt(phat*(1-phat)/1000)

35
[1] 0.3181245

So the 0.95 confidence interval is (0.2618755, 0.3181245).


25.

> phat=60/1000
> phat-1.96*sqrt(phat*(1-phat)/1000)

[1] 0.04528041

> phat+1.96*sqrt(phat*(1-phat)/1000)

[1] 0.07471959

So, (0.04528041, 0.07471959).


26. With the number of successes equal to one, using Equation (6.15) can be unsatisfac-
tory because it assumes p̂ has a normal distribution, which might not be approximately true
when the probability of success is close to zero or one.
27. Using Blyth’s method

> n=600
> alpha=0.1
> 1-(1-alpha/2)^(1/600)

[1] 8.548517e-05

> 1-(alpha/2)^(1/600)

[1] 0.004980443

That is, the 0.90 confidence interval is (0.0000855, 0.00498). Using instead the Schilling–Doi
method via the R command binomLCO(1,600,alpha=0.1), the confidence interval is (0.000,
0.008).
28. 300/4=75, so

> phat=75/300
> cval=qnorm(0.995) # cval is the $1-alpha/2=0.995$ quantile.
> phat-cval*sqrt(phat*(1-phat)/300)

[1] 0.1856043

> phat+cval*sqrt(phat*(1-phat)/300)

[1] 0.3143957

The 0.99 confidence interval is (0.1856043, 0.3143957).


29.

36
> phat=5/250
> phat-1.96*sqrt(phat*(1-phat)/250)

[1] 0.00264542

> phat+1.96*sqrt(phat*(1-phat)/250)

[1] 0.03735458

So the 0.95 confidence interval is (0.00264542, 0.03735458).


30. Using Blyth’s method

> n=250
> alpha=0.01
> 1-alpha^(1/250)

[1] 0.01825206

So the 0.99 confidence interval is (0, 0.0183). The 0.99 confidence interval returned by the R
command binomLCO(0,250,alpha=.01)
q is (0.000, 0.021).
31. 0.18 ± 1.96 0.18(0.82)/1000 or using R

> phat=180/1000
> phat-1.96*sqrt(phat*(1-phat)/1000)

[1] 0.1561878

> phat+1.96*sqrt(phat*(1-phat)/1000)

[1] 0.2038122

So the 0.95 confidence interval is (0.156, 0.204).


32. Actual probability coverage can be substantially less, as well as substantially greater
than intended, depending how skewed the distributions are and the likelihood of encountering
outliers.
33. Even when the distribution of the sample mean is approximately normal, Student’s
T can yield inaccurate confidence intervals. A large sample size might be needed to get an
accurate confidence interval. Just how large depends on the degree to which sampling is
from a skewed distribution and the likelihood of encountering outliers.
34. Here are the commands to create a boxplot:

> lsat=c(545,555,558,572,575,576,578,580,594,605,635,651,653,661,666)
> boxplot(lsat,xlab='LSAT')

37
660
640
620
600
580
560

LSAT

No outliers are indicated and the data are skewed: the median is relatively close to
the lower quartile. If indeed the population distribution is skewed, Student’s T can yield
inaccurate confidence intervals as indicated in Section 6.5.
35. (a). With n = 24, the number of observations trimmed from each tail is 0.2n rounded
down to the nearest integer, which is g = 4. So the degrees of freedom are n − 2g − 1 =
24 − 2(4) − 1 = 15. For a 0.95 confidence interval, use the 0.975√ quantile
√ of Student’s T
distribution: t0.975 = 2.13. The confidence interval is 52 ± 2.13 12/(0.6 24). Using R

> 52-2.13*sqrt(12)/(0.6*sqrt(24))

[1] 49.48977

> 52+2.13*sqrt(12)/(0.6*sqrt(24))

[1] 54.51023

That is, the 0.95 confidence is (49.49, 54.51).


(b). With n = 36, so ν = 36 − 14 − 1 = 21 and t0.975 = 2.10. The 0.95 confidence
√ g = 14 √
interval is 10 ± 2.07 30/(0.6 36).

38
(c). With n = 12, the degrees of freedom
√ √are ν = 12 − 4 − 1 = 7 and t0.975 = 2.365. The
0.95 confidence interval is 16 ± 2.365 9/(0.6 12).
37. The R commands:
aware=c(59, 106, 174, 207, 219, 237, 313, 365, 458, 497, 515, 529, 557, 615, 625,
645, 973, 1065, 3215)
trimci(aware)
The 0.95 confidence interval returned by this last command is (293.6, 595.9).
38. Chapter 5 illustrated that the measure of location having the smallest standard error
dependes on the distribution from which observations are sampled. For a normal distribution,
the shortest possible confidence interval for the mean is obtained using Student’s T, roughly
because the sample mean has smaller standard error than other estimators that might be
used. But for heavy-tailed distributions, a confidence interval based on a trimmed mean
or median can be substantially shorter than the confidence interval based on Student’s T
because the standard error of the sample mean can be substantially larger than the standard
error of other measures of location that might be used as noted in Chapter 5.
39. Using R:
x=c(5, 60, 43, 56, 32, 43, 47, 79, 39, 41)
trimci(x)
trimci(x,tr=0)
For a 20% trimmed mean, the 0.95 confidence interval is (34.8, 54.8). For the mean, it is
(30.7, 58.3).
40. The R command
outbox(x)
indicates that the value 5 is an outlier. So it is not surprising that the confidence interval
based on the 20% trimmed mean is shorter.
41. There are two ways of doing this. First, sort the values in t.vals. This can be done
with the R command
tsort=sort(t.vals).
Then use Equation (6.13) with a=tsort[25] and b=tsort[975]. Or sort the absolute value
of t.vals using the R command tsort=sort(abs(t.vals)) and then use Equation (6.12)
with t=tsort[950].
42. If the data are stored in the R variable x, the following R commands can be used:
t.vals=NULL
for(i in 1:1000){
z=sample(x,25,replace=T)
t.vals[i]=trimci(z,tr=0,null.value=mean(x),pr=F)$test.stat
}
In effect, the last command computes the test statistic T using data sampled from a distri-
bution that has a mean equal to the value of mean(x). Said another way, in the bootstrap
world, the population mean is known; it is X̄, the sample mean based on the observed data.
That is, bootstrap samples are generated from a distribution (the observed data) that has a
population mean equal to X̄. So the value of T based on a bootstrap sample is

X̄ ∗ − X̄
T∗ = √ ,
s∗ / n

39
where X̄ ∗ is the sample mean based on a bootstrap sample and s∗ is the sample standard
deviation, again based on the bootstrap sample. The R code described above yields 1000
T ∗ values, which in turn can be used to determine critical values. More precisely, let T(1)∗

∗ ∗
· · · ≤ T(1000) be the T values written in ascending order. Then the values for a and b used
∗ ∗
to compute a confidence interval are a = T(26) and b = T974 . Note, for example, that 2.5% of
the T ∗ values are less than T(26)

. That is, a estimates the 0.025 quantile of the distribution
of T .

Chapter 7
1. The standard deviation σ is known, so use the standard normal distribution (Table 1
in Appendix B) to determine the critical value. The hypothesis H0 : µ ≥ 80, means that the
√ value is in the left tail; it is the 0.05 quantile, which is −1.645. The test statistic is
critical
Z = 10(78 − 80)/5 = −1.265, which is greater than the critical value so fail to reject.
2. For a two-tailed test, reject if Z is less than the α/2 quantile or greater than the
1 − α/2 quantile, which are −1.96 and 1.96, respectively. Again, Z = −1.265, so fail to
reject. √ √
3. The 0.95 confidence interval is X̄ ± 1.96σ/ n = 78 ± 1.965/ 10 = (74.9, 81.1). This
interval contains the hypothesized value, 80, so fail to reject.
4. Exercise 1 indicates that the test statistic is Z = −1.265. The null hypothesis
corresponds to Case 1 in Section 7.1.3. So the p-value is given by Equation (7.4) and is
P (Z ≤= −1.265)=0.103. Using R:

> pnorm(-1.265)

[1] 0.1029357

5.The null hypothesis corresponds to Case 3 in Section 7.1.3. The p-value is


2 (1 − P (Z ≤ | − 1.265|)) = 0.206. Using R:

> 2*(1-pnorm(1.265))

[1] 0.2058713

6. Because σ is known, the test statistic is Z = 49(120 − 130)/5 = −14 and the critical
value is read from Table 1 (the standard normal distribution), which is −1.645. Because
−14 < −1.645, reject.
7. The test statistic is the same as in Exercise 6. Now reject if Z is less than -1.96 or
greater than 1.96. Z =√ −14, so reject.
8. 120 ± 1.96(5)/ 5 = (118.6, 121.4). This interval does not contain the hypothesized
value, so reject. Using R:

> 120-1.96*5/sqrt(49)

[1] 118.6

> 120+1.96*5/sqrt(49)

40
[1] 121.4

9. Yes, fail to reject because X̄, the estimate of the population mean, is consistent with
H0 .
10. H0 : µ ≤ 232. Z = 5(240 − 232)/4 = 10, the critical value is 2.326348, Z exceeds the
critical value,√so reject.
11. Z = 20(565 − 546)/40 = 2.12. The critical values are −1.96 and 1.96, so reject.
12. A confidence interval provides a range of values that is likely to include the popu-
lation mean, assuming normality. But a p-value provides information about the strength of
empirical evidence that a decision can be made about whether the population mean is less
than or greater than the hypothesized value.
13. This is Case 3 in Section 7.2. The critical value is −2.326348, so power is
> pnorm(-2.326348-5*(56-60)/5)

[1] 0.9529005

14. This is Case 1 in Section 7.2. The critical value is 1.96. So power is
> 1-pnorm(1.96-6*(103-100)/8)

[1] 0.6140919

15. This is Case 3 in Section 7.2. The critical values are −1.96 and 1.96, so power is
> pnorm(-1.96-7*(47-50)/10) +(1-pnorm(1.96-7*(47-50)/10))

[1] 0.5556945

16. Might commit a Type II error. Power might be relatively low.


17. Power is
> pnorm(-1.645-sqrt(10)*(46-48)/5)

[1] 0.3519397

18.

> pnorm(-1.645-sqrt(20)*(46-48)/5)

[1] 0.5571923

> pnorm(-1.645-sqrt(30)*(46-48)/5)

[1] 0.7074293

> pnorm(-1.645-sqrt(40)*(46-48)/5)

[1] 0.8118737

41
19. Increase α. But the Type I error probability is higher.
20. Because s is given, not σ, use Student’s T with critical values read from Table 4 in
Appendix B. The degrees of freedom are 25-1=24. Testing for equality with α = 0.05, the
critical value is t0.975 = 2.063899. Here is how to determine the critical value using R:

> qt(0.975,24)

[1] 2.063899

So, (a) T = 5(44 − 42)/10) = 1, fail to reject. (b) T = 0.5, fail to reject.(c) T = 2.5, reject.
21. Power is higher when the standard deviation, s, is likely to be small.
22. The degrees of freedom are 16-1=15 and the critical value is t0.95 = 1.75305. Here is
how to determine the critical value using R:

> qt(0.95,15)

[1] 1.75305

So (a) T = 0.8, fail to reject.(b) T = 0.4, fail to reject. (c) T = 2, reject.


23. Fail to reject because X̄ is consistent with H0 .
24. The sample size is n = 10, so the degrees of freedom are 9 and the critical
√ value is
t0.975 = 2.262157. That is, the null hypothesis is rejected if |T | ≥ 2.262157. T = 10(46.4 −
45)/11.27 = 0.39, fail to reject.
25. Given s, not σ, so use Student’s T. The degrees of freedom are 100-1=99. Because
the hypothesis is that µ is greater than 10.5, the critical value will be the 0.025 quantile of
Student’s T distribution. If using Table 4, determine the 0.975 quantile t0.975 and multiply
by −1 to get the critical value. The resulting critical value is t0.025 = −1.984. Here is how
to determine the critical value using R:

> qt(0.025,99)

[1] -1.984217

The test statistic is T = 100(9.79 − 10.5)/2.72 = −2.61. This is less than the critical value,
so reject.
26. Using R:

> 4*(40-38)/4 # The test statistic T

[1] 2

> qt(0.99,15) # The critial value

[1] 2.60248

42
T = 2, which is less than the critical value, fail to reject.
27. The degrees of freedom are 8, testing for equality, α/2 = 0.025 so the critical val-
ues are the 0.025 and 0.975 quantiles of Student’s T distribution, which are −2.306004 and
−2.306004, respectively. Here is how to determine the critical values using R:

> qt(0.025,8)

[1] -2.306004

> qt(0.975,8)

[1] 2.306004

Using R to compute the test statistic:

> sqrt(9)*(33-32)/4

[1] 0.75

Because Student’s
√ T = 0.75 is between the lower and upper critical values, fail to reject.
28. T = 10(146 − 150/2.5 = −1.6. Because the goal is to test for equality, reject if T
is less than t0.025 = −2.26 or greater than t0.975 = 2.26. That is, reject if |T | ≥ t0.975 , so fail
to reject.
29. Using R:
> times=c(42, 90, 84, 87, 116, 95, 86, 99, 93, 92,
+ 121, 71, 66, 98, 79, 102, 60, 112, 105, 98)
> t.test(times,alternative='greater',conf.level=0.99,mu=80)

One Sample t-test

data: times
t = 2.2737, df = 19, p-value = 0.01739
alternative hypothesis: true mean is greater than 80
99 percent confidence interval:
78.85461 Inf
sample estimates:
mean of x
89.8

30. With n = 20 and 20% trimming, g = 4 observations are trimmed from both tails, so
the degrees of freedom are 20-8-1=11. So with α = 0.05 and when testing for exact equality,
the critical values are t0.025 = −2.2 and t0.975 = 2.2. Using R to compute the test statistics:

> 0.6*sqrt(20)*(44-42)/9 # (a) fail to reject

43
[1] 0.5962848

> 0.6*sqrt(20)*(43-42)/9 # (b) fail to reject

[1] 0.2981424

> 0.6*sqrt(20)*(43-42)/3 # (c) fail to reject

[1] 0.8944272

31. Now the degrees of freedom are 16-6-1=9. The critical value is t0.95 , which is equal
to

> qt(0.95,9)

[1] 1.833113

Using R, the test statistic T is

> 0.6*sqrt(16)*(44-42)/9 # (a) fail to reject

[1] 0.5333333

> 0.6*sqrt(16)*(43-42)/9 # (b) fail to reject

[1] 0.2666667

> 0.6*sqrt(16)*(43-42)/3 # (c) fail to reject

[1] 0.8

32. n = 10, so the degrees of freedom are 5, testing for equality with α = 0.05, so the
√ meaning that the null hypothesis is rejected if |Tt | ≥ 2.571.
critical value is t0.975 = 2.571,
The test statistic is T = 0.6 10(42.17 − 45)/1.73 = −3.1, reject.
33. The degrees of freedom are 25-10-1=14, the critical value is t0.995 = 2.977, the test
statistic is

> 0.6*sqrt(25)*(5.1-4.8)/7

[1] 0.1285714

44
That is, Tt = 0.129, |T | < 2.977, fail to reject. So based on Tukey’s three decision rule, make
no decision about whether the trimmed mean is less than or greater than 48.
34. Works poorly when testing hypotheses about the mean. But for a 20% trimmed
mean or the median, performs relatively well.
35. test=(mean(x)-null.value)*sqrt(length(x))/sigma.
36. R commands:
flag=which(ToothGrowth[,3]==0.5)
trimci(ToothGrowth[flag,1],null.value=8,tr=0)
returns the 0.95 confidence interval
(8.499046, 12.710954)
which does not contain the value 8. The p-value is 0.018.
trimci(ToothGrowth[flag,1],tr=0.2,null.value=8)
returns a p-value equal to 0.096. A plot of the distribution, based on the R command
akerd(ToothGrowth[flag,1])
indicates a skewed distribution suggesting that a 20% trimmed mean better reflects the
typical value. The mean is 10.605, which is further from the hypothesized value 8 compared
to the 20% trimmed mean, which is 9.975. Also, the standard errors of the mean and 20%
trimmed mean are 1.006178 and 1.084262, respectively. Because the mean has a smaller
standard error, this further explains why the mean rejects and the 20% trimmed mean does
not.
37. Would expect that based on the median or MOM, would fail to reject. One reason
is that because the mean has a smaller standard error than the 20% trimmed mean, this
suggests that the standard error of the mean will be smaller than the standard error of
MOM or the median. Also, the distribution is skewed with the null value relatively close
to the central portion of the distribution. So would expect that the difference between the
null value and the sample mean is larger than the difference between the null value and the
median or MOM. Noting that the dosage is stored in column 3, this can be verified with the
R commands:
id=which(ToothGrowth[,3]==0.5)
momci(ToothGrowth[id,1],null.value=8)
which returns

$ci
[1] 7.829412 12.520000

$p.value
[1] 0.07

$est.mom
[1] 10.03158

and using the median


sintv2(ToothGrowth[id,1],null.value=8)

45
$n
[1] 20

$ci.low
[1] 7.511711

$ci.up
[1] 11.42943

$p.value
[1] 0.086

Chapter 8
1.

> x=c(5, 8, 9, 7, 14)


> y=c(3, 1, 6, 7, 19)
> lsfit(x,y)$coef

Intercept X
-8.477876 1.823009

tsreg(x,y)$coef
returns
Intercept
-7.968254 1.746032.
2. Using the least squares regression line, the sum of squared residuals is computed with
the following R commands:

> x=c(5, 8, 9, 7, 14)


> y=c(3, 1, 6, 7, 19)
> sum(lsfit(x,y)$residuals^2)

[1] 46.58407

3.

> x=c(5, 8, 9, 7, 14)


> y=c(3, 1, 6, 7, 19)
> yhat=2*x-9
> res.sq=sum((y-yhat)^2) #sum of the squared residuals.
> res.sq

[1] 53

46
Least squares minimizes the sum of the squared residuals. So for any choice for the slope
and intercept, the sum of the squared residuals will be greater than or equal to 46.584, the
sum of squared residuals based on the least squares estimator, as indicated in the answer to
Exercise 1.
4. Note that (Xi − X̄)2 = (n−1)s2x . So in Equation (8.5), C = (n−1)s2x = 24(12) = 288
P

and the estimated slope is 144/288 = 0.5.


5. The file cancer rate dat.txt has labels in line 1, so use header=TRUE when reading the
data into R. As noted in the text, values in this file are separated by &.
So an R command for reading the data is:
can=read.table(‘cancer_rate_dat.txt’,header=TRUE,sep=‘&’)
Or use the file.choose command to read the data. See Section 1.3.
The R command
labels(can)
indicates the labels used in the file.
The R command
lsfit(can$calories,can$Rate)$coef
returns an estimate of slope and intercept, namely b1 = −0.0355, b0 = 39.93, respectively.
6
> SAT=c(500, 530, 590, 660, 610, 700, 570, 640)
> GPA=c(2.3, 3.1, 2.6, 3.0, 2.4, 3.3, 2.6, 3.5)
> lsfit(SAT,GPA)$coef

Intercept X
0.484615385 0.003942308

7.
> SAT=c(500, 530, 590, 660, 610, 700, 570, 640)
> GPA=c(2.3, 3.1, 2.6, 3.0, 2.4, 3.3, 2.6, 3.5)
> sum(lsfit(SAT,GPA)$residuals)

[1] -6.938894e-18

The exact value can be shown to be zero.


8.
> x=c(40, 41, 42, 43, 44, 45, 46)
> y=c(1.62, 1.63, 1.90, 2.64, 2.05, 2.13, 1.94)
> lsfit(x,y)$coef

Intercept X
-1.25321429 0.07535714

9. Ŷ = −0.0355(600) + 39.93 = 18.63, but daily calories of 600 is greater than any value
used to compute the slope and intercept. That is, extrapolation is being used.
10.

47
> MOU=c(63.3, 60.1, 53.6, 58.8, 67.5, 62.5)
> TIMES=c(241.5, 249.8, 246.1, 232.4, 237.2, 238.4)
> lsfit(MOU,TIMES)$coef

Intercept X
270.8927673 -0.4919535

The negative slope indicates that as MOU increases, times decrease. But this needs to be
checked by testing the hypothesis of a zero slope using a method that allows heterosceasticity
and the possible impact of outliers needs to be considered. Using the R command
olshc4(MOU,TIMES)
tests the hypothesis of a zero slope using a method that allows heterosceasticity, which fails
to reject with α = 0.05. The p-value is 0.265.
11. Here are the R commands for plotting the data and computing the least squares
estimate of the intercept and slope:

> x=c(1, 2, 3, 4, 5, 6)
> y=c(1, 4, 7, 7, 4, 1)
> plot(x,y)
> lsfit(x,y)$coef

Intercept X
4.000000e+00 -5.838669e-16

48
7
6
5
4
y

3
2
1

1 2 3 4 5 6

So, assuming a straight line, the slope is estimated to be virtually zero, but this is clearly
not the case.
12. Using R:

> x=c(1, 2, 3, 4, 5, 6)
> y=c(4, 5, 6, 7, 8, 2)
> plot(x,y)
> lsfit(x,y)$coef

Intercept X
5.333333e+00 -6.369458e-16

49
8
7
6
5
y

4
3
2

1 2 3 4 5 6

In this case, due to a single unusual value the estimate of the slope is zero, yet all but
one of the points lie on a straight line that has a positive slope.
13. A study might indicate that for a certain range of vitamin A intake, health improves.
But it would be a mistake to conclude that more vitamin A is always better. For vitamin A
intake outside the range of values used in a study, the association with health might differ
substantially. Also, it should not be assumed a straight regression line provides a satisfactory
summary of the association between two variables.
14. Can use the R commands:
diab=read.table(‘diabetes_sockett_dat.txt’,header=TRUE)
lplot(diab[,1],diab[,2])
15. Assuming that the data have been read into the R variable diab as described in the
answer to Exercise 14, the R commands
flag=diab[,1]<=7
olshc4(diab[flag,1],diab[flag,2])
return a p-value equal to 0.0257 for the slope.
For the Theil–Sen estimator, use the R command
regci(diab[flag,1],diab[flag,2])
which returns:

50
$regci
ci.low ci.up Estimate S.E. p-value
Intercept 2.22000000 4.5058824 3.0500000 0.6077580 0.003338898
Slope 1 0.06896552 0.4871795 0.3333333 0.1062048 0.023372287

16. The data are stored in columns 4 and 8 in the file read dat.txt. Here are the R
commands:
read=read.table(‘read_dat.txt’,skip=13)
The first 13 lines describe the variables, which is why the argument skip=13 was used.
spearci(read[,4],read[,8])
returns a p-value equal to 0.014, and
scorciMC(read[,4],read[,8])
returns a p-value equal to 0.002.
The plot returned by this last command indicates outliers among the independent variable
only. That is, the plot suggests that there are no outliers that require taking into account
the overall structure of the data as discussed in Section 8.6. So it is not surprising that
Spearman’s correlation gives a result that is very similar to the skipped correlation.
17. With n = 10 the degrees of freedom are 10-2=8. The estimate of the squared standard
error is 35/140. The critical value, using the R command qt(0.975,8), is t0.975 = 2.306004.
Or the critical value can be read from Table 4 in Appendix B. So the ends of the confidence
interval are

> -1.5-qt(0.975,8)*sqrt(35/140)

[1] -2.653002

> -1.5+qt(0.975,8)*sqrt(35/140)

[1] -0.3469979

That is, the 0.95 confidence interval is (−2.65, −0.35).


18. In this case, 1 − α/2 = 0.99, so the ends of the of 0.98 confidence intervals are given
by

> -1.5-qt(0.99,8)*sqrt(35/140)

[1] -2.94823

> -1.5+qt(0.99,8)*sqrt(35/140)

[1] -0.05177028

51
That is, the 0.98 confidence interval is (−2.948, −0.052).
19. Least squares regression can be negatively impacted by non-normality, heteroscedas-
ticity and outliers.
r 1 − 3(0.5) = −0.5.
20. So X̄ = 15/30 = 0.5, Ȳ = 30/30 = 1, b1 = 30/10 = 3, b0 =
36(60)
21. (a) b1 = 180/60 = 3, b0 = 20 − 3(7) = −1.(b) T = −1 121(1922)
= −0.09637, the
degrees of freedom are n − 2 = 38 − 2 = 36, so the critical value is t0.99 = 2.43, |T | < 2.43,
so fail to reject. Using R:

> -1*sqrt((36*60)/(121*1922)) #The test statistic

[1] -0.09637347

> qt(0.99,36) # the critical value

[1] 2.434494

Again, the absolute q


value of the test statistic is less than the critical value, so fail to reject.
(c) 3 ± 1.688298 (121/60) = (0.602, 5.398). Using R:

> 3-qt(0.95,36)*sqrt(121/60)

[1] 0.6024587

> 3+qt(0.95,36)*sqrt(121/60)

[1] 5.397541

22. (a) b1 = 100/400 = 0.25, b0 = 10 − 0.25(12) = 7. (b) The degrees of freedom


are 41-2=39,
q the critical value is t0.95 = 1.684875, so the 0.90 confidence interval is 0.25 ±
1.684875 144/400 = (−0.76, 1.26).
23. The degrees of freedom
q are 18-2=16, t0.975 = 2.119905, and the 0.95 confidence
interval is 3.1 ± 2.119905 36/144 = (2.04, 4.14) so a reasonable decision is that the slope is
greater than 2.
24. Degrees of freedom are 20-2=18, t0.975 = 2.100922. A confidence interval for the
intercept
r is P
s2 X2
b0 ± t n PY.X i
(Xi −X̄)2
.
So for the situation
r at hand the 0.95 confidence interval is
25(169)
6 ± 2.100922 20(90) = (2.78, 9.22).
25. (a) The degrees of freedom are 27-2=25.

> r=200/sqrt(100*625)
> r

52
[1] 0.8

> test.stat.T=r*sqrt(25/(1-r^2)) # Test statistic T


> test.stat.T

[1] 6.666667

> qt(0.995,25) # critical value

[1] 2.787436

|T | exceeds the critical value, reject.


(b) The degrees of freedom are 5-2=3.

> r=10/sqrt(16*25)
> test.stat.T=r*sqrt(3/(1-r^2)) # Test statistic T
> test.stat.T

[1] 1

> qt(0.975,25) # critical value

[1] 2.059539

Fail to reject. q q
3
26. The degrees of freedom are 29-2=27, r = 40/ 100(64) = 0.5, T = 0.5 1−0.5 2 = 2.8,

t0.95 = 1.7, reject. This indicates dependence, but look at the answers to Exercise 27.
27. (a). Yes, if the correlation is positive, this means that the least squares regression
has a positive slope because b1 = rsy /sx . (b) Yes, outliers can mask a negative association.
More generally, even a single outlier might have an inordinate impact on Pearson’s correlation
resulting in a misleading indication of the strength of the association among the bulk of the
points. (c) Plot the data. Check the impact of using a method that deals with outliers and
heteroscedasticity.
28. Nothing, this does not change r. For example,

> set.seed(46)
> x=rnorm(30) # generate 30 values from a standard normal.
> y=rnorm(30)
> cor(x,y)

[1] 0.1888607

> cor(x,3*y)

[1] 0.1888607

53
Recall that Pearson’s correlation is given by
A
r=√ ,
CD
where X
A= (Xi − X̄)(Yi − Ȳ ),
(Xi − X̄)2
X
C=
and
(Yi − Ȳ )2 .
X
D=
Multiplying Y by the constant c, A becomes 3A and C become c2 C, so r does not change.

29. The absolute value of the slope gets larger. Multiplying Y by 3, the slope and the
intercept are multiplied by 3 as well. For example,

> set.seed(46)
> x=rnorm(30)
> y=rnorm(30)
> lsfit(x,y)$coef

Intercept X
-0.0005236425 0.2094865640

> lsfit(x,3*y)$coef

Intercept X
-0.001570928 0.628459692

The previous exercise demonstrated that multiplying Y by a constant c 6= 0 does not


alter r. But sy becomes csy . Because
sy
b1 = r ,
sx
the slope is altered.
30. Increasing σ 2 to 2 means that the variance of the residuals increases, so the correlation
will get smaller. See Figure 8.5.
To explain it another way, recall that r2 =VAR(Ŷ )/VAR(Y ). Also note that increasing σ 2
to 2, the variance of the Y values, VAR(Y ), increases as well. But because the slope is the
same for both situations, VAR(Ŷ ) does not change when σ 2 is increased to 2. Consequently,
r2 decreases.
31. The slope and intercept were chosen so as to minimize the second sum, (Yi − Ŷi )2 .
P

So in particular, if Ȳ is used to predict Y , (Yi − Ȳ )2 will be larger than (Yi − Ŷi )2 .


P P

32. The end of Section 8.5.1 describes several features of data that impact r, each of
which influence the ability of r to detect an association. Another issue is power based on
the sample size of the available data.

54
33. Outliers, for example, can result in a large r but a poor fit.
34. No. You need to look at the data. Restricting the range of X can increase as well as
decrease r.
35. The confidence interval can be relatively long and is potentially inaccurate.
36. The confidence interval can be relatively inaccurate due to using the wrong standard
error.

Chapter 9
1.

> s1sq=8
> s2sq=24
> n1=20
> n2=10
> sqp=((n1-1)*s1sq+(n2-1)*s2sq)/(n1+n2-2) # Estimate of the
> # assumed common variance.
> sqp

[1] 13.14286

> T=(15-12)/sqrt(sqp*(1/n1+1/n2))
> T

[1] 2.136637

> qt(0.975,n1+n2-2) # Critical value, degrees of freedom = n1+n2-2

[1] 2.048407

Because |T | > 2.048, reject.


2.

> s1sq=4
> s2sq=16
> n1=20
> n2=30
> sqp=((n1-1)*s1sq+(n2-1)*s2sq)/(n1+n2-2) # Estimate of the
> # assumed common variance.
> sqp

[1] 11.25

3.

> T=(45-36)/sqrt(11.25*(1/20+1/30))
> T

55
[1] 9.29516

> qt(0.975,20+30-2)

[1] 2.010635

Because |T | > 2.0106, reject.


4.

> W=(45-36)/sqrt(4/20+16/30) # Welch's test statistic


> W

[1] 10.50974

> q1=4/20
> q2=16/30
> df=(q1+q2)^2/(q1^2/19+q2^2/29) #Degrees of freedom
> df

[1] 45.13947

> qt(0.975,df) #Critical value

[1] 2.013932

|W | > 2.014, reject.


5. Welch’s test might have more power, roughly because W is larger than T .
6. Because the sample sizes are equal, the estimate of the assumed common variance is
the average of the individual variances, which is 25.

> T=(86-80)/sqrt(25*2/20)
> T

[1] 3.794733

> qt(0.995,20+20-2) # The critical value.

[1] 2.711558

Because |T | > 2.71, reject.


7.

> W=(86-80)/sqrt(2*25/20)
> W

[1] 3.794733

56
> q1=25/20
> q2=25/20
> df=(q1+q2)^2/(q1^2/19+q2^2/19) #Degrees of freedom
> df

[1] 38

> qt(0.995,df) # The critical value.

[1] 2.711558

Because |W | > 2.71, reject.


8. Notice that in the last two exercises, with equal sample variances, Student’s T and
Welch give exactly the same result, suggesting that when the sample variances are approxi-
mately equal, the choice between T and W makes little difference.
9.
> q1=21/16
> q2=29/16
> df=(q1+q2)^2/(q1^2/15+q2^2/15) #Degrees of freedom
> df

[1] 29.25117

> qt(0.975,df) #Critical Value

[1] 2.044467

> CI=(10-5)-qt(0.975,df)*sqrt(21/16+29/16) #Lower end of confidence interval


> CI[2]=(10-5)+qt(0.975,df)*sqrt(21/16+29/16) #Upper end of confidence interval
> CI

[1] 1.385859 8.614141

That is, the 0.95 confidence interval is (1.4, 8.6). Based on Tukey’s three decision rule, decide
that the first group has the larger population mean.
10.
> sqp=(21+29)/2 # equal sample sizes so average the sample variances
> # to get an estimate of the assumed common variance
> CI=(10-5)-qt(0.975,30)*sqrt(sqp*(1/16+1/16)) #Lower end of confidence interval
> CI[2]=(10-5)+qt(0.975,30)*sqrt(sqp*(1/16+1/16)) #Upper end of confidence interval
> CI

[1] 1.389738 8.610262

That is, the 0.95 confidence interval is (1.39, 8.61). Based on Tukey’s three decision rule,
decide that the first group has the larger population mean.
11.

57
> x=c(132, 204, 603, 50, 125, 90, 185, 134)
> y=c(92, -42, 121, 63, 182, 101, 294, 36)
> t.test(x,y)

Welch Two Sample t-test

data: x and y
t = 1.1922, df = 11.193, p-value = 0.2579
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-71.17601 240.17601
sample estimates:
mean of x mean of y
190.375 105.875

The p-value is 0.258, so no, fail to reject and make no decision about which group has the
larger population mean.
12. Using R:

> x=c(11.1, 12.2, 15.5, 17.6, 13.0, 7.5, 9.1, 6.6, 9.5, 18.0, 12.6)
> y=c(18.2, 14.1, 13.8, 12.1, 34.1, 12.0, 14.1, 14.5, 12.6,
+ 12.5, 19.8, 13.4, 16.8, 14.1, 12.9)
> t.test(x,y)

Welch Two Sample t-test

data: x and y
t = -1.9546, df = 23.931, p-value = 0.06242
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.4082012 0.2021406
sample estimates:
mean of x mean of y
12.06364 15.66667

So in the notation used in the text, the test statistic is W = −1.95, |W | > t0.975 = 2.06,
fail to reject. Or fail to reject because the p-value is greater than 0.05 and make no decision
about which group has the larger population mean.
13. The actual probability coverage could differ substantially from 0.95.
14.
twobinom(15,24,23,42) returns

$p.value
[1] 0.5594805

58
twobicipv(15,24,23,42) returns

$p.value
[1] 0.999
$ci
[1] -0.2018983 0.3377457
In this particular case, the p-value returned by Beal’s method is substantially larger than
the p-value returned by the Storer–Kim method, suggesting that generally the Storer–Kim
method will have more power.
15. The R command

twobinom(20,98,30,70) returns

$p.value
[1] 0.00174666

$p1
[1] 0.2040816

$p2
[1] 0.4285714
The R command

twobicipv(20,98,30,70) returns

$p.value
[1] 0.005

$ci
[1] -0.37821798 -0.06101773
So both methods reject with α = 0.01, but note that the Storer–Kim method has a lower
p-value again suggesting better power.
16. Using Equation (9.20):

> p1=20/121
> p2=15/80
> Z=(p1-p2)/sqrt(p1*(1-p1)/121 + p2*(1-p2)/80)
> Z

[1] -0.4025341

59
Because |Z| < 1.96, fail to reject with α = 0.05.

twobinom(20,121,15,80) returns

$p.value
[1] 0.6943439

$p1
[1] 0.1652893

$p2
[1] 0.1875

twobicipv(20,121,15,80) returns

$p.value
[1] 0.999
$ci
[1] -0.14712631 0.09712294
$p1
[1] 0.1652893
$p2
[1] 0.1875

Again, the p-values differ substantially, but this is not always the case.
17.

> x=c(22, 23, 12, 11, 30, 22, 7, 42, 24, 33, 28, 19, 4, 34, 15, 26, 50,
+ 27, 20, 30, 14, 42)
> y=c(17, 22, 16, 16, 14, 29, 20, 20, 19, 14, 10, 8, 26, 9, 14, 17, 21, 16,
+ 14, 11, 14, 11, 29, 13, 4, 16, 16, 7, 21)
> t.test(x,y) #Welch's test

Welch Two Sample t-test

data: x and y
t = 3.0586, df = 29.544, p-value = 0.004694
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
2.760369 13.875995
sample estimates:
mean of x mean of y
24.31818 16.00000

60
> t.test(x,y,var.equal=TRUE) # Student's T test.
Two Sample t-test

data: x and y
t = 3.3162, df = 49, p-value = 0.001724
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
3.277457 13.358907
sample estimates:
mean of x mean of y
24.31818 16.00000
So Student’s T yields a shorter confidence interval, but the confidence interval based on
Student’s T can be less accurate compared to the confidence interval based on Welch’s
method.
18. Z = 7.5/3.930 = 1.91, which is less than the 0.975 critical value read from Table 1
in Appendix B, which is 1.96. Using R:
x=c(22, 23, 12, 11, 30, 22, 7, 42, 24, 33, 28, 19, 4, 34, 15, 26, 50, 27, 20,
30, 14, 42)
y=c(17, 22, 16, 16, 14, 29, 20, 20, 19, 14, 10, 8, 26, 9, 14, 17, 21, 16, 14,
11, 14, 11, 29, 13, 4, 16, 16, 7, 21)
msmed(x,y)
returns

$test
Group Group test crit se p.value
[1,] 1 2 1.908167 1.959964 3.930473 0.05665605

$psihat
Group Group psihat ci.lower ci.upper
[1,] 1 2 7.5 -0.2035862 15.20359
19. Using R:
x=c(22, 23, 12, 11, 30, 22, 7, 42, 24, 33, 28, 19, 4, 34, 15, 26, 50, 27, 20,
30, 14, 42)
y=c(17, 22, 16, 16, 14, 29, 20, 20, 19, 14, 10, 8, 26, 9, 14, 17, 21, 16, 14,
11, 14, 11, 29, 13, 4, 16, 16, 7, 21)
medpb2(x,y)
returns

$p.value
[1] 0.01

$ci
[1] 3 14

61
This indicates that the 0.95 confidence interval for the difference between the population
medians is (3, 14). The p-value indicates that the hypothesis of equal medians would be
rejected at the 0.01 level, in contrast to the method based on the McKean–Schrader estimator,
which has a p-value equal to 0.057. One explanation is that there are tied values, the point
being that using a method that allows tied values can make a practical difference. (The R
functions that use the McKean–Schrader estimator check for tied values and print a warning
message if any are found.)
20. For Storer–Kim:
twobinom(49,105,101,156)
returns

$p.value
[1] 0.003722579

$p1
[1] 0.4666667

$p2
[1] 0.6474359
Beal’s method:

twobicipv(49,105,101,156)
returns

$p.value
[1] 0.008

$ci
[1] -0.31510569 -0.04021425
So in this case the two methods give similar p-values.
21. Using R:
twobinom(11,23,10,23)
returns

$p.value
[1] 0.7515369

$p1
[1] 0.4782609

$p2
[1] 0.4347826

62
So fail to reject.
22. Student’s T controls the Type I error probability when comparing identical distribu-
tions. It is possible that Student’s T accurately detects a difference in the distributions that
might be missed by other methods.
23. In terms of understanding the nature of any difference that might exist, Student’s T
can be highly inaccurate in some situations. Confidence intervals for the difference between
the means can be inaccurate when distributions differ. Problems with heteroscedasticity
can be reduced by using a method that allows heteroscedasticity, such as Welch’s test or a
bootstrap-t method. But inaccurate confidence intervals, as well as relatively low power due
to outliers, are always a concern when using means. Other methods might have substantially
more power, such as techniques based on the median or 20% trimmed mean. Even with
discrete distributions where outliers are rare, the methods in Section 9.11 might have more
power. The only safe way to determine whether alternative methods make a difference is to
try them.
24. The median better reflects the typical value for skewed distributions, it can have
relatively high power when outliers are common. Accurate confidence intervals can be com-
puted over a fairly broad range of situations, much broader than methods based on means.
But for skewed distributions, it is possible that means have more power, even when the Type
I error probability is controlled reasonably well, because the difference between the means
might be larger. As noted in Chapter 2, there are situations where the mean can be argued
to be a better measure of location. Also, when dealing with distributions that have light
tails (outliers are relatively uncommon), the mean or some smaller amount of trimming can
have a smaller standard error.
25. Proceed as described in the last example of Section 9.2.2. B = 1000, A = 10, C = 2,
Q = (A + 0.5C)/B = 0.01 + 0.5(0.002) = 0.011. Set P equal to Q or 1 − Q, whichever is
smaller, so P = 0.011 and the p-value is 2P = 2(0.011) = 0.022.
26. Assuming the data have been read into the R matrix salk, with columns correspond-
ing to the two groups,
t.test(salk[,1],salk[,2])
returns

t = 3.5139, df = 51.639, p-value = 0.0009277


alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
30.05544 110.11123
sample estimates:
mean of x mean of y
18.00000 -52.08333

27. Assuming the data have been read into the R variable dana,
yuen(dana[,1],dana[,2])
returns

$n1

63
[1] 19

$n2
[1] 19

$est.1
[1] 282.6923

$est.2
[1] 444.7692

$ci
[1] -326.090263 1.936417

$p.value
[1] 0.05254339

$dif
[1] -162.0769

$se
[1] 79.28439

$teststat
[1] 2.044248

$crit
[1] 2.068671
So the p-value is 0.05254339 and the 0.95 confidence interval is (-326.090263, 1.936417).
28.
g1=c(2, 4, 4, 2, 2, 2, 4, 3, 2, 4, 2, 3, 2, 4, 3, 2, 2, 3, 5, 5, 2, 2)
g2=c(5, 1, 4, 4, 2, 3, 3, 1, 1, 1, 1, 2, 2, 1, 1, 5, 3, 5)
disc2com(g1,g2)
returns a p-value equal to 0.008.
29. Assuming the data for the control group have been read into the R variable A1 and
the data for the experimental group are stored in B3, the R commands
pts=seq(-0.4,0.3,0.1)
ancJN(A1$cort1-A1$cort2,A1$MAPAFREQ,
B3$cort1-B3$cort2,B3$MAPAFREQ, xout=T,regfun=ols,pts=pts)
compare the control group to the intervention group taking into account the cortisol awak-
ening response (CAR). The values for the covariate are taken to be −0.4, −0.3, −0.2, −0.1,
0, 0.1, 0.2 and 0.3, as indicated by the first R command. The output looks like this:
X Est1 Est2 DIF TEST se ci.low
[1,] -0.4 69.64765 90.02571 -20.3780642 -2.99337759 6.807716 -38.96313

64
[2,] -0.3 71.31745 88.36983 -17.0523809 -3.21080797 5.310931 -31.55122
[3,] -0.2 72.98724 86.71394 -13.7266977 -3.47394203 3.951332 -24.51383
[4,] -0.1 74.65704 85.05806 -10.4010144 -3.55373016 2.926788 -18.39115
[5,] 0.0 76.32684 83.40217 -7.0753312 -2.66185867 2.658042 -14.33178
[6,] 0.1 77.99664 81.74629 -3.7496479 -1.12493458 3.333214 -12.84932
[7,] 0.2 79.66644 80.09041 -0.4239647 -0.09318134 4.549888 -12.84516
[8,] 0.3 81.33624 78.43452 2.9017186 0.48470699 5.986542 -13.44154
ci.hi p.value
[1,] -1.7929998 0.0027590816
[2,] -2.5535382 0.0013236235
[3,] -2.9395618 0.0005128718
[4,] -2.4108835 0.0003798087
[5,] 0.1811225 0.0077710499
[6,] 5.3500269 0.2606167576
[7,] 11.9972300 0.9257595006
[8,] 19.2449774 0.6278842077
A plot of the regression lines suggest that they cross close to where the CAR is 0.2. For CAR
negative, typical MAPAFREQ values tend to be higher among participants who received
intervention. No significant difference is found when the CAR is positive.
30. The R function ancJN uses the Theil–Sen estimator by default. So now the analysis
is done with the R commands
pts=seq(-0.4,0.3,0.1)
ancJN(A1$cort1-A1$cort2,A1$MAPAFREQ,
B3$cort1-B3$cort2,B3$MAPAFREQ, xout=T,pts=pts)
Again, the values for the covariate are taken to be −0.4, −0.3, −0.2, −0.1, 0, 0.1, 0.2 and
0.3, as indicated by the first R command. The output looks like this:

$output
X Est1 Est2 DIF TEST se ci.low
[1,] -0.4 72.77228 93.08529 -20.313011 -2.70947053 7.497041 -40.77993
[2,] -0.3 74.04465 91.51616 -17.471512 -2.99898528 5.825808 -33.37597
[3,] -0.2 75.31703 89.94704 -14.630013 -3.40723321 4.293810 -26.35211
[4,] -0.1 76.58940 88.37791 -11.788513 -3.78585817 3.113829 -20.28927
[5,] 0.0 77.86177 86.80879 -8.947014 -3.22064238 2.778022 -16.53101
[6,] 0.1 79.13415 85.23966 -6.105515 -1.72681529 3.535708 -15.75800
[7,] 0.2 80.40652 83.67054 -3.264016 -0.66560732 4.903816 -16.65143
[8,] 0.3 81.67890 82.10141 -0.422517 -0.06492126 6.508145 -18.18975
ci.hi p.value
[1,] 0.1539103 0.0067390692
[2,] -1.5670566 0.0027088039
[3,] -2.9079108 0.0006562503
[4,] -3.2877614 0.0001531790
[5,] -1.3630152 0.0012790364
[6,] 3.5469685 0.0842008407

65
[7,] 10.1234027 0.5056621260
[8,] 17.3447187 0.9482366918

So again, for CAR negative, typical MAPAFREQ values tend to be higher among participants
who received intervention. No significant difference is found when the CAR is positive.
Looking at the plot returned by ancJN, the regression lines based on the Theil–Sen estimator
appear to cross close to where the CAR is 0.3, rather than 0.2 when using least squares
regression.

Chapter 10
1. Because the sample sizes are equal, simply average the three sample variances yielding
MSWG=(6.214+3.982+2.214)/3=4.14.
2. Using R:
g1=c(3, 5, 2, 4, 8, 4, 3, 9)
g2=c(4, 4, 3, 8, 7, 4, 2, 5)
g3=c(6, 7, 8, 6, 7, 9, 10, 9)
x=cbind(g1,g2,g3) # This stores the data in a matrix
anova1(x)
returns

$F.test
[1] 6.053237

$p.value
[1] 0.00839879

$df1
[1] 2

$df2
[1] 21

$MSBG
[1] 25.04167

$MSWG
[1] 4.136905

An alternative approach would have been to store the data in list mode:
x=list()
x[[1]]=c(3, 5, 2, 4, 8, 4, 3, 9)
x[[2]]=c(4, 4, 3, 8, 7, 4, 2, 5)
x[[3]]=c(6, 7, 8, 6, 7, 9, 10, 9)
anova1(x)

66
3. Assume the data are stored in the R variable x as described in the answer to Exercise
2.
t1way(x,tr=0)
returns

$TEST
[1] 7.774918

$nu1
[1] 2

$nu2
[1] 13.40326

$n
[1] 8 8 8

$p.value
[1] 0.005733349

As indicated, the p-value is 0.0057, confirming that the null hypothesis would be rejected
with α = 0.01.
4. Entering data into a matrix as described in Section 1.3.3, one way of doing this is as
follows:
dat=matrix(c(15,17,22,9,12,15,17,20,23,13,12,17),ncol=4)
anova1(dat)
returns

$F.test
[1] 4.210526

$p.value
[1] 0.04615848

$df1
[1] 3

$df2
[1] 8

$MSBG
[1] 40

67
$MSWG
[1] 9.5

5. If the data are stored in the R variable dat, and because the sample sizes are equal,
can use the R command mean(apply(dat,2,var)). The second argument in the command
apply is 2 indicating that it will compute the variance for each column of the data. Then
command mean averages these sample variances. This yields 9.5, which agrees with the value
of MSWG returned by anova1 in the previous exercise.
6. Assuming the data are stored in dat as described in the answer to Exercise 4,
t1way(dat,tr=0)
returns

$TEST
[1] 3.37717

$nu1
[1] 3

$nu2
[1] 4.42256

$n
[1] 3 3 3 3

$p.value
[1] 0.1240676

7. With 20% trimming, no values get trimmed with this sample size.
8. Don’t know when power is high enough to detect situations where unequal variances is
a practical issue. At least five published papers have found this strategy to be unsatisfactory.
9. F=MSBG/MSWG=2.3. The degrees of freedom are ν1 = J − 1 = 5 − 1 = 4 and
ν2 = nJ − J = (5(15) − 5) = 70, and the critical value is f0.95 = 2.5, F is less than the
critical value, so fail to reject.
10. Using R with list mode is convenient because the sample sizes are not equal.
dat=list()
dat[[1]]=c(9,10,15)
dat[[2]]=c(16,8,13,6)
dat[[3]]=c(7,6,9)
anova1(dat)
returns

$F.test
[1] 1.145033

$p.value

68
[1] 0.3713448

$df1
[1] 2

$df2
[1] 7

$MSBG
[1] 14.40833

$MSWG
[1] 12.58333

11. For equal sample sizes, MSBG estimates σp2 +nσµ2 , the variation among the population
means multiplied by the common sample size, plus the assumed common variance. For the
situation at hand, the variation among the population means is

> var(c(3,4,5,6,7))

[1] 2.5

The sample size is n = 10 and the common variance among the five groups is σp2 = 2. So
MSBG is estimating 2+10(2.5)=27. Because the null hypothesis is false, MSBG estimates a
larger quantity, on average, than the quantity being estimated by MSWG, which is σp2 = 2.
12. Using R:
g=list()
g[[1]]=c(10, 11, 12, 9, 8, 7)
g[[2]]=c(10, 66, 15, 32, 22, 51)
g[[3]]=c(1, 12, 42, 31, 55, 19)
anova1(g)
returns p.value = 0.082
and
t1way(g,tr=0)
returns p.value=0.046. Heteroscedasticity might explain this. The variance for group 1 is
substantially smaller compared to groups 2 and 3.
13.
The degrees of freedom are ν1 = J − 1 = 5 and ν2 = N − J = 8, so the number of groups
is J = 6 and the total sample size is N = 14.
14. The distributions differ, suggesting that in particular the means differ. But the
details about which groups differ and how groups differ are unclear.
15. Low power due to outliers, violating the equal variance assumption (homoscedastic-
ity), differences in skewness, small sample sizes, relatively large standard errors.
16. No. Power of the test for equal variances might be too low. Five published papers,
cited in the text, indicate that this strategy for salvaging the homoscedasticity assumption
is unsatisfactory.

69
17. Unclear whether the test has enough power to detect a departure from normality
that has practical importance.
18. Generate data so that the groups have unequal variances.
19. There is a main effect for A: µ11 + µ12 = 180, which is not equal µ21 + µ22 = 120.
Similarly, there is a main effect for Factor B: 110 + 80 6= 70 + 40. There is no interaction:
110-70=80-40.
20. There is a main effect for A and B and an interaction.
Factor A: µ11 + µ12 = 10 + 20 = 30, which is not equal µ21 + µ22 = 40 + 10 = 50. For Factor
B: 10+40 6= 20+10. Interaction: 10 − 20 6= 40 − 10.
21.
x=list()
x[[1]]=c( 7, 9, 8, 12, 8, 7, 4, 10, 9, 6)
x[[2]]=c(10, 13, 9, 11, 5, 9, 8, 10, 8, 7)
x[[3]]=c(12, 11, 15, 7, 14, 10, 12, 12, 13, 14)
med1way(x)
The p-value is 0.018. However, there are tied values. The R function Qanova is a better
choice, but the sample sizes are less than 20, suggesting that Qanova might not adequately
control the probability of a Type I error. (Consequently, the R function medpb would be
better a choice in this situation, which is described in Section 12.6.1.)
22. The p-value returned by Qanova is 5e-04=0.0005, so using a method that allows tied
values when comparing medians can make a practical difference. This function is preferable
to using med1way because there are tied values, but because the sample sizes are relatively
small, the function medpb in Chapter 12 is probably better in terms of controlling the Type
I error probability.
23. Still assuming the data are stored in the R variable x, t1way(x) returns a p-value
equal to 0.0022.
24. Still assuming the data are stored in the R variable x, the p-value returned by
kruskal.test(x)
is 0.004. The p-value returned by
bdm(x)
is 0.0016. Both methods indicate that the distributions differ.

Chapter 11
1. Three different R commands for performing the paired T test are illustrated.
corkall=read.table(‘corkall.dat’,header=T) # Or read the
data using file.choose; see Chapter 1.
labels(corkall) # Returns the labels which are: "N" "E" "S" "W"
t.test(corkall[,2]-corkall[,3]) # Or use
t.test(corkall$E]-corkall$S) # Or use
t.test(corkall$E,corkall$S,paired=T)
Each of the last three commands returns:

Paired t-test

70
data: corkall$E and corkall$S
t = -1.7514, df = 27, p-value = 0.09122
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.6002671 0.6002671
sample estimates:
mean of the differences
-3.5

2. Continuing as described in the previous answer, and remembering that trimci uses a
20% trimmed mean by default,
trimci(corkall[,2]-corkall[,3])
returns

$ci
[1] -7.30965260 -0.02368073

$estimate
[1] -3.666667

$test.stat
[1] -2.12353

$se
[1] 1.726685

$p.value
[1] 0.04868652

3. Continuing as described in the last two exercises


yuend(corkall[,2],corkall[,3])
returns

$ci
[1] -8.657220 1.101665

$p.value
[1] 0.1207526

$est1
[1] 43.61111

$est2
[1] 47.38889

71
$dif
[1] -3.777778

So fail to reject at the 0.05 level in contrast to the previous exercise based on on difference
scores. This illustrates that the choice between using difference scores, rather than compare
the marginal trimmed means, can make a practical difference.
4. The sample trimmed mean of the difference scores is not necessarily equal to the
difference between the trimmed means of the marginal distributions. In Exercise 2, the
20% trimmed mean of the difference scores is −3.67. But in Exercise 3, the difference
between the 20% trimmed means associated with the east and south side of the trees is
43.61111 − 47.38889 = −3.78.
5. Yes. Similar to the answer to Exercise 4, the population trimmed mean of the difference
scores is not necessarily equal to the difference between the population trimmed means of
the marginal distributions. Consequently, power can depend on which method is used.
6. The R command
trimcibt(corkall[,2]-corkall[,3],tr=0)
returns

$estimate
[1] -3.5

$ci
[1] -7.6260961 0.6260961

$test.stat
[1] -1.751449

$p.value
[1] 0.0918197

7. The R command
trimcibt(corkall[,2]-corkall[,3],tr=0.2) returns

$estimate
[1] -3.666667

$ci
[1] -7.0710545 -0.2622789

$test.stat
[1] -2.12353

$p.value
[1] 0.03672788

72
8. Using R:
x=c(10, 14, 15, 18, 20, 29, 30, 40)
y=c(40, 8, 15, 20, 10, 8, 2, 3)
signt(x,y)
returns

$Prob_x_less_than_y
[1] 0.2857143

$ci
[1] 0.0435959 0.7103684

$n
[1] 8

$N
[1] 7

$p.value
[1] 0.46
Note: The result N = 7 means that among the n = 8 difference scores, seven are not
equal to zero and are used in the analysis.
wilcox.test(x,y)
returns

Wilcoxon signed rank test with continuity correction

data: x and y
V = 21, p-value = 0.2719
Note: The value indicated by V is the first test statistic W described in Section 11.4, which
assumes no tied values and can be used to compute an exact p-value.

9. Using R:

> x=c(86, 71, 77, 68, 91, 72, 77, 91, 70, 71, 88, 87)
> y=c(88, 77, 76, 64, 96, 72, 65, 90, 65, 80, 81, 72)
> wilcox.test(x,y,paired=TRUE)
Wilcoxon signed rank test with continuity correction

data: x and y
V = 41.5, p-value = 0.4765
alternative hypothesis: true location shift is not equal to 0

73
signt(x,y)

$Prob_x_less_than_y
[1] 0.3636364

$ci
[1] 0.1276735 0.6924891

$n
[1] 12

$N
[1] 11

$p.value
[1] 0.55
10. Here is one way of doing this with R:
g1=which(Indometh[,2]==0.5)
g2=which(Indometh[,2]==0.75)
trimci(Indometh[g1,3]-Indometh[g2,3],tr=0)
which returns:

$ci
[1] 0.1206732 0.6859935

$estimate
[1] 0.4033333

$test.stat
[1] 3.668014

$se
[1] 0.1099596

$p.value
[1] 0.01447331
11. Using R:

> library(WRS) #This command is not necessary if Rallfun is installed.


> g1=which(Indometh[,2]==0.5)
> g2=which(Indometh[,2]==0.75)
> akerd(Indometh[g1,3]-Indometh[g2,3]) #Plots the distribution

74
[1] "Done"

> # of the difference scores.\\


> trimcibt(Indometh[g1,3]-Indometh[g2,3],tr=0.2)

[1] "Taking bootstrap samples. Please wait."


$estimate
[1] 0.335

$ci
[1] 0.1267734 0.5432266

$test.stat
[1] 5.935775

$p.value
[1] 0.02337229
2.0
1.5
1.0
0.5

0.2 0.4 0.6 0.8

Using the mean rather than a 20% trimmed mean:


trimcibt(Indometh[g1,3]-Indometh[g2,3],tr=0)

75
returns

$estimate
[1] 0.4033333

$ci
[1] 0.009749199 0.796917467
A boxplot indicates an extreme outlier, which explains why the confidence interval for
the mean is so much longer than the confidence interval based on the 20% trimmed mean.
12. The read.table command is not appropriate because the goal is not to read columns
of data, with each column representing a different variable. There is only one variable with
more than one value in each row.
Here are R commands that can be used:
z1=scan(file=‘CESD_before_dat.txt’)
z2=scan(file=‘CESD_after_dat.txt’)
trimci(z1-z2,tr=0)
which returns

[1] 0.4367194 2.3598450

$estimate
[1] 1.398282

$test.stat
[1] 2.860787

$se
[1] 0.4887753

$p.value
[1] 0.004499525

$n
[1] 326
For a 20% trimmed mean use
trimci(z1-z2,tr=0.2) which returns

$ci
[1] 0.1434972 1.6711967

$estimate
[1] 0.9073469

76
$test.stat
[1] 2.342703

$se
[1] 0.3873077

$p.value
[1] 0.02015272

$n
[1] 326

One way of using the median is with the R command


sintv2(z1-z2)
which returns

[1] "Duplicate values detected; hdpb might have more power"


$n
[1] 326

$ci.low
[1] 0

$ci.up
[1] 2

$p.value
[1] 0.79

Using instead
hdpb(z1-z2)
returns

$ci
[1] -0.1412505 1.6235765

$n
[1] 326

$estimate
[1] 0.6218725

$p.value
[1] 0.134

77
demonstrating again that when there are tied values and inferences are made based on
the median, using methods designed for tied values can make a practical difference. The
R function hdpb uses a percentile bootstrap method in conjunction with the usual sample
median M replaced by an alternative estimator of the population median not described in this
book. Readers interested in the details are referred to the description of the Harrell–Davis
estimator in Wilcox (2012b).
13. Still assuming the data are stored in the R variables z1 and z2
yuend(z1,z2)
returns

$ci
[1] -0.2710108 1.7812149

$p.value
[1] 0.1482984

$est1
[1] 11.47449

$est2
[1] 10.71939

$dif
[1] 0.755102

$se
[1] 0.5202874

$teststat
[1] 1.451317
In general, the difference between the trimmed means associated with two dependent groups
is not equal to the trimmed mean of the difference scores.
14. Still assuming the data are stored in the R variables z1 and z2, the R command
wilcox.test(z1,z2,paired=T)
returns

data: z1 and z2
V = 26948.5, p-value = 0.01392
So, if testing at the α = 0.05 level, reject and conclude that the distributions differ.
15. Still assuming the data are stored in the R variables z1 and z2 as described in the
answer to Exercise 12, the R command
comdvar(z1,z2)
returns

78
$p.value
[1] 0.01300589
$est1
[1] 114.4718

$est2
[1] 89.1083
So conclude that the variances differ and that the variance among the measures of depressive
symptoms is smaller after intervention.
16. First, read the data into R:
Z=read.table(‘CESDMF123_dat.txt’,header=TRUE)
or use the R command file.choose as illustrated in Chapter 1.
As explained in the text, columns 2-4 contain the measures of depressive symptoms taken
at three different times. So the R command
rmanova(Z[,2:4],tr=0)
tests the hypothesis of equal means and returns

$test
[1] 1.24193

$df
[1] 1.889864 134.180342

$p.value
[1] 0.2908471

$tmeans
[1] 13.59722 12.05778 12.44444
So in particular, fail to reject with a Type I error of α = 0.05.
17. Method BPRM does not assume a common correlation. As in Exercise 16, it is as-
sumed the data have been read into the R variable Z, in which case the data are in columns
2-4. The R command
bprm(Z[,2:4])
returns

$test.stat
[1] 1.444947

$nu1
[1] 1.852673

$p.value
[1] 0.2363027

79
So in particular, fail to reject with α = 0.05. If the p-value had been less than 0.05, conclude
that the distributions differ in some manner.
When using Friedman’s test the R command
friedman.test(Z[,2:4])
produces an error because Z is a data frame. Unlike the R function bprm, Z must be converted
to a matrix, which can be done with the command
friedman.test(as.matrix(Z[,2:4]))
The p-value is 0.611, which differs substantially from the p-value based on method BPRM.
Friedman’s test is based on more restrictive assumptions, which can impact power.
18. As in the previous two exercises, it is assumed the data have been read into the R
variable Z. Again the measures taken at times 1-3 are in columns 2-4. Column 5 indicates
whether a participant is male (1) or female (2). The first step is to store the data in a form
that can be accepted by the R function bwtrim. This is done with the command
M=bw2list(Z,5,c(2:4)).
To compare the groups with a 10% trimmed mean, use the command
bwtrim(2,3,M,tr=0.1)
which returns

$Qa
[1] 2.292791

$Qa.p.value
[,1]
[1,] 0.1435347

$Qb
[1] 1.123135

$Qb.p.value
[,1]
[1,] 0.3443356

$Qab
[1] 2.435661

$Qab.p.value
[,1]
[1,] 0.1122762

So when testing at the α = 0.05 level, fail to reject the hypothesis of main effects and the
hypothesis of no interaction.

Chapter 12
1. The probability of one or more Type I errors can be unacceptably high.

80
q
2. MSWG=11.6, The test statistic is T = |15−10|/ 11.6(1/20 + 1/20) = 4.64. Or using
R, the test statistic is

> abs(15-10)/sqrt(11.6*(1/20+1/20))

[1] 4.642383

The total sample size is N = 100. Because there are 5 groups the degrees of freedom are
ν = 100−5 = 95 and the critical value (Table 4 in Appendix B) is t0.975 = 1.985251. Because
T ≥ 1.985251, reject. Or in the context of Tukey’s three decision rule, decide that the first
group has the largest population mean. But even assuming normality and homoscedasticity,
FWE is not controlled because more thanqthree groups are being compared.
3. The test statistic is T = |15 − 10|/ 11.6(1/20 + 1/20)/2 = 6.565. Using R, the test
statistic is

> abs(15-10)/sqrt(11.6*(1/20+1/20)/2)

[1] 6.565322

The total sample size is N = 100. There are 5 groups so the degrees of freedom are
ν = 100 − 5 = 95. With FWE equal to 0.05, the critical value is q = 3.9 (Table 9 in
Appendix B). Because T > 3.9, reject. The critical value can be determined using R rather
than Table 9:

> qtukey(0.95,5,df=95)

[1] 3.932736

where the first argument 0.95 means that FWE is equal to 1-0.95=0.05, the second argument
indicates the number of groups, and df indicates the degrees of freedom. Assuming normality
and homoscedasticity, FWE is controlled. But violating these assumptions, FWE might not
be controlled adequately and power might be relatively poor.
4. Because there are q equal sample sizes, MSWG=(5+6+4+10+15)/5=8 and the test
statistic is T = |20 − 12|/ 8(1/10 + 1/10) = 6.325. Using R, the test statistic is

> abs(20-12)/sqrt(8*(1/10+1/10))

[1] 6.324555

The total sample size is 50, there are 5 groups, so the degrees of freedom are ν = 50 − 5 = 45.
The critical value is

> qt(0.975,df=45)

[1] 2.014103

81
(or the critical value t0.975 can be read from Table 9 in Appendix B). T is greater than the
critical value, so reject.
5. Because there areqequal sample sizes, MSWG=(5+6+4+10+15)/5=8. The test statis-
tic T is T = |20 − 12|/ 4(1/10 + 1/10) = 8.944. Or using R, the test statistic is

> abs(20-12)/sqrt(4*(1/10+1/10))

[1] 8.944272

The total sample size is 50, there are 5 groups, so the degrees of freedom are ν = 50 − 5 = 45.
The critical value is

> qtukey(0.95,5,df=45)

[1] 4.018417

(or the critical value can be read from Table 9 in Appendix B). Because T is greater than
the critical value, reject. Or in the context of Tukey’s three decision rule, decide that the
first group has the larger population mean.
6. An outlier in any group increases the corresponding variance, which in turn increases
MSWG. The result is that power decreases.
7. Because the Tukey–Kramer method uses MSWG, again outliers inflate MSWG, which
in turn decreases power.
8. Basically, apply Welch’s method, only read the critical value from Table 10 in Appendix
B rather than Table 4. The critical value in Table 10 depends on how many tests are to
be performed. Here, all pairwise comparisons are planned, there are J = 5 groups, so the
number of tests is C = (J 2 q
− J)/2 = (52 − 5)/2 = 10. The test statistic for comparing groups
1 and 2 is W = (15 − 10)/ 4/20 + 9/20 = 6.2. The degrees of freedom are:

> q1=4/20
> q2=9/20
> df=(q1+q2)^2/(q1^2/19+q2^2/19)
> df

[1] 33.10309

In the notation used in the text, the degrees of freedom are ν̂ = 33. From Table 10, the
critical value is 2.99, |W | > 2.99, so reject.
9. As in Exercise 8, apply Welch’s method, only read the critical value from Table 10 in
Appendix B rather than Table 4. The critical value in Table 10 depends on how many tests
are to be performed. Here, all pairwise comparisons are planned, so the number of tests is
C = (52 − 5)/2=10. The test statistic for comparing groups 1 and 2 is

> W=(20-12)/sqrt(5/10 + 6/10)


> W

[1] 7.627701

82
The degrees of freedom are
> q1=5/10
> q2=6/10
> df=(q1+q2)^2/(q1^2/19+q2^2/19)
> df

[1] 37.68852

The critical value is approximately 2.96, |W | ≥ 2.96, reject. Rather than use Table 10, the
critical value can be determined via the R command
smmcrit(37.69,10),
where the first argument is the degrees of freedom and the second argument indicates the
number of hypotheses that are to be tested. q
10. The test statistic is W = (20 − 12)/ 5/10 + 6/10 = 7.63. The degrees of freedom
are ν = ∞, the number of hypotheses to be tested is C = (42 − 4)/2 = 6, so from Table
10 the critical value is 2.63, reject. When using R, the critical value is determined by the
command
smmcrit(Inf,6)
11. Possibly invalid results when comparing the median of the first group to any other
group due to the poor estimate of the standard error associated with the first group. All
known methods for estimating the standard error of the sample median can perform poorly
when there are tied values.
12. Note that the hypothesis is H0 : µ1 + µ2 − µ4 − µ5 = 0. So the linear contrast
coefficients are: 1, −1, 0, 1,
q −1, 0. The test statistic is
W = (24 − 36 + 14 − 24)/ 48/8 + 56/7 + 64/8 + 25/5 = −4.23.
Using R, the test statistic is
> (24-36+14-24)/sqrt(48/8+56/7+64/8+25/5)

[1] -4.233902

The degrees of freedom are


> q1=48/8
> q2=56/7
> q4=64/8
> q5=25/5
> V1=(q1+q2+q4+q5)^2
> V2=q1^2/7+q2^2/7+q4^2/6+q5^2/4
> DF=V1/V2
> DF

[1] 23.3636

That is, ν = 23.36. So from Table 4 in Appendix in B, the critical value is t0.975 = 2.07. Or
using R, the critical value is

83
> qt(0.975,23.36)

[1] 2.066895

Because |W | exceeds the critical value, reject.


13. There are three groups associated with factor B, all pairwise comparisons are to be
performed, so the number of tests is C = (32 − 3)/2 = 3. Here is how to compute the degrees
of freedom:

> q2=56/7
> q3=60/12
> q5=25/5
> q6=40/10
> V1=(q2+q3+q5+q6)^2
> V2=q2^2/6+q3^2/11+q5^2/4+q6^2/9
> DF=V1/V2
> DF

[1] 23.0837

So the degrees of freedom are ν = 23.08. From Table 10 in Appendix B, the critical value is
approximately 2.57. Or using the R command
smmcrit(23.08,3),
the critical value is 2.566.
14. The hypothesis is H0 : µ1 − µ2 − µ4 + µ5 = 0. So the linear contrast coefficients are
1, −1, 0, −1, 1, 0. Using R:

> q1=48/8
> q2=56/7
> q4=64/8
> q5=25/5
> W=(24-36-14+24)/sqrt(q1+q2+q4+q5) # Test statistic
> W

[1] -0.3849002

> V1=(q1+q2+q4+q5)^2
> V2=q1^2/8+q2^2/6+q4^2/7+q5^2/4
> DF=V1/V2
> DF

[1] 23.85508

> qt(0.975,DF) # Critical value

[1] 2.064562

84
That is, because only one test is to be performed, the critical value is based on Student’s T
distribution, which can be read from Table 4 in Appendix B instead. Because the absolute
value of the test statistic, |W | = 0.38, is less than the critical value t0.975 = 2.06, fail to
reject. Or in the context of Tukey’s three decision rule, make no decision about which group
has the larger population mean.
15. With five tests and FWE equal to 0.05, the Bonferroni method would reject when
the p-value is less than or equal to 0.05/5=0.01. So none of the five hypotheses would be
rejected.
16. The largest p-value is 0.049, which is less than 0.05, so both methods reject all five
hypotheses.
17. Both Rom’s method and Hochberg’s method have as much or more power than the
Bonferroni method.
18. The largest p-value is 0.24, so fail to reject the corresponding hypothesis. The
next largest p-value is 0.12, this greater than 0.05/2=0.025, so fail reject the corresponding
hypothesis. The next largest p-value is 0.04, this greater than 0.05/3=0.0167, so fail reject the
corresponding hypothesis. The next largest p-value is 0.005, this is less than 0.05/4=0.0125,
so reject the corresponding hypothesis as well as the remaining hypothesis associated with
the smallest p-value.
19. In general, for L means µ1 , . . . , µL and L contrast coefficients c1 , . . . , cL , the linear
contrast is L`=1 c` µ` , where c` = 0. For the situation at hand, multiplying the means
P P

µ1 , µ2 , µ3 , µ4 , µ5 , µ6 by the corresponding linear contrast coefficients 1, 1, 0, 0, −1 −1 and


adding the results yields µ1 +µ2 −µ5 −µ6 . That is, the goal is to test H0 : µ1 +µ2 −µ5 −µ6 = 0.
20. The linear contrast is 1(µ1 )−2(µ2 )+1(µ4 ), so the null hypothesis is H0 : µ1 −2µ2 +µ4 =
0.
21.H0 : 2µ1 − µ3 − µ5 = 0. So the linear contrast coefficients are 2, 0, −1, 0, −1.
22.

> A=matrix(c(2,0,-1,0,-1))
> A

[,1]
[1,] 2
[2,] 0
[3,] -1
[4,] 0
[5,] -1

23. Contrast coefficients are stored in a matrix for which the number of rows is equal to
the number of groups. So in this particular case, the matrix should have four rows because
there are four groups.
24.

> matrix(c(1, 0, 0, -1, 0, 1, -1, 0),ncol=2)

[,1] [,2]
[1,] 1 0

85
[2,] 0 1
[3,] 0 -1
[4,] -1 0

25. The R command


con2way(2,3)$conAB
returns

[,1] [,2] [,3]


[1,] 1 1 0
[2,] -1 0 1
[3,] 0 -1 -1
[4,] -1 -1 0
[5,] 1 0 -1
[6,] 0 1 1

So there are three linear contrasts relevant to interactions, which correspond to the three
columns of this matrix.

Chapter 13
1. Using Equation (13.2):

> nj=c(23, 14, 8, 32) # Observed frequencies for the 4 categories


> n=sum(nj) # Total number of observations.
> k=4 # Number of categories
> XSQ=sum((nj-n/k)^2)/(n/k) # Test statistic
> XSQ

[1] 17.18182

> qchisq(0.95,k-1) # Critical value

[1] 7.814728

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.25,0.25,0.25,0.25))

Chi-squared test for given probabilities

data: nj
X-squared = 17.182, df = 3, p-value = 0.0006484

The test statistic exceeds the critical value, reject. Or because the p-value is less than 0.05,
reject.
2. Using Equation (13.2):

86
> nj=c(23, 34, 43, 53, 16) # Observed frequencies for the 5 categories
> n=sum(nj) # Total number of observations.
> k=5 # Number of categories
> XSQ=sum((nj-n/k)^2)/(n/k) # Test statistic
> XSQ

[1] 26.23669

> qchisq(0.99,k-1) # Critical value

[1] 13.2767

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.20,0.20,0.20,0.20,0.20))

Chi-squared test for given probabilities

data: nj
X-squared = 26.237, df = 4, p-value = 2.835e-05

The test statistic exceeds the critical value, reject. Or because the p-value is less than 0.01,
reject.
3. Using Equation (13.2):

> nj=c(6, 20, 30, 35, 10, 5) # Observed frequencies for the 6 categories
> n=sum(nj) # Total number of observations.
> k=6 # Number of categories
> XSQ=sum((nj-n/k)^2)/(n/k) # Test statistic
> XSQ

[1] 46.03774

> qchisq(0.99,k-1) # Critical value

[1] 15.08627

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=rep(1/6,6))

Chi-squared test for given probabilities

data: nj
X-squared = 46.038, df = 5, p-value = 8.923e-09

The test statistic exceeds the critical value, reject. Or because the p-value is less than 0.01,
reject.
4. Using Equation (13.2):

87
> nj=c(9, 30, 15) # Observed frequencies for the 3 categories
> n=sum(nj) # Total number of observations.
> k=3 # Number of categories
> XSQ=sum((nj-n/k)^2)/(n/k) # Test statistic
> XSQ

[1] 13

> qchisq(0.95,k-1) # Critical value

[1] 5.991465

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=rep(1/3,3))

Chi-squared test for given probabilities

data: nj
X-squared = 13, df = 2, p-value = 0.001503

The test statistic exceeds the critical value, reject. Or because the p-value is less than 0.01,
reject.
5. Using Equation (13.4):
> nj=c(10, 40, 50, 5) # Observed frequencies
> n=sum(nj) # Total number of observations.
> p=c(0.1, 0.3, 0.5, 0.1) #Hypothesized probabilities
> k=4 # Number of categories
> XSQ=sum((nj-n*p)^2/(n*p))# Test statistic
> XSQ

[1] 5.31746

> qchisq(0.95,k-1) # Critical value

[1] 7.814728

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.1, 0.3, 0.5, 0.1))

Chi-squared test for given probabilities

data: nj
X-squared = 5.3175, df = 3, p-value = 0.15

The test statistic is less than the critical value, fail to reject. Or because the p-value is
greater than 0.05, fail to reject.
6. Using Equation (13.4):

88
> nj=c(20, 50, 40, 10, 15) # Observed frequencies
> n=sum(nj) # Total number of observations.
> p=c(0.2, 0.3, 0.3, 0.1, 0.1) #Hypothesized probabilities
> k=5 # Number of categories
> XSQ=sum((nj-n*p)^2/(n*p))# Test statistic
> XSQ

[1] 5.123457

> qchisq(0.95,k-1) # Critical value

[1] 9.487729

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.2, 0.3, 0.3, 0.1, 0.1))

Chi-squared test for given probabilities

data: nj
X-squared = 5.1235, df = 4, p-value = 0.2749

The test statistic is less than the critical value, fail to reject. Or because the p-value is
greater than 0.05, fail to reject.
7. Using Equation (13.4):

> nj=c(40,50,10) # Observed frequencies


> n=sum(nj) # Total number of observations.
> p=c(0.5, 0.3, 0.2) #Hypothesized probabilities
> k=3 # Number of categories
> XSQ=sum((nj-n*p)^2/(n*p))# Test statistic
> XSQ

[1] 20.33333

> qchisq(0.95,k-1) # Critical value

[1] 5.991465

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.5, 0.3, 0.2))

Chi-squared test for given probabilities

data: nj
X-squared = 20.333, df = 2, p-value = 3.843e-05

89
The test statistic exceeds the critical value, reject. Or because the p-value is less than 0.05,
reject.
8. Using Equation (13.4):
> nj=c(439, 168, 133, 60) # Observed frequencies
> n=sum(nj) # Total number of observations.
> p=c(9/16, 3/16,3/16,1/16) #Hypothesized probabilities
> k=4 # Number of categories
> XSQ=sum((nj-n*p)^2/(n*p))# Test statistic
> XSQ

[1] 6.355556

> qchisq(0.95,k-1) # Critical value

[1] 7.814728

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=c(0.1, 0.3, 0.5, 0.1))

Chi-squared test for given probabilities

data: nj
X-squared = 1815.8, df = 3, p-value < 2.2e-16

The test statistic is less than the critical value, fail to reject. Or because the p-value is
greater than 0.05, fail to reject.
9. Using Equation (13.4) or Equation (13.2):
> nj=c(38, 31, 40, 39, 40, 44, 48) # Observed frequencies
> n=sum(nj) # Total number of observations.
> p=rep(1/7,7) #Hypothesized probabilities
> k=7 # Number of categories
> XSQ=sum((nj-n*p)^2/(n*p))# Test statistic
> XSQ

[1] 4.15

> qchisq(0.95,k-1) # Critical value

[1] 12.59159

> # Or the built-in chi-square test can be used:


> chisq.test(nj,p=rep(1/7,7))

Chi-squared test for given probabilities

data: nj
X-squared = 4.15, df = 6, p-value = 0.6564

90
The test statistic is less than the critical value, fail to reject. Or because the p-value is
greater than 0.05, fail to reject.
10. p̂11 = 35/200. The R command
acbinomci(35,200)
returns a 0.95 confidence interval: (0.128, 0.234).
11. Using R:

> (abs(60-57)-1)^2/(60+57) # The test statistic


[1] 0.03418803
The critical value, read from Table 3 with 1 degree of freedom and α = 0.05 is 3.8415, so fail
to reject. Or in terms of Tukey’s three decision rule, make no decision. Using the built-in R
function

> mcnemar.test(matrix(c(20,60,57,63),ncol=2))
McNemar's Chi-squared test with continuity correction

data: matrix(c(20, 60, 57, 63), ncol = 2)


McNemar's chi-squared = 0.034188, df = 1, p-value = 0.8533
which gives the same result.
12. The R command
chi.test.ind(matrix(c(35,80,42,43),ncol=2))
returns

$test.stat
[1] 7.433703

$p.value
[1] 0.006401347
That is, the test statistic is X 2 = 7.4, and the hypothesis of independence is rejected with
α = 0.05. No, the phi coefficient is known to be an unsatisfactory measure of the strength
of an association.
13. Based on the notation in Table 13.4, the odds of saying yes among high income
individuals is 35/42=0.833. The odds among low incomes individuals is 80/43=1.860. The
odds ratio is θ̂ = 0.833/1.860 = 0.448. Among individuals with high incomes, the relative
likelihood of being optimistic about the future is about half the relative likelihood of being
optimistic among those with a low income.
14. The proportion of agreement is p̂ = (30 + 70 + 40)/320 = 0.4375. The R command
binomci(140,320)
returns a 0.95 confidence for the proportion of agreement: (0.385, 0.494).
15. The following R commands can be used:
flag=(kyphosis[,1]==‘present’)
logSM(kyphosis[,2],flag,xlab=‘Age (in months)’,ylab=‘P(Kyphosis)’)

91

You might also like