Safari
Safari
Dale Zimmerman
Summer 2016
Definition of biostatistics
• Statistics — the science of collecting, describing, analyzing,
and interpreting data, so that inferences (conclusions about a
population based on data from merely a sample) can be made
with quantifiable certainty.
2
Example applications of Biostatistics
3
Data collection
Some issues:
• Accuracy of measurement
For the most part, we will not do our own data collection in this
class, but will use existing data sets.
4
Descriptive Statistics
• numerical
• tabular
• graphical
5
Data Types
6
Data types: Continuous data
Examples:
7
Data types: Discrete data
Examples:
8
Data types: Ordinal data
Examples:
9
Data types: Categorical data
Examples:
• Blood type (A+, A-, B+, B-, O+, O-, AB+, AB-)
Can assign numbers to the category labels, but the order and magni-
tude of the numbers have no meaning. So almost no mathematical
operations can be performed on the data (exception: counting the
number of individuals that fall into each category).
10
Data types: Final remarks
• The line between continuous and discrete data may sometimes
appear blurry, due to measurement devices which are not “in-
finitely accurate.” Key discriminator: Would the data be dis-
crete if we could measure to an infinite level of accuracy?
11
Descriptive statistics: Measures of Center
• Mean
• Median
• Mode
12
Measures of Center: The Mean
X1 , X2 , . . . , Xn
X = (X1 + X2 + · · · + Xn )/n.
1 n
X= Â Xi.
n i=1
13
Measures of Center: The Mean
14
Measures of Center: The Mean
The mean is the “balance point” for the data, i.e. it is where a fulcrum
would need to be put to balance equally weighted objects placed on
the number line at X1 , X2 , . . . , Xn .
15
Measures of Center: The Median
The median is the middle value in the ordering of all data values
from smallest to largest. Clearly this requires the data to be ordinal,
discrete, or continuous.
If we represent the ordered values in our data set generically as
16
Measures of Center: The Median
Toy examples:
• Data 1,2,3,4,5: X̃ = 3
• Data 6,7,8,9,10: X̃ = 8
• Data 4,6,8,10,12: X̃ = 8
• Data 1,1,1,1,36: X̃ = 1
17
Measures of Center: The Mode
The mode is the datum that occurs most frequently in the sample.
Toy examples:
The mode is the most common value, but it may or may not be rep-
resentative of the dataset’s center.
18
Mean vs. Median vs. Mode
• Mode is well-defined for all data types, median requires at
least ordinal data, mean requires at least discrete data.
• Mean is distorted more than the others if the data are skewed
(definition to come later) or contain outliers.
• Units for all three are the same as the units of the data.
19
Descriptive Statistics: Measures of Dispersion (Spread)
20
Measures of Dispersion: The Range
21
Measures of Dispersion: The Interquartile Range (IQR)
IQR = Q3 Q1
where Q1 and Q3 are the first and third quartiles (Q2 , the second
quartile, coincides with the median).
How are the first and third quartiles defined?
22
Measures of Dispersion: The Interquartile Range (IQR)
23
Measures of Dispersion: The Variance
We’d like a measure of spread that utilizes information from all the
observations. How about the mean deviation, 1n Âni=1 (Xi X)?
We can avoid the problem of negative deviations canceling out pos-
itive deviations by squaring each deviation, i.e. the mean squared
deviation
1 n
Â
n i=1
(Xi X)2 .
Note that this involves summing the squares of the data (i.e. to com-
pute Âni=1 Xi2 we square first and then sum), as well as squaring the
sum of the data.
25
Measures of Dispersion: The Variance
26
Toy examples
• Data 6,7,8,9,10: Range = 10 6 = 4, IQR = 9 7 = 2,
1
s2 = [(6 8)2 +(7 8)2 +(8 8)2 +(9 8)2 +(10 8)2 ] = 2.5, o
4
1 402 1
s2 = [(62 +72 +82 +92 +102 ) ] = (330 320) = 2.5,
4 5 4
p
s = 2.5 = 1.58
27
Sample statistics vs. population parameters
28
Sample statistics vs. population parameters
29
Computing X and s2 from grouped data
30
Computing X and s2 from grouped data
X = 1n Âni=1 Xi = 1n Âi fi X[i] ,
✓ ◆
2 1 2 (Âi fi X[i] )2
s = n 1 Âi fi X[i] n
2
394 70
X = 70/100 = 0.7, s2 = 99
100
= 3.48, s = 1.87
31
Linear transformations of data
Yi = aXi + b, i = 1, 2, . . . , n.
32
Linear transformations of data
Examples:
33
Linear transformations of data
What happens to measures of center when the data are linearly trans-
formed? Consider the following examples:
34
Linear transformations of data
What happens to measures of spread when the data are linearly trans-
formed? Consider the same example:
35
In general, sY = |a|sX (with a similar result for range and IQR), while
sY2 = a2 s2X .
36
Accuracy and Precision
Recall the frequency table used to summarize data from 100 sam-
pling quadrats surveyed in NY, Xi = # of Cepaea nemoralis snails
per quadrat:
# of snails, X[i] Frequency, fi
0 69
1 18
2 7
3 2
4 1
5 1
8 1
15 1
100
We can add columns to this table that give relative frequencies (pro-
portions of the total # of observations that fall in each category) and
cumulative relative frequencies (proportions of the total # of obser-
39
vations that fall in each category or previous categories).
40
Frequency table for first age guess data
41
Bar graphs
• A graphical display of the distribution of frequencies (or rela-
tive frequencies)
42
Bar graph for snail data
43
Histograms
• Similar to a bar graph
44
Histogram of Old Faithful geyser eruption durations
80
60
40
20
0
1 2 3 4 5
geyser$duration
45
46
Histogram of guesses of Dr. Z’s age
47
Data shape
48
Data shape: Skewness
49
Effects of data shape on measures of center
• For a symmetric distribution, mean = median
50
Five-number summaries and box plots
51
Step-by-step procedure for constructing a box plot
1. Draw a horizontal (or vertical) reference scale based on the
extent of the data.
2. Draw a box whose sides (or top and bottom) are located at Q1
and Q3 .
3. Draw a vertical (horizontal) line segment at the median.
4. Compute the fences, f1 = Q1 1.5 ⇤ IQR and f3 = Q3 + 1.5 ⇤
IQR.
5. Extend a line segment (so-called whisker) from Q1 out to the
most extreme observation that is at or inside the fence, i.e.,
f1 ). Repeat on the other side, i.e., from Q3 to the most
extreme observation that is f3 . Mark the end of these line
segments with a ⇥.
52
6. Mark any observations beyond the fences with an open circle,
; these are regarded as outliers.
7. If you are constructing more than one box plot for comparison
purposes, use the same scale for all of them and put them side-
by-side (or one on top of another)
53
54
Box plots for guesses of Dr. Z’s age
55
Describing shape from a box plot
• For an (approximately) symmetric distribution, Q2 will be near
the middle of the box, and the two whiskers will be nearly the
same length.
• Right (left) skewness lengthens the box and the whisker to the
right (left) of the median, relative to the lengths of the box and
whisker on the other side of the median.
56
Probability: Basic concepts and terminology
• S = {1, 2, 3, 4, 5, 6}
• S = [0.00, 4.00]
58
Probability: Basic concepts and terminology
60
Probability: Basic concepts and terminology
61
Counting outcomes
62
Counting outcomes
65
Counting outcomes
66
Complement of an event
Example 2: Toss a fair coin 6 times. If A ={first and last tosses are
heads}, then A0 ={either the first toss or the last toss (or both) are
tails}.
Probability of complement:
P(A0 ) = 1 P(A).
67
Intersection of two events
Example: Roll a fair die once. If A = {3, 6}, B = {2, 3, 4}, and
C = {1, 2, 4}, then:
A \ B = {3},
A \C = 0,
/
B \C = {2, 4}.
68
Conditional probability
P(A \ B)
P(B|A) = .
P(A)
69
Conditional probability examples
70
Independent events
Example: Roll a fair die once. Let A = {1, 2}, B = {1, 3, 5}, C =
{1, 2, 3}. Then A and B are independent, but A and C are not inde-
pendent.
71
Mutually exclusive events
Example: Roll a fair die once. Then A = {3, 6} and B = {1, 4, 5} are
mutually exclusive.
Note: If A and B are mutually exclusive, then P(A \ B) = 0 (and vice
versa).
72
Union of two events
Example: Roll a fair die once. If A = {3, 6}, B = {2, 3, 4}, and
C = {1, 2, 4}, then:
A [ B = {2, 3, 4, 6},
A [C = {1, 2, 3, 4, 6},
B [C = {1, 2, 3, 4}.
73
Probability of the union of two events
In general,
74
Probability practice with a biological problem
75
Probability practice with a biological problem (cont’d)
76
Probability practice with a biological problem (cont’d)
2. Calculate the probability that the selected tree is dead, given that
it contains GLY.
77
Probability practice with a biological problem (cont’d)
3. Calculate the probability that the selected tree is dead, given that
it does not contain GLY.
78
Probability practice with a biological problem (cont’d)
79
Probability practice with a biological problem (cont’d)
80
Bayes Rule
P(B)P(A|B)
P(B|A) =
P(A)
81
Application of Bayes rule: Diagnostic screening
Suppose that a nurse tests a person for TB using the “skin test.”
Define
Suppose further that for the population of interest, P(B) = .03, P(A|B) =
.90, and P(A|B0 ) = .05. Then
P(B|A) =
82
Random variables
83
Random variables
84
Discrete random variables: the probability density
function
For every discrete random variable, the numerical values it can take
on can be listed (prior to conducting the random experiment) and a
probability can be associated with each one. The function, f (x) =
P(X = x), that assigns these probabilities is called the probability
density function (pdf).
Example: Toss a fair coin three times, let X = # heads. The pdf is:
x f (x)
0 .125
1 .375
2 .375
3 .125
85
Discrete random variables: the probability density
function
86
Discrete random variables: the probability density
function
87
Discrete random variables: the probability density
function
88
Discrete random variables: the cumulative distribu-
tion function
Example: Toss a fair coin three times, let X = # heads. The CDF is:
x f (x) F(x)
0 .125 .125
1 .375 .500
2 .375 .875
3 .125 1.000
89
Discrete random variables: the expected value
90
Discrete random variables: the expected value
91
Discrete random variables: the variance
92
The binomial distribution: basic framework
96
The binomial distribution: biological examples
97
The binomial distribution: the CDF and Table C.1
For selected values of n and p, this CDF is given in Table C.1 of our
text, and we can use it to avoid having to compute the pdf.
Re-do of Example 1:
98
The binomial distribution: the CDF and Table C.1
Re-do of Example 2:
Re-do of Example 3:
99
The binomial distribution: Mean, variance, and shape
• µ = E(X) = np
• s 2 = np(1 p)
100
The Poisson distribution: basic framework
• # of hurricanes in a year
• # of snails in a 1 m ⇥ 1 m quadrat
101
The Poisson distribution: basic framework
103
The Poisson distribution: the CDF and Table C.2
104
The Poisson distribution: changing the length of the
time period
The third part of the framework for a Poisson random variable (p. 98
of these notes) tells us that if the number of basic events in a period
of unit length has expected value µ, then the number of basic events
in a period of length t is a Poisson random variable with parameter
(and mean) tµ. So, with proper modification we can compute prob-
abilities for events involving a period of any length.
Hurricane example, continued: What is the probability that there are
10 or fewer hurricanes from 2016-2018 (inclusive)?
105
Poisson approximation to the binomial distribution
n 100, np 10.
106
Poisson approximation to the binomial distribution
107
Continuous random variables
108
Continuous random variables
Thus, if a and b are two real numbers such that a < b, then
P(a X < b) = P(X = a) + P(a < X < b) = 0 + P(a < X < b).
109
Continuous random variables: the pdf
• f (x) 0;
• the area under the graph of f (and above the x-axis) is equal
to 1.0;
110
Continuous random variables: the pdf
⇢ 1
2, if 1 < x < 3
Equivalently, f (x) =
0, otherwise.
Then, e.g., P(1 < X < 2) = 12 · 1 = 12 , P(1.5 < X < 4) =
We can also solve problems like: Find a number c such that P(X <
c) = 23 .
111
Continuous random variables: the pdf
Then, e.g.:
• P(X > 1) =
• P(|X| < 1) =
• Find a number c such that P(0 < X < c) = 14 :
112
Continuous random variables: the CDF
113
Continuous random variables: mean and variance
114
The normal distribution: Introduction
115
The normal distribution: the pdf
The normal distribution’s pdf, the “normal curve,” has the following
features:
116
The normal distribution: the pdf
117
The standard normal distribution and its CDF
The specific normal curve, N(0, 1), is called the standard normal
distribution, and the corresponding random variable is denoted by
Z. We write
Z ⇠ N(0, 1).
We can use Table C.3 and knowledge of the symmetry of the stan-
dard normal curve around 0 to obtain the probabilities of many events
involving Z.
• P(Z < 0) =
• P(Z > 0) =
119
The standard normal distribution: Determining prob-
abilities
• P( 0.39 < Z < 1.64) =
120
The standard normal distribution: Inverse problems
We can also solve “inverse” problems, where we are given the prob-
ability and asked to find a z-value. For example, find z such that:
121
The normal distribution: Standardization
122
The normal distribution: Standardization
123
The normal distribution: Standardization
124
Normal approximation to the binomial distribution
125
Normal approximation to the binomial distribution
126
Normal approximation to the binomial distribution
Example: Recall the Caribbean trip you take, on which you are bit-
ten by 120 mosquitos. Suppose that each time you are bitten by a
mosquito on the trip, the probability that the mosquito is carrying
the Zika virus is .10. What is the probability that at least one of the
120 bites will be from a Zika virus carrier?
127
Some practice combining concepts of Chapters 2 & 3
128
(c) What is the probability that fewer than 3 people in the sample are
lactose intolerant, given that an odd number of people in the sample
are lactose intolerant?
(d) What is the probability that an odd number of people in the sam-
ple are lactose intolerant, given than fewer than 3 people in the sam-
ple are lactose intolerant?
129
Example 2: Suppose Z ⇠N(0, 1). (a) Find P(0.50 < Z < 1.50 \ Z <
1.00).
130
(c) In a random sample of size 6 from a population whose distribu-
tion is N(0, 1), what is the probability distribution of X, where X
is defined as the number of observations in the sample that are less
than 1.00?
131
Sampling distributions: Introduction
132
Sampling distributions: Introduction
133
Sampling distribution of X: A discrete example
Now think of the x’s as the values of objects in some very large pop-
ulation, and the f (x)’s as their corresponding relative frequencies.
Imagine taking a random sample of size 2 from this population with
replacement, and let X represent the mean of this sample. Then the
probability distribution of X is as follows:
134
Sampling distribution of X: A discrete example
x f (x)
1.0 (.4)(.4)=.16
1.5 (.4)(.1)+(.1)(.4)=.08
2.0 (.4)(.2)+(.2)(.4)+(.1)(.1)=.17
µX = 2.40, sX2 = 0.82
2.5 (.4)(.3)+(.3)(.4)+(.1)(.2)+(.2)(.1)=.28
3.0 (.1)(.3)+(.3)(.1)+(.2)(.2)=.10
3.5 (.2)(.3)+(.3)(.2)=.12
4.0 (.3)(.3)=.09
135
Sampling distribution of X: A discrete example
If we take a random sample of size 3 (rather than 2), then the pdf of
X is:
x f (x)
1.0 .064
1.3̄ .048
1.6̄ .108
2.0 .193
2.3̄ .126 µX = 2.40, sX2 = 0.546
2.6̄ .165
3.0 .152
3.3̄ .063
3.6̄ .054
4.0 .027
136
Sampling distribution of X: A discrete example
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0.1
0.0
0.0
0.0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
137
Sampling distribution of X: A discrete example
138
Sampling distribution of X: A discrete example
139
The Central Limit Theorem (CLT)
1. has mean µX ;
Amazing!
140
The Central Limit Theorem (CLT)
. s2
X ⇠ N(µ, ) for n sufficiently large.
n
Or equivalently,
X µ .
p ⇠ N(0, 1) for n sufficiently large.
s/ n
p
The quantity s / n is called the standard error of the mean.
Note: as the sample size increases, the standard error of the mean
decreases.
141
The Central Limit Theorem in practice
142
The CLT in practice: Examples
143
The CLT in practice: Examples
• What is the (approximate) probability that the average birth
weight of 5 infants randomly selected from this population is
less than 3000g?
144
The CLT in practice: Examples
• What birth weight (approximately) cuts off the lower 5% of
the distribution of this population’s birth weights?
145
The CLT in practice: Examples
• What birth weight (approximately) cuts off the lower 5% of
the distribution of sample mean birth weights based on sam-
ples of size 5 drawn from this population?
146
Interval estimation: Introduction
• X is a point estimate of µ
• s2 is a point estimate of s 2
147
Confidence interval for µ: Derivation
X µ
P( 2.58 p 2.58) = 0.99.
s/ n
(The interval from -2.58 to +2.58 captures the middle 99% of the
standard normal distribution.)
Algebraic manipulations to get µ by itself:
148
Confidence interval for µ: Derivation
149
Confidence interval for µ: Example
1.2, 2.4, 1.3, 1.3, 1.0, 3.8, 0.0, 0.8, 4.6, 1.4
For these data, X = 1.58 hrs. Assume that s = 1.5 hrs and that the
distribution of sleep gain (B over A) is symmetric. Find an approx-
imate 99% confidence interval for µ, the mean additional hours of
sleep gained by using drug B rather than drug A among all insomni-
acs.
Answer:
150
Confidence interval for µ: Levels of confidence
151
Confidence interval for µ: Levels of confidence
152
Confidence interval for µ: Unknown s
153
Diversion: The t distributions
• Like the N(0, 1), the t distributions are continuous, symmetric,
bell-shaped, with mean 0.
• The larger that n (or n 1) is, the more closely the t distribu-
tion resembles the N(0, 1).
154
Confidence interval for µ: Unknown s
where t1 a/2,n 1 is a number (from Table C.4) that cuts off the upper
(a/2)% of the probability of the t distribution with n 1 degrees of
freedom.
This is an exact 100(1 a)% CI when the random sample is drawn
from a normally-distributed population. If the population is not nor-
mally distributed, then the interval given above is an adequate ap-
proximate 100(1 a)% CI provided that the sample size is suffi-
ciently large (see p. 142 for how large is “large enough”).
155
Confidence interval for µ: Example with unknown s
1.2, 2.4, 1.3, 1.3, 1.0, 3.8, 0.0, 0.8, 4.6, 1.4
For these data, X = 1.58 hrs and s = 1.66 hrs. Assume that the distri-
bution of sleep gain (B over A) is symmetric. Find an approximate
99% confidence interval for µ, the mean additional hours of sleep
gained by using drug B rather than drug A among all insomniacs.
Answer:
156
Confidence interval for µ: Factors affecting width
The narrower the confidence interval, the more precisely we’ve pinned
down µ. The width of a 100(1 a)% confidence interval for µ is
• Level of confidence
• Sample size
157
Confidence interval for s 2 : Introduction
159
Confidence interval for s 2 : Derivation
Write cn2 1,a/2 and cn2 1,1 a/2 for the values that cut off 100a/2%
from each tail of the cn2 1 distribution (leaving 100(1 a)% of the
distribution in the middle). Then
✓ ◆
2 (n 1)s2 2
P cn 1,a/2 cn 1,1 a/2 = 1 a
s2
160
Confidence interval for s 2 : Formula
161
Confidence interval for s 2 : Example
162
Confidence interval for a proportion: Introduction
# successes Âni=1 Xi
p̂ = = = X.
n n
163
Confidence interval for a proportion: Sampling dis-
tribution of p̂
Using the result at the bottom of the previous page, we can derive
the following approximate 100(1 a)% CI for p:
p p
( p̂ z1 a/2 p(1 p)/n, p̂ + z1 a/2 p(1 p)/n).
165
Confidence interval for a proportion: Example
166
Confidence interval for a proportion: Sample size con-
siderations
and the margin of error, defined as half the width of this CI, is
p
z1 a/2 p̂(1 p̂)/n.
167
Confidence interval for a proportion: Sample size con-
siderations
or equivalently,
4(1.96)2 p̂(1 p̂)
n .
.022
168
Confidence interval for a proportion: Sample size con-
siderations
4(1.96)2 (0.5)(0.5)
n = 9604.
.022
170
Interpretations of confidence intervals
171
Interpretations of confidence intervals
Incorrect interpretations:
172
Hypothesis testing: Introduction
173
Hypothesis testing: Null and alternative hypotheses
174
Hypothesis testing: Null and alternative hypotheses
• The investigator must specify these two hypotheses according
to the problem at hand and his/her goals.
• Depending on their goals, one investigator’s H0 may be differ-
ent than another investigator’s H0 , even for the same problem.
• The burden of proof is always on the investigator to provide
strong evidence that the null hypothesis is false (this is consis-
tent with the scientific method).
• H0 : µ = µ0 versus Ha : µ 6= µ0
• H0 : µ µ0 versus Ha : µ > µ0
• H0 : µ µ0 versus Ha : µ < µ0
177
Hypothesis testing: Type I and Type II errors
178
Hypothesis testing: Type I and Type II errors
179
Hypothesis testing: Type I and Type II errors
Define:
180
Hypothesis testing: Power
The power of a test is the probability, using that test, that we will
reject H0 when it is false.
• Thus, Power = 1 b.
• Some tests have higher power than others, but often at the
price of more restrictive assumptions
181
Hypothesis testing: Six-step procedure
1. Formulate H0 and Ha , based on the scientific question of in-
terest.
182
Hypothesis testing: Six-step procedure
5. (a) If the test statistic is more extreme than the critical value(s),
then reject H0 ; otherwise do not reject H0 . OR
(b) If the P value is less than a, reject H0 ; otherwise do not
reject H0 .
183
Test statistics for hypotheses about a population mean
X µ0 .
184
Test statistics for hypotheses about a population mean
p
According to the CLT, the standard error of X is s / n. So, if s is
known to us, we may use as our test statistic the “z-statistic”
X µ0
Z= p .
s/ n
p
• Dividing by s / n calibrates the discrepancy between X and
µ0 in units of standard error.
185
Test statistics for hypotheses about a population mean
186
Critical values for testing hypotheses about a popula-
tion mean
• z1 a if Ha is µ > µ0
• za if Ha is µ < µ0
187
Critical values for testing hypotheses about a popula-
tion mean
• t1 a,n 1 if Ha is µ > µ0
• ta,n 1 if Ha is µ < µ0
If our computed test statistic is more extreme than the critical value,
we reject H0 ; otherwise, we do not reject H0 .
188
Hypothesis testing for a population mean: Example
A researcher wanted to test the hypothesis that the mean body tem-
perature of African elephants was 96.0 F. He has no prior notion
about which direction the mean body temperature of elephants will
differ from 96.0 F if it is not equal to 96.0.
190
Hypothesis testing for a population mean: Example
191
Hypothesis testing: More on P values
• The P value approach to HT is equivalent to the critical value
approach; they always yield the same decision about H0 .
192
Hypothesis testing: Variations on the elephant exam-
ple
Suppose we change the significance level from .05 to .01. How does
the test change?
193
Hypothesis testing: Variations on the elephant exam-
ple
194
Equivalence between hypothesis tests and confidence
intervals
Consider testing
H0 : µ = µ0 versus Ha : µ 6= µ0
X µ0
t1 a/2,n 1 p t1 a/2,n 1 .
s/ n
195
Equivalence between hypothesis tests and confidence
intervals
197
Hypothesis testing for a population variance: Intro-
duction
198
HT for a population variance: Test statistic
199
HT for a population variance: Test statistic
The further that s2 /s02 is from 1.0, the greater the discrepancy (and
the stronger the evidence against H0 ). Equivalently, the further that
(n 1)s2
s02
For critical values and P values, we use the fact that our test statistic
has a chi-square distribution with n 1 degrees of freedom when H0
is true.
200
HT for a population variance: Critical values
Suppose the significance level is a. Let cn2 1,a be the 100ath per-
centile of the cn2 1 distribution, i.e.
Then:
• if Ha is two-sided, we reject H0 if
201
HT for a population variance: P values
• if Ha is two-sided, P value =
202
HT for a population variance: Example
A healthy lifestyle undoubtedly plays a role in longevity, but so does genetic makeup. Recent studies
have linked large cholesterol particles to longevity. A variant of a gene called CETP encoding the
cholestryl ester transferase protein apparently causes the formation of large cholesterol particles. In
a particular population the life spans for males are normally distributed with a mean of 74.2 yrs and
a standard deviation of 10.0 yrs. A sample of 16 males in this population that had the variant CETP
gene lived an average of 81.2 yrs with a standard deviation of 8.0 yrs. Does this establish that CETP
variant carriers are significantly less variable in their life spans than the general population?
203
Nonparametric methods for hypothesis testing
The HT methods we’ve learned so far require either (a) the sampled
population to be normally distributed, or (b) the sample size to be
large enough for the CLT to “steer” the distribution of the test statis-
tic sufficiently close to its reference distribution (Z, t, c 2 ). What if
neither (a) nor (b) is satisfied?
In that case, we use alternative HT methods called nonparametric or
distribution-free methods. Though more widely applicable than the
parametric methods already learned, they are not as powerful.
204
The sign test: Introduction
205
The sign test: Hypotheses and test statistics
• H0 : M = M0 versus Ha : M 6= M0
• H0 : M M0 versus Ha : M > M0
• H0 : M M0 versus Ha : M < M0
207
The sign test: P value testing approach
P(bin(n, 12 ) S ) < a.
P(bin(n, 12 ) S+ ) < a.
208
The sign test: P value testing approach
209
The sign test: Example 1 (Problem 6.32 in text)
Abuse of substances containing toluene (for example, various glues) can produce neurological symp-
toms. In an investigation of the mechanism of these toxic effects, researchers measured the concentra-
tions of certain chemicals in the brains of rats who had been exposed to a toluene-laden atmosphere.
The concentrations (ng/gm) of the brain chemical norepinephrine in the medulla region of the brain of
9 toluene-exposed rats was determined and recorded below:
Does the exposure to toluene significantly increase norepinephrine levels in rat medullas above the
normal median level of 530 ng/gm?
210
The sign test: Example 2
211
The Wilcoxon signed-rank test: Introduction
• Tests hypotheses about a population median, M (tests same
hypotheses as sign test)
• Since symmetry implies that the mean and median are equal,
this test tests the same hypotheses as the t test too
212
The Wilcoxon signed-rank test: Test statistic
213
The Wilcoxon signed-rank test: Test statistic
Note:
• Also, if ties occur we average all the successive ranks that are
tied. For example,
214
The Wilcoxon signed-rank test statistic: Example
W+ = 36, W =9
215
The Wilcoxon signed-rank test: Rationale
n(n + 1)
1+2+3+···+n = .
2
So, if the population distribution is symmetric and M = M0 , we
would expect both W+ and W to equal roughly n(n + 1)/4. If either
of them is too small (too far away from n(n + 1)/4), we should reject
H0 .
How far is too far? We need a reference distribution for the Wilcoxon
signed rank statistic, which is provided in Table C.6. Call this distri-
bution W (n).
216
The Wilcoxon signed-rank test: P value testing ap-
proach
Three cases:
217
The Wilcoxon signed-rank test: P value testing ap-
proach
218
The Wilcoxon signed-rank test: Example 1
219
The Wilcoxon signed-rank test: Example 2
P(W (n) )=
220
Comparing two population means: Introduction
The question of interest: Are the mean ages at death due to SIDS
identical for boys and girls?
221
Comparing two population means: Introduction
• H0 : µ1 = µ2 versus Ha : µ1 6= µ2
• H0 : µ1 µ2 versus Ha : µ1 > µ2
• H0 : µ1 µ2 versus Ha : µ1 < µ2
Note that each hypothesis can also be expressed in terms of the dif-
ference of the means. For example, the last pair of hypotheses can
also be expressed as follows:
• H0 : µ1 µ2 0 versus Ha : µ1 µ2 < 0
223
Comparing two population means: Types of sampling
225
Comparing two means via paired sampling
226
Comparing two means via paired sampling
227
Comparing two means via paired sampling
Xd 0 Xd 0
p < t1 a/2,n 1 or p > t1 a/2,n 1 .
sd / n sd / n
228
Comparing two means via paired sampling: Example
An experiment was performed to study the effects of irradiation on bacterial contamination in meat.
The logarithm of the direct microscopic count (log DMC) of bacteria in 12 meat samples was measured
before irradiating the 12 meat samples, and then again afterwards. The data were as follows:
229
Comparing two means via paired sampling: Example
H0 : µd 0 versus Ha : µd > 0.
Xd 0 .258
p =p = 2.51.
sd / n .127/12
If we test at the .05 significance level, the critical value is t.95,11 =
1.796, so we reject H0 . Conclusion: there is statistically significant
evidence that irradiation reduces bacterial contamination in meat.
230
Comparing two means via independent sampling
231
Comparing two means via independent sampling, as-
suming equal variances
Assuming that s12 = s22 , the two sample variances (s21 and s22 ) are
estimates of the same quantity. So it makes sense to combine, or
“pool” them:
(n1 1)s21 + (n2 1)s22
s2p =
n1 + n2 2
• The divisor, n1 + n2 2, is the degrees of freedom here
• The sample variance from the larger of the two samples gets
more weight in the pooled estimate (makes sense!)
232
Comparing two means via independent sampling, as-
suming equal variances
233
Comparing two means via independent sampling, as-
suming equal variances
More specifically:
X X2
q 1 < t1 a/2,n1 +n2 2
s2p ( n11 + n12 )
or
X1 X2
q > t1 a/2,n1 +n2 2 .
s2p ( n11 + n12 )
234
Comparing two means via independent sampling, as-
suming equal variances
• To test H0 : µ1 µ2 versus Ha : µ1 < µ2 , we reject H0 if
X X2
q 1 < t1 a,n1 +n2 2
s2p ( n11 + n12 )
X X2
q 1 > t1 a,n1 +n2 2
s2p ( n11 + n12 )
235
Comparing two means via independent sampling, as-
suming equal variances: Example
In a study of the periodical cicada (Magicicada septendecim), researchers measured the hind tibia
lengths of the shed skins of 110 individuals: 60 males and 50 females. Some summary statistics for
the tibia lengths were as follows:
Gender ni Xi si
Males 60 78.42 2.87
Females 50 80.44 3.52
Let µ1 and s12 represent the mean and variance of hind tibia lengths for the entire population of male
periodical cicadas at shedding; define µ2 and s22 similarly for females. We want to test H0 : µ1 = µ2
versus Ha : µ1 6= µ2 at, say, the .05 level of significance, and suppose we’re willing to assume that
s12 = s22 .
236
Comparing two means via independent sampling, as-
suming equal variances: Example
Test statistic:
Critical values:
P-value:
Conclusion:
237
Comparing two means via independent sampling, as-
suming equal variances: Example
238
Comparing two means via independent sampling, when
variances are possibly unequal
In this case s21 and s22 aren’t necessarily estimating the same quantity,
so we do not pool them. We sum them instead (actually we sum
scaled versions of them).
100(1 a)% confidence interval for µ1 µ2 :
s
s21 s22
(X 1 X 2 ) ± t1 a/2,n + .
n1 n2
239
Comparing two means via independent sampling, when
variances are possibly unequal
X X
q1 2 22
s1 s2
n1 + n2
240
Comparing two means via independent sampling, when
variances are possibly unequal: Example
Let’s revisit the periodical cicada example, only this time let’s not
assume that the two population variances are equal. Then our test
statistic is
241
Pros and cons of paired sampling
242
Pros and cons of paired sampling: Numerical demon-
stration
245
Pros and cons of paired sampling: Numerical demon-
stration
247
Sources of variation that paired sampling controls for:
Examples
• Bacterial contamination in paired meat samples, before and
after irradiation
248
The irradiated meat example, revisited: Incorrectly
analyzed by an independent-sampling based approach
249
For testing H0 : µ1 µ2 vs. Ha : µ1 > µ2 at the .05 significance level,
the critical value of t is t.95,22 = 1.717. The test statistic is
X1 X2
t=r ⇣ ⌘ = 0.857.
1 1
s2p n1 + n2
250
Nonparametric tests for two populations: Introduc-
tion
The previously described tests for the means of two populations are
strictly valid only when either:
251
Nonparametric tests for two populations: Introduc-
tion
When the sampling is paired, we may test hypotheses about the me-
dian of the population of within-paired differences using either the
• sign test, or
252
The irradiated meat example, revisited: Sign test and
Wilcoxon signed-rank test for the population median
of paired differences
Recall, once again, the following log DMC data in 12 meat samples before and after irradiation. The
investigator wishes to test
H0 : Md 0 versus Ha : Md > 0.
S = , S+ = , W = 7, W+ = 71.
Thus, the sign test does not reject H0 (at a = .05), but the Wilcoxon
signed rank test does reject H0 (indicating that there is and is not,
respectively, statistically significant evidence that irradiation reduces
bacterial contamination in meat).
Why the difference in conclusions?
254
Wilcoxon rank-sum test
• H0 : M1 = M2 versus Ha : M1 6= M2
• H0 : M1 M2 versus Ha : M1 > M2
• H0 : M1 M2 versus Ha : M1 < M2
255
Wilcoxon rank-sum test: Test statistic
1. Conceptually pool the data from both samples into one sample
and rank the data from smallest to largest. Replace the data
with their ranks.
2. Sum the ranks that correspond to the smaller of the two sam-
ples (Sample 1). Call this rank sum W1 , which is our test statis-
tic.
256
Wilcoxon rank-sum test: Critical values
257
Wilcoxon rank-sum test: Example
Recall (from page 221 of these notes) the following data, which are
the ages (in days) at time of death for random samples of 11 girls
and 16 boys who died from SIDS:
Girls 53 56 60 60 78 87 102 117 134 160 277
Boys 46 52 58 59 77 78 80 81 84 103 114 115 133 134 167 175
Histograms of these data show that for both girls and boys, the data
are right-skewed. Thus, the corresponding populations are probably
not normally distributed, and the sample sizes are relatively small.
Question of interest: Are the median ages at death due to SIDS iden-
tical for boys and girls?
258
So we wish to test
H0 : M1 = M2 versus Ha : M1 6= M2 .
Original data:
Girls 53 56 60 60 78 87 102 117 134 160 277
Boys 46 52 58 59 77 78 80 81 84 103 114 115 133 134 167 175
Several pages back, we saw that the specifics of inference for the
difference in two population means based on independent samples
depended on whether or not we assumed that the two population
variances were equal. If we do make this assumption, it’s desirable
to have some justification for doing so. This leads us to consider the
problem of testing the hypothesis that the two population variances
are equal.
We may also be interested in testing whether two population vari-
ances are equal for its own sake.
260
Comparing two population variances: Hypotheses
261
Comparing two population variances: Test statistic
262
Comparing two population variances: Critical values
Values in Table C.7 are critical values in the right tail, i.e. F1 a,(n1 ,n2 ) :
• F.95,(6,11) =
• F.99,(60,25) =
.
• F.975,(47,83) =
Critical values in the left tail are not tabled (to save space), but can
be obtained from critical values in the right tail as follows:
264
Comparing two population variances: Critical values
n1 = n1 1 and n2 = n2 1.
Conclusion:
266
Hypothesis testing: Some loose ends . . .
267
Hypothesis testing: Some loose ends . . . (continued)
268
Hypothesis testing: Some loose ends . . . (continued)
(b) Same scenario as (a), except suppose that we do not reject the
second H0 . Would we also not reject the first H0 ?
270
Hypothesis testing: Some loose ends . . . (continued)
(c) Suppose we are asked to test
H0 : µ = 65 versus Ha : µ 6= 65
Suppose further that X is larger than 65, and that we reject the
second H0 (at a particular level of significance, say .05).
Would we also reject the first H0 (at the same level of signifi-
cance)?
271
Hypothesis testing: Some loose ends . . . (continued)
(d) Same scenario as (c), except suppose that we do not reject the
second H0 . Would we also not reject the first H0 ?
272
Hypothesis testing: Some loose ends . . . (continued)
H0 : µ1 = µ2 versus Ha : µ1 6= µ2
274
Testing hypotheses on more than two population means:
Motivating example
A study was carried out to determine whether two dietary supplements derived from red clover were
more effective than a placebo in reducing hot flashes in post-menopausal women. The randomized,
double-blind trial was conducted using 252 menopausal women, aged 45 to 60 years, who were expe-
riencing at least 35 hot flashes per week. After a 2-week period in which all were given a placebo, the
women were randomly assigned to Promensil (82 mg of total isoflavones per day), Rimostil (57 mg of
total isoflavones per day), or an identical placebo; and then followed up for 12 weeks. The table below
provides summary statistics on the number of hot flashes (per day) experienced by the women at the
end of the trial.
275
Testing hypotheses on more than two population means:
Motivating example
Promensil Rimostil Placebo
ni 84 83 85
Xi 5.1 5.4 5.0
si 4.1 4.6 3.2
Assuming normality, analyze these data to determine whether there are any differences in the mean
number of hot flashes per day for these three treatments.
276
Testing hypotheses on more than two population means:
Motivating example
277
Testing hypotheses on more than two population means:
Motivating example
If the data being used to perform each hypothesis test was indepen-
dent of the data being used to perform the others, the overall Type I
error probability could be computed as follows:
But some of the data is the same in each of the tests, so indepen-
dence doesn’t hold and the previous calculation isn’t valid.
Bottom line: We can’t control (determine) the overall Type I error
probability by doing multiple two-sample t tests. If we want to con-
trol a, we need to take a completely different approach, which is
called the Analysis of Variance (ANOVA).
278
The ANOVA: Set-up and notation
279
The ANOVA: Set-up and notation
More notation:
X i. =
s2i =
280
The ANOVA: The hypotheses tested
H0 : µ1 = µ2 = · · · = µk
in such a way that the overall Type I error probability can be pre-
specified (see pp. 277-278 of these notes).
281
The ANOVA: Underlying assumptions
1. The samples are independent random samples from their re-
spective populations.
282
The ANOVA: Partitioning deviations from grand mean
Xi j X .. =
283
The ANOVA: Partitioning deviations from grand mean
Data Xi j X .. X i. X .. Xi j X i.
9 -11 -10 -1
10 -10 -10
11 -9 -10
19
21
28
30
32
284
The ANOVA: Partitioning the sums of squares
It works!
So what?
286
The ANOVA: Test statistic
287
The ANOVA: Toy example
So,
600/(3 1)
F= = 125.0
12/(8 3)
The critical value for a test at the .05 significance level is F.95,2,5 =
5.79, so we would reject H0 for these “data.”
288
The ANOVA: Some remarks
• Although the hypotheses being tested are concerned with pop-
ulation means, the test statistic is a ratio of measures of spread!
Why is this reasonable?
289
The ANOVA: Computational formulas for sums of
squares
Additional notation:
T.. =
290
The ANOVA: Computational formulas for sums of
squares
In practice, the following formulas for the sums of squares are alge-
braically equivalent, but not as painful or prone to numerical errors:
!
k ni
T..2
SSTotal = Â Â ij X 2
N
i=1 j=1
!
k
Ti.2 T..2
SSTreat = Â
i=1 ni N
k
SSError = Â (ni 1)s2i
i=1
Actually, we only need to compute any two of these, and we can
then get the third using the fact that SSTotal = SSTreat + SSError .
291
The ANOVA table
292
The ANOVA: A real example
We get
293
The ANOVA: A real example
Since the computed F statistic is not in the right tail, we do not re-
ject H0 at the .05 level of significance (or at any other typical level
of significance).
Conclusion: The mean number of hot flashes per day is not signifi-
cantly different for the three treatments.
294
Mean separation: Introduction
295
Mean separation: Protected F method
X i. X j.
ti j = r ⇣ ⌘.
1 1
MSError ni + n j
An experiment was performed to study the psychological effects of exercise on male college students.
Four groups of college men were studied:
• Quitters (Q): people who volunteered to participate in the exercise program but did not follow
through
At the beginning and end of the experiment, a psychological test was taken by each person. The scoring
on the exam was measured in such a way that a greater degree of satisfaction/confidence/happiness at
the end of the experiment corresponded to a greater difference in the two exam scores taken by each
person. We therefore think of the µi ’s as representing mean satisfaction levels.
At the 0.10 level of significance, we want to test H0 : µE = µQ = µJ = µS versus Ha : at least one mean
is different from the others.
297
Mean separation: Example
Results:
Group ni X i. si
E 5 57.40 10.46
Q 10 51.90 6.42
J 10 58.20 9.49
S 11 49.73 6.27
298
Mean separation: Example
299
Mean separation: Example
300
Nonparametric ANOVA (Kruskal-Wallis test: Intro-
duction
301
Kruskal-Wallis test: The hypotheses tested
H0 : M1 = M2 = · · · = Mk
in such a way that the overall Type I error probability can be pre-
specified.
302
Kruskal-Wallis test: Underlying assumptions
1. The samples are independent random samples from their re-
spective populations.
2. The population distributions are all the same shape; they differ
(possibly) only insofar as their medians are concerned.
303
Kruskal-Wallis test: Test statistic
304
Kruskal-Wallis test: Test statistic
Ri N+1
If H0 is true, then we would expect each ni to be fairly close to 2 ;
so if H is large it casts doubt on H0 .
Computational formula for test statistic:
!
k
12 R2i
H=
N(N + 1) i=1 Â ni 3(N + 1).
305
Kruskal-Wallis test: Critical values and P values
306
Kruskal-Wallis test: Example
(Problem 8.16 from textbook.) To compare the efficacy of three insect repellants, 19 volunteers applied
fixed amounts of repellant to their hand and arm and then placed them in a chamber with several
hundred hungry female Culex erraticus mosquitoes. The repellants were citronella, N,N-diethyl-meta-
toluamide (DEET) 20%, and Avon Skin So Soft hand lotion. The data recorded below are the times in
minutes until first bite; ranks are given in parentheses.
307
Kruskal-Wallis test: Example
Test statistic:
✓ ◆
12 39.52 942 56.52
H= + + 3(20) = 9.13.
19(20) 6 6 7
2
Critical value: c.95,2 = 5.99.
So we reject H0 . There is statistically significant evidence that the
median time to first bite for at least one repellant is different from
the others.
Pairwise Wilcoxon rank sum tests for each pairwise comparison of
medians show that median times to first bite are significantly differ-
ent for citronella and DEET 20%, and also for Avon Skin So Soft
and DEET 20%; but median times to first bite are not significantly
different for citronella and Avon Skin So Soft.
308
Hypothesis testing for the probabilities of a distribu-
tion of a categorical variable
309
Hypothesis testing for the probabilities of a distribu-
tion of a categorical variable
Now we consider situations where the variable of interest is categor-
ical. Examples:
310
Hypothesis testing for the probabilities of a distribu-
tion of a categorical variable
More generally, we may want to test hypotheses about all the pro-
portions, for example whether the ratio of three color morphs in a
salamander population is Black:Red-striped:Red-spotted = 1:2:1.
311
The binomial and proportions tests: Introduction
• H0 : p = p0 versus Ha : p 6= p0
• H0 : p p0 versus Ha : p > p0
• H0 : p p0 versus Ha : p < p0
Note: we’ve already dealt with a confidence interval for p (p. 165).
312
The binomial and proportions tests: Introduction
313
The binomial test: Test statistic
The test statistic for the binomial test is simply the number of suc-
cesses in the sample, i.e.
S = # of successes in sample.
You might have noticed that this is similar to the sign test statistic;
in fact it’s equivalent to the sign test when p0 = 0.5.
314
The binomial test: P values
• If Ha is 2-sided, reject H0 if
315
The binomial test: Beer bottling example
Beer drinkers and brewmeisters have long known that exposure to light can cause a “skunky” taste and
smell in beer. In fact, chemical studies have shown how the light-sensitive compounds in hops called
isohumulones degrade forming free radicals that bond to sulfur to cause the skunky taste. Most bottled
beer is sold in green or brown bottles to prevent this. Miller Genuine Draft (MGD) is claimed to be
made from chemically altered hops that don’t break down into free radicals in light and, therefore, the
beer can be sold in less expensive clear bottles. The company thinks the extra cost of a dark bottle will
pay off for them only if more than 60% of beer drinkers would prefer MGD unexposed to light. In a
taste test of MGD stored for 6 months in light-tight containers or exposed to light, a panel of 20 tasters
preferred the light-tight beer 16 times. Should the company use dark bottles?
317
The proportions test: Test statistic and critical values
When the sample sizes are such that np0 > 5 and n(1 p0 ) > 5, we
can still do the binomial test, but alternatively we can get a good
approximation to it using the normal approximation to the binomial.
The test statistic is
S np0
z= p ,
np0 (1 p0 )
or equivalently,
p̂ p0
z= p
p0 (1 p0 )/n
where p̂ is the sample proportion of successes, S/n (defined previ-
ously on page 158 of these notes).
Critical values are either z1 a , za , or ±z1 a/2 depending on whether
Ha points to the right, to the left, or is two-sided.
318
The proportions test: Atlanta immunization example
Recall the Atlanta immunization example introduced on page 308.
In order to test the hypotheses
H0 : p 0.9 versus Ha : p < 0.9,
• H0 : p1 = p2 versus Ha : p1 6= p2
• H0 : p1 p2 versus Ha : p1 > p2
• H0 : p1 p2 versus Ha : p1 < p2
320
Comparing two population proportions: Test statistic
321
and a “pooled” sample proportion,
n1 p̂1 + n2 p̂2 total # of successes
p̂c = = .
n1 + n2 total sample size
Rationale for pooled sample proportion: If the null hypothesis is
true, then p̂1 and p̂2 are estimating the same quantity, so we get a
better estimate by combining them (analogous to pooling the sam-
ple variance for the t test comparing means when we are willing to
assume the two population variances are equal).
Test statistic:
p̂1 p̂2
z= r ⇣ ⌘
1
p̂c (1 p̂c ) n1 + n12
322
Comparing two population proportions: Critical val-
ues
This test, like the proportions test for the proportion of a single pop-
ulation, is based on the normal approximation to the binomial dis-
tribution. So we get critical values (and P values) from the standard
normal distribution.
Critical values are z1 a , za , or ±z1 a/2 depending on whether Ha
points to the right, to the left, or is 2-sided.
323
Comparing two population proportions: Chronic wast-
ing disease example
On page 166 of these notes, we described a study in which 272 deer were legally
killed by hunters in the Mount Horeb area of SW Wisconsin in 2001-02. From
tissue sample analysis, it was determined that 9 of the deer had chronic wasting
disease (a disease similar to mad cow disease). If 272 deer from the population in
that same region were sampled next winter, and 16 tested positive for the disease,
would that be statistically significant evidence of a change in the infection rate?
324
Comparing two population proportions: Chronic wast-
ing disease example
Sample proportions:
9 16 9 + 16
p̂1 = = .03309, p̂2 = = .05882, p̂c = = .04596
272 272 272 + 272
Test statistic:
.03309 .05882
z= q = 1.43
1 1
.04596(1 .04596) 272 + 272
Critical values (taking a = .05): ±z.975 = ±1.96
So we don’t reject H0 . If the sample infection rate changed to this
degree, it would not constitute statistically significant evidence for a
change in the population’s infection rate.
325
Confidence interval for the difference in two popula-
tion proportions
326
Chi-square test for goodness-of-fit: Introduction
In some situations with categorical data, there are more than two
categories (levels) and hence more than one proportion parameter of
interest. Consider the following example.
The nests of the wood ant, Formica rufa, are constructed from small twigs and
wood chips. As part of a study of where ants build these nests, the direction of
greatest slope was recorded for 42 such nests in Pound Wood, Essex, England.
Compass directions were divided into four classes: North, East, South, West. The
direction of greatest slope for the 42 nests were 3, 8, 24, and 7 (in the same order
of listing as the compass directions). Do the ants prefer any particular direction of
exposure over another?
327
Chi-square test for goodness-of-fit: Introduction
H0 : pN = pE = pS = pW = 14 versus
Ha : At least one proportion is different from the others
328
Chi-square test for goodness-of-fit: Hypotheses tested
Here p01 , p02 , . . . , p0k are proportions we specify. (In the wood ant
example they are all 14 ).
Note: The textbook describes H0 and Ha in an equivalent way, but
using words; it doesn’t use the notation p1 , p01 , etc.
329
Chi-square test for goodness-of-fit: Test statistic
330
Chi-square test for goodness-of-fit: Test statistic
Note:
331
Chi-square test for goodness-of-fit: Critical value
Note: What we have just tested is what the textbook would call an
extrinsic model. The text also describes an intrinsic model and mod-
ifies the c 2 test slightly for such a model; you can skip this.
332
Chi-square test for goodness-of-fit: Wood ant exam-
ple
Recall the wood ant example, for which the hypotheses to be tested
are
H0 : pN = pE = pS = pW = 14 versus
Ha : At least one proportion is different from the others.
333
Chi-square test for goodness-of-fit: Wood ant exam-
ple
334
Chi-square test for r ⇥ k contingency tables
335
Contingency tables example: The CASH Study
The CASH (Cancer and Steroid Hormone) study was conducted during the 1980s
to investigate the relationship between oral contraceptive use and three cancers
(breast, endometrial, and ovarian) in U.S. women. Part of this very comprehensive
study investigated whether family history of breast cancer was a risk factor for
breast cancer. 4730 women having breast cancer, and 4688 women not having
breast cancer, were asked how many of their first-degree relatives (mother, sister,
daughter) had breast cancer. The results are displayed below:
Scientific question: Does family history of breast cancer increase a woman’s own
risk of breast cancer?
336
Contingency tables example: beetles in logs
A method commonly used by ecologists to detect an association between species
(possibly mutualistic, parasitic, or something else) is to take a series of obser-
vational units where the species live or forage, such as ponds or trees, and then
count the number of those units in which both species are found, neither species
is found, or one or the other of the species is found.
In one such study, 500 logs in a forest were sampled for the presence of two beetle
species (labeled here generically as Species A and Species B). Results were as
follows:
338
with similar definitions for pYes,0 , pYes,1 , and pYes, 2 . Then esti-
mates of, for example, pNo,0 and pYes,1 are
4403 511
p̂No,0 = = 0.9392, p̂Yes,1 = = 0.1080.
4688 4730
In the beetles-in-logs study, a single population is sampled, and we
have for example,
202 106
p̂PP = = 0.404, p̂AP = = 0.212.
500 500
339
Chi-square test for contingency tables: Hypotheses
versus
340
Chi-square test for contingency tables: Hypotheses
versus
341
Chi-square test for contingency tables: Test statistic
The test statistic for testing these hypotheses is once again a chi-
square statistic,
r k (O 2
i j Ei j )
c2 = Â Â ,
i=1 j=1 Ei j
where
(282)(308) (282)(192)
E11 = = 173.712, E12 = = 108.288
500 500
(218)(192) (218)(308)
E22 = = 83.712, E21 = = 134.288.
500 500
343
Chi-square test for contingency tables: Critical val-
ues
(r 1)(k 1),
344
Chi-square test for contingency tables: Breast cancer
example
Review the breast cancer example from several pages back. The
Oi j ’s are
346
Chi-square test for contingency tables: Beetles-in-logs
example
349
Correlation and regression analysis: Overview
Let X and Y represent the two continuous variables. Often, it makes
sense to think that one of the two variables may depend on the other;
we let Y represent that variable (the dependent variable) and let X
represent the other variable (the independent, or explanatory, vari-
able).
Statistical approach to understanding the relationship, if any, be-
tween X and Y : We imagine that X and Y exist for each member
of a large (possibly infinitely large) population. We take a finite ran-
dom sample of size n from the population and measure X and Y on
each sampled individual.
This yields data (X1 ,Y1 ), (X2 ,Y2 ), . . . , (Xn ,Yn ) which we may use to
make inferences about the relationship between X and Y for the pop-
ulation as a whole.
350
Correlation and regression analysis: Overview
351
Correlation and regression analysis: Overview
352
Correlation and regression analysis: Overview
354
Pearson’s correlation coefficient
Some properties of r:
• 1r1
• r = 0 , no linear relationship exists between X and Y
• r > 0 , positive linear relationship; r = 1 , perfect positive
linear relationship
• r < 0 , negative linear relationship; r = 1 , perfect nega-
tive linear relationship
• The closer r is to 1, the stronger the positive linear relation-
ship; the closer r is to 1, the stronger the negative linear
relationship
355
Pearson’s correlation coefficient for various scatter-
plots
356
Inference for a population correlation coefficient: Hy-
potheses
H0 : r = 0 versus Ha : r 6= 0.
357
Inference for a population correlation coefficient: Test
statistic and critical values
• t1 a,n 2 if Ha is r > 0
• t1 a,n 2 if Ha is r < 0
• ±t1 a/2,n 2 if Ha is r 6= 0
This testing approach is valid, provided that either X and Y are nor-
mally distributed or n is sufficiently large (n 25).
358
Correlation analysis example: Relationship between
heart disease and a fatty diet
Y = 100[log(# of deaths from heart disease per 100,000 males aged 55 59) 2]
X = fat calories as a percent of total calories in diet.
Do these data indicate that heart disease and a fatty diet are associated?
359
Correlation analysis example: Relationship between
heart disease and a fatty diet
Scatter diagram:
Let’s test
H0 : r = 0 versus Ha : r 6= 0
at the 0.05 level of significance.
0.45
Test statistic: t = p = 2.23.
(1 0.452 )/(22 2)
361
Another correlation analysis example: Relationship
between heart disease and telephone abundance
Data from the same 22 countries from the previous example are also available on
the variable
X = # of telephones per 1000 population.
Are heart disease (Y ) and telephone abundance associated?
362
Another correlation analysis example: Relationship
between heart disease and telephone abundance
Scatter diagram:
When we test
H0 : r = 0 versus Ha : r 6= 0
364
Nonparametric correlation coefficients
In those situations where the sample size is relatively small (< 25)
and the population of values of the variable of interest is not nor-
mally distributed, we cannot safely do correlation analysis with Pear-
son’s correlation coefficient. Instead, we use a nonparametric corre-
lation coefficient, which is based on the ranks of the data.
Our textbook describes the following 2 nonparametric correlation
coefficients:
Xi Yi rXi rYi di
1.72 0.19
0.58 0.92
1.12 1.54
6(6)
So rS = 1 3(8) = 0.5.
367
Spearman’s correlation coefficient: Hypothesis test-
ing
H0 : rS = 0 versus Ha : rS 6= 0
368
Spearman’s correlation coefficient: Examples
Let’s revisit the example of heart disease versus fatty diet considered
previously. For those data we have
n = 22, rS = 0.39.
The critical values for a two-sided test at a = 0.05 are ±0.425, so
we do not reject H0 . (In fact, 0.05 < P < 0.10). So we conclude
that there is not statistically significant evidence of a relationship
between heart disease and a fatty diet.
Recall that with the parametric test (the t-test), we concluded there
was statistically significant evidence of a linear relationship between
these 2 variables. What gives? Two probable explanations:
• Nonparametric tests are not as powerful as parametric tests
• There are 2 outliers that strongly affect r but not rS
369
Spearman’s correlation coefficient: Examples
n = 22, rS = 0.54.
370
Correlation analysis: Final remarks
• Correlation analysis assumes that a linear relationship exists
between Y and X (i.e. Yi = AXi + B, possibly with A = 0).
• Correlation analysis seeks to determine if the linear relation-
ship is positive, negative, or 0; and how strong it is.
• Effect of transformations:
– Linear transformations, Ui = AU Xi + BU and Vi = AV Yi +
BV , have no effect on either r or rS
– Monotone increasing transformations (like taking logs)
have no effect on rS (because the ranking remains the
same), but r may change
– Non-monotonic nonlinear transformations affect r and rS
in unpredictable ways
371
Regression analysis: Introduction
1. What is the equation of the straight line that best fits the data?
372
Simple linear regression analysis: Introduction
373
Simple linear regression analysis: Conceptual foun-
dation
µY |X and sY2|X .
µY |X = a + b X;
374
Simple linear regression analysis: Conceptual foun-
dation
Here:
• By eye?
375
Least squares estimation of a and b
376
Least squares estimation of a and b
Y = â + b̂ X,
on the scatterplot.
377
Least squares estimation of a and b : Toy example
b̂ =
â =
378
Prediction using the least squares regression line
µ̂Y |x = â + b̂ x.
Ŷ |x = â + b̂ x.
379
Prediction using the least squares regression line
380
Simple linear regression analysis: Assumptions for
inference
â, b̂ , and µ̂Y |x are all point estimates of their respective parameters.
To obtain confidence intervals for these parameters and do hypoth-
esis tests, we need to make some further assumptions. We assume
that:
• µY |X = a + b X (as before)
381
Simple linear regression model
Picture:
This model is called the simple linear regression model. Its param-
eters are a, b , and sY2 .
382
Estimation of residual variance
383
Estimation of residual variance: Toy example
For the toy example on page 378 of these notes, recall that
b̂ = , â =
Xi Yi â + b̂ Xi Yi (â + b̂ Xi )
1 0
3 3
4 2
2 1
2 2
So sY2 =
384
Confidence interval for Y |x
385