0% found this document useful (0 votes)
20 views181 pages

CHP 6

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 181

Chapter 6: Inference for categorical data

OpenIntro Statistics, 4th Edition

Slides developed by Mine Çetinkaya-Rundel of OpenIntro.


The slides may be copied, edited, and/or shared via the CC BY-SA license.
Some images may be included under fair use guidelines (educational purposes).
Inference for a single proportion
Two scientists want to know if a certain drug is effective against
high blood pressure. The first scientist wants to give the drug to
1000 people with high blood pressure and see how many of them
experience lower blood pressure levels. The second scientist wants
to give the drug to 500 people with high blood pressure, and not give
the drug to another 500 people with high blood pressure, and see
how many in both groups experience lower blood pressure levels.
Which is the better way to test this drug?

(a) All 1000 get the drug


(b) 500 get the drug, 500 don’t

1
Two scientists want to know if a certain drug is effective against
high blood pressure. The first scientist wants to give the drug to
1000 people with high blood pressure and see how many of them
experience lower blood pressure levels. The second scientist wants
to give the drug to 500 people with high blood pressure, and not give
the drug to another 500 people with high blood pressure, and see
how many in both groups experience lower blood pressure levels.
Which is the better way to test this drug?

(a) All 1000 get the drug


(b) 500 get the drug, 500 don’t

1
Results from the GSS

The GSS asks the same question, below is the distribution of


responses from the 2010 survey:

All 1000 get the drug 99


500 get the drug 500 don’t 571
Total 670

2
Parameter and point estimate

We would like to estimate the proportion of all Americans who have


good intuition about experimental design, i.e. would answer “500
get the drug 500 don’t”? What are the parameter of interest and the
point estimate?

3
Parameter and point estimate

We would like to estimate the proportion of all Americans who have


good intuition about experimental design, i.e. would answer “500
get the drug 500 don’t”? What are the parameter of interest and the
point estimate?

• Parameter of interest: Proportion of all Americans who have


good intuition about experimental design.

p (a population proportion)

3
Parameter and point estimate

We would like to estimate the proportion of all Americans who have


good intuition about experimental design, i.e. would answer “500
get the drug 500 don’t”? What are the parameter of interest and the
point estimate?

• Parameter of interest: Proportion of all Americans who have


good intuition about experimental design.

p (a population proportion)

• Point estimate: Proportion of sampled Americans who have


good intuition about experimental design.

p̂ (a sample proportion)

3
Inference on a proportion

What percent of all Americans have good intuition about experimen-


tal design, i.e. would answer “500 get the drug 500 don’t”?

4
Inference on a proportion

What percent of all Americans have good intuition about experimen-


tal design, i.e. would answer “500 get the drug 500 don’t”?

• We can answer this research question using a confidence


interval, which we know is always of the form

point estimate ± ME

4
Inference on a proportion

What percent of all Americans have good intuition about experimen-


tal design, i.e. would answer “500 get the drug 500 don’t”?

• We can answer this research question using a confidence


interval, which we know is always of the form

point estimate ± ME
• And we also know that ME = critical value × standard error of
the point estimate.

4
Inference on a proportion

What percent of all Americans have good intuition about experimen-


tal design, i.e. would answer “500 get the drug 500 don’t”?

• We can answer this research question using a confidence


interval, which we know is always of the form

point estimate ± ME
• And we also know that ME = critical value × standard error of
the point estimate.
Standard error of a sample proportion

r
p (1 − p)
SEp̂ =
n
4
Sample proportions are also nearly normally distributed

Central limit theorem for proportions

Sample proportions will be nearly normally distributed with mean


equal to the population proportion, p, and standard error equal to
q
p (1−p)
n .
 r 
p (1 − p)
p̂ ∼ N mean = p, SE =
 

n

But of course this is true only under certain conditions...


Any guesses?

5
Sample proportions are also nearly normally distributed

Central limit theorem for proportions

Sample proportions will be nearly normally distributed with mean


equal to the population proportion, p, and standard error equal to
q
p (1−p)
n .
 r 
p (1 − p)
p̂ ∼ N mean = p, SE =
 

n

But of course this is true only under certain conditions...


Any guesses?

Independent observations, at least 10 successes and 10 failures

5
Sample proportions are also nearly normally distributed

Central limit theorem for proportions

Sample proportions will be nearly normally distributed with mean


equal to the population proportion, p, and standard error equal to
q
p (1−p)
n .
 r 
p (1 − p)
p̂ ∼ N mean = p, SE =
 

n

But of course this is true only under certain conditions...


Any guesses?

Independent observations, at least 10 successes and 10 failures

Note: If p is unknown (most cases), we use p̂ in the calculation of the standard


5
error.
Back to experimental design...

The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Estimate (using a 95%
confidence interval) the proportion of all Americans who have good
intuition about experimental design?

6
Back to experimental design...

The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Estimate (using a 95%
confidence interval) the proportion of all Americans who have good
intuition about experimental design?

Given: n = 670, p̂ = 0.85. First, check conditions.

6
Back to experimental design...

The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Estimate (using a 95%
confidence interval) the proportion of all Americans who have good
intuition about experimental design?

Given: n = 670, p̂ = 0.85. First, check conditions.

1. Independence: The sample is random, and 670 < 10% of all


Americans, therefore we can assume that one respondent’s
response is independent of another.

6
Back to experimental design...

The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Estimate (using a 95%
confidence interval) the proportion of all Americans who have good
intuition about experimental design?

Given: n = 670, p̂ = 0.85. First, check conditions.

1. Independence: The sample is random, and 670 < 10% of all


Americans, therefore we can assume that one respondent’s
response is independent of another.
2. Success-failure: 571 people answered correctly (successes)
and 99 answered incorrectly (failures), both are greater than 10.

6
We are given that n = 670, p̂ = 0.85, we also justqlearned that the
p(1−p)
standard error of the sample proportion is SE = n . Which of
the below is the correct calculation of the 95% confidence interval?
q
0.85×0.15
(a) 0.85 ± 1.96 × 670
q
0.85×0.15
(b) 0.85 ± 1.65 × 670
0.85×0.15
(c) 0.85 ± 1.96 × √
670
q
571×99
(d) 571 ± 1.96 × 670

7
We are given that n = 670, p̂ = 0.85, we also justqlearned that the
p(1−p)
standard error of the sample proportion is SE = n . Which of
the below is the correct calculation of the 95% confidence interval?
q
0.85×0.15
(a) 0.85 ± 1.96 × 670 → (0.82, 0.88)
q
0.85×0.15
(b) 0.85 ± 1.65 × 670
0.85×0.15
(c) 0.85 ± 1.96 × √
670
q
571×99
(d) 571 ± 1.96 × 670

7
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

r
0.85 × 0.15
0.01 ≥ 1.96 × → Use p̂ from previous study
n

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

r
0.85 × 0.15
0.01 ≥ 1.96 × → Use p̂ from previous study
n
0.85 × 0.15
0.012 ≥ 1.962 ×
n

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

r
0.85 × 0.15
0.01 ≥ 1.96 × → Use p̂ from previous study
n
0.85 × 0.15
0.012 ≥ 1.962 ×
n
1.962 × 0.85 × 0.15
n ≥
0.012

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

r
0.85 × 0.15
0.01 ≥ 1.96 × → Use p̂ from previous study
n
0.85 × 0.15
0.012 ≥ 1.962 ×
n
1.962 × 0.85 × 0.15
n ≥
0.012
n ≥ 4898.04

8
Choosing a sample size

How many people should you sample in order to cut the margin of
error of a 95% confidence interval down to 1%.

ME = z? × SE

r
0.85 × 0.15
0.01 ≥ 1.96 × → Use p̂ from previous study
n
0.85 × 0.15
0.012 ≥ 1.962 ×
n
1.962 × 0.85 × 0.15
n ≥
0.012
n ≥ 4898.04 → n should be at least 4,899

8
What if there isn’t a previous study?

... use p̂ = 0.5

why?

9
What if there isn’t a previous study?

... use p̂ = 0.5

why?

• if you don’t know any better, 50-50 is a good guess

9
What if there isn’t a previous study?

... use p̂ = 0.5

why?

• if you don’t know any better, 50-50 is a good guess


• p̂ = 0.5 gives the most conservative estimate – highest
possible sample size

9
CI vs. HT for proportions

• Success-failure condition:
• CI: At least 10 observed successes and failures
• HT: At least 10 expected successes and failures, calculated
using the null value
• Standard error: q
p̂(1−p̂)
• CI: calculate using observed sample proportion: SE = n
q
p0 (1−p0 )
• HT: calculate using the null value: SE = n

10
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

11
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

H0 : p = 0.80 HA : p > 0.80

11
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

H0 : p = 0.80 HA : p > 0.80


r
0.80 × 0.20
SE = = 0.0154
670

11
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

H0 : p = 0.80 HA : p > 0.80


r
0.80 × 0.20
SE = = 0.0154
670
0.85 − 0.80
Z = = 3.25
0.0154

11
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

H0 : p = 0.80 HA : p > 0.80


r
0.80 × 0.20
SE = = 0.0154
670
0.85 − 0.80
Z = = 3.25
0.0154
p − value = 1 − 0.9994 = 0.0006 0.8
sample proportions
0.85

11
The GSS found that 571 out of 670 (85%) Americans answered the
question on experimental design correctly. Do these data provide
convincing evidence that more than 80% of Americans have a good
intuition about experimental design?

H0 : p = 0.80 HA : p > 0.80


r
0.80 × 0.20
SE = = 0.0154
670
0.85 − 0.80
Z = = 3.25
0.0154
p − value = 1 − 0.9994 = 0.0006 0.8
sample proportions
0.85

Since the p-value is low, we reject H0 . The data provide convincing


evidence that more than 80% of Americans have a good intuition
on experimental design.
11
11% of 1,001 Americans responding to a 2006 Gallup survey stated
that they have objections to celebrating Halloween on religious
grounds. At 95% confidence level, the margin of error for this sur-
vey is ±3%. A news piece on this study’s findings states: “More than
10% of all Americans have objections on religious grounds to cel-
ebrating Halloween.” At 95% confidence level, is this news piece’s
statement justified?

(a) Yes
(b) No
(c) Cannot tell

12
11% of 1,001 Americans responding to a 2006 Gallup survey stated
that they have objections to celebrating Halloween on religious
grounds. At 95% confidence level, the margin of error for this sur-
vey is ±3%. A news piece on this study’s findings states: “More than
10% of all Americans have objections on religious grounds to cel-
ebrating Halloween.” At 95% confidence level, is this news piece’s
statement justified?

(a) Yes
(b) No
(c) Cannot tell

12
Recap - inference for one proportion

• Population parameter: p, point estimate: p̂

13
Recap - inference for one proportion

• Population parameter: p, point estimate: p̂


• Conditions:
• independence
- random sample and 10% condition
• at least 10 successes and failures
- if not → randomization

13
Recap - inference for one proportion

• Population parameter: p, point estimate: p̂


• Conditions:
• independence
- random sample and 10% condition
• at least 10 successes and failures
- if not → randomization
q
p(1−p)
• Standard error: SE = n
• for CI: use p̂
• for HT: use p0

13
Difference of two proportions
Melting ice cap

Scientists predict that global warming may have big effects on the
polar regions within the next 100 years. One of the possible effects
is that the northern ice cap may completely melt. Would this bother
you a great deal, some, a little, or not at all if it actually happened?

(a) A great deal


(b) Some
(c) A little
(d) Not at all

14
Results from the GSS

The GSS asks the same question, below are the distributions of
responses from the 2010 GSS as well as from a group of
introductory statistics students at Duke University:

GSS Duke
A great deal 454 69
Some 124 30
A little 52 4
Not at all 50 2
Total 680 105

15
Parameter and point estimate

• Parameter of interest: Difference between the proportions of


all Duke students and all Americans who would be bothered a
great deal by the northern ice cap completely melting.

pDuke − pUS

16
Parameter and point estimate

• Parameter of interest: Difference between the proportions of


all Duke students and all Americans who would be bothered a
great deal by the northern ice cap completely melting.

pDuke − pUS

• Point estimate: Difference between the proportions of


sampled Duke students and sampled Americans who would
be bothered a great deal by the northern ice cap completely
melting.
p̂Duke − p̂US

16
Inference for comparing proportions

• The details are the same as before...

17
Inference for comparing proportions

• The details are the same as before...


• CI: point estimate ± margin of error

17
Inference for comparing proportions

• The details are the same as before...


• CI: point estimate ± margin of error
point estimate−null value
• HT: Use Z = SE to find appropriate p-value.

17
Inference for comparing proportions

• The details are the same as before...


• CI: point estimate ± margin of error
point estimate−null value
• HT: Use Z = SE to find appropriate p-value.
• We just need the appropriate standard error of the point
estimate (SEp̂Duke −p̂US ), which is the only new concept.

17
Inference for comparing proportions

• The details are the same as before...


• CI: point estimate ± margin of error
point estimate−null value
• HT: Use Z = SE to find appropriate p-value.
• We just need the appropriate standard error of the point
estimate (SEp̂Duke −p̂US ), which is the only new concept.

Standard error of the difference between two sample proportions

r
p1 (1 − p1 ) p2 (1 − p2 )
SE(p̂1 −p̂2 ) = +
n1 n2

17
Conditions for CI for difference of proportions

1. Independence within groups:


• The US group is sampled randomly and we’re assuming that
the Duke group represents a random sample as well.

18
Conditions for CI for difference of proportions

1. Independence within groups:


• The US group is sampled randomly and we’re assuming that
the Duke group represents a random sample as well.
• 105 < 10% of all Duke students and 680 < 10% of all
Americans.

18
Conditions for CI for difference of proportions

1. Independence within groups:


• The US group is sampled randomly and we’re assuming that
the Duke group represents a random sample as well.
• 105 < 10% of all Duke students and 680 < 10% of all
Americans.
We can assume that the attitudes of Duke students in the
sample are independent of each other, and attitudes of US
residents in the sample are independent of each other as well.

18
Conditions for CI for difference of proportions

1. Independence within groups:


• The US group is sampled randomly and we’re assuming that
the Duke group represents a random sample as well.
• 105 < 10% of all Duke students and 680 < 10% of all
Americans. Didn’t sample more than 10% = independent
We can assume that the attitudes of Duke students in the
sample are independent of each other, and attitudes of US
residents in the sample are independent of each other as well.
2. Independence between groups: The sampled Duke students
and the US residents are independent of each other.

18
Conditions for CI for difference of proportions

1. Independence within groups:


• The US group is sampled randomly and we’re assuming that
the Duke group represents a random sample as well.
• 105 < 10% of all Duke students and 680 < 10% of all
Americans.
We can assume that the attitudes of Duke students in the
sample are independent of each other, and attitudes of US
residents in the sample are independent of each other as well.
2. Independence between groups: The sampled Duke students
and the US residents are independent of each other.
3. Success-failure:
At least 10 observed successes and 10 observed failures in
the two groups. CLT

18
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 X1 454 X2
Not a great deal 36 226
Total 105 n1 680 n2

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
? p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
(p̂Duke − p̂US ) ± z × +
nDuke nUS

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
? p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
(p̂Duke − p̂US ) ± z × +
nDuke nUS

= (0.657 − 0.668)

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
? p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
(p̂Duke − p̂US ) ± z × +
nDuke nUS

= (0.657 − 0.668) ± 1.96

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
?
(p̂Duke − p̂US ) ± z × +
nDuke nUS
r
0.657 × 0.343 0.668 × 0.332
= (0.657 − 0.668) ± 1.96 × +
105 680

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
?
(p̂Duke − p̂US ) ± z × +
nDuke nUS
r
0.657 × 0.343 0.668 × 0.332
= (0.657 − 0.668) ± 1.96 × +
105 680
= −0.011 ±

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
?
(p̂Duke − p̂US ) ± z × +
nDuke nUS
r
0.657 × 0.343 0.668 × 0.332
= (0.657 − 0.668) ± 1.96 × +
105 680
= −0.011 ± 1.96 × 0.0497

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
?
(p̂Duke − p̂US ) ± z × +
nDuke nUS
r
0.657 × 0.343 0.668 × 0.332
= (0.657 − 0.668) ± 1.96 × +
105 680
= −0.011 ± 1.96 × 0.0497
= −0.011 ± 0.097

19
Construct a 95% confidence interval for the difference between the
proportions of Duke students and Americans who would be both-
ered a great deal by the melting of the northern ice cap (pDuke −pUS ).

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

r
p̂Duke (1 − p̂Duke ) p̂US (1 − p̂US )
?
(p̂Duke − p̂US ) ± z × +
nDuke nUS
r
0.657 × 0.343 0.668 × 0.332
= (0.657 − 0.668) ± 1.96 × +
105 680
= −0.011 ± 1.96 × 0.0497
= −0.011 ± 0.097
= (−0.108, 0.086)
19
Which of the following is the correct set of hypotheses for testing if
the proportion of all Duke students who would be bothered a great
deal by the melting of the northern ice cap differs from the propor-
tion of all Americans who do?

(a) H0 : pDuke = pUS


HA : pDuke , pUS
(b) H0 : p̂Duke = p̂US
HA : p̂Duke , p̂US
(c) H0 : pDuke − pUS = 0
HA : pDuke − pUS , 0
(d) H0 : pDuke = pUS
HA : pDuke < pUS

20
Which of the following is the correct set of hypotheses for testing if
the proportion of all Duke students who would be bothered a great
deal by the melting of the northern ice cap differs from the propor-
tion of all Americans who do?

(a) H0 : pDuke = pUS


HA : pDuke , pUS
(b) H0 : p̂Duke = p̂US
HA : p̂Duke , p̂US
(c) H0 : pDuke − pUS = 0
HA : pDuke − pUS , 0
(d) H0 : pDuke = pUS
HA : pDuke < pUS

Both (a) and (c) are correct.


20
Flashback to working with one proportion

• When constructing a confidence interval for a population


proportion, we check if the observed number of successes
and failures are at least 10.

np̂ ≥ 10 n(1 − p̂) ≥ 10

21
Flashback to working with one proportion

• When constructing a confidence interval for a population


proportion, we check if the observed number of successes
and failures are at least 10.

np̂ ≥ 10 n(1 − p̂) ≥ 10

• When conducting a hypothesis test for a population


proportion, we check if the expected number of successes
and failures are at least 10.

np0 ≥ 10 n(1 − p0 ) ≥ 10

21
Pooled estimate of a proportion

• In the case of comparing two proportions where H0 : p1 = p2 ,


there isn’t a given null value we can use to calculated the
expected number of successes and failures in each sample.

22
Pooled estimate of a proportion

• In the case of comparing two proportions where H0 : p1 = p2 ,


there isn’t a given null value we can use to calculated the
expected number of successes and failures in each sample.
• Therefore, we need to first find a common (pooled) proportion
for the two groups, and use that in our analysis.

22
Pooled estimate of a proportion

• In the case of comparing two proportions where H0 : p1 = p2 ,


there isn’t a given null value we can use to calculated the
expected number of successes and failures in each sample.
• Therefore, we need to first find a common (pooled) proportion
for the two groups, and use that in our analysis.
• This simply means finding the proportion of total successes
among the total number of observations.

Pooled estimate of a proportion

# of successes1 + # of successes2
p̂ =
n1 + n2

22
Calculate the estimated pooled proportion of Duke students and
Americans who would be bothered a great deal by the melting of
the northern ice cap. Which sample proportion (p̂Duke or p̂US ) the
pooled estimate is closer to? Why?

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

23
Calculate the estimated pooled proportion of Duke students and
Americans who would be bothered a great deal by the melting of
the northern ice cap. Which sample proportion (p̂Duke or p̂US ) the
pooled estimate is closer to? Why?

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

# of successes1 + # of successes2
p̂ =
n1 + n2

23
Calculate the estimated pooled proportion of Duke students and
Americans who would be bothered a great deal by the melting of
the northern ice cap. Which sample proportion (p̂Duke or p̂US ) the
pooled estimate is closer to? Why?

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

# of successes1 + # of successes2
p̂ =
n1 + n2
69 + 454
=
105 + 680

23
Calculate the estimated pooled proportion of Duke students and
Americans who would be bothered a great deal by the melting of
the northern ice cap. Which sample proportion (p̂Duke or p̂US ) the
pooled estimate is closer to? Why?

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

# of successes1 + # of successes2
p̂ =
n1 + n2
69 + 454 523
= =
105 + 680 785

23
Calculate the estimated pooled proportion of Duke students and
Americans who would be bothered a great deal by the melting of
the northern ice cap. Which sample proportion (p̂Duke or p̂US ) the
pooled estimate is closer to? Why?

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

# of successes1 + # of successes2
p̂ =
n1 + n2
69 + 454 523
= = = 0.666
105 + 680 785

23
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS

24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS
(0.657 − 0.668)
= q =
0.666×0.334
105 + 0.666×0.334
680
24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS
(0.657 − 0.668) −0.011
= q =
0.0495
0.666×0.334
105 + 0.666×0.334
680
24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS
(0.657 − 0.668) −0.011
= q = = −0.22
0.0495
0.666×0.334
105 + 0.666×0.334
680
24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS
(0.657 − 0.668) −0.011
= q = = −0.22
0.0495
0.666×0.334
105 + 0.666×0.334
680
p − value = 2 × P(Z < −0.22) 24
Do these data suggest that the proportion of all Duke students who
would be bothered a great deal by the melting of the northern ice
cap differs from the proportion of all Americans who do? Calcu-
late the test statistic, the p-value, and interpret your conclusion in
context of the data.

Data Duke US
A great deal 69 454
Not a great deal 36 226
Total 105 680
p̂ 0.657 0.668

(p̂Duke − p̂US )
Z = q
p̂(1−p̂) p̂(1−p̂)
nDuke + nUS
(0.657 − 0.668) −0.011
= q = = −0.22
0.0495
0.666×0.334
105 + 0.666×0.334
680
p − value = 2 × P(Z < −0.22) = 2 × 0.41 = 0.82 24
Recap - comparing two proportions

• Population parameter: (p1 − p2 ), point estimate: (p̂1 − p̂2 )

25
Recap - comparing two proportions

• Population parameter: (p1 − p2 ), point estimate: (p̂1 − p̂2 )


• Conditions:

25
Recap - comparing two proportions

• Population parameter: (p1 − p2 ), point estimate: (p̂1 − p̂2 )


• Conditions:
• independence within groups
- random sample and 10% condition met for both groups
• independence between groups
• at least 10 successes and failures in each group
- if not → randomization (Section 6.4)

25
Recap - comparing two proportions

• Population parameter: (p1 − p2 ), point estimate: (p̂1 − p̂2 )


• Conditions:
• independence within groups
- random sample and 10% condition met for both groups
• independence between groups
• at least 10 successes and failures in each group
- if not → randomization (Section 6.4)
q
p1 (1−p1 ) p2 (1−p2 )
• SE(p̂1 −p̂2 ) = n1 + n2
• for CI: use p̂1 and p̂2
• for HT:
# suc +#suc
• when H0 : p1 = p2 : use p̂pool = 1
n1 +n2
2

• when H0 : p1 − p2 = (some value other than 0): use p̂1 and p̂2
- this is pretty rare

25
Reference - standard error calculations

one sample two samples

r
s21 s22
mean SE = √s
n
SE = n1 + n2

q q
p(1−p) p1 (1−p1 ) p2 (1−p2 )
proportion SE = n SE = n1 + n2

26
Reference - standard error calculations

one sample two samples

r
s21 s22
mean SE = √s
n
SE = n1 + n2

q q
p(1−p) p1 (1−p1 ) p2 (1−p2 )
proportion SE = n SE = n1 + n2

• When working with means, it’s very rare that σ is known, so


we usually use s.

26
Reference - standard error calculations

one sample two samples

r
s21 s22
mean SE = √s
n
SE = n1 + n2

q q
p(1−p) p1 (1−p1 ) p2 (1−p2 )
proportion SE = n SE = n1 + n2

• When working with means, it’s very rare that σ is known, so


we usually use s.
• When working with proportions,
• if doing a hypothesis test, p comes from the null hypothesis
• if constructing a confidence interval, use p̂ instead
26
Chi-square test of GOF
Weldon’s dice

• Walter Frank Raphael Weldon (1860 - 1906),


was an English evolutionary biologist and a
founder of biometry. He was the joint
founding editor of Biometrika, with Francis
Galton and Karl Pearson.
• In 1894, he rolled 12 dice 26,306 times, and
recorded the number of 5s or 6s (which he
considered to be a success).
• It was observed that 5s or 6s occurred more often than
expected, and Pearson hypothesized that this was probably
due to the construction of the dice. Most inexpensive dice
have hollowed-out pips, and since opposite sides add to 7, the
face with 6 pips is lighter than its opposing face, which has
only 1 pip. 27
Labby’s dice

• In 2009, Zacariah Labby (U of


Chicago), repeated Weldon’s
experiment using a homemade
dice-throwing, pip counting
machine.
http:// www.youtube.com/ watch?v=
95EErdouO2w
• The rolling-imaging process took
about 20 seconds per roll.
• Each day there were ∼150 images to process manually.
• At this rate Weldon’s experiment was repeated in a little more
than six full days.
• Recommended reading:
http:// galton.uchicago.edu/ about/ docs/ labby09dice.pdf 28
Labby’s dice (cont.)

• Labby did not actually observe the same phenomenon that


Weldon observed (higher frequency of 5s and 6s).
• Automation allowed Labby to collect more data than Weldon
did in 1894, instead of recording “successes” and “failures”,
Labby recorded the individual number of pips on each die.

29
Expected counts

Labby rolled 12 dice 26,306 times. If each side is equally likely


to come up, how many 1s, 2s, · · · , 6s would he expect to have
observed?

1
(a) 6
12
(b) 6
26,306
(c) 6
12×26,306
(d) 6

30
Expected counts

Labby rolled 12 dice 26,306 times. If each side is equally likely


to come up, how many 1s, 2s, · · · , 6s would he expect to have
observed?

1
(a) 6
12
(b) 6
26,306
(c) 6

(d) 12×26,306
6 = 52, 612

30
Summarizing Labby’s results

The table below shows the observed and expected counts from
Labby’s experiment.

Outcome Observed Expected


1 53,222 52,612
2 52,118 52,612
3 52,465 52,612
4 52,338 52,612
5 52,244 52,612
6 53,285 52,612
Total 315,672 315,672

31
Summarizing Labby’s results

The table below shows the observed and expected counts from
Labby’s experiment.

Outcome Observed Expected


1 53,222 52,612
2 52,118 52,612
3 52,465 52,612
4 52,338 52,612
5 52,244 52,612
6 53,285 52,612
Total 315,672 315,672

Why are the expected counts the same for all outcomes but the
observed counts are different? At a first glance, does there appear
31
to be an inconsistency between the observed and expected counts?
Setting the hypotheses

Do these data provide convincing evidence of an inconsistency be-


tween the observed and expected counts?

32
Setting the hypotheses

Do these data provide convincing evidence of an inconsistency be-


tween the observed and expected counts?

H0 : There is no inconsistency between the observed and the


expected counts. The observed counts follow the same
distribution as the expected counts.

32
Setting the hypotheses

Do these data provide convincing evidence of an inconsistency be-


tween the observed and expected counts?

H0 : There is no inconsistency between the observed and the


expected counts. The observed counts follow the same
distribution as the expected counts.
HA : There is an inconsistency between the observed and the
expected counts. The observed counts do not follow the same
distribution as the expected counts. There is a bias in which
side comes up on the roll of a die.

32
Evaluating the hypotheses

• To evaluate these hypotheses, we quantify how different the


observed counts are from the expected counts.

33
Evaluating the hypotheses

• To evaluate these hypotheses, we quantify how different the


observed counts are from the expected counts.
• Large deviations from what would be expected based on
sampling variation (chance) alone provide strong evidence for
the alternative hypothesis.

33
Evaluating the hypotheses

• To evaluate these hypotheses, we quantify how different the


observed counts are from the expected counts. O-E

• Large deviations from what would be expected based on


sampling variation (chance) alone provide strong evidence for
the alternative hypothesis.
• This is called a goodness of fit test since we’re evaluating how
well the observed data fit the expected distribution.

33
Anatomy of a test statistic

• The general form of a test statistic is

point estimate − null value


SE of point estimate

34
Anatomy of a test statistic

• The general form of a test statistic is

point estimate − null value


SE of point estimate

• This construction is based on


1. identifying the difference between a point estimate and an
expected value if the null hypothesis was true, and
2. standardizing that difference using the standard error of the
point estimate.

34
Anatomy of a test statistic

• The general form of a test statistic is

point estimate − null value


SE of point estimate

• This construction is based on


1. identifying the difference between a point estimate and an
expected value if the null hypothesis was true, and
2. standardizing that difference using the standard error of the
point estimate.
These two ideas will help in the construction of an appropriate
test statistic for count data.

34
Chi-square statistic

When dealing with counts and investigating how far the observed
counts are from the expected counts, we use a new test statistic
called the chi-square (χ2 ) statistic.

35
Chi-square statistic

When dealing with counts and investigating how far the observed
counts are from the expected counts, we use a new test statistic
called the chi-square (χ2 ) statistic.

χ2 statistic

k
X (O − E)2
χ2 = where k = total number of cells
i=1
E

35
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

(52,465−52,612)2
3 52,465 52,612 52,612 = 0.41

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

(52,465−52,612)2
3 52,465 52,612 52,612 = 0.41

(52,338−52,612)2
4 52,338 52,612 52,612 = 1.43

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

(52,465−52,612)2
3 52,465 52,612 52,612 = 0.41

(52,338−52,612)2
4 52,338 52,612 52,612 = 1.43

(52,244−52,612)2
5 52,244 52,612 52,612 = 2.57

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

(52,465−52,612)2
3 52,465 52,612 52,612 = 0.41

(52,338−52,612)2
4 52,338 52,612 52,612 = 1.43

(52,244−52,612)2
5 52,244 52,612 52,612 = 2.57

(53,285−52,612)2
6 53,285 52,612 52,612 = 8.61

36
Calculating the chi-square statistic

(O−E)2
Outcome Observed Expected E

(53,222−52,612)2
1 53,222 52,612 52,612 = 7.07

(52,118−52,612)2
2 52,118 52,612 52,612 = 4.64

(52,465−52,612)2
3 52,465 52,612 52,612 = 0.41

(52,338−52,612)2
4 52,338 52,612 52,612 = 1.43

(52,244−52,612)2
5 52,244 52,612 52,612 = 2.57

(53,285−52,612)2
6 53,285 52,612 52,612 = 8.61

Total 315,672 315,672 24.73 36


Why square?

Squaring the difference between the observed and the expected


outcome does two things:

37
Why square?

Squaring the difference between the observed and the expected


outcome does two things:

• Any standardized difference that is squared will now be


positive.

37
Why square?

Squaring the difference between the observed and the expected


outcome does two things:

• Any standardized difference that is squared will now be


positive.
• Differences that already looked unusual will become much
larger after being squared.

37
Why square?

Squaring the difference between the observed and the expected


outcome does two things:

• Any standardized difference that is squared will now be


positive.
• Differences that already looked unusual will become much
larger after being squared.

When have we seen this before?

37
The chi-square distribution

• In order to determine if the χ2 statistic we calculated is


considered unusually high or not we need to first describe its
distribution.

38
The chi-square distribution

• In order to determine if the χ2 statistic we calculated is


considered unusually high or not we need to first describe its
distribution.
• The chi-square distribution has just one parameter called
degrees of freedom (df), which influences the shape, center,
and spread of the distribution.

38
The chi-square distribution

• In order to determine if the χ2 statistic we calculated is


considered unusually high or not we need to first describe its
distribution.
• The chi-square distribution has just one parameter called
degrees of freedom (df), which influences the shape, center,
and spread of the distribution.

Remember: So far we’ve seen three other continuous distributions:

- normal distribution: unimodal and symmetric with two parameters: mean and
standard deviation
- T distribution: unimodal and symmetric with one parameter: degrees of freedom
- F distribution: unimodal and right skewed with two parameters: degrees of freedom
or numerator (between group variance) and denominator (within group variance)

38
Which of the following is false?

Degrees of Freedom
2
4
9

0 5 10 15 20 25

As the df increases,

(a) the center of the χ2 distribution increases as well


(b) the variability of the χ2 distribution increases as well
(c) the shape of the χ2 distribution becomes more skewed (less
like a normal)

39
Which of the following is false?

Degrees of Freedom
2
4
9

0 5 10 15 20 25

As the df increases,

(a) the center of the χ2 distribution increases as well


(b) the variability of the χ2 distribution increases as well
(c) the shape of the χ2 distribution becomes more skewed (less
like a normal)

39
Finding areas under the chi-square curve

• p-value = tail area under the chi-square distribution (as usual)

40
Finding areas under the chi-square curve

• p-value = tail area under the chi-square distribution (as usual)


• For this we can use technology,or a chi-square probability
table.

40
Finding areas under the chi-square curve (cont.)

Estimate the shaded area (above the cutoff value of 10) under the
χ2 curve with df = 6.

41
Finding areas under the chi-square curve (cont.)

Estimate the shaded area (above the cutoff value of 10) under the
χ2 curve with df = 6.

> pchisq(q = 10, df = 6, lower.tail = FALSE)

[1] 0.124652

41
Finding areas under the chi-square curve (cont.)

Estimate the shaded area (above the cutoff value of 17) under the
χ2 curve with df = 9.

(a) 0.05
(b) 0.02
df = 9 (c) between 0.02 and 0.05
(d) between 0.05 and 0.1
0 17 (e) between 0.01 and 0.02

42
Finding areas under the chi-square curve (cont.)

Estimate the shaded area (above the cutoff value of 17) under the
χ2 curve with df = 9.

(a) 0.05
(b) 0.02
df = 9 (c) between 0.02 and 0.05
(d) between 0.05 and 0.1
0 17 (e) between 0.01 and 0.02

> pchisq(q = 17, df = 9, lower.tail = FALSE)

[1] 0.04871598

42
Finding areas under the chi-square curve (one more)

Estimate the shaded area (above 30) under the χ2 curve with df =
10.

(a) greater than 0.3


(b) between 0.005 and
0.001
df = 10
(c) less than 0.001
(d) greater than 0.001
0 30 (e) cannot tell using this
table

43
Finding areas under the chi-square curve (one more)

Estimate the shaded area (above 30) under the χ2 curve with df =
10.

(a) greater than 0.3


(b) between 0.005 and
0.001
df = 10
(c) less than 0.001
(d) greater than 0.001
0 30 (e) cannot tell using this
table

> pchisq(q = 30, df = 10, lower.tail = FALSE)

[1] 0.0008566412
43
Back to Labby’s dice

• The research question was: Do these data provide convincing


evidence of an inconsistency between the observed and
expected counts?

44
Back to Labby’s dice

• The research question was: Do these data provide convincing


evidence of an inconsistency between the observed and
expected counts?
• The hypotheses were:
H0 : There is no inconsistency between the observed and the
expected counts. The observed counts follow the same
distribution as the expected counts.
HA : There is an inconsistency between the observed and the
expected counts. The observed counts do not follow the same
distribution as the expected counts. There is a bias in which
side comes up on the roll of a die.

44
Back to Labby’s dice

• The research question was: Do these data provide convincing


evidence of an inconsistency between the observed and
expected counts?
• The hypotheses were:
H0 : There is no inconsistency between the observed and the
expected counts. The observed counts follow the same
distribution as the expected counts.
HA : There is an inconsistency between the observed and the
expected counts. The observed counts do not follow the same
distribution as the expected counts. There is a bias in which
side comes up on the roll of a die.
• We had calculated a test statistic of χ2 = 24.67.

44
Back to Labby’s dice

• The research question was: Do these data provide convincing


evidence of an inconsistency between the observed and
expected counts?
• The hypotheses were:
H0 : There is no inconsistency between the observed and the
expected counts. The observed counts follow the same
distribution as the expected counts.
HA : There is an inconsistency between the observed and the
expected counts. The observed counts do not follow the same
distribution as the expected counts. There is a bias in which
side comes up on the roll of a die.
• We had calculated a test statistic of χ2 = 24.67.
• All we need is the df and we can calculate the tail area (the
p-value) and make a decision on the hypotheses.
44
Degrees of freedom for a goodness of fit test

• When conducting a goodness of fit test to evaluate how well


the observed data follow an expected distribution, the degrees
of freedom are calculated as the number of cells (k) minus 1.

df = k − 1

45
Degrees of freedom for a goodness of fit test

• When conducting a goodness of fit test to evaluate how well


the observed data follow an expected distribution, the degrees
of freedom are calculated as the number of cells (k) minus 1.

df = k − 1

• For dice outcomes, k = 6, therefore

df = 6 − 1 = 5

45
Finding a p-value for a chi-square test

The p-value for a chi-square test is defined as the tail area above
the calculated test statistic.

p-value = P(χ2df =5 > 24.67)


df = 5

is less than 0.001

0 24.67

46
Conclusion of the hypothesis test

We calculated a p-value less than 0.001. At 5% significance level,


what is the conclusion of the hypothesis test?

(a) Reject H0 , the data provide convincing evidence that the dice
are fair.
(b) Reject H0 , the data provide convincing evidence that the dice
are biased.
(c) Fail to reject H0 , the data provide convincing evidence that the
dice are fair.
(d) Fail to reject H0 , the data provide convincing evidence that the
dice are biased.

47
Conclusion of the hypothesis test

We calculated a p-value less than 0.001. At 5% significance level,


what is the conclusion of the hypothesis test?

(a) Reject H0 , the data provide convincing evidence that the dice
are fair.
(b) Reject H0 , the data provide convincing evidence that the dice
are biased.
(c) Fail to reject H0 , the data provide convincing evidence that the
dice are fair.
(d) Fail to reject H0 , the data provide convincing evidence that the
dice are biased.

47
Turns out...

• The 1-6 axis is consistently shorter than the other two (2-5
and 3-4), thereby supporting the hypothesis that the faces
with one and six pips are larger than the other faces.
• Pearson’s claim that 5s and 6s appear more often due to the
carved-out pips is not supported by these data.
• Dice used in casinos have flush faces, where the pips are
filled in with a plastic of the same density as the surrounding
material and are precisely balanced.

48
Recap: p-value for a chi-square test

• The p-value for a chi-square test is defined as the tail area


above the calculated test statistic.
• This is because the test statistic is always positive, and a
higher test statistic means a stronger deviation from the null
hypothesis.

p−value

49
Conditions for the chi-square test

1. Independence: Each case that contributes a count to the table


must be independent of all the other cases in the table.

50
Conditions for the chi-square test

1. Independence: Each case that contributes a count to the table


must be independent of all the other cases in the table.
2. Sample size: Each particular scenario (i.e. cell) must have at
least 5 expected cases.

50
Conditions for the chi-square test

1. Independence: Each case that contributes a count to the table


must be independent of all the other cases in the table.
2. Sample size: Each particular scenario (i.e. cell) must have at
least 5 expected cases.
3. df > 1: Degrees of freedom must be greater than 1.

50
Conditions for the chi-square test

1. Independence: Each case that contributes a count to the table


must be independent of all the other cases in the table.
2. Sample size: Each particular scenario (i.e. cell) must have at
least 5 expected cases.
3. df > 1: Degrees of freedom must be greater than 1.

Failing to check conditions may unintentionally affect the test’s


error rates.

50
2009 Iran Election

There was lots of talk of election fraud in the 2009 Iran election.
We’ll compare the data from a poll conducted before the election
(observed data) to the reported votes in the election to see if the
two follow the same distribution.

Observed # of Reported % of
Candidate voters in poll votes in election
(1) Ahmedinajad 338 63.29%
(2) Mousavi 136 34.10%
(3) Minor candidates 30 2.61%
Total 504 100%

51
2009 Iran Election

There was lots of talk of election fraud in the 2009 Iran election.
We’ll compare the data from a poll conducted before the election
(observed data) to the reported votes in the election to see if the
two follow the same distribution.

Observed # of Reported % of
Candidate voters in poll votes in election
(1) Ahmedinajad 338 63.29%
(2) Mousavi 136 34.10%
(3) Minor candidates 30 2.61%
Total 504 100%
↓ ↓
observed expected
distribution
51
Hypotheses

What are the hypotheses for testing if the distributions of reported


and polled votes are different?

52
Hypotheses

What are the hypotheses for testing if the distributions of reported


and polled votes are different?

H0 : The observed counts from the poll follow the same distribution
as the reported votes.
HA : The observed counts from the poll do not follow the same
distribution as the reported votes.

52
Calculation of the test statistic

Observed # of Reported % of Expected # of


Candidate voters in poll votes in election votes in poll
(1) Ahmedinajad 338 63.29% 504 × 0.6329 = 319
(2) Mousavi 136 34.10% 504 × 0.3410 = 172
(3) Minor candidates 30 2.61% 504 × 0.0261 = 13
Total 504 100% 504

53
Calculation of the test statistic

Observed # of Reported % of Expected # of


Candidate voters in poll votes in election votes in poll
(1) Ahmedinajad 338 63.29% 504 × 0.6329 = 319
(2) Mousavi 136 34.10% 504 × 0.3410 = 172
(3) Minor candidates 30 2.61% 504 × 0.0261 = 13
Total 504 100% 504

(O1 − E1 )2 (338 − 319)2


= = 1.13
E1 319

53
Calculation of the test statistic

Observed # of Reported % of Expected # of


Candidate voters in poll votes in election votes in poll
(1) Ahmedinajad 338 63.29% 504 × 0.6329 = 319
(2) Mousavi 136 34.10% 504 × 0.3410 = 172
(3) Minor candidates 30 2.61% 504 × 0.0261 = 13
Total 504 100% 504

(O1 − E1 )2 (338 − 319)2


= = 1.13
E1 319
(O2 − E2 )2 (136 − 172)2
= = 7.53
E2 172

53
Calculation of the test statistic

Observed # of Reported % of Expected # of


Candidate voters in poll votes in election votes in poll
(1) Ahmedinajad 338 63.29% 504 × 0.6329 = 319
(2) Mousavi 136 34.10% 504 × 0.3410 = 172
(3) Minor candidates 30 2.61% 504 × 0.0261 = 13
Total 504 100% 504

(O1 − E1 )2 (338 − 319)2


= = 1.13
E1 319
(O2 − E2 )2 (136 − 172)2
= = 7.53
E2 172
(O2 − E2 )2 (30 − 13)2
= = 22.23
E2 13
53
Calculation of the test statistic

Observed # of Reported % of Expected # of


Candidate voters in poll votes in election votes in poll
(1) Ahmedinajad 338 63.29% 504 × 0.6329 = 319
(2) Mousavi 136 34.10% 504 × 0.3410 = 172
(3) Minor candidates 30 2.61% 504 × 0.0261 = 13
Total 504 100% 504

(O1 − E1 )2 (338 − 319)2


= = 1.13
E1 319
(O2 − E2 )2 (136 − 172)2
= = 7.53
E2 172
(O2 − E2 )2 (30 − 13)2
= = 22.23
E2 13
χdf =3−1=2
2
= 30.89 53
Conclusion

Based on these calculations what is the conclusion of the hypothe-


sis test?

(a) p-value is low, H0 is rejected. The observed counts from the


poll do not follow the same distribution as the reported votes.
(b) p-value is high, H0 is not rejected. The observed counts from
the poll follow the same distribution as the reported votes.
(c) p-value is low, H0 is rejected. The observed counts from the
poll follow the same distribution as the reported votes
(d) p-value is low, H0 is not rejected. The observed counts from
the poll do not follow the same distribution as the reported
votes.

54
Conclusion

Based on these calculations what is the conclusion of the hypothe-


sis test?

(a) p-value is low, H0 is rejected. The observed counts from the


poll do not follow the same distribution as the reported votes.
(b) p-value is high, H0 is not rejected. The observed counts from
the poll follow the same distribution as the reported votes.
(c) p-value is low, H0 is rejected. The observed counts from the
poll follow the same distribution as the reported votes
(d) p-value is low, H0 is not rejected. The observed counts from
the poll do not follow the same distribution as the reported
votes.

54
Wednesday, 29 Nov

Chi-square test of independence


Popular kids

In the dataset popular, students in grades 4-6 were asked whether


good grades, athletic ability, or popularity was most important to
them. A two-way table separating the students by grade and by
choice of most important factor is shown below. Do these data pro-
vide evidence to suggest that goals vary by grade?

4th

5th

6th
Grades Popular Sports Grades

4th 63 31 25
5th 88 55 33
Popular
6th 96 55 32
Sports

55
Chi-square test of independence

• The hypotheses are:


H0 : Grade and goals are independent. Goals do not vary by grade.
HA : Grade and goals are dependent. Goals vary by grade.

56
Chi-square test of independence

• The hypotheses are:


H0 : Grade and goals are independent. Goals do not vary by grade.
HA : Grade and goals are dependent. Goals vary by grade.
• The test statistic is calculated as
k
X (O − E)2
χ2df = where df = (R − 1) × (C − 1),
i=1
E

where k is the number of cells, R is the number of rows, and C


is the number of columns.

Note: We calculate df differently for one-way and two-way tables.

56
Chi-square test of independence
*normally, on the test the table has 2 rows, 2 columns

• The hypotheses are:


H0 : Grade and goals are independent. Goals do not vary by grade.
HA : Grade and goals are dependent. Goals vary by grade.
• The test statistic is calculated as
k
X (O − E)2
χ2df = where df = (R − 1) × (C − 1),
i=1
E

where k is the number of cells, R is the number of rows, and C


is the number of columns.

Note: We calculate df differently for one-way and two-way tables.


• The p-value is the area under the χ2df curve, above the
calculated test statistic.

56
Expected counts in two-way tables

Expected counts in two-way tables

(row total) × (column total)


Expected Count =
table total

57
Expected counts in two-way tables

Expected counts in two-way tables

(row total) × (column total)


Expected Count =
table total

Grades Popular Sports Total


th
4 63 31 25 119
5th 88 55 33 176
6th 96 55 32 183
Total 247 141 90 478

57
Expected counts in two-way tables

Expected counts in two-way tables

(row total) × (column total)


Expected Count =
table total

Grades Popular Sports Total


th
4 63 31 25 119
5th 88 55 33 176
6th 96 55 32 183
Total 247 141 90 478

119 × 247
Erow 1,col 1 = = 61
478

57
Expected counts in two-way tables

Expected counts in two-way tables

(row total) × (column total)


Expected Count =
table total

Grades Popular Sports Total


th
4 63 31 25 119
5th 88 55 33 176
6th 96 55 32 183
Total 247 141 90 478

119 × 247 119 × 141


Erow 1,col 1 = = 61 Erow 1,col 2 = = 35
478 478

57
Expected counts in two-way tables

What is the expected count for the highlighted cell?

Grades Popular Sports Total


4th 63 31 25 119
5th 88 55 33 176
6th 96 55 32 183
Total 247 141 90 478

176×141
(a) 478
119×141
(b) 478
176×247
(c) 478
176×478
(d) 478

58
Expected counts in two-way tables

What is the expected count for the highlighted cell?

Grades Popular Sports Total


4th 63 31 25 119
5th 88 55 33 176
6th 96 55 32 183
Total 247 141 90 478

(a) 176×141 → 52
478
119×141 more than expected # of 5th graders
(b) 478 have a goal of being popular
176×247
(c) 478
176×478
(d) 478

58
Calculating the test statistic in two-way tables

Expected counts are shown in blue next to the observed counts.

Grades Popular Sports Total


4th 63 61 31 35 25 23 119
5th 88 91 55 52 33 33 176
6th 96 95 55 54 32 34 183
Total 247 141 90 478

59
Calculating the test statistic in two-way tables

Expected counts are shown in blue next to the observed counts.

Grades Popular Sports Total


4th 63 61 31 35 25 23 119
5th 88 91 55 52 33 33 176
6th 96 95 55 54 32 34 183
Total 247 141 90 478

X (63 − 61)2 (31 − 35)2 (32 − 34)2


χ2 = + + ··· + = 1.3121
61 35 34

59
Calculating the test statistic in two-way tables

Expected counts are shown in blue next to the observed counts.

Grades Popular Sports Total


4th 63 61 31 35 25 23 119
5th 88 91 55 52 33 33 176
6th 96 95 55 54 32 34 183
Total 247 141 90 478

X (63 − 61)2 (31 − 35)2 (32 − 34)2


χ2 = + + ··· + = 1.3121
61 35 34
df = (R − 1) × (C − 1) = (3 − 1) × (3 − 1) = 2 × 2 = 4

59
Calculating the p-value

Which of the following is the correct p-value for this hypothesis test?

χ2 = 1.3121 df = 4

(a) more than 0.3


(b) between 0.3 and 0.2
df = 4
(c) between 0.2 and 0.1

0 1.3121
(d) between 0.1 and 0.05
(e) less than 0.001

60
Calculating the p-value

Which of the following is the correct p-value for this hypothesis test?

χ2 = 1.3121 df = 4

(a) more than 0.3


(b) between 0.3 and 0.2
df = 4
(c) between 0.2 and 0.1

0 1.3121
(d) between 0.1 and 0.05
(e) less than 0.001

60
Conclusion

Do these data provide evidence to suggest that goals vary by


grade?

H0 : Grade and goals are independent. Goals do not vary by


grade.
HA : Grade and goals are dependent. Goals vary by grade.

61
Conclusion

Do these data provide evidence to suggest that goals vary by


grade?

H0 : Grade and goals are independent. Goals do not vary by


grade.
HA : Grade and goals are dependent. Goals vary by grade.

Since p-value is high, we fail to reject H0 . The data do not provide


convincing evidence that grade and goals are dependent. It
doesn’t appear that goals vary by grade.

61

You might also like