0% found this document useful (0 votes)
6 views31 pages

Biostats Lecture 9 Difference of Two Proportions v2

The document covers statistical methods for comparing two proportions, including hypothesis testing and confidence intervals. It discusses the importance of pooled estimates and variance calculations, as well as the chi-square test for goodness of fit and testing independence in contingency tables. Practical examples are provided to illustrate the application of these statistical techniques in analyzing categorical data.

Uploaded by

Cesar Calderon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views31 pages

Biostats Lecture 9 Difference of Two Proportions v2

The document covers statistical methods for comparing two proportions, including hypothesis testing and confidence intervals. It discusses the importance of pooled estimates and variance calculations, as well as the chi-square test for goodness of fit and testing independence in contingency tables. Practical examples are provided to illustrate the application of these statistical techniques in analyzing categorical data.

Uploaded by

Cesar Calderon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

IPHS 405

Inference for Categorical Data


Comparing Difference of Two Proportions
Biostats Lecture 9

Hua Yun Chen


Lester Arguelles

1
Biostats Lecture 9 (Diez Chapter 6.2)
Difference of Two Proportions (6.2)
Sampling Distribution of the Difference of Two Proportions (6.2.1)
Confidence Intervals for p1-p2 (6.2.2)
Hypothesis Tests for the Difference of Two Proportions (6.2.3)
More on 2-Proportions Hypothesis Tests (6.2.4)
Examining the Standard Error Formula (6.2.5)

2
Hypothesis testing about two
proportions
Baby Sex Proportion with Smoker Mother and
Nonsmoker Mother

Consider variables: sex_baby and smoke.


Test of equality of two population
proportions
1. The probability of baby is a boy when
mother does not smoke.

2. The probability of baby is a girl when mother


smokes.

3. The null hypothesis is: the two probabilities


are equal.
Estimate of the difference of two proportions
from the sample

Mother Female Male Total


smoke
Nonsmoker 49 51 100
Smoker 19 31 50
Total 68 82 150
The Pooled Estimate Of A Proportion
Is Needed For HT
In the case of comparing two proportions where H0: p1 = p2, there
isn't a given null value we can use to calculate the expected
number of successes and failures in each sample.
Therefore, we first need to find a common (pooled) proportion
for the two groups, and use that in our analysis.
This simply means finding the proportion of total successes
among the total number of observations.
Pooled estimate of a proportion

7
Variance estimates under the null
hypothesis
1. The variance of under the null hypothesis

2. The variance of

3. The variance of
Steps for testing the hypothesis
1. Find the point estimate for

2. Find the variance estimate and standard error


for ,

3. Find test statistic,


Steps for testing the hypothesis
(continuing)
4. Find the p-value (2-sided)

5. Compare p-value with type I error

6. Make a decision.
Fail to reject (accept) the hypothesis.
Confidence interval comparing
two proportions
Comparing two population
proportions
1. When two proportions may not be equal, we
are interested in finding the difference:

2. The first step is to obtain the corresponding


quantity from the sample (a point estimator),

3. The next step is to find variance of .


Variances Estimate
a). The variance of

b). The variance of

c). The variance of


Comparing two population
proportions (continuing)
4. Find the standard error for ,

5. Determine the margin of error. For 95%


confidence interval,

6. Find the confidence interval,


Comparing two population
proportions (continuing)
6. Find the confidence interval,

7. Interpretation of the confidence interval: With


95% confidence that is within to .

Practice: What is the 99% confidence interval


for ?
Comparing two population
proportions (continuing)
6. Find the confidence interval,

7. Interpretation of the confidence interval: With


95% confidence that is within to .

Practice: What is the 99% confidence interval


for ?
Conditions Necessary for Normal
Approximation
Independence between groups The subjects in the birth weight
study are sampled independently.

Success-failure At least 10 observed successes and 10 observed


failures in each group

17
Chi-square test of goodness of
fit.
Goodness of fit test for one-way
table.
1. The test for a proportion is for a binary
distribution.
2. For a categorical variable of more than two
categories, test of goodness of fit can be
done.
3. The test of goodness of fit is to examine if
the sample data follows a give distribution.
4. Such a test statistic is a chi-square
distributed.
Example on the mother age of birth
distribution

1. Mother’s age at birth is categorized into


three intervals less than 25, between 25-35,
and 35+.
2. The frequency table
Age range <25 years >=25 years, <35 >=35 years Total
years
Counts 64 65 21 150

Frequency 0.427 0.433 0.140 1.0

Hypothetical 0.4 0.4 0.2 1.0


Population freq.
Expected 150*0.4=60 150*0.4=60 150*0.2=30 150
frequency
Chi-square test of goodness of fit

1. Test statistic

2. For the example

3. P-value=chisq.dist(3.383,2,TRUE)=0.184.
Fails to reject the null hypothesis.
Testing independence (No
association) in contingency table
Test of independence
(no association)
Question: Is baby sex associated with whether
mother smokes? No association implies

Mother smoke BS= BS= Total


Female Male
MS=Nonsmoker 49(45.33) 51(54.67) 100(0.6667)
MS=Smoker 19(22.66) 31(27.33) 50(0.3333)
Total 68(0.4533) 82(0.5467) 150(1)
Steps for Testing of independence

Step 1. Find the margin distribution estimates


Mother smoke BS= BS= Total
Female Male
MS=Nonsmoker 100(0.6667)
MS=Smoker 50(0.3333)
Total 68(0.4533) 82(0.5467) 150(1)
Steps for Testing of independence

Step 2. Find the joint cell probability under


Mother smoke BS= BS= Total
Female Male
MS=Nonsmoker 0.3022 0.3645 0.6667
MS=Smoker 0.1511 0.1822 0.3333
Total 0.4533 0.5467 1

Do the same for the rest of cells.


Steps for Testing of independence

Step 3. Find the expected cell counts under


Mother smoke BS= BS= Total
Female Male
MS=Nonsmoker 45.33 54.67 100
MS=Smoker 22.67 27.33 50
Total 68 82 150

Do the same for the rest of cells.


Steps for Testing of independence

Step 4. Compare the expected cell counts with


the observed cell counts under
Mother smoke BS= BS= Total
Female Male
MS=Nonsmoker 49(45.33)[0.2971] 51(54.67)[0.2467] 100
MS=Smoker 19(22.67)[0.5941] 31(27.33)[0.4982] 50
Total 68 82 150

Calculate for each cell.


Steps for Testing of independence

Step 5. Compute the chi-square statistics.


Mother smoke BS= BS= Total
Female Male
MS=Nonsmoker 0.2971 0.2467 0.5438
MS=Smoker 0.5941 0.4982 1.0923
Total 0.8912 0.7449 1.6361

Calculate over all cells.


Steps for Testing of independence

Step 6. Determine the p-value.

This can be done in Excel as follows.

CHISQ.DIST(statistic, degree of freedom, cumulative).


Degree of freedom is determined by
Compare with test
1. Result of the test of independence (no
association) is usually similar to that of test
of two proportions in a 2x2 table.

2. The advantage of test of independence is


that it can be directly applied to 2x3 table,
3x2 table, 3x3 table, and any JxK table.

3. For JxK table, the degree of freedom for chi-


square distribution is .
Practice
The following table gives results of a study investigating the
low birth weight and the race of the mother.
Race Low birth Normal birth
weight weight
Black 13 18
White 25 80
Other 27 43

Test the hypothesis that there is no association between


race and birth weight.

You might also like