Difference of Proportion
Difference of Proportion
With this test we can test significance of difference between proportions of two
populations
Formulae:
1. If Population proportions 𝑃1 and 𝑃2 are known, then test statistic is
𝑝 −𝑝 𝑝 −𝑝
𝑧 = 𝑃 (1−𝑃1 ) 𝑃2 (1−𝑃 ) = 𝑃 𝑄1 𝑃2 𝑄
1 1 2 2 1 1 2 2
√ + √ 𝑛1
+ 𝑛
𝑛1 𝑛2 2
Where 𝑝1 = sample proportion of first sample
𝑝2 = sample proportion of of second sample
𝑛1 = sample size of first sample
𝑛2 = sample size of second sample
𝑃1 = Proportion of first population
𝑃2 = Proportion of second population
𝑄1 = 1 − 𝑃1
𝑄2 = 1 − 𝑃2
𝑃1 𝑄1 𝑃2 𝑄2
In this case standard error is 𝑆. 𝐸.𝑝1 −𝑝2 = √ +
𝑛1 𝑛2
1 1
𝑆. 𝐸.𝑝1 −𝑝2 = √𝑝𝑞 ( + )
𝑛1 𝑛2
3. 95% confidence interval for difference between population proportions is
given by
𝑃1 (1−𝑃1 ) 𝑃2 (1−𝑃2 )
|𝑝1 − 𝑝2 | ± 1.96 𝑆. 𝐸.𝑝1 −𝑝2 = |𝑝1 − 𝑝2 | ± 1.96√ +
𝑛1 𝑛2
4. 99% confidence interval for difference between population proportions is
given by
𝑃1 (1−𝑃1 ) 𝑃2 (1−𝑃2 )
|𝑝1 − 𝑝2 | ± 2.58 𝑆. 𝐸.𝑝1 −𝑝2 = |𝑝1 − 𝑝2 | ± 2.58√ +
𝑛1 𝑛2
5. 100% confidence interval for difference between population proportions is
given by
𝑃1(1 − 𝑃1) 𝑃2(1 − 𝑃2)
|𝑝1 − 𝑝2 | ± 3 𝑆. 𝐸.𝑝1 −𝑝2 = |𝑝1 − 𝑝2 | ± 3√ +
𝑛1 𝑛2
Before an increase in excise duty on tea , 400 people out of sample of 500
were found to be tea drinkers. After an increase in duty, 400 people were
tea drinkers in a sample of 600 people. Using standard error of proportion
, state whether there is significant decrease in consumption of tea at 5%
level of significance. Take value of Z at 5% level of significance as
1.645.
Sol. Here
Sample size of first sample = 𝑛1 = 500
Sample size of second sample = 𝑛2 = 600
No. of tea drinkers in first sample = 𝑋1 = 400
No. of tea drinkers in second sample = 𝑋2 = 400
𝑋 400
sample proportion of first sample = 𝑝1 = 1 = = 0.8
𝑛1 500
𝑋2 400
sample proportion of of second sample = 𝑝2 = = 0.67
𝑛2 600
Let 𝑃1 and 𝑃2 be population proportions of persons who are regular tea drinkers
before and after increase in duty respectively.
𝐻0 : 𝑃1 = 𝑃2 i.e. the difference between proportions is not significant and there
is no significant decrease in consumption of tea after increase in duty.
(One-tailed test)
𝑞 = 1 − 𝑝 = 1 − 0.7272 = 0.2728
𝑝1 −𝑝2 0.8−0.67 0.13
𝑧= = =
1 1 1 1 √(0.1984)(0.002+0.0016)
√𝑝𝑞(𝑛 +𝑛 ) √0.7272(0.2728)( + )
1 2 500 600
0.13 0.13 0.13
= = = = 4.8788
√(0.1984)(0.0036) √0.00071 0.026646
𝑦𝑖𝑒𝑙𝑑𝑠
→ |𝑧| = 4.8788
Critical value of z at 5 % level of significance for one tailed test is 1.645
Since |𝑧| = 4.8788 > 1.96, so we reject 𝐻0 & accept 𝐻1 and therefore there is
significant decrease in consumption of tea after increase in duty.
(Two-tailed test)
𝑛1𝑝1 + 𝑛2 𝑝2 80 + 65
𝑝𝑜𝑜𝑙𝑒𝑑 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 = 𝑝 = =
𝑛1 + 𝑛2 400 + 420
145
= = 0.1768
820
𝑞 = 1 − 𝑝 = 1 − 0.1768 = 0.8232
𝑝1 −𝑝2 0.2−0.1547 0.0453
𝑧= = =
1 1 1 1 √(0.1455)(0.0025+0.0023)
√𝑝𝑞(𝑛 +𝑛 ) √0.1768(0.8232)( + )
1 2 400 420
0.0453 0.0453 0.0453
= = = = 1.7378
√(0.1455)(0.0048) √0.00069 0.026267
𝑦𝑖𝑒𝑙𝑑𝑠
→ |𝑧| = 1.7378
Critical value of z at 5 % level of significance for two tailed test is 1.96
Since |𝑧| = 1.7378 < 1.96, so we accept 𝐻0 i.e. there is no significant
difference in the defaulter rate for two classes of tax payers.
In two large populations there are 30% and 25% fair haired people
respectively. Is this difference likely to be hidden in samples of 1200 and
900 respectively?
Sol. Here
𝑃1 − 𝑃2 𝑃1 − 𝑃2
𝑧= =
𝑃1(1 − 𝑃1 ) 𝑃2 (1 − 𝑃2) 𝑃1𝑄1 𝑃2𝑄2
√ + √ +
𝑛1 𝑛2 𝑛1 𝑛2
0.3 − 0.25 0.05 0.05
= =
√0.000175 + 0.000208
√0.3(0.7) + 0.25(0.75) √ 0.21 + 0.1875
1200 900 1200 900
0.05 0.05
= = = 2.5549
√0.000383 0.01957
Critical value of z at 5 % level of significance for one tailed test is 1.96
Since |𝑧| = 2.5549 > 1.96, so we reject 𝐻0 & accept 𝐻1 and conclude that the
difference is not likely to be hidden in samples.
2. https://www.khanacademy.org/math/ap-statistics/two-sample-inference/two-
sample-z-test-proportions/v/hypothesis-test-for-difference-in-proportions-example
3. https://www.khanacademy.org/math/ap-statistics/two-sample-inference/two-
sample-z-test-proportions/v/hypothesis-test-for-difference-in-proportions