0% found this document useful (0 votes)
119 views18 pages

Unit-3 PROPORTIONS

The document discusses estimating proportions from sample data and constructing confidence intervals for population proportions. It provides formulas for calculating large sample confidence intervals for a proportion p based on the sample proportion x/n. Several examples are worked through, including constructing 95% and 99% confidence intervals for proportions given values of x and n. The document also introduces hypothesis testing for a single proportion p, discussing the test statistic and critical regions for tests comparing p to a hypothesized value p0.

Uploaded by

sujeen killa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views18 pages

Unit-3 PROPORTIONS

The document discusses estimating proportions from sample data and constructing confidence intervals for population proportions. It provides formulas for calculating large sample confidence intervals for a proportion p based on the sample proportion x/n. Several examples are worked through, including constructing 95% and 99% confidence intervals for proportions given values of x and n. The document also introduces hypothesis testing for a single proportion p, discussing the test statistic and critical regions for tests comparing p to a hypothesized value p0.

Uploaded by

sujeen killa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Inferences concerning Proportions

Estimations of Proportions:

The estimation of a proportion is the number of times, X, that an appropriate event


occurs in n trails, occasions, or observations. The point estimator of the population
X
proportion, itself, is usually the sample proportion p  , namely, the proportion of
n
the time that the event actually occurs.

Large sample confidence interval for proportion ‘p’ is

x x x x
1   1  
x n n x n n
 z /2  p   z /2
n n n n

Where the degree of confidence is (1-α)100%.

x
The proportion of success p 
n

Problem 1: If x = 36 of n = 100 persons interviewed are familiar with the tax


incentives for installing certain energy-saving devices, construct a 95% confidence
interval for the corresponding true proportion.

Solution: Given that x =36 and n = 100 and confidence is 95%

Then α=0.05 and z /2  z0.025  1.96

Using above values in the confidence interval for p,

x x x x
1   1  
x n n x n n
 z /2  p   z /2
n n n n

36  36  36  36 
 1   1 
36 100  100  36 100  100 
 1.96  p  1.96
100 100 100 100
0.266  p  0.454
p 1  p 
Note that the maximum error of estimate E  z /2
n

x
Here p 
n

Problem 2: In a sample survey conducted in a large city, 136 of 400 persons answered
yes to the question of whether their city’s public transportation is adequate. With 99%
x 136
confidence, what can we say about the maximum error, if   0.34 is used as
n 400
an estimate of the corresponding true proportion?

x 136
Solution: Since   0.34 and confidence is 99%, then α = 0.01 and
n 400
z /2  z0.005  2.575 .

p 1  p 
Then maximum error of estimate E  z /2
n

E  2.575
 0.34  0.66   0.061
400

Note: To find the sample size when sample proportion is known is


2
z 
n  p 1  p    /2  and
 E 
2
1z 
When sample proportion p is unknown, n    /2  (take p = 1/ 2)
4 E 

Problem 3: What is the size of the smallest sample required to estimate an unknown
proportion of customers who would pay for an additional service, to within a
maximum error of 0.06 with at least 95% confidence?

Solution: From the given data maximum error of estimate E = 0.06,

Confidence is 95%, α= 0.05 and hence zα/2 = z0.025= 1.96


2
1z 
Then sample size n    /2 
4 E 
2
1  1.96 
   266.77  267
4  0.06 

Problem 4: In a random sample of 200 claims filed against an insurance company


writing collision insurance on cars, 84 exceeded $3,500. Construct a 95% confidence
interval for the true proportion of claims filed against this insurance company that
exceed $3,500, using the large sample confidence interval formula.

Solution: Given that x =84 and n = 200 and confidence is 95%

Then α=0.05 and z /2  z0.025  1.96

Using above values in the confidence interval for p,

x x x x
1   1  
x n n x n n
 z /2  p   z /2
n n n n

84  84  84  84 
 1   1 
84 200  200  84 200  200 
 1.96  p  1.96
200 200 200 200
0.3516  p  0.4200

Problem 5: In a random sample of 200 claims filed against an insurance company


writing collision insurance on cars, 84 exceeded $3,500. What we say with 99%
confidence about the maximum error if we use the sample proportion as an estimate
of the true proportion of claim field against this insurance company.

p 1  p 
Solution: we know that maximum error of estimate E  z /2
n

Here sample size n = 200, x = 84 and confidence is 99%.

Hence α = 1- 0.99 = 0.01

Then z /2  z0.005  2.575

p 1  p 
The maximum error of estimate is E  z /2
n
84  84 
1  
200  200 
 2.575
200
 0.08987

Problem 6: In a random sample of 400 industrial accidents, it was found that 231
were due at least partially to unsafe working conditions. Construct a 99% confidence
interval for the corresponding true proportion using the large sample confidence
interval formula.

Solution: Large sample confidence interval for p is

x x x x
1   1  
x n n x n n
 z /2  p   z /2
n n n n

Where the degree of confidence is (1-α)100%.

From the given data n = 400, x = 231, α = 1-0.99= 0.01, z /2  z0.005  2.575

Then 99% confidence interval is

231  231  231  231 


 1  1  
231 400  400  231 400  400 
 2.575  p  2.575
400 400 400 400

0.5775  0.0636  p  0.5775  0.0636


0.5139  p  0.6411

Problem 7: In a random sample of 90 sections of pipe in a chemical plant, 15 showed


signs of serious corrosion. Construct a 95% confidence interval for the true proportion
of pipe sections showing signs of serious corrosion, using the large sample confidence
interval forumula.

Solution: Large sample confidence interval for p is

x x x x
1   1  
x n n x n n
 z /2  p   z /2
n n n n

Where the degree of confidence is (1-α)100%.


From the given data n = 90, x = 15, α = 1-0.95= 0.05, z /2  z0.025  1.96

Then 99% confidence interval is

15  15  51  15 
 1  1  
15 90  90  15 90  90 
 1.96  p  1.96
90 90 90 90
0.1667  0.0770  p  0.1667  0.0770
0.0897  p  0.2437

Hypothesis concerning One Proportion

Here we test the null hypothesis p  p0 against one of the alternative hypothesis
p  p0 , p  p0 , or p  p0 with the use of the statistic

X  np0
Z
n p0 1  p0 

Which is a random variable having approximately the standard normal distribution.

Critical region for Testing p  p0


Alternative Hypothesis Reject null hypothesis if:
p  p0 Z   z
p  p0 Z  z
p  p0 Z   z /2 or Z  z /2

Problem 8: Transceivers provide wireless communication among electronic


components of consumer products. Responding to a need for a fast, low-cost test of
Bluetooth-capable transceivers, engineers developed a product test at the wafer level.
In one set of trails with 60 devices selected from different wafer lots, 48 devices
passed. Test the null hypothesis p = 0.70 against the alternative hypothesis p > 0.70
at the 0.05 level of significance.

Solution: Null hypothesis: p =0.70


Alternative hypothesis : p > 0.70

Level of significance α = 0.05

Then z0.05=1.645

X  np0
The null hypothesis is rejected if Z > 1.645 where the test statistic Z 
np0 1  p0 

From the given x =48, n =60, and p0=0.70

X  np0 48  60  0.70 
Then Z    1.69
np0 1  p0  60  0.70  0.30 

Since z = 1.69 is greater than 1.645, the null hypothesis is rejected. So, we accept
alternative hypothesis. That is p > 0.70 is accepted.

Problem 9: A manufacturer of submersible pumps claims that at most 30% of the


pumps require repairs within the first 5 years of operation. If a random sample of 120
of these pumps includes 47 which required repairs within the first 5 year, test the null
hypothesis p = 0.30against the alternative hypothesis p > 0.30 at the 0.05 level of
significance.

Solution: Null hypothesis: p =0.30

Alternative hypothesis : p > 0.30

Level of significance α = 0.05

Then z0.05=1.645

X  np0
The null hypothesis is rejected if Z > 1.645 where the test statistic Z 
np0 1  p0 

From the given x =47, n =120, and p0=0.30

X  np0 47  120  0.30 


Then Z    2.1913
np0 1  p0  120  0.30  0.70 

Since z = 2.1913 is greater than 1.645, the null hypothesis is rejected. So, we accept
alternative hypothesis. That is p > 0.30 is accepted.
Problem 10: The performance of a computer is observed over a period of 2 years to
check the claim that the probability is 0.20 that its downtime will exceed 5 hours in
any given week. Testing the null hypothesis p = 0.20 against the alternative hypothesis
p ≠ 0.20, what can we conclude at the level of significance α= 0.05, if there we only
11 weeks in which the downtime of the computer exceeded 5 hours?

Solution: Null hypothesis: p =0.20

Alternative hypothesis : p ≠ 0.20

Level of significance α = 0.05

Then z0.025=1.96

The null hypothesis is rejected if Z <-1.96 or Z>1.96 where the test statistic
X  np0
Z
np0 1  p0 

From the given x =11, n =104, and p0=0.20

X  np0 11  104  0.20 


Then Z    2.391
np0 1  p0  105  0.20  0.80 

Since z = -2.391 is less than -1.96, the null hypothesis is rejected. So, we accept
alternative hypothesis. That is p ≠ 0.30 is accepted.

Problem 11: To check on an ambulance service’s claim that at least 40% of its calls
are life-threatening emergencies, a random sample was taken from its files, and it was
found that only 49 of 150 calls were life-threatening emergencies. Can the null
hypothesis p = 0.40 be rejected against the alternative hypothesis

P < 0.40 if the probability of a Type-I error is to be at most 0.01?

Solution: Null hypothesis: p  0.40

Alternative hypothesis : p < 0.40

Level of significance α ≤0.05


Here we test the null hypothesis p  0.40 against the alternative hypothesis p < 040
at the level of significance α= 0.05

Then z0.05=1.645

X  np0
The null hypothesis is rejected if Z <-1.645 where the test statistic Z 
np0 1  p0 

From the given x =49, n =150, and p0=0.40

X  np0 49  150  0.40 


Then Z    1.8333
np0 1  p0  150  0.40  0.60 

Since z = -1.8333 is less than -1.645, the null hypothesis is rejected. So, we accept
alternative hypothesis. That is p <0.40 is accepted.

Problem 12: In a random sample of 600 cars making a right turn at a certain
intersections, 157 pulled into the wrong lane. Test the null hypothesis that actually
30% of all drivers make this mistake at the given intersection, using the alternative
hypothesis p ≠ 0.30 and the level of significance

(a) α = 0.05 (b) α = 0.01

Solution: Null hypothesis: p  0.30

Alternative hypothesis : p  0.30

Here we test the null hypothesis p  0.30 against the alternative hypothesis p ≠ 0.30
The null hypothesis is rejected if Z < -Zα/2 or Z > Zα/2 where the test statistic
X  np0
Z
np0 1  p0 

From the given x = 600, n =157, and po  0.30

(i) at the level of significance α= 0.05

Then z0.025=1.96
X  np0 157  600  0.30 
Then Z    2.049
np0 1  p0  600  0.30  0.70 

Since z = -2.0491 is less than -1.96, the null hypothesis is rejected. So, we accept
alternative hypothesis. That is p ≠0.30 is accepted.

(ii) at the level of significance α= 0.01

Then z0.005=2.575

X  np0 157  600  0.30 


Then Z    2.049
np0 1  p0  600  0.30  0.70 

Since z = -2.0491 is between -2.575 and 2.575l, the null hypothesis is accepted. That
is p = 0.30 is accepted.

Conclusion: The null hypothesis is rejected with 95% confidence and accepted with
99% confidence.

HYPOTHESIS CONCERING SEVERAL PROPOPRTIONS


Many engineering problems concern a random variable that follows the binomial
distribution. For example, consider a production process that manufactures items that
are classified as either acceptable or defective. Modeling the occurrence of defectives
with the binomial distribution is usually reasonable when the binomial parameter p
represents the proportion of defective items produced. Consequently, many
engineering decision problems involve hypothesis testing about p.
Suppose that we are interested in testing whether two or more binomial populations
have the same parameter p. Let us consider k different binomial populations whose
parameters are respectively p1, p2,…,pk. Now we are interested in testing the null
hypothesis p1 = p2 =…= pk = p against the alternative hypothesis that these population
proportions are not all equal. To perform a suitable large sample test of this
hypothesis, we require independent random samples of size n 1, n2,…,nk from k
different populations. The number of successes and failures in each of these k samples
are given by the following table:

Sample 1 Sample 2 Sample k Total


Successes x1 x2 … xk x
Failures n1 – x1 n2 – x2 … nk – xk n-x
Total n1 n2 … nk n

In the above table x represents the total number of successes, n – x represents the total
number of failures and n the total number of trails. The entry in the cell belonging to
the ith row and jth column is called the observed frequency oij with i= 1, 2 and j = 1,
2, …, k. Let us denote the observed proportion of success by p . So, the value of p is
x
given by p  .
n

Hence the expected number of successes and failures for the jth sample are estimated
by the following formulae:
x
e1 j  n j p   n j
n
x nx
e2 j  n j (1  p)  n j (1 )  n j ( )
n n

The quantities e1j and e2j are called the expected cell frequencies for j =1, 2, …, k.
The test statistic for test concerning difference among proportions is given by
2 k (oij  eij ) 2
  
2

i 1 j 1 eij

Decision: Reject the null hypothesis if the value of  exceeds   with k – 1 degrees
2 2

of freedom.
Problem 13: Samples of three kinds of materials subjected t extreme temperature
changes, produced the results shown in the following table:
Material A Material B Material C Total
Crumbled 41 27 22 90
Remained intact 79 53 78 210
Total 120 80 100 300

Use the 0.05 level of significance to test whether, under the stated conditions, the
probability of crumbing is the same for the three kinds of materials.
Solution:
Null Hypothesis, H0: p1 = p2 = p3
Alternative Hypothesis, H1: p1 , p2 and p3 are not all equal.
Level of significance, α = 0.05
90
p
300

The expected frequencies for the cells are given as follows:


e11 = 120x90/300 = 36
e12 = 80x90/300 = 24
e13 = 90 – (36 + 24) = 30 ( since, 36 + 24 + e13 =90)
e21 = 120 – 36 = 84 ( since, 36 + e21 = 120)
e22 = 80 – 24 = 56
e23 = 100 – 30 = 70
2 3 (oij  eij ) 2
Test statistic,   
2

i 1 j 1 eij

= (41 -36)2/36 + (27 – 24)2/24 + (22 – 30)2/30 + (79 - 84)2/84


+ (53 – 56)2/56 + (78 – 70)2/70
= 4.575
 02.05 (3  1) d . f  5.991

Since   4.575 does not exceeds  0.05 (3 – 1) d.f = 5.991, we can’t reject the null
2 2

hypothesis at the 0.05 level of significance. Hence the probability of crumbling is the
same for the three kinds of material.
Problem 14: Four methods are under development for making disks of a
superconductivity material. Fifty disks are made by each method and they are checked
for superconductivity when cooled with liquid nitrogen.

Method 1 Method 2 Method 3 Method 4 Total


Superconductors 31 42 22 25 120
Failures 19 8 28 25 80
Total 50 50 50 50 200

Perform a chi square test with α = 0.05 to test whether the probability of
superconductivity is the same for the four kinds of methods.
Solution:
Null Hypothesis, H0: p1 = p2 = p3= p4
Alternative Hypothesis, H1: p1 , p2 , p3 and p are not all equal.
Level of significance, α = 0.05
120
p
200

The expected frequencies for the cells are given as follows:


e11 = 50x120/200 = 30
e12 = 50x120/200 = 30
e13 = 50x120/200 = 30
e14 = 50x120/200 = 30
e21 = 50 – 30 = 20
e22 = 50 – 30 = 20
e23 = 50 – 30 = 20
e24 = 50 – 30 = 20
2 4 (oij  eij ) 2
Test statistic,   
2

i 1 j 1 eij

= (31 -30)2/30 + (42 – 30)2/30 + (22 – 30)2/30 + (25 - 30)2/30


+ (19 – 20)2/20 + (8 – 20)2/20 + (28 – 20)2/20 + (25 – 20)2/20
= 19.50
02.05 (4  1) d. f  7.815

Since   19.50 exceeds  0.05 (4 – 1) d.f = 7.815, we reject the null hypothesis at the
2 2

0.05 level of significance. Hence the probability of superconductivity is not the same
for four methods.
Problem 15: The following data come from a study in which random samples of the
employees of three government agencies were asked questions about their pension
plan:
Agency 1 Agency 2 Agency 3 Total
For the pension plan 67 84 109 260
Against the pension plan 33 66 41 140
Total 100 150 150 400

Use the 0.01 level of significance to test the null hypothesis that the actual proportions
of employees favoring the pension plan are the same.
Solution:
Null Hypothesis, H0: p1 = p2 = p3
Alternative Hypothesis, H1: p1, p2 and p3 are not all equal.
Level of significance, α = 0.01
260
p
400

The expected frequencies for the cells are given as follows:


e11 = 100x260/400 = 65
e12 = 150x260/400 = 97.5
e13 = 260 – (65 + 97.5) = 97.5
e21 = 100 – 65 = 35
e22 = 150 – 97.5 = 52.5
e23 = 150 – 97.5 = 52.5
2 3 (oij  eij ) 2
Test statistic,   
2

i 1 j 1 eij

= (67 - 65)2/65 + (84 – 97.5)2/97.5 + (109 – 97.5)2/97.5


+ (33 - 35)2/35 + (66 – 52.5)2/52.5 + (41 – 52.5)2/52.5
= 9.39
 02.01 (3  1) d . f  9.210
Since   9.39 exceeds  0.01 (3 – 1) d.f = 9.210, we have to reject the null hypothesis
2 2

at the 0.01 level of significance. Hence the probability for favoring the pension plan
by the three agencies is not the same.

Hypothesis concerning two proportions

This is a particular case of several proportions with k = 2. In this case we proceed as


per the given below procedure:
Null Hypothesis, H0: p1 = p2
Alternative Hypothesis, H1 : p1 < p2 (or) p1 > p2 (or) p1 ≠ p2

Given x1, x2, n1 and n2


Level of significance = α
X1 X 2

n1 n2 X  X2
Test Statistic, Z  with p  1
1 1 n1  n2
p(1  p ) (  )
n1 n2

Critical Regions for testing the Null Hypothesis, H0: p1 = p2


Alternate Hypothesis Reject null hypothesis if
p1 < p 2 Z < - zα
p1 > p 2 Z > zα
p1 ≠ p2 Z < - zα/2 or Z > zα/2

Finally, we have to write the decision that either accepting the null hypothesis or
rejecting the null hypothesis.
Also the (1 – α)100% large sample confidence interval for the difference of two
proportions is given by

x1  x1  x2  x2 
1   1  
x1 x2 n1  n1  n2  n2 
  z / 2 
n1 n2 n1 n2
Problem 16: A study shows that 16 of 200 tractors produced on one assembly line
required extensively adjustments before they could be shipped. While the same was
true for 14 of 400 tractors produced on another assembly line. At the 0.01 level of
significance, does this support the claim that the second production line does superior
work? Also construct a 95% confidence interval for p1 - p2.
Solution:
Null Hypothesis, H0: p1 = p2
Alternative Hypothesis, H1 : p1 > p2

Given x1 = 16, x2 = 14, n1 = 200 and n2 = 400


Level of significance, α = 0.01
X1 X 2

n1 n2 X1  X 2
Test Statistic, Z  with p 
 1 1  n1  n2
p(1  p )   
 200 400 
16 14

200 400 16  14
= with p   0.05
 1 1  200  400
0.05(1  0.05)   
 200 400 
0.045
=
(0.0475)(0.0075)

= 2.384
From table 3, Z0.01 = 2.33

Critical Regions for testing the Null Hypothesis, H0: p1 = p2


Alternate Hypothesis Reject null hypothesis if
p1 > p 2 Z > zα

Decision: Since Z = 2.384 exceeds Z0.01 = 2.33, we have to reject the null hypothesis.
So, accept the alternative hypothesis. That is the true proportion of tractors requiring
extensive adjustments is greater for first assembly line than for the second.
A 95% confidence interval for the difference of two proportions is given by
x1  x1  x2  x2 
1   1  
 x1 x2  n1  n1  n2  n2 
    z / 2 
 n1 n2  n1 n2

0.08(1 0.08) 0.035(1 0.035)


 ( (0.08  0.035)  z 0.025 
200 400

0.08(1  0.08) 0.035(1  0.035)


 (0.08  0.035) 1.96 
200 400

 0.045 1.96 0.000368 0.00008443

 0.003 p1  p2  0.087

Problem 17: Photolithography plays a central role in manufacturing integrated circuits


made on thin disks of silicon. Prior to a quality improvement program, too many
rework operations were required. In a sample of 200 units, 26 required reworking of
the photolithographic step. Following training in the use of pareto charts and other
approaches to identify significant problems, improvements were made. A new sample
of size 200 had only 12 that needed rework. Is this sufficient evidence at the 0.01 level
of significance that the improvements have been effective in reducing rework?
Solution:
Null Hypothesis, H0: p1 = p2
Alternative Hypothesis, H1 : p1 > p2
Given x1 = 26, x2 = 12, n1 = 200 and n2 = 200
Level of significance, α = 0.01
X1 X 2

n1 n2 X1  X 2
Test Statistic, Z  with p 
1 1  n1  n2
p (1  p )   
 n1 n2 

26 12

200 200 26  12
= with p   0.095
 1 1  200  200
0.095(1  0.095)   
 200 200 
0.07
=
(0.2932)(0.1)
= 2.3873
Z0.01 = 2.33

Critical Regions for testing the Null Hypothesis, H0: p1 = p2


Alternate Hypothesis Reject null hypothesis if
p1 > p 2 Z > zα

Decision: Since Z = 2.3873 exceeds Z0.01 = 2.33, we have to reject the null hypothesis.
So, accept the alternative hypothesis.
Problem 18: The owner of a machine shop must decide which of two snack-vending
machines to install in his shop. If each machine is tested for 250 times and the first
machine fails to work(neither delivers the snack nor returns the money) 13 times and
the second machine fails to work 7 times, test at the 0.05 level of significance whether
the difference between the corresponding sample proportions is significant.
Solution:
Null Hypothesis, H0: p1 = p2
Alternative Hypothesis, H1 : p1 ≠ p2
Given x1 = 13, x2 = 7, n1 = 250 and n2 = 250
Level of significance, α = 0.05
X1 X 2

n1 n2 X1  X 2
Test Statistic, Z  with p 
1 1 n1  n2
p(1  p ) (  )
n1 n2

13 7

250 250 13  7
= with p   0.04
 1 1  250  250
0.04(1  0.04 )   
 250 250 
0.024
=
(0.1959)(0.0.0844)

= 1.369
Z0.025 = 1.96

Critical Regions for testing the Null Hypothesis, H0: p1 = p2


Alternate Hypothesis Reject null hypothesis if
p1 ≠ p2 Z < - zα/2 or Z > zα/2

Decision: Since Z = 1.369 does not exceed Z0.025 = 1.96, we can’t reject the null
hypothesis. So, accept the null hypothesis.

Home work: A study showed that 64 of 180 persons who saw a photocopying
machine advertised during the telecast of a baseball game and 75 of 180 other persons
who saw it advertised on a variety show remembered the brand name 2 hours later.
Use the 0.05 level of significance whether the difference between the corresponding
sample proportions is significant?

You might also like