0% found this document useful (0 votes)
149 views16 pages

Fstats ch2 PDF

(i) Calculate a 95% confidence interval for the mean using the sample standard deviation. The interval is 10.41 ± 1.54 or (8.87, 11.95). (ii) Degrees of freedom for a sample of size 11 is 10. The t-value for a 95% confidence interval with 10 degrees of freedom is 2.228. The confidence interval using t is 10.41 ± 1.67 or (8.74, 12.08). (iii) A 90% confidence interval using t with 10 degrees of freedom is 10.41 ± 1.37 or (9.04, 11.78).

Uploaded by

Sachin Dhiman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views16 pages

Fstats ch2 PDF

(i) Calculate a 95% confidence interval for the mean using the sample standard deviation. The interval is 10.41 ± 1.54 or (8.87, 11.95). (ii) Degrees of freedom for a sample of size 11 is 10. The t-value for a 95% confidence interval with 10 degrees of freedom is 2.228. The confidence interval using t is 10.41 ± 1.67 or (8.74, 12.08). (iii) A 90% confidence interval using t with 10 degrees of freedom is 10.41 ± 1.37 or (9.04, 11.78).

Uploaded by

Sachin Dhiman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter 2 Estimation

2 ESTIMATION

Objectives
After studying this chapter you should
• be able to calculate confidence intervals for the mean of a
normal distribution with unknown variance;
• be able to calculate confidence intervals for the variance and
standard deviation of a normal distribution;
• be able to calculate approximate confidence intervals for a
proportion;
• be able to calculate approximate confidence intervals for the
mean of a Poisson distribution.

2.0 Introduction
In Chapter 9 of the text Statistics the idea of a confidence interval
was introduced. Confidence intervals are used when we want to
estimate a population parameter from a sample. The parameter
may be estimated by a single value (a point estimate) but it is
usually preferable to estimate it by an interval which will give
some indication of the amount of uncertainty attached to the
estimate. In Statistics, estimation of the mean of a normal
population with known standard deviation was considered.
If x is the mean of a random sample of size n from a normal
distribution with mean µ and standard deviation σ there is a
σ
probability of 0.95 that x lies within 1.96 of µ . If this is the
n
case the interval
σ
x ± 1.96
n
will contain µ . This interval is called a 95% confidence interval
for µ .
If further samples of size n were taken and the calculation
repeated, different intervals would be calculated. 95% of these 0.025 0.025
0.95
intervals would contain µ , but 5% would not.
Note: although µ is unknown, it does not vary, it is the intervals σ
µ + 1.96
σ
µ − 1.96 µ x
that vary. It is possible to calculate 99% or even 99.9% n n
confidence intervals which would be wider than the 95% interval
but it is not possible to calculate 100% confidence intervals.

21
Chapter 2 Estimation

Example
The length of time a bus takes to travel from Chorlton to All
Saints in the morning rush hour is normally distributed with
standard deviation 4 minutes. A random sample of 6 journeys
took 23, 19, 25, 34, 24 and 28 minutes. Find

(i) a 95% confidence interval for the mean journey time,

(ii) a 99% confidence interval for the mean journey time.

Solution
153
The sample mean is = 25.5
6 0.025 0.025
0.95

(i) 95% confidence interval is


4 –1.96 1.96
25.5 ± 1.96 × i.e 25.5 ± 3.20 or ( 22.3, 28. 7)
6

(ii) 99% confidence interval is 0.005 0.99 0.005


4
25.5 ± 2.576 × i.e. 25.5 ± 4.21 or ( 21.3, 29. 7)
6
–2.576 2.576

2.1 Confidence interval for mean:


standard deviation unknown
The example above of the bus journey times is somewhat
unrealistic. How was it known that the bus journey times were
normally distributed with a standard deviation of 4 minutes? If so
much was known about the distribution how was it that the mean
was unknown?

Probably the journey times for similar journeys had been studied
and the standard deviation found to be about four minutes. The
statement that the standard deviation was 'known' was probably
something of an exaggeration. The same argument applies to the
normal distribution. However, as you are dealing with the sample
mean, no great error will result from assuming a normal
distribution unless the distribution is extremely unusual.

If the standard deviation is completely unknown or if you do not


wish to accept a rough estimate, there is still a way forward.
Provided the sample has more than one member, an estimate of
the standard deviation may be made from the sample.

In the case of the bus journey times σ̂ = 5.09 , this is most easily

22
Chapter 2 Estimation

found by using the σ n −1 or sn −1 button on a calculator.


Alternatively the formula

( x − x )2
σ̂ 2 = ∑ (n − 1)
or equivalent can be used.

σ
For a 95% confidence interval, σ known, the formula x ± 1.96
n
was used. Now the known standard deviation σ will be replaced by
an estimate σ̂ . There will be some uncertainty in this estimate
since, if you started again and took a different sample of the same
size, the estimate of σ̂ would almost certainly be different. It
therefore seems reasonable that to allow for this extra uncertainty,
the interval should be widened by increasing the figure of 1.96
which came from tables of the normal distribution. How much you
need to increase it by has fortunately been calculated for you and is
tabulated in tables of the t distribution.

The required value of t will, however, depend on the sample size. If


you had a random sample of 1000 bus journeys then you would be
able to make a very accurate estimate of the population standard
deviation and there would be no real need to change the
figure 1.96. However, in this case with a sample of 6 there is quite a
lot of uncertainty in the estimate and you may need to make quite a
large change. If the sample had been of size 2 there would have
been a huge amount of uncertainty and a very large change may have
been necessary.

The amount of uncertainty is measured by the degrees of freedom.


Degrees of freedom are a concept related to the mathematical
definition of the chi-squared distribution. For your purposes they
may be thought of as a measure of the number of pieces of
information you have to estimate the standard deviation. It is
impossible to use a sample of size one to make an estimate of the
standard deviation. The standard deviation is a measure of spread
and one item can tell nothing about how spread out a distribution is.
A sample of size two gives one piece of information about the spread
and a sample of size three gives two pieces. In general, a sample of
size n gives n − 1 pieces of information and an estimate made from
such a sample is said to be based on n − 1 degrees of freedom.

In the example there were 6 bus journey times and so the estimate of
5.09 for the standard deviation is based on 5 degrees of freedom.
To find a 95% confidence interval you therefore require the 0.025 0.95 0.025
upper and lower 0.025 tails of the t distribution with 5 degrees
of freedom, denoted t 5 . As with the standard normal distribution,
the t distribution is symmetrical about zero and the required values –2.571 2.571 t5
are ± 2.571.

23
Chapter 2 Estimation

A 95% confidence interval for the mean, making no assumption


about the standard deviation, is given by

5.09
25.5 ± 2.571 × i.e. 25.5 ± 5.34 or ( 20.2, 30.8)
6
Note: For the use of the t distribution to be valid the data must be
normally distributed. However, small deviations from
normality will not seriously affect the results.

In general, the formula is

σ̂
x ± tn −1
n

Example
The resistances (in ohms) of a random sample from a batch of
resistors were

2314 2456 2389 2361 2360 2332 2402

Assuming that the sample is from a normal distribution calculate

(i) a 95% confidence interval for the mean,

(ii) a 90% confidence interval for the mean.

Solution
The data gives x = 2373. 4 and σ̂ = 47. 4.

(i) t6, 0.025 = 2. 447, so the


0.025 0.95 0.025
95% confidence interval for the mean is given by

47. 4 –2.447 2.447 t6


2373. 4 ± 2. 447 ×
7

⇒ 2373. 4 ± 43.8

⇒ (2330, 2417) .
(ii) t6, 0.05 = 1.943, giving
0.05 0.90 0.05
90% confidence interval for the mean as
47. 4
2373. 4 ± 1.943 ×
7 –1.943 1.943 t6

⇒ 2373. 4 ± 34.8
⇒ (2339, 2408) .

24
Chapter 2 Estimation

Activity 1

The following data are random observations from a normal


distribution with mean 10.

6.86 11.53 12.41 12.08 12.80 10.42 8.99 9.55 8.23 5.84

7.59 5.96 10.14 10.12 10.22 10.42 11.83 8.73 11.57 11.83

10.76 9.93 10.63 7.94 12.44 12.49 9.63 9.45 13.40 10.78

13.44 11.85 13.62 13.24 12.56 10.56 10.77 8.51 11.65 9.36

8.12 11.88 11.68 7.36 7.07 10.04 9.55 12.97 10.85 8.58

8.27 9.22 11.36 9.43 8.80 9.07 7.66 13.16 8.34 7.12

3.49 13.04 13.16 11.48 8.30 10.01 10.29 11.78 13.18 8.18

10.00 12.27 14.18 9.91 9.62 7.48 8.50 10.53 13.06 6.74

6.05 9.96 7.51 10.19 9.07 9.29 6.01 12.02 10.04 10.64

9.74 8.23 9.45 5.41 9.68 10.64 6.77 10.76 8.10 10.33

8.34 11.61 9.72 11.24 12.84 6.10 10.78 8.27 8.52 7.42

8.91 8.52 10.66 14.06 9.37 10.44 11.81 9.87 9.78 10.44

10.82 10.10 7.68 11.87 7.49 9.99 8.54 4.65 5.37 8.83

15.00 10.02 9.41 8.16 9.54 9.32 6.15 12.59 12.24 13.02

9.80 8.61 8.92 8.86 11.92 13.01 14.11 11.57 10.46 11.27

8.35 8.95 9.12 7.20 11.20 13.42 13.46 12.80 10.99 10.33

14.31 7.72 9.88 10.57 13.20 11.90 8.48 9.41 7.76 10.35

8.78 9.45 11.48 10.96 7.68 9.26 14.29 8.35 6.80 8.29

8.83 10.72 10.02 11.80 13.56 13.00 10.79 7.51 8.15 10.14

11.02 8.49 9.82 8.97 9.86 7.74 11.81 9.87 10.77 9.18

Starting at any point in the table take a sample of size 3 and


calculate σ̂ . Calculate an 80% confidence interval for the
mean using the t-distribution.

Now take the next sample of 3 and repeat the calculation. Carry
on until you have calculated at least 20, and preferably more,
intervals. If possible work with a group so that the labour of
calculation may be divided up between you.

What proportion of your intervals contain 10, the population


mean? Is this approximately the proportion you would have
expected?

25
Chapter 2 Estimation

Recalculate the intervals using z-values, i.e. calculate, for each


sample
σ̂
x ± 1.282.
3
What proportion of these intervals contain 10? Which set of
intervals gave results more in line with your expectations?

Exercise 2A
1. Samples of a high temperature lubricant were 3. As part of a research study on pattern recognition
tested and the temperature ( ° C) at which they a random sample of students on a design course
ceased to be effective were as follows: were asked to examine a picture and see if they
235 242 235 240 237 234 239 237 could recognise a word. The picture contained
the word 'technology' written backwards. The
Calculate a 95% confidence for the mean. times, in seconds, taken to recognise the word
2. In a study aimed at improving the design of bus were as follows:
cabs the functional arm reach of a random sample 55, 28, 79, 54, 87, 61, 62, 68, 38
of bus drivers was measured. The results, in mm,
were as follows: Calculate
701, 642, 651, 700, 672, 674, 656, 649 (a) a 95% confidence interval for the mean,
Calculate a 95% confidence interval for the mean. (b) a 99% confidence interval for the mean.

2.2 Confidence interval for the


standard deviation and the
variance
In Section 2.1 the standard deviation of a population of bus
journey times was estimated from a sample of bus journey times.
As was stated, the estimate is subject to uncertainty as, if
another sample of the same size were taken, a different estimate
would almost certainly result. Provided the data comes from a
normal distribution it is possible to estimate the population
standard deviation with a confidence interval rather than using a
single figure (or point estimate).

To do this you need to use the fact that for a sample of size n
from a normal distribution

( x − x )2
∑ σ2

is distributed as

χ n2−1 (chi-squared with n − 1 degrees of freedom).

26
Chapter 2 Estimation

∑ ( x − x ) is equal to ( n − 1)σ̂ 2 and this is usually the easiest


2

way of calculating it.

In the case of the bus journey times a sample of size 6 gave


σ̂ 2 = 5.09 2 = 25.9 .
0.025 0.025
If you find the upper and lower 0.025 tails of χ 2 , there will be 0.95

( x − x )2
a probability of 0.95 that ∑ σ2
will lie between them.
0.831 12.833

A 95% confidence interval for the variance is defined by

25.9
0.831 < 5 × < 12.833
σ2

1
⇒ 0.006415 < < 0.0990965
σ2

⇒ 10.09 < σ 2 < 155.9

and this is a 95% confidence interval for the variance.

Although the variance has a very important role to play in the


mathematical theory of statistics, for practical applications the
standard deviation is a much more useful statistic. A 95%
confidence interval for the standard deviation is found simply
by taking the square root of the two limits,

i.e. 3.2 < σ < 12.5

Note that the interval is not symmetrical about the point


estimate which was 5.09. Clearly the interval would not make
sense unless it contained the point estimate and this is a useful
check on the calculation. However, as the standard deviation
cannot be negative but has no upper limit, it would not be
expected that the point would lie exactly in the middle.

Is it possible to calculate 100% confidence interval for the


standard deviation?

Example
In processing grain in the brewing industry, the percentage
extract recovered is measured. A particular brewery introduces
a new source of grain and the percentage extract on eleven
separate days is as follows:

95.2, 93.1, 93.5, 95.9, 94.0, 92.0, 94.4, 93.2, 95.5, 92.3, 95.4

27
Chapter 2 Estimation

(a) Regarding the sample as a random sample from a normal


population, calculate
(i) a 90% confidence interval for the population variance,
(ii) a 90% confidence interval for the population mean.
(b) The previous source of grain gave daily percentage extract
figures which were normally distributed with mean 94.2 and
standard deviation 2.5. A high percentage extract is desirable
but the brewery manager also requires as little day to day
variation as possible. Without further calculation, compare
the two sources of grain. (AEB)

Solution
For this data

n = 11 x = 94.045 σ̂ = 1.34117

(a)
(i) 90% confidence interval for variance is given by

1.341172
3.94 < 10 × < 18.307 0.05 0.05
σ2
0.90
1
⇒ 0.2190 < 2 < 1.0178
σ 3.440
3.940 18.307

⇒ 0.98 < σ 2 < 4.57

(ii) 90% confidence interval for mean is given by


1.34117 0.05 0.05
94.045 ± 1.812 × 0.90
11
⇒ 94.045 ± 0. 733 –1.812 1.812 t10

⇒ (93.31, 94. 78)

(b) The mean of the previous source of grain was 94.2. This lies
in the middle of the confidence interval calculated for the
mean of the new source of grain. There is therefore no
evidence that the means differ.
The standard deviation of the previous source of grain was
2.5 and hence the variance was 2.52 = 6.25 .
This is above the upper limit of the confidence interval for
the variance of the new source of grain. This suggests that
the new source gives less variability.
Combining these two conclusions suggests that the new
source is preferable to the previous source.

28
Chapter 2 Estimation

Activity 2
Take a sample of size 6 from the data in Activity 1. Calculate
an 80% confidence interval for the standard deviation.

Take further samples of size 6 and repeat the calculation at least


20 times. Work in a group, if possible, so that the calculations
may be shared.

The data in Activity 1 came from a normal distribution with


standard deviation 2. What proportion of your intervals contain
the value 2? Is this proportion similar to the proportion you
would expect?

Exercise 2B
1. Using the data in Questions 1, 2 and 3 of Assuming a normal distribution calculate a 95%
Exercise 2A, calculate 95% confidence intervals confidence interval for the standard deviation.
for the population standard deviations. A greengrocer claimed that the method of
2. The external diameter, in cm, of a random determining the vitamin C content was extremely
sample of piston rings produced on a particular unreliable and that the observed variability was
machine were more due to errors in the determination rather
9.91, 9.89, 10.12, 9.98, 10.09, than to actual differences between lemons. To
9.81, 10.01, 9.99, 9.86 check this 7 independent determinations were
made of the vitamin C content of the same
Calculate a 95% confidence interval for the lemon. The results were as follows
standard deviation. Assume normal
distribution. 1.21, 1.22, 1.21, 1.23, 1.24, 1.23, 1.22

Do your results support the manufacturer's claim Assuming a normal distribution, calculate a 90%
that the standard deviation is 0.06 cm? confidence interval for the standard deviation of
the determinations. Does your result support the
3. The vitamin C content of a random sample of 5 greengrocer's claim?
lemons was measured. The results in 'mg per
10 g' were
1.04, 0.95, 0.63, 1.62, 1.11

2.3 Confidence interval for


proportions
An insurance company receives a large number of claims for
storm damage. A manager wished to estimate the proportion of
these claims which were for less than £500. He examined a
random sample of 120 claims and found that 18 of them were
for less than £500.

Obviously the best estimate of the proportion of all claims


18
which are for less than £500 is = 0.15 .
120

29
Chapter 2 Estimation

To calculate a confidence interval for the proportion we must


observe that, r, the number of claims for less than £500 in a
sample of size n, will be an observation from a binomial
distribution. This will have mean np and variance np (1 − p ).
From Statistics you know that a binomial distribution can be
approximated by a normal distribution if n is large and np is
reasonably large.

In this case n = 120 and p is estimated by 0.15. Hence r can be


approximated by a normal distribution with variance
120 × 0.15 × 0.85 = 15.3 and standard deviation 15.3 = 3.912 .

r
The proportion of claims for less than £500 is estimated by .
n

This has variance

np(1 − p) p(1 − p )
= .
n2 n

In this case the proportion can be estimated by a normal


distribution with variance

0.15 × 0.85
= 0.0010625
120

and standard deviation 0.03260.

Hence an approximate 95% confidence interval for the


proportion is given by

0.15 ± 1.96 × 0.0326

⇒ 0.15 ± 0.064

⇒ ( 0.086, 0.214) .
In general, the formula for an approximate confidence interval
for a proportion is

p̂(1 − p̂)
p̂ ± z
n

where p̂ is an estimate of p and z is the appropriate value from


normal tables to give the required percentage confidence
interval.

Note this formula is valid only if the parameters of the binomial


distribution make it reasonable to use the normal approximation.

30
Chapter 2 Estimation

Example
Employees of a firm carrying out motorway maintenance are
issued with brightly coloured waterproof jackets. These come
in five different sizes numbered 1 to 5. The last 40 jackets
issued were of the following sizes
2 3 3 1 3 3 2 4 3 2 5 4 1 2 3 3 2 4 5 3
2 4 4 1 5 3 3 2 3 3 1 3 4 3 3 2 5 1 4 4

(a) (i) Find the proportion in the sample requiring size 3.


Assuming the 40 employees can be regarded as a
random sample of all employees, calculate an
approximate 95% confidence interval for the
proportion, p, of all employees requiring size 3.
(ii) Give two reasons why the confidence interval
calculated in (i) is approximate rather than exact.

(b) Your estimate of p is p̂ .


(i) What percentage is associated with the approximate
confidence interval p̂ ± 0.1?
(ii) How large a sample would be needed to obtain an
approximate 95% confidence interval of the form
p̂ ± 0.1? (AEB)

Solution
(a) (i) There are 15 out of 40 requiring size 3, a proportion of
15
= 0.375 . An approximate 95% confidence interval
40
is given by

0.375 × 0.625
0.375 ± 1.96 ×
40
⇒ 0.375 ± 0.150
⇒ (0.225, 0.525) .
(ii) The confidence interval is approximate because an
estimate of p is used (the true value is unknown) and
because the normal distribution is used as an
approximation to the binomial distribution.

(b) (i) Confidence interval is given by

p̂(1 − p̂)
p̂ ± z .
n
Hence if interval is p̂ ± 0.1 ,

0.375 × 0.625
0.1 = z
40

31
Chapter 2 Estimation

i.e. z = 1.306 ;
this is (1 − 2 × 0.096 ) × 100 = 81 per cent confidence 0.9042
0.0958
interval.

(ii) 95% confidence interval


1.306
0.375×.625
p̂ ± 1.96 .
n
If this is p̂ ± 0.1 ,

0.375×.625
0.1 = 1.96
n

⇒ n = 9. 4888
⇒ n = 90.04
Sample of size 90 needed.

Exercise 2C
1. When a random sample of 80 climbing ropes 3. A large civil engineering firm issues every new
were subjected to a strain equivalent to the employee with a safety helmet. Five different
weight of ten climbers, 12 of them broke. sizes are available numbered 1 to 5. A random
Calculate a 95% confidence interval for the sample of 90 employees required the following
proportion of ropes which would break under this sizes
strain. 2 4 2 2 2 5 4 5 4 4
2. Data from a completed questionnaire were 4 2 4 3 4 2 3 1 5 4
entered into a computer as a series of binary
3 2 3 3 3 4 3 2 4 4
digits (i.e. each digit was 0 or 1). A check on
1000 digits revealed errors in 19 of them. 3 4 4 5 3 3 3 2 4 4
Assuming the probability of an error is the same 2 2 3 2 3 2 3 3 5 4
for each digit entered, calculate a 90%
2 3 4 2 4 3 2 2 3 2
confidence interval for the proportion of digits
where an error will be made. 3 4 2 3 4 5 2 3 3 2
4 3 2 2 3 3 3 2 3 4
2 3 2 4 2 3 3 2 2 3

Calculate an approximate 90% confidence


interval for the proportion of employees
requiring size 2. (AEB)

2.4 Confidence interval for the


mean of a Poisson
distribution
In an area of moorland, plants of a certain variety are known to
be distributed at random at a constant average rate; that is, they
are distributed according to the Poisson distribution. A
biologist counts the number of plants in a randomly chosen
square of area 10 m2 and finds 142 plants. An approximate
confidence interval for the average number of plants in a square

32
Chapter 2 Estimation

of area 10 m2 can be found by approximating the Poisson


distribution by a normal distribution with mean 142 and
standard deviation 142 . This approximation is valid since the
mean is large.

In this case a 95% approximate confidence interval is given by

142 ± 1.96 142


⇒ 142 ± 23. 4

⇒ (118.6, 165. 4) .
In general the formula is

m±z m

where m is the observed value and z is the appropriate value


from normal tables to give the required percentage confidence
interval. Remember this is only valid if the mean is reasonably
large so that the normal distribution gives a good
approximation.

If a confidence interval for the mean number of plants per m2


was required, the interval calculated for 10 m2 would be divided
by 10. The result would be 14.2 ± 2.34 or (11.86, 16.54 ) .

The calculation could have been carried out by finding the mean
number of plants observed in 10 areas of 1m2 and basing the
calculation on this. However, there would be no advantage in
this as it would give the identical answer.

For example in this case, if 10 separate m 2 had been observed,


the mean number of plants in each square would be 14.2 and the
distribution would be approximated by normal mean 14.2
standard deviation 14.2 . Since you now have the mean of 10
observations the calculation for a 95% interval would be

14.2
14.2 ± 1.96
10

⇒ 14.2 ± 2.34 , as before.

It is probably always easiest to work in terms of the total


number of events observed and then scale the final answer as
required.

Why is a confidence interval calculated for the mean of a


Poisson distribution only an approximation?

33
Chapter 2 Estimation

Exercise 2D
1. Cars pass a point on a motorway during the 3. The number of a certain type of organism
morning rush hour at random at a constant suspended in a liquid follows a Poisson
average rate. An observer counts 212 cars distribution. 10 cc of the liquid are found to
passing during a 5 minute interval. Calculate contain 35 of the organisms. Calculate
(a) a 95% confidence interval for the mean (a) a 90% confidence interval for the mean
number of cars passing in a 5 minute interval, number of organisms per 10 cc,
(b) a 95% confidence interval for the mean (b) a 95% confidence interval for the mean
number of cars passing in a one minute number of organisms per cc,
interval,
(c) a 99% confidence interval for the mean
(c) a 90% confidence interval for the mean number of organisms per 100 cc.
number of cars passing in a two minute
interval, A further 10 cc of the liquid were examined and
found to contain 26 of the organisms. Modify
(d) a 99% confidence interval for the mean your answers to (a), (b) and (c) to take account
number of cars passing in an hour. of this additional data.
2. The number of times a machine needs resetting
on a night shift follows a Poisson distribution.
On three randomly selected nights it was reset 9,
5 and 11 times. Calculate a 95% confidence
interval for the average number of times it needs
resetting per night.

2.5 Miscellaneous Exercises


1. The development engineer of a company making (b) Explain to the manager whether or not your
razors records the time it takes him to shave, on results suggest that the distribution of payments
seven mornings, using a standard razor made by has changed after special investigation and
the company. The times, in seconds, were comment on her suggestion that in future all
217, 210, 254, 237, 232, 228, 243 claims over £900 should be subject to special
investigation. (AEB)
Assuming that this may be regarded as a random
sample from a normal distribution, calculate a 3. A car manufacturer purchases large quantities of
95% confidence interval for the mean. a particular component. Tests have shown that
(AEB) 3% fail to function and that, of those that do
function, the mean working life is 2400
2. A car insurance company found that the average hourswith a standard deviation of 650 hours.
amount it was paying on bodywork claims was The manufacturer is particularly concerned with
435 with a standard deviation of 141. The next the large variability and the supplier undertakes
eight bodywork claims were subjected to extra to improve the design so that the standard
investigation before payment was agreed. The deviation is reduced to 300 hours.
payments, in pounds, on these claims were
(a) A random sample of 310 of the new
48, 109, 237, 192, 403, 98, 264, 68. components contained 12 which failed to
(a) Assuming the data can be regarded as a function. Calculate an approximate 95%
random sample from a normal distribution, confidence interval for the proportion which
calculate a 90% confidence interval for fail to function.
(i) the mean payment after extra (b) A random sample of 3 new components tested
investigation, had working lives of 2730, 3120 and 2300
hours.
(ii) the standard deviation of the payments
after extra investigation.

34
Chapter 2 Estimation

Assuming that the claim of a standard (a) Calculate a 95% confidence interval for the
deviation of 300 hours is correct and that the mean weight.
lives of the new components follow a normal (b) Find the proportion of packets in the sample
distribution, calculate weighing less than 191 g and use your result
(i) a 90% confidence interval for the to calculate an approximate 95% confidence
mean working life of the components, interval for the proportion of all packets
weighing less than 191 g.
(ii) how many components it would be
necessary to test to make the width of (c) Assuming that the mean weight is at the lower
a 90% confidence interval for the limit of the interval calculated in (a), what
mean just less than 100 hours. proportion of packets would weigh less than
182 g?
(c) Lives of components commonly follow a
distribution that is not normal. If the (d) Discuss the suitability of the packets from the
assumption of normality is invalid in this point of view of the average quantity system.
case, comment briefly on the amount of A simple adjustment will change the mean
uncertainty in your answers to (b)(i) and weight of future packages. Changing the
(b)(ii). standard deviation is possible but very
expensive. Without carrying out any further
(d) Using all the information available, compare calculations, discuss any adjustments you
the two designs and recommend which one might recommend. (AEB)
should be used. (AEB)
6. A car manufacturer introduces a new method of
4. The resistances (in ohms) of a sample from a assembling a new component. The old method
batch of resistors were had a mean assembly time of 42 minutes with a
2314, 2456, 2389, 2361, 2360, 2332, 2402. standard deviation of 4 minutes. The
manufacturer would like the assembly time to be
Assuming that the sample is from a normal
as short as possible and to have as little
distribution,
variation as possible. He expects the new
(a) Calculate a 90% confidence interval for the method to have a smaller mean but to leave the
standard deviation of the batch. variability unchanged. A random sample of
Past experience suggests that the standard assembly times, in minutes, taken after the new
deviation, σ , is 35 ohms. method had become established was
(b) Calculate a 95% confidence interval for the 27, 19, 68, 41, 17, 52, 35, 72, 38.
mean resistance of the batch A statistician glanced at the data and said she
thought the variability had increased.
(i) assuming σ = 35 ,
(a) Suggest why she said this.
(ii) making no assumption about the
standard deviation. (b) Assuming the data may be regarded as a
(c) Compare the merits of the confidence random sample from a normal distribution,
intervals calculated in (b). (AEB) calculate a 95% confidence interval for the
standard deviation. Does this confirm the
5. Packets of baking powder have a nominal weight statistician’s claim or not?
of 200 g. The distribution of weights is normal
and the standard deviation is 7 g. Average (c) Calculate a 90% confidence interval for the
quantity system legislation states that, if the mean using a method which is appropriate in
nominal weight is 200 g, the light of your answer to (b).
(i) the average weight must be at least (d) Comment on the suitability of the new
200 g, process. (AEB)
(ii) not more than 2.5% of packages 7. Stud anchors are used in the construction
may weigh less than 191 g, industry. Samples are tested by embedding them
(iii) not more than 1 in 1000 packages may in concrete and applying a steadily increasing
weigh less than 182 g. load until the stud fails.
A random sample of 30 packages had the (a) A sample of 6 tests gave the following
following weights: maximum loads in kN
218 207 214 189 211 206 203 217 183 186 27.0, 30.5, 28.0, 23.0, 27.5, 26.5
219 213 207 214 203 204 195 197 213 212 Assuming a normal distribution for maximum
188 221 217 184 186 216 198 211 216 200 loads, find 95% confidence intervals for
(i) the mean,
(ii) the standard deviation.

35
Chapter 2 Estimation

(b) If the mean was at the lower end and the (c) Of the 26 who replied, 10 had obtained
standard deviation at the upper end of the employment. Of all redundant miners who
confidence intervals calculated in (a), find would reply to a questionnaire, a proportion
the value of k which the maximum load p have obtained emplyment. Approximately
would exceed with probability 0.99. how many replies would be necessary to
obtain 95% confidence interval of width 0.1
Safety regulations state that the greatest load
for p?
that may be applied under working conditions

is
( x − 2 σ̂ ) where x is the mean and σ̂ 2 is
(d) Using the results of (b) and (c) estimate
approximately how many letters should be
3 sent out to give a high probability of
the unbiased estimate of variance calculated obtaining sufficient replies to calculate the
from a sample of 6 tests. Calculate this confidence interval in (c). Explain your
figure for the data above and comment on the answer. (AEB)
adequacy of this regulation in these
circumstances. 9. It is known that repeated weighings of the same
(AEB) object on a particular chemical balance give
readings which are normally distributed with
8. A campaign to combat the economic devastation mean equal to the mass of the object. Past
caused to coalfield communities by pit closures esperience suggests that the standard deviation,
employed a researcher. The campaign organisers σ , is 0.25 mg. Seven repeated weighings gave
wished to know the proportion of redundant the following readings (mg).
miners who were able to find alternative
19.3, 19.5, 19.1, 19.0, 19.8, 19.7, 19.4
employment within a year of becoming
redundant. (a) Use the data to calculate a 95% confidence
interval for σ .
(a) The researcher found that the probability of a
redundant miner visited at home refusing to (b) Calculate a 95% confidence interval for the
answer a questionnaire is 0.2. mass of the object assuming σ = 25 mg.
What is the probability that on a day when he (c) Calculate 95% confidence interval for the
visits twelve redundant miners at home mass of the object, making no assumption
(i) 3 or fewer will refuse to answer the about σ , and using only data from the
questionnaire, sample.

(ii) exactly 3 will refuse to answer the (d) Give two reasons for preferring the
questionnaire, confidence interval calculated in (b) to that
calculated in (c).
(iii) at least 10 will agree to answer the
questionnaire?
(b) The researcher decided to try a postal survey
and as a pilot scheme sent out 70
questionnaires to randomly selected
redundant miners. There were 26 completed
questionnaires returned. Calculate an
approximate 95% confidence interval for the
proportion of redundant miners who would
return a completed questionnaire.

36

You might also like