Chapter 2-Part 2
Chapter 2-Part 2
03/16/2025
4
the best unbiased estimate of , the population variance, is where
is the variance of the sample
There are alternative formats for :
or
03/16/2025
Example 2.8
5
A railway enthusiast simulates train journeys and records the number of minutes x, to the
nearest minute, trains are late according to the schedule being used. A random sample of 50
journeys gave the following times.
17 5 3 10 4 3 10 5 2 14
3 14 5 5 21 9 22 36 14 34
22 4 23 6 8 15 41 23 13 7
6 13 33 8 5 34 26 17 8 43
24 14 23 4 19 5 23 13 12 10
03/16/2025
6
Given that calculate to two decimal places, unbiased estimates of the mean and the variance
of the population from which this was drawn.
Example 2.8
For the data given in Example 2.7, estimate the proportion of trains that are more than 25
minutes late.
03/16/2025
7
2.4 Interval Estimates
Another way of using a sample value to give a good idea of an unknown population
parameter is to construct an interval, known as a confidence interval.
In general terms, this is an interval that has a specified probability of including the parameter.
The interval is usually written and the end- values, a and b, are known as confidence limits.
The probabilities most often used in confidence intervals are 90%, 95% and 99%. Suppose
you do not know the mean of a particular population and you want to work out a 95%
confidence interval for it.
03/16/2025
You8would need to construct an interval (a, b) so that . In this case, the probability that the
interval includes is 0.95 or 95%.
The interval that you construct uses the value of the mean of a random sample of size n
taken from the population. This mean is denoted by .
Before constructing your interval for , it is essential to ask the following questions.
- Is the distribution of the population normal or not?
- Do you know the variance of the population?
- Is the sample size large or small? Your answers will then determine how to proceed.
The following theory illustrates various situations.
03/16/2025
9
(a) Confidence interval for , the population mean
of a normal population,
with known variance
using any size sample, large or small
Consider first how to calculate the end-values of the most commonly used interval, the 95%
confidence interval. The method can then be adapted for other levels of confidence. Note
that it is useful to be able to follow the theory for the derivation of the end-points, but in
practice you will probably only need to be able to apply the formula.
03/16/2025
for random samples of size , if then
10
Therefore 95% confidence interval for is
Example 2.10
The mass of vitamin E in a capsule manufactured by a certain drug company is normally distributed
with standard deviation 0.042 mg. A random sample of five capsules was analysed and the mean mass
of vitamin E was found to be 5.12 mg. Calculate a symmetric 95% confidence interval for the
population mean mass of vitamin E per capsule. Give the values of the end-points of the interval
03/16/2025
If is the mean of a random sample of size , when is large ,taken from a non-normal
population with known variance ,
then a 95%% confidence interval for is given by
03/16/2025
Example 2.11
12
The heights of men in a particular district are distributed with mean and the standard
deviation . On the basis of the results obtained from a random sample of 100 men from
the district, the 95% confidence interval for was calculated and found to be
Calculate
(a) the value of the sample mean,
(b) the value of ,
(c) a symmetric 90% confidence interval for ·
03/16/2025
13
Example 2.12
The result X of a stress test is known to be a normally distributed random variable with
mean and standard deviation 1.3. It is required to have a 95% symmetrical confidence
interval for with total width less than 2. Find the least number of tests that should be
carried out to achieve this.
03/16/2025
(c) Confidence interval for , the population mean
14
• of a normal or non-normal population,
• with unknown variance
• using a large sample, n
When calculating confidence intervals it is often the case that the population variance, , is
not known. Provided that the sample size, is large, ( say) it is permissible to use ,
the best unbiased estimate for .
Ideally the distribution of should be normal, but an approximate confidence interval can
also be given when the distribution of is not normal. Remember that in both cases, must
be large.
03/16/2025
then 95% confidence interval for is given by
15
03/16/2025
16
Example 2.13
The fuel consumption of a new model of car is being tested, In one trial, 50 cars chosen at
random, were driven under identical conditions and the distances, , covered on 1 litter of
petrol were recorded. The results gave the following totals:
Calculate a confidence interval for the mean petrol consumption, in kilometers per litter, of
cars of this type.
03/16/2025
Example 2.14
17
The height, , of each man in a random sample of 200 men living in the UK was measured, The
following results were obtained:
(a) Calculate unbiased estimates of the mean and variance of the heights of men living in the
UIC
(b) Determine an approximate 90% confidence interval for the mean height of men living in
the UK. Name the theorem that you have assumed.
03/16/2025
(d) Confidence interval for when
18
the population is normal
is unknown,
sample size is small,
When calculating confidence intervals, you have already encountered the situation when
large
samples () are taken from a normal population with unknown variance
For large samples,
03/16/2025
For small samples,
where
19 has a distribution.
distribution is negligible.
For samples of size , it can be shown that
20
03/16/2025
Example
22 2.15
Consider following a with degrees of freedom, i.e. and Find
(i)
(ii)
(iii)
Example 2.16
The random variable has a t-distribution with 14 degrees of freedom i e Find the value of for
which
(i) (ii)
03/16/2025
Example 2.17
The mass, in grams, of biscuits of a particular brand, follows a normal distribution with
mean . Ten packets of biscuits are chosen at random and their masses noted. The results, in
grams, are
The theory needed to derive the confidence interval for is based on the sampling distribution
of proportions, , This states that, provided the sample size n is large, the distribution of , is
normal, so
You are then able to find approximate confidence intervals for p as follows:
Example 2.19
A manufacturer wants to assess the proportion of defective items in a large batch
produced by a particular machine. He tests a random sample of 300 items and finds that
45 items are defective.
Calculate an approximate 95% confidence interval for the proportion of defective items
in the batch.
Example 2.20
In a random sample of 400 carpet shops, it was discovered that 136 of them sold carpets at
below the list prices recommended by the manufacturer.
(a) Estimate the percentage of all carpet shops selling below list price.
(b) Calculate an approximate 90% confidence interval for the proportion of shops that sell
below list price and explain briefly what this means.
(c) What size sample would have to be taken in order to estimate the percentage to within
confidence?