Lecture 7 CIs A
Lecture 7 CIs A
LECTURE 7
2
RECALL : THE NORMAL
DISTRIBUTION
3
RECALL : STATISTICAL RESULTS BASED ON SAMPLES
4
THE SAMPLING DISTRIBUTION
5
THE MEAN OF THE SAMPLING DISTRIBUTION
6
THE STANDARD ERROR OF THE SAMPLING DISTRIBUTION
7
SAMPLE SIZE AND STANDARD ERROR
8
Example 7.1
Consider a population of employees. Suppose X is the time
taken (in minutes) for an employee to word process a
business letter.
The distribution of times (in minutes) to word process a
business letter is modelled on the normal distribution:
X is N(10, 4).
Now take a random sample of 10 employees, measure the
time it takes for each employee to word process the letter
and find the mean of this sample of employees.
Repeat this process with a different sample of 10 employees,
then repeat over and over again, graphing the results for all
of the samples.
Then repeat the process above by taking random samples 9of
50 employees.
INTERPRETATION CONTINUED
10
WHEN THE SHAPE OF THE DISTRIBUTION IS UNKNOWN OR
NOT NORMAL
11
THE CENTRAL LIMIT THEOREM
12
INTRODUCING CONFIDENCE INTERVALS
A confidence interval (CI) is used for the
purpose of estimating a population parameter
(a single number that describes a population)
by using statistics (numbers that describe a
sample of data.)
Instead of giving a single estimate of the
population mean we can give a range of
values for the population mean, called a
confidence interval.
Confidence intervals are determined using 13
14
Which is the basis of calculating a confidence interval.
MARGIN OF ERROR
The margin of error measures the variation in the
random samples due to chance.
Note As you didn’t get to sample everyone in the
population:
you expect your sample results to be “out” by a
certain amount ‘just by chance’ and
you acknowledge that your results could change with
subsequent samples and that they’re only accurate to
within a certain range (which is the margin of
error).
The ultimate goal when making an estimate using a
confidence interval is to have a small margin of error.
15
The narrower the CI, the more precision in the
results.
SO HOW DO YOU ENSURE THAT YOUR CI WILL BE
NARROW ENOUGH?
18
FACTOR: POPULATION VARIABILITY
19
FACTOR: CHOOSING THE SAMPLE SIZE
20
EXAMPLE 7.2
21
95% CI FOR THE POPULATION MEAN
(FOR LARGE SAMPLES N≥30)
22
99% CI FOR THE POPULATION MEAN
(FOR LARGE SAMPLES N≥30)
23
99% CI FOR THE POPULATION MEAN
(FOR LARGE SAMPLES N≥30)
μ - 1.96 σ to μ + 1.96 σ
√n √n
25
Example 7.4
150 people were asked what their weekly income is.
The sample mean was calculated as £378 and the
sample standard deviation as £111.80.
95% CI
μ - 1.96 σ to μ + 1.96 σ
√n √n
μ – 2.58 σ to μ + 2.58 σ
√n √n
Calculate:
μ – 1.96 σ to μ + 1.96 σ
√n √n
μ – 2.58 σ to μ + 2.58 σ 28
√n √n
CI FOR THE DIFFERENCE OF 2 POPULATION MEANS
(WHEN N1, N2 ≥30)
29
INTERPRETING THE CI FOR THE DIFFERENCE OF
2 MEANS
30
Example 7.5
Ace Delivery Service operates a fleet of delivery vans. Currently,
the company have all of their drivers paying for the diesel using
the same brand of credit card – a Texgas credit card. However
the company’s senior management have now decided that
perhaps Quik-Chek, a chain of convenience stores that also sell
diesel (but does not accept credit cards) is worth investigating.
Texgas Quik-
A random sample of
Chek
diesel prices (per litre) at
sample size 35 40
35 Texgas petrol stations
and 40 Quik-Chek petrol
mean £1.48 £1.39
standard 8p 6p
stations, nationwide, are
deviation summarised in the table.
31
Texgas Quik-
Example 7.5 Chek
sample size 35 40
mean £1.48 £1.39
standard 8p 6p
deviation
33
Solutions
b)
Zero is not contained within the 95%CI and both lower and
upper limits are positive, indicating that the cost of Texgas is
significantly higher than Quik-Chek.
We are 95% confident that Texgas is between 6p and 12p
more expensive than Quik-Chek.
So management of Ace Delivery Service should move to
Quik-Chek as there is a significant difference between the
prices.
c)
No it doesn’t matter if the distributions of diesel prices (i.e.
the data values) are not normally distributed as the sample
sizes are sufficiently large (n1, n2 ≥ 30) to assume that the34
sampling distribution (of sample means) will be normally
distributed.