0% found this document useful (0 votes)
76 views111 pages

Inferential Statistics

This document provides information about inferential statistics including: - Using sample statistics to infer population parameters - The empirical rule and how it relates to standard deviations from the mean and percentages of data included - Examples demonstrating how to use the empirical rule to determine what percentage of data lies within 1 and 2 standard deviations of the mean - Key aspects of sampling including that samples should be random and representative of the population - How margin of error depends on sample size and can be used to create confidence intervals for estimating population proportions with a certain degree of confidence, typically 95%

Uploaded by

Cameron
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views111 pages

Inferential Statistics

This document provides information about inferential statistics including: - Using sample statistics to infer population parameters - The empirical rule and how it relates to standard deviations from the mean and percentages of data included - Examples demonstrating how to use the empirical rule to determine what percentage of data lies within 1 and 2 standard deviations of the mean - Key aspects of sampling including that samples should be random and representative of the population - How margin of error depends on sample size and can be used to create confidence intervals for estimating population proportions with a certain degree of confidence, typically 95%

Uploaded by

Cameron
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 111

Night 1

Modular Course 5
Summary or Descriptive Statistics:
Numerical and graphical summaries
of data.

INFERENTIA L STATIST ICS:


USING T HE SAMPLE
STATISTICS TO INFER (T O)
POPULATION PARAMET ERS.
Making Decisions based on the
Empirical Rule (Standard Normal
Curve)

68%

-3 -2 -1 0 1 2 3

95%

99.7%
Empirical Rule

68%

m - 3s m - 2s m - 1s m m + 1s m + 2s m + 3s

95%

99.7%
Most Important for Inferential Stats
on our Syllabus

m - 2s m m + 2s

95%

95% of normal data lies within 2 standard


deviations of the mean
Example 1
IQ scores are normally distributed with a mean of 100 and a standard deviation of 15.
Use the Empirical Rule to show that 95% of IQ scores in the population are between 70 and 130.

Solution
95% of the IQ scores are within �2 standard deviations of the mean.
100 + 2(15) = 100 + 30 = 130
100 - 2(15) = 100 - 30 = 70

68%

-3 -2 -1 0 1 2 3

95%
99.7%
Example 2
The number of sandwiches sold by a shop from 12 noon to 2 pm each day is normally distributed.
The mean of the distribution was 42.6 sandwiches and a standard deviation of 8.2.
Use the Empirical Rule to identify the range of values around the mean that includes 68%
of the sale numbers.

Solution
68% of the sales are within �1 standard deviations of the mean .
42.6 + 1(8.2) = 42.6 + 8.2 = 50.8
42.6 - 1(8.2) = 42.6 - 8.2 = 34.4
Solution: 68% of the sale are between 34.4 and 50.8 sandwiches.

68%

-3 -2 -1 0 1 2 3

95%
99.7%
Your Turn
Question
The table below shows the prices charged per room of 40 B&B houses in Galway.
Race - Week B&B prices per room (€)
56 75 60 70 80 70 50 90 80 75
75 50 75 50 70 60 65 60 50 70
84 70 70 60 60 70 70 70 40 60
70 80 60 65 55 50 70 80 50 55

(i) Calculate, correct to one decimal place, the mean and standard deviation of the data.
(ii) Show that the emperical rule holds true for 1 standard deviation around the mean.
(iii) Show that the emperical rule holds true for 2 standard deviations around the mean.

68%

-3 -2 -1 0 1 2 3

95%
99.7%
Solution
(i) Using calculator : Mean = 65.5, SD =11.2

(ii) Upper Range = Mean + 1(Standard Deviation) = 76.7


Lower Range = Mean - 1(Standard Deviation) = 54.4
Of the forty houses 13(68.05%) charge between €54.40 and €76.70
Therefore aprox 68% of the prices lie between 1 standard deviation of the mean.
(iii) Upper Range = Mean + 2(Standard Deviation) = 87.9
Lower Range = Mean - 2(Standard Deviation) = 43.1
Of the forty houses 38 (95%) charge between €43.10 and €87.90
Therefore aprox 95% of the prices lie between 2 standard deviations of the mean.

68%

-3 -2 -1 0 1 2 3

95%
99.7%
Inferential Statistics:

Sampling
For Leaving Cert we deal with two types of sampling:

1. Sample Proportion (Ordinary Level and Higher Level)

2. Sample Means ( Higher Level)


Sampling
We are usually unable to collect information about a total population.
The aim of sampling is to draw reasonable conclusions about a population
by obtaining information from a relatively small sample of that
population.

When a sample from a population is selected we hope that the data we


get
represents the population as a whole.

To ensure this
1. The sample must be random;

2. Every member ofPopulation


the population must have an equal chance of being
included;
S ample 1
Sample 2

S ample 3

S ample 6

S ample 4
S ample 5
Population Proportions and Margin of Error
  sample of 25 students in a school were asked if they spent over €5 on
A
mobile phone calls over the last week. 10 students have spent over €5.
The proportion of the sample of 25 who spent over €5 was
Can we say that 40% of the students in the school (population) spent over
€5?

The answer is no, (unless the sample size was the same as the population
size), we can’t say for certain.

However we could say with a certain degree of


confidence, if the sample was large enough and
representative then the proportion of the sample would
be approximately the same as the proportion of the
population
Population Proportions and Margin of
Error
How confident we are is usually expressed as a percentage.
We already saw (from the empirical rule) that approximately 95% of the
area of a normal curve lies within ± 2 standard deviations of the mean.

This means that we are 95% certain that the population proportion is
within ±2 standard deviations of the sample proportion. ± 2 standard
deviations is our margin of error and the percentage margin of error
that this represents depends on the sample size.

If n = 1000 the percentage margin of error of ± 3%

95% is the confidence interval we are working with, but other confidence
intervals also exist (e.g.90% and 99%) for which a different margin of error
applies depending on sample size.

At 95% level of confidence


1
Margin of Error =
n
where n, is the sample size
Confidence interval for population
proportion using Margin of Error
 

95% confidence
interval
   Population  
Proportion

95% confident that the population proportion is inside this


confidence interval
Showing a 95% confidence
Question. interval.
A sample of 25 students in a school were asked if they spent
over €5 on mobile phone calls over the last week. 10 students has spent
overproportion
 The €5. of the sample of 25 who spent over €5 was
Margin of Error = = 0.2

   
diferent 95% confidence intervals

95% of the time, the true


population proportion is in
the interval I made with
my sampled proportion
and the margin of error
interval.
Some Notes on Margin of Error
• As the sample size increases the margin of error decreases

• A sample of about 50 has a margin of error of about 14% at 95% level


of confidence
1
= �14.14%
50

• A sample of about 1000 has a margin of error of about 3% at 95%


level of confidence
1
= �3.16%
1000

• The size of the population does not matter

• If we double the sample size (1000 to 2000) we do not get do not half
the margin of error

• Margin of error estimates how accurately the results of a poll reflect


the “true” feelings of the population
 

Sample Size Margin of


Error
25  20%
64  12.5%
100  10%
256  6.25%
400  5%
625  4%
1111  3%
1600  2.5%
2500  2%
Example 1
A company claims that 30% of people who eat their "Rice Crispy Bun" product really liked it.
The confidence level is cited as 95%.
In June an independant survey was carried out on 625 randomly selected people to see if
they liked the "Rice Crispy Bun" product.
(i) Calculate the margin of error.
(ii) The result of the survey in June was that 125 liked the "Rice Crispy Bun" product.
According to the June survey would you say that at a 5% level of significance the
company was correct in stating that 30% of people who eat their "Rice Crispy Bun"
product really liked it?
Solution
1 1
(i) Margin of Error = = = �0.04 = �4%
n 625
(ii) Reason : The company claim 30% like the product.
The margin of error is plus or minus 0.04.
Acording to the survey 125 out of 625 liked the product.
30% is outside the margin of error.
30
%
   
16%   24%
Example
  2
In a survey I want a margin of error of + or - 5% at 95% level of confidence.
What sample size must I pick in order to achieve this?
Solution
Margin of Error = �0.05
1
�0.05 =
n
1
( �0.05)2 =
n
1
n=
0.0025
n = 400
Your Turn
Question
A sweet company claims that 10% of the M&M's it produces are green.
Students found that in a large sample of 500 M&M's 60 were green.
(i) Calculate the margin of error.
(ii) State weather 60 greens from 500 is an unusually
high proportion of green M&M's if the claim by the company is assumed to be true.
Question: Solution
A sweet company claims that 10% of the M&M's it produces are green.
Students found that in a large sample of 500 M&M's 60 were green.
(i) Calculate the margin of error.
(ii) State weather 60 greens from 500 is an unusually
high proportion of green M&M's if the claim by the company is assumed to be true.
Solution
1 1
(i) Margin of Error = = = �0.045 = �4.5%
n 500
(ii) Reason :
60
p$= = 0.12 = 12%
500
   
 

10%

7.5%   16.5%

10% is between 7.5% and 16.5% (inside the margin of error) so it seems not to be unusual.
Recognising the Concept of a
Hypothesis Test
Testing claims about a population.
Null Hypothesis: The null hypothesis, denoted by H 0 is a claim or
statement about a population. We assume this statement is true
until proven otherwise. (the null hypothesis means that nothing is
wrong with the claim or statement).

Alternative Hypothesis: The alternative hypothesis, denoted by H 1


is a claim or statement which opposes the original statement
about a population.
Courtroom Analogy to
Teach Formal Language
• At the start of a trial it is assumed the defendant is not guilty.

• Then the evidence is presented to the judge and jury.

• The null hypothesis is that the defendant is not guilty (H 0)

• If the jury reject the null hypothesis (H0), this means that they find the
defendant guilty.

• If the jury fail to reject the null hypothesis (H 0), this means that they
find the defendant not guilty.
Often we need to make a decision about a population based on a
sample.

1. Is a coin which is tossed biased if we get a run of 8 heads in 10 tosses?


Assuming that the coin is not biased is called a NULL HYPOTHESIS
(H0)
Assuming that the coin is biased is called an ALTERNATIVE
HYPOTHESIS (H1)

2. During a 5 minute period a new machine produces fewer faulty parts


than an old machine.
Assuming that the new machine is no better than the old one is called
a NULL HYPOTHESIS (H0)
Assuming that the new machine is better than the old one is called an
ALTERNATIVE HYPOTHESIS (H1)

3. Does a new drug for Hay-Fever work effectively?


Assuming that the new drug does not work effectively is called a NULL
HYPOTHESIS (H0)
Assuming that the new drug does work effectively called an
ALTERNATIVE HYPOTHESIS (H1)
Hypothesis test on a population
proportion using Margin of Error
 

95% confidence
interval
   Population  

Proportion
Claim % Claim %
Claim % Claim % (H0) is
(H0) is (H0) is (H0) is outside
outside inside inside

Reject Fail to Fail to Reject


Reject Reject
Example 1

Go Fast
Airlines

Go Fast Airlines provides internal flights in Ireland, short haul flights


to Europe and long haul flights to America and Asia. Each month the
company carries out a survey among 1000 passengers. The
company repeatedly advertises that 70% of their customers are
satisfied with their overall service. 664 of the sample stated they
were satisfied with the overall service.
Example 1
Go Fast Airlines provides internal flights in Ireland, short haul flights to Europe and long haul
flights to America and Asia. Each month the company carries out a survey among 1000
passengers. The company repeatedly advertises that 70% of their customers are satisfied with
their overall service. 664 of the sample stated they were satisfied with the overall service. Would
you say that the company were correct in saying that 70% of their customers were satisfied?
State the null hypotheses and state your conclusions clearly.
 Null Hypothesis: The proportion of passengers who are satisfied with the service is
unchanged 70%. p = 0.7
Alternative Hypothesis: The proportion of passengers who are satisfied with the
service is not 70%. p 0.7

Evidence:
Sample Proportion =
Margin of Error =

70%
   
Conclusion 63.24%   69.56%
The 70% is outside the range 63.24% to 69.56% of our confidence interval. Reject
There is sufficient evidence to reject the claim that the percentage of passengers who are
happy with the service is 70% at the 5% level of significance.

Possible Actions: Change the advertisement from 70% to 65%.


Meet with staff to come up with suggestions about how to improve the level of
satisfaction.
Do a further survey to find out more detail about why the level of satisfaction has
changed.
Your Turn
Question 1
It is generally agree that 40% of the voting public are in favour of a change of government.
A survey was carried out on 900 randomly selected people to see if there was a
change in support for the government. The result was that 42% are now in favour
of a change of government.
(i) Calculate the margin of error.
(ii) State the null and alternative hypothesis.
(iii) At a 5% level of significance, would you accept or reject the null hypothesis?
Give a reason for your conclusion.
Solution
1 1
(i) Margin of Error = = = �0.03 = �3%
n 900
(ii) Null hypothesis, H0 : "There is no change in the support for the government"
Alternative hypothesis, H0 : "There is a change in the support for the government"
(iii) We Fail to Reject (Accept) H0 the null hypothesis.
Reason : See the diagram below. 40% is inside the margin of error.
Question 1: Solution
It is generally agree that 40% of the voting public are in favour of a change of government.
A survey was carried out on 900 randomly selected people to see if there was a
change in support for the government. The result was that 42% are now in favour
of a change of government.
(i) Calculate the margin of error.
(ii) State the null and alternative hypothesis.
(iii) At a 5% level of significance, would you accept or reject the null hypothesis?
Give a reason for your conclusion.
Solution
1 1
(i) Margin of Error = = = �0.03 = �3%
n 900
(ii) Null hypothesis, H0 : "There is no change in the support for the government"
Alternative hypothesis, H0 : "There is a change in the support for the government"
(iii) We Fail to Reject (Accept) H0 the null hypothesis.
Reason : See the diagram below. 40% is inside the margin of error.
40%
    Fail to
39%   45% Reject
Question 2
RTÉ claim that 60% of all viewers watch the Late Late Show every Friday
night. An independent survey was carried out on 400 randomly selected
viewers to see if the claim were true. The result of the survey was that
180 were watching the Late Late Show.
I. Calculate the margin of error.
II. State the Null and Alternative Hypothesis.
III. Would you accept or reject the Null Hypothesis according to this
survey?
Give a reason for your conclusion.
Question 2: Solution
 I. Margin of Error = = 0.05 = 5%

II. Null hypothesis : 60% of viewers watch the Late Late Show.
Alternative hypothesis : 60% of viewers do not watch the Late Late
Show.
= 0.45 = 45%

60
    %
40%   50%
Rejec
t
iii. There is sufficient evidence, according to the survey, Reject the Null
Hypotheses. Reason: 60% is outside the confidence interval.
Empirical Rule

68%

m - 3s m - 2s m - 1s m m + 1s m + 2s m + 3s

95%

99.7%

What about 1·5 std devs or 0·8


Night 2
Normal Distribution to
Standard Normal Distribution
Different sets of data have different means and standard deviations but
any that are normally distributed have the same bell-shaped normal
distribution type of curves.
Normal Distribution Curve Standard Normal Curve
In order to avoid unnecessary calculations and graphing the scale of a
Normal Distribution curve is converted to a standard scale called the z
score or standard unit scale.
Normal Standard Normal
Distributions Distribution
m = 13 m=0
s=3 s =1

4 7 1 1 1 1 2
0 3 6 9 2
m = 278
s = 12 –3 –2 –1 0 1 2 3

24 25 26 27 29 30 31
2 4 6 8 0 2 4
Standard Normal Distribution
1 - 12 z2
If m = 0 and s = 1 we would plot e
2p
This graph gives the Standard Normal Graph with a standardised scale.
Total area under the curve
� - 1 z2
1
P(-�< z < �) =
2p �e
-�
2
dz = 1

m - 3s m - 2s m-s m m+s m + 2s m + 3s
-3 -2 -1 0 1 2 3

z - scores

The area between the Standard Normal Curve


and the z - axis between - �and + �is 1.
Standard Units (z – scores)
x -m
z=
s
x is a data point
m is the population mean
s is the standard deviation of the population
z – scores define the position of a score in relation to the mean using the
standard deviation as a unit of measurement.

z – scores are very useful for comparing data points in different


distributions.

The z – score is the number of standard deviations by which the score


departs from the mean.
This standardises the distribution.
Reading z – values From Tables
Example 1
Using the tables find P(Z �1 �
31).

36
Pg. 36
37 Pg.
For a given z, the table gives
1
1 z - t
P(Z �z) =
�e 2
dt

Pg. 37
2p -�

Pg.
–3 –2 –1 0 1 2 3
1.31
P(Z �1 �
31) can be read from the tables directly

31) = 0 �
P(Z �1 � 9049 = 90.49%
Example 2
Using the tables find P(Z �1 �
32)

37 Pg.
Pg. 37
Pg. 36
Pg. 36
–3 –2 –1 0 1 2 3
1.32
The table only gives value to the left of z, but
the fact that the total area under the curve
P(Z �z) is equal to 1 - P(Z �z) equals 1, allows us to use, P(Z �z) = 1 - P(Z �z)
P(Z �1 �32) = 1 - P(Z �1 �32)
32) = 1 - 0 �
P(Z �1 � 9066 = 0 �
0934 = 9.34% P(Z �z)

P(Z �z) P(Z �z)


1 - P(Z �z)

0
z
Example 3
Using the tables find P(Z �-0 �
74).

37 Pg.
Pg. 37
Pg. 36
Pg. 36
–3 –2 –1 0 1 2 3

–0.74

The tables only work for positive values but as


Both areas are the same and hence
the curve is symmetrical about z = 0
both probabilities are equal as the curve
P(Z �-0 �
74) = P(Z �0 � 74) is symmetrical about the mean, 0.
P(Z �-0 �
74) = 1 - P(Z �0 �74)
P(Z �-0 �
74) = 1 - 0 � 2296 = 22.96%
7704 = 0 �
P(Z �-z) P(Z �z)

0
–z z
Example 4
1 32 - z 1 29)
Using the tables find P( ��‫ף‬

37 Pg.
Pg. 37
Pg. 36
Pg. 36
–3 –2 –1 0 1 2 3
–1.32 1.29

–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
1.29 –1.32
P( --1�
32=��‫ף‬
z 1 -29) Area to the Left of 1 29 Area to the left of 1.32
29) - [ 1 - P(z �1 �
= P(z �1 � 32)]
= 0�
9015 - [1 - 0 � 8081 = 80.81%
9066] = 0 �
Your Turn
Question 1
The amounts due on a mobile phone bill in Ireland are normally distributed with a mean of €53 and a
standard deviation of €15. If a monthly phone bill is chosen at random, find the probability that the
amount due is between €47 and €74.
Solution
x -m x -m
z1 = z2 =
s s
47 - 53 74 - 53
z1 = z2 =
15 15
z1 = - 0 �
4 z2 = 1 �
4

P(-0 �
4 < Z < 1�
4)
P(-0 �
4 < Z < 1� 4) - [ 1 - P(Z �0 �
4) = P(Z �1 � 4)]
P(-0 �
4 < Z < 1�
4) = 0 �
9192 - [1 - 0 �
6554]
P(-0 �
4 < Z < 1�
4) = 0 �
5746
Question 1: Solution
The amounts due on a mobile phone bill in Ireland are normally distributed with a mean of €53 and a
standard deviation of €15. If a monthly phone bill is chosen at random, find the probability that the
amount due is between €47 and €74.
Solution
x -m x -m
z1 = z2 =
s s
47 - 53 74 - 53
z1 = z2 =
15 15
z1 = - 0 �
4 z2 = 1 �
4
8 23 38 47 53 68 74 83 98
–3 –2 –1 –0.4 0 1 1.4 2 3
P(-0 �
4 < Z < 1�
4)
P(-0 �
4 < Z < 1� 4) - [ 1 - P(Z �0 �
4) = P(Z �1 � 4)]
P(-0 �
4 < Z < 1�
4) = 0 �
9192 - [1 - 0 �
6554]
P(-0 �
4 < Z < 1�
4) = 0 �
5746
Question 2
The mean percentage achieved by a student in a statistic exam is 60%.
The standard deviation of the exam marks is 10%.
(i) What is the probability that a randomly selected student scores above 80%?
(ii) What is the probability that a randomly selected student scores below 45%?
(iii) What is the probability that a randomly selected student scores between 50% and 75%?
(iv) Suppose you were sitting this exam and you are offered a prize for getting a mark which is
greater than 90% of all the other students sitting the exam?
What percentage would you need to get in the exam to win the prize?
Solution
x - m 80 - 60
(i) z= = =2
s 10
P(Z > 2) = 1 - P(Z < 2)
P(Z > 2) = 1 - 0.9772 = 0.0228 = 2.28% 30 40 50 60 70 80 90
–3 –2 –1 0 1 2 3

x - m 45 - 60
(ii) z= = = -1.5
s 10
P(Z < -1.5) = P(Z > 1.5) = 1 - P(Z < 1.5)
P(Z < -1.5) = 1 - 0.9332 = 0.0668 = 6.68% 30 40 45 50 60 70 80 90
–3 –2–1.5–1 0 1 2 3
Question 2: Solution
The mean percentage achieved by a student in a statistic exam is 60%.
The standard deviation of the exam marks is 10%.
(i) What is the probability that a randomly selected student scores above 80%?
(ii) What is the probability that a randomly selected student scores below 45%?
(iii) What is the probability that a randomly selected student scores between 50% and 75%?
(iv) Suppose you were sitting this exam and you are offered a prize for getting a mark which is
greater than 90% of all the other students sitting the exam?
What percentage would you need to get in the exam to win the prize?
Solution
x - m 80 - 60
(i) z= = =2
s 10
P(Z > 2) = 1 - P(Z < 2)
P(Z > 2) = 1 - 0.9772 = 0.0228 = 2.28% 30 40 50 60 70 80 90
–3 –2 –1 0 1 2 3

x - m 45 - 60
(ii) z= = = -1.5
s 10
P(Z < -1.5) = P(Z > 1.5) = 1 - P(Z < 1.5)
P(Z < -1.5) = 1 - 0.9332 = 0.0668 = 6.68% 30 40 45 50 60 70 80 90
–3 –2–1.5–1 0 1 2 3
Question 2: Solution
x -m x -m
(iii) z1 = z2 =
s s
50 - 60 75 - 60
z1 = z2 =
10 10
z1 = -1 z2 = 1.5
30 40 50 60 70 75 80 90
P(-1 < Z < 1 � 5) - [ 1 - P(Z �1)]
5) = P(Z �1 � –3 –2 –1 0 1 1.5 2 3

P(-1 < Z < 1 �


5) = 0.9332 - [1 - 0.8413]
P(-1 < Z < 1 �
5) = 0.7745

(iv) From the tables an answer for an area of 90% (0.9) = 1.28 � Z = 1.28
x -m
z=
s
x - 60
1.28 = � x = 72.8 marks
10
30 40 50 60 70 72.8 80 90
–3 –2 –1 0 1 1.28 2 3
For Higher Level Leaving Cert use z scores

37 Pg.
Pg. 37
Pg. 36
Pg. 36
-1.96 0 +1.96

2.5 95% 2.5


% %
Confidence interval
for population proportion
 

95% confidence
interval
   
 Population

Proportion

95% confident that the population proportion is inside this


confidence interval
Example 1
Skygo provides Wifi in the Galway area . In March the company carries out
a survey among 625 of its costumers. The company advertises that 60%
of their customers were satisfied with their download speeds. 370 of the
sample stated they were satisfied with their download speed time. Create
a 95% confidence interval based on your sample.
 Sample Proportion =

 Confidence Limits =

95% confidence
interval
  55.36%       63.04%
Your Turn
Question 1:
The Sunday Independent reports that the government's approval rating is at 65%. The
paper states that the poll is based on a random sample of 972 voters and that the margin
of error is 3%
Show that the pollsters used a 95% level of confidence.
Question 1: Solution
The Sunday Independent reports that the government's approval rating is at 65%. The
paper states that the poll is based on a random sample of 972 voters and that the margin
of error is 3%
Show that the pollsters used a 95% level of confidence.

 
Solution
Confidence Limits=

0.03 =

0.03 =

=1.96

Therefore they are using a 95% level of confidence.


Question 2
It is known that 30% of a certain kind of apple seed will germinate. In an experiment 85 out
of 300 seeds germinated. Construct a 95% confidence interval for the sample proportion.

 Sample Proportion =

Confidence Limits =

95% confidence
interval
      

 
Sample means

Sample Means
The data below are the heights in cm, of a population of 100, 15 year old students
165 161 170 182 176 185 180 155 154 166
165 152 174 167 165 171 172 150 181 165
166 161 174 158 166 168 164 150 155 170
168 144 164 154 177 173 178 158 165 175
180 174 152 167 148 175 153 162 180 175
157 172 155 140 147 160 152 166 168 158
153 165 160 143 166 167 167 163 158 160
150 157 172 167 184 172 165 159 158 177
179 174 156 178 165 179 174 148 175 166
157 159 163 165 162 153 145 170 176 180

From the list above the Mean of the Population m =


�x = 164 �72
n

�( x - m )
2

From the list above the Standard Deviation of the Population s = = 10 �


21
n

Slide60
It does not matter if the original distribution of the sample means
will always be normally distributed. Use Java Applets.

Slide61
A single sample of 5 data points. A single sample of 10 data points.

The black arrows are the data points. The mean of the sample is the red dot
 Naturally if we choose a sample size of 100 (original population size) the mean of the
sample will be that same as the mean of the population.

As the sample size increases the standard error will decrease.


Why? ……………
The sample means are normally distributed

For a sample size of 30


1. m x �m or m �m
The sample means are normally distributed
For a sample size of 30
s s
sx � or s �
n n

For a sample size of 30


1. m x �m or m �m
s s
2. s x � or s �
n n
Population
Populatio Large Sample Means
n Sample
Mean
Standard Deviation (Standard Error)

Population

S ample 1
S ample 2

S ample 3

S ample 6

S ample 4

S ample 5
Summary
Populatio Large Sample Means
n Sample
Mean
Standard Deviation (Standard Error)
In practice, from the table above, we can say that for n �30
1. The sample means are normally distributed.
2. The mean of the sample means is the same as the population mean. m x = m
s
3. The standard deviation of the sample means is equal to
n
s
this is called the standard error. s x =
n
KEY IDEA CLICK LINK BELOW

http://onlinestatbook.com/stat_sim/sampling_dist/index.htm
l
In the Standard Normal Distribution we want the values of z 1 such that 95%
of the population lies in the interval - z 1 ≤ z ≤ z1

- z1 z1
095 0025
0025

P(z �z1 ) = 0 �
95 + 0 �
025 = 0 �
975
� z1 = 1 �
96 and - z1 = -1 �
96
Therefore in a Normal Distribution 95% of the population lies within 1∙96
standard deviations of the mean.
95% of the population lies within 1∙96 of μ( the population mean)

\m x
96s x < m < m
- 1� x
96s x
+ 1�
s s
As s x �‫=ޱ‬ the confidence limits are m x
1 96
n n

Slide71
Example 1
A random sample of 250 cars were taken and the mean age of the cars was
4�5 years and the standard diviation was 2 �
2 years.
(i) Find the 95% confidence interval for the mean age of all cars.
(ii) What size sample is required to estimate the mean age, with 95% confidence
within �0.3 years.

s s
(i) The confidence limits are x �1 �
96 (ii) �1 �
96 = �0 � 3
n n
2�2 2�2
4 =5 1 96
�‫ױ‬ �1�96 = 0� 3
250 n
5 -1�
4� 139) < m < 4 �
96(0 � 5 +1�
96(0 �
139)
� n=
(1� 96 ) ( 2 �
2)
4� 23 < m < 4 �
77 0� 3
This means that we can say with 95% confidence = 14 �373
� n = ( 14 �
373 ) = 207cars
2
that the mean age of all cars in the population is
between 4 �23 years and 4 � 77 years.
Example 2
A random sample of 144 male students in a large university was taken and their heights measured.
The mean height was 175 cm. The standard deviation of all the male students in the university
was 9 cm.
(i) Give a 95% confidence intreval for the heights of all the male students.
(ii) Show that the confidence interval would decrease if a sample size was 225 instead of 144.

(i) n = 144, x (mean of the sample) = 175, s (standard deviation of the population) = 9,
m (population mean) is unknown.
s
We calculate the standard error of the mean using sx =
n
9
sx = = 0.75
144
As the sample size is large the best possible estimated value of m is x which is 175 cm.
Now we have to give a range of values in which the true population mean (m) lies.
This will be with 95% level of certainty.
s s
x - 1.96 � m � x + 1.96
n n
175 - 1.96(0.75) � m � 175 + 1.96(0.75)
173.53 � m � 176.47
The true population mean lies within the range 173.53 cm to 176.47 cm with 95% certainty.
(ii) If a sample of 225 were taken the standard error would be
9
sx = = 0.6
225
s s
x - 1.96 � m � x + 1.96
n n
175 - 1.96(0.6) � m � 175 + 1.96(0.6)
173.82 � m � 176.18
The true population mean lies within the range 173.82 cm to 176.18 cm with 95% certainty.
The confidence interval has decreased.

This is narrower than the previous confidence interval.


As you incerase the sample size you decrease the width of the confidence interval.
A study addressed the issue of whether pregnant women can correctly guess the sex of their baby.
Among 104 recruited subjects, 57 correctly guessed the sex of the baby
Use these sample data to test the claim that the success rate of such guesses is no different from the
50% success rate expected with random chance guesses. Use a 5% significance level.
(based on data from “Are Women Carrying ‘Basketballs’ Really Having Boys? Testing Pregnancy Folklore,” by Perry, DiPietro, and Constigan, Birth, Vol. 26, No. 3)

Solution:
The original claim is that the success rate is no different from 50%.
H0 = 0.5
H1 �0.5
57
pˆ = = 0.548
104
pˆ - p 0.548 - 0.50
z= = = 0.98
p(1 - p) n (0.5)(0.5)/104
At 5% level of significance the critical values are �1.96
As 0.98 is between - 1.96 and 1.96 we fail to reject the null hypthesis.

There is not sufficient evidence to warrant rejection of the claim that women who guess the sex of
their babies have a success rate equal to 50%.
Your Turn
 
A survey was carried out to find the weekly rental costs of holiday apartments in a certain country.
A random sample of 400 apartments was taken. The mean of the sample was €320 and the
standard deviation was €50.
Form a 95% confidence interval for the mean weekly rental costs of holiday apartments in that country.

s
The confidence limits are x �1 �
96
n
50
= 320 �1 �
96
400
320 - 1 � 5) < m < 320 + 1 �
96(2 � 96(2 �
5)
315 �1 < m < 324 �9
Between €315 � 10 and €324 �90
Night 3
Hypothesis Testing

Slide79
Often we need to make a decision about a population based on a
sample.

In a trial you are presumed innocent until after the trial?


Assuming that an accused person is innocent ( nothing has
happened) is called a NULL HYPOTHESIS (H0)
Assuming that an accused person is not innocent called an
ALTERNATIVE HYPOTHESIS (H1)

1. Is a coin which is tossed biased if we get a run of 8 heads in 10 tosses?


Assuming that the coin is not biased is called a NULL HYPOTHESIS (H0)
Assuming that the coin is biased is called an ALTERNATIVE HYPOTHESIS
(H1)

2. During a 5 minute period a new machine produces fewer faulty parts than
an old machine.
Assuming that the new machine is no better than the old one is called a
NULL HYPOTHESIS (H0)
Assuming that the new machine is better than the old one is called an
ALTERNATIVE HYPOTHESIS (H1)

3. Does a new drug for Hay-Fever work efectively?


Assuming that there is no difference between the new drug and the
current
drug called a NULL HYPOTHESIS. ( H )
Testing the Null Hypothesis using z-values

A Two Tailed Test.

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

The critical values for a 5% level of significance

z = - 1∙96 or z = 1∙96

Slide81
Testing the Null Hypothesis using z-
values
The statistical method used to determine whether H0 is true or not is called
HYPOTHESIS TESTING.
Statisticians speak of “not accepting or accepting H0 at a certain level”. This
level is called the LEVEL OF SIGNIFICANCE. ( 5% level of significance is on the
syllabus).

If the value of z lies outside the range - 1∙96 < z < 1∙96 (critical region)
we reject H0 .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96
 
If we take a large sample of size n from a population with a mean of m and a standard deviation of s.
We have to calculate the mean of the sample x. ( m x = x when we are dealing with large samples)
s
We can also calculate sx (s) by using sx = .
n
We want to test the hypothesis that the sample comes from a population with a
paticular value of m called m 0

Step 1. State the null and alternative hypotheses.


Null Hypotheses: m = m0
Alternative Hypothesis: m �m 0
Note 1: Not using m>0 or m<0. No direction stipulated.
Therefore this is a two tailed test. (Only Two Tailed Test on for Leaving Cert.)
Note 2: Null Hypothesis always has an equal sign and uses population parameters
Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
X -m
The test statistic is a Standard Normal Z score with Z = .
s
As we are dealing with the sampling distribution of the mean
x - m0
Z= .
s
n
(This is the difference between the value we have observed from our sample
and the hypothesised value from the population divide by the standard error)
Observed Value - Hypothesised Value
Z=
Standard Error

Step 3. Write down the critical values. ( a sketch also helps ) .


Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
Once we have the value of Z we compare it to our critical values and decide
wheather or not to reject the null hypotheses.
Review of the steps involved
in Hypothesis Testing:
1. Write down the null hypothesis H0 and the alternative hypothesis H1

2. Convert the observed results into z units. (Calculate the test statistic).

3. Write down the critical values. (a sketch also helps).

4. Reject H0 if z is in the critical regions, otherwise fail to reject H 0.


Example 1
A company manafactures pens with a mean writing life of 500 hours and
a standard deviation of 10 hours. A retailer examines a sample of 81 pens from
a supplier who claims to only sell pens from this company and finds their mean
life is 497 hours. Are these pens genuine products from the company?

Step 1. State the null and alternative hypotheses.


Null Hypotheses H0 : The sample of pens are genuine products from the compny. m = 500
Alternative Hypothesis H1: The sample of pens are not genuine products from the compny. m �500
Note : If not given a Level of Significance we must write it down. 5% (only level on for Leaving Cert.)

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m 0 497 - 500
Z= = = - 2.7
s 10
n 81
Example 1
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

− 2.7 is in the Reject


Region
Step 4. Reject H0 if Z is in the critical regions, otherwise fail to reject H0 .
We reject the null hypotheses as - 2.7 is in the reject region.
This means that there is sufficient evidence to conclude that the pens are not genuine.
Example 2
A tyre company claims that the mean life of tyres that it produces is 11,000 miles
with a standard deviation of 552 miles. An independant supplier of tyres wants to investigate
the company's claim. A test on a random sample of 36 tyres from the company gave a mean
life of 10,000 miles.
Carry out a hypothesis test using a significance level of 5% to see if there is evidence to support
the company's claim.

Step 1. State the null and alternative hypotheses.


Null Hypotheses H0 : The company produces tyres with a mean life of 11,000 miles. m = 11,000
Alternative Hypothesis H1: The company produces tyres whose mean life is not 11,000 miles. m �11,000

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m 0 10,000 - 11,000
Z= = = - 10.87
s 552
n 36
Example 2
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

− 10.87 is in the Reject


Region
Step 4. Reject H0 if Z is in the critical regions, otherwise fail to reject H0 .
We reject the null hypotheses as - 10.87 is in the reject region.
We can conclude that there is evidence to suggestt hat the company's claim is not true.
Your Turn
Question 1
A neurologist is testing the effect of a drug on response time by injecting 36 rats
with a unit dose of a new drug.
The neurologist measures the response time of each rat to a stimulus.
The neurologist know that the mean response time for rats not injected is 0.75 seconds.
The mean of the 36 injected rats' response time is 0.6 seconds with a standard deviation of 0.2 seconds.
Can you conclude that the drug has an effect on response time?
Question 1: Solution
A neurologist is testing the effect of a drug on response time by injecting 36 rats
with a unit dose of a new drug.
The neurologist measures the response time of each rat to a stimulus.
The neurologist know that the mean response time for rats not injected is 0.75 seconds.
The mean of the 36 injected rats' response time is 0.6 seconds with a standard deviation of 0.2 seconds.
Can you conclude that the drug has an effect on response time?

Step 1. State the null and alternative hypotheses.


Null Hypotheses H0 : The drug has no effect m = 0.75 seconds
Alternative Hypothesis H1 : The drug has an effect m �0.75 seconds

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m0 0.6 - 0.75
z= = = -4.5  Note we are approximating with
s 0.2 as we don’t know .
n 36
Question 1: Solution
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

–4.5 is in the Reject


Region
Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We reject the null hypotheses as –4.5 is in the reject region.
We can conclude that there is evidence to suggest that the drug has an effect on reaction time.
Example 2
In an examination taken by a large number of students the mean mark was 51.5 and the
standard deviation was 8.5. In a random sample of 49 students in a particular town,
it was found that among the students in this town the mean mark was 50.
At the 5% level of significance, investigate if there is evidence to conclude that the
students of this town did as well as students in general.

STEP 1. State the null and alternative hypotheses.


Null Hypotheses H0 : The students in this town did as well as all other students m = 51.5.
Alternative Hypothesis H1: The students in this town did as well as all other students m �51.5.

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m 0 50 - 51.5
Z= = = - 1.24
s 8.5
n 49
Example 2
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

−1.24 is in the Fail to Reject


Region

Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .


We fail to reject the null hypotheses as - 1.24 is in the fail to reject region.
We can conclude that there is evidence to suggest that the students in this town
did equally as well as students in general.
Example 3
The weights of newborn babies in Ireland is known to have a mean of 3 �
42kg
and a standard deviation of 0 �
9kg. Assuming that the weights are normally distributed,
a random sample of 500 babies whose mothers smoked heavily during pregnancy is taken.
If the mean weight of this sample is 3 �
28kg, can we conclude at the 5% significance
that heavy smoking of mothers during pregnancy has an effect on the weight of their babies at birth?

STEP 1. State the null and alternative hypotheses.


Null Hypotheses H0 : Heavy smoking during pregnancy by mothers has no effect on the weight
of their babies at birth m = 3.42 kg
Alternative Hypothesis H1 : Heavy smoking during pregnancy by mothers has an effect on the weight
of their babies at birth m �3.42 kg

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m 0 3.28 - 3.42
Z= = = - 3.48
s 0.9
n 500
Example 3
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

−3.48 is in the Reject


Region
Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We reject the null hypotheses as - 3.48 is in the reject region.
We can conclude that there is evidence to suggest that babies weights will be
effected if their mothers smoke heavily during pregnancy.
p - value
Instead of comparing the value of our test statistic to the critical values, we can get a specific p-value
for our test statistic by looking up its value on the tables.
The p-value measures the strength of the evidence in the data against the null hypothesis.
The smaller the p-value, the less likely it is that the sample results come from a situation
where the null hypothesis is true.

p-value at the 5% Significance Level

If p �0.05: Very strong evidence to reject the null hypotenuse H0 (if p is low H0 must go)
If p > 0.05: Very strong evidence to fail to reject the null hypotenuse H0 .
Example 1
Medical consultants for large companies are concerned about the effects of stress
on company executives. The mean systolic blood pressure for males aged
35 to 44 years of age is, according to national health statistics, 128 with
a standard deviation of 15. A sample of 72 male executives in this age
group ws selected from companies. Their mean blood pressure was 130.
(i) Construct a 95% confidence interval for the mean systolic blood pressure
for the executives. Interpert this interval.
(ii) Carry out a hypothesis test using a significance level of 5% to see if there
is evidence to suggest that the mean systolic blood pressure for executives
is different to the national average. Clearly state the null and alternative
hypothesis and your conclusion. Give a p-value for this hypothesis test
and interpret this p-value.
(i) n = 72, s = 15, x = 130
�s �
95% confidence interval x �1.96 � �
�n�
�15 �
95% confidence interval 130 �1.96 � �
� 72 �
130 �3.46
[126.54, 133.46]
This means that the mean systolic blood pressure (m) for all male executives aged 35 to 44
in large companies lies in the range 126.54 to 133.46, with 95% certainty.
This range includes the national average of 128.

(ii) Carry out a hypothesis test.


STEP 1. State the null and alternative hypotheses.
Null Hypotheses H0 : The mean systolic blood pressure for males in the age group 35-44
is the same as the national average. m = 128.
Alternative Hypothesis H1: The mean systolic blood pressure for males in the age group 35-44
is not the same as the national average. m �128
Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m0 130 - 128
Z= = = 1.13
s 15
n 72
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

1.13 is in the fail to Reject


Region
Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We fail to reject the null hypotheses as 1.13 is in the fail to reject region.
We can conclude that there is evidence to suggest that
the mean systolic blood pressure for males in the age group 35-44
is the same as the national average. m = 128.
Step 5. p - value in a Two Tailed Test.
The probability of getting a value > 1.13 is got from the tables is 1 - 0.8708 = 0.1292.
The probability of getting a value < - 1.13 is also = 0.1292.
The p-value is the sum of these two probabilities = 2(0.1292) = 0.2584
This p-value is very high it is greater than 0.05
so this is greater evidence for failing to reject the null hypothesis.

Two things to note:


1. The p-value means: what is the probability that the observed
value (130) is this far away from the value I expected to get (128)
because of sheer randomness? So a p-value of 0·26 means in this
case that there is a 26% chance that the blood pressure will be 2
or more units (130–128 = 2) away from the population mean for a
sample of this size, just because of random variation in sampling.
This is not enough evidence to reject the null hypothesis – the 5%
level of significance means that we only reject the null hypothesis
if the probability that the observed value is this far away from the
value I expected to get because of sheer randomness is less than
5%. So, at 26%, the chance that this variation was due to
randomness is too high.

2. The z-score is doubled to get the p-value because we are doing


Example 2
A new diet is adertised with the claim that participants will loose an average of 4 kg during the
first week on this diet. A random sample of 40 people on this diet showed a mean weight loss
of 3.6 kg, with a standard deviation of 1 kg.
(i) Calculate at a 95% confidence interval for the mean weight loss of all participants on this diet.
Interpret this interval.
(ii) Test the claim made in the advertisement for this diet at a 5% level of significance.
Clearly state your null and alternative hypotheses and your conclusion.
Give a p-value for this hypothesis test and interpret this p-value.

(i) n = 40, s = 1, x = 3.5


�s �
95% confidence interval x �1.96 � �
�n�
�1 �
95% confidence interval 3.6 �1.96 � �
� 40 �
3.6 �0.31
[3.29, 3.91]
This means that the mean weight loss (m) lies in the range 3.29 kg to 3.91 kg, with 95% certainty.
This range does not include the weight loss (4 kg) as advertised.
Example 2
(ii) Carry out a hypothesis test.
STEP 1. State the null and alternative hypotheses.
Null Hypotheses H0: The average weight loss during the first week of this diet is 4 kg. m = 4 kg.
Alternative Hypothesis H1 : The average weight loss during the first week of this diet is not 4 kg. m �4 kg.

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m0 3.6 - 4
Z= = = - 2.53
s 1
n 40
Example 2
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

−2.53 is in the Reject


Region
Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We reject the null hypothesis as - 2.53 is in the reject region.
The average weight loss during the first week of this diet is not 4 kg.
We can conclude that there is evidence to suggest that
the advertising claims seems not to be true.
Example 2
Step 5. p - value in a Two Tailed Test.
The probability of getting a value > 2.53 is got from the tables is 1 - 0.9943 = 0.006.
The probability of getting a value < - 2.53 is also = 0.006.
The p-value is the sum of these two probabilities = 2(0.006) = 0.012

“The p-value is very small – there is only a 1.2% chance that the
deviation from the 4 kg stated is due to sampling variability. This is
very strong evidence for rejecting the company’s claim.”
Your Turn
Question 1
The mean hourly wage in an EU country is €10. A sample of 35 individuals in the capital city
of the country has a mean hourly wage of €10.83 with a standard deviation of €3.35 per hour.
(i) Construct a 95% confidence interval for the mean hourly wage in the capital city.
Interpert this interval.
(ii) Is there evidence to suggest that hourly wages for workers in the capital city are
differen from the national hourly wage?
Test the hypothesis using a 5% level of significance.
Clearly state the null and alternative hypotheses and your conclusion.
Give a p-value for this hypothesis test and interpret this p-value.
Question 1: Solution
The mean hourly wage in an EU country is €10. A sample of 35 individuals in the capital city
of the country has a mean hourly wage of €10.83 with a standard deviation of €3.35 per hour.
(i) Construct a 95% confidence interval for the mean hourly wage in the capital city.
Interpert this interval.
(ii) Is there evidence to suggest that hourly wages for workers in the capital city are
differen from the national hourly wage?
Test the hypothesis using a 5% level of significance.
Clearly state the null and alternative hypotheses and your conclusion.
Give a p-value for this hypothesis test and interpret this p-value.
(i) n = 35, s = 3.35, x = 10.83
�s �
95% confidence interval x �1.96 � �
�n�
�3.35 �
95% confidence interval 10.83 �1.96 � �
� 35 �
10.83 �1.11
[9.72, 11.94]
This means hourly wage (m) for workers in the capital city lies in the range €9.72 to €11.94
with 95% certainty.
This range includes the mean hourly rate for the country (€10).
Question 1: Solution

(ii) Carry out a hypothesis test.


Step 1. State the null and alternative hyp otheses.
Null Hypotheses H0 : The average hourly wage for a worker in the capital is the same as that of a worker
in the rest of the country . m = €10.
Alternative Hypothesis H1: The average hourly wage for a worker in the capital is not the same as that
of a worker in the rest of the country . m �€10.

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m0 10.83 - 10
Z= = = 1.466
s 3.35
n 35
Question 1: Solution
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

1.466 is in the Fail to Reject


Region
Step 4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We fail to reject the null hypotheses as 1.466 is in the fail to reject region.
We can conclude that there is evidence to suggest that,
the hourly wage for workers in the capital is the same as the rest of the country.
Question 1: Solution
Step 5. p - value in a Two Tailed Test.
The probability of getting a value > 1.466 is got from the tables is 1 - 0.9286 = 0.0714
The probability of getting a value < 1.466 is also = 0.0714.
The p-value is the sum of these two probabilities = 2(0.0714) = 0.1428
This p-value is greater than 0.05.
So this is greater evidence for failing to reject the null hypothesis.
Question 2
A machine filling bottles of natural mineral water is set to deliver 0.725 litres with a
standard deviation of 0.01 litres. A sample of 50 bottles is checked and the mean quantity
is found to be 0.721 litres.
A the 5% level of siginificance, investigate if there is evidence to suggest that the mean
of this sample is different from the expected mean of 0.725 litres?
Question 2: Solution
A machine filling bottles of natural mineral water is set to deliver 0.725 litres with a
standard deviation of 0.01 litres. A sample of 50 bottles is checked and the mean quantity
is found to be 0.721 litres.
A the 5% level of siginificance, investigate if there is evidence to suggest that the mean
of this sample is different from the expected mean of 0.725 litres?

STEP 1. State the null and alternative hypotheses.


Null Hypotheses H0 : The mean volume delivered is the same as the expected volume.
m = 0.725 litres
Alternative Hypothesis H1 : The mean volume delivered is not the same as the expected volume.
m �0.725 liters

Step 2. Convert the observed results into z units. ( Calculate the test statistic ) .
x - m0 0.721 - 0.725
Z= = = - 2.83
s 0.01
n 50
Question 2: Solution
Step 3. Write down the critical values. ( a sketch also helps ) .

Fail to Fail to
Reject Reject Reject Reject

Reject H0 Reject H0

25% 25%

-1.96 1.96

− 2.83 is in the Reject


StepRegion
4. Reject H0 if Z is in the critical regions,otherwise fail to reject H0 .
We reject the null hypotheses as - 2.83 is in the reject region.
We can conclude that there is evidence to suggest that,
the mean volume delivered is not the same as the expected volume.

We can conclude that there is evidence to suggest that the mean is


different from the expected mean

You might also like