Estadistica Basica - bkb9 PDF
Estadistica Basica - bkb9 PDF
Estadistica Basica - bkb9 PDF
9 Data Analysis
Worked Example 1
Find
(a)
the mean
(b)
the median
(c)
the mode
(d)
the range
Solution
(a)
The mean is
5+6 +2 + 4 + 7+8+3+ 5+ 6+ 6
10
52
=
10
= 5.2 .
(b)
5+6
2
11
=
2
median =
= 5.5 .
(c)
From the list above it is easy to see that 6 appears more than any other number, so
mode = 6 .
(d)
The range is the difference between the smallest and largest numbers, in this case
2 and 8. So the range is 8 2 = 6 .
145
9.1
Worked Example 2
Five people play golf and at one hole their scores are
3, 4, 4, 5, 7.
For these scores, find
(a)
the mean
(b)
the median
(c)
the mode
(d)
the range .
Solution
(a)
The mean is
3+ 4+ 4+5+ 7
5
23
=
5
= 4.6 .
(b)
median = 4 .
(c)
(d)
The range is the difference between the smallest and largest numbers, in this case
3 and 7, so
range = 7 3
= 4.
Exercises
1.
2.
Find the mean median, mode and range of each set of numbers below.
(a)
3, 4, 7, 3, 5, 2, 6, 10
(b)
(c)
17, 18, 16, 17, 17, 14, 22, 15, 16, 17, 14, 12
(d)
(e)
(f)
Twenty children were asked their shoe sizes. The results are given below.
8,
6,
7,
6,
5,
4 12 ,
7 12 ,
6 12 ,
8 12 ,
10
7,
5,
5 12
8,
9,
7,
5,
6,
8 12
(b)
the median
the mean
146
(c)
the mode
3.
4.
5.
6.
Find
(i)
the mean
(ii)
the median
(iii)
the mode.
(b)
Which average would you use if you wanted to claim that the staff were:
(i)
well paid
(ii) badly paid?
(c)
Two people work in a factory making parts for cars. The table shows how many
complete parts they make in one week.
Worker
Mon
Tue
Wed
Thu
Fri
Fred
Harry
20
30
21
15
22
12
20
36
21
28
(a)
(b)
(c)
A gardener buys 10 packets of seeds from two different companies. Each pack
contains 20 seeds and he records the number of plants which grow from each pack.
Company A
20
20
20
20
20
20
20
Company B
17
18
15
16
18
18
17
15
17
18
(a)
Find the mean, median and mode for each company's seeds.
(b)
(c)
(d)
7.
(a)
(b)
If he scores 70 in his next test, does his mean score increase or decrease?
Find his new mean score.
(c)
Which has increased most, his mean score or his median score?
Richard keeps a record of the number of fish he catches over a number of fishing
trips. His records are:
1, 0, 2, 0, 0, 0, 12, 0, 2, 0, 0, 1, 18, 0, 2, 0, 1.
(a)
Why does he object to talking about the mode and median of the number of
fish caught?
147
9.1
8.
(b)
(c)
Richard's friend, Najir, also goes fishing. The mode of the number of fish
he has caught is also 0 and his range is 15.
What is the largest number of fish that Najir has caught?
A garage owner records the number of cars which visit his garage on 10 days.
The numbers are:
204, 310, 279, 314, 257, 302, 232, 261, 308, 217.
9.
(a)
(b)
The owner hopes that the mean will increase if he includes the number of
cars on the next day. If 252 cars use the garage on the next day, will the
mean increase or decrease?
The children in a class state how many children there are in their family.
The numbers they state are given below.
1, 2, 1, 3, 2, 1, 2, 4, 2, 2, 1, 3, 1, 2,
2, 2, 1, 1, 7, 3, 1, 2, 1, 2, 2, 1, 2, 3
(a)
(b)
10.
The mean number of people visiting Jane each day over a five-day period is 8.
If 10 people visit Jane the next day, what happens to the mean?
11.
The table shows the maximum and minimum temperatures recorded in six cities
one day last year.
City
Maximum
Minimum
Los Angeles
22 C
12 C
Boston
22 C
3 C
Moscow
18 C
9 C
Atlanta
27 C
8 C
Archangel
13 C
15 C
Cairo
28 C
13 C
(a)
(b)
(c)
Work out the difference between the maximum temperature and the
minimum temperature for Moscow.
(LON)
12.
13.
Here are the number of goals scored by a school football team in their matches this
term.
3, 2, 0, 1, 2, 0, 3, 4, 3, 2
(a)
(b)
14.
(a)
(b)
The 8 members of Nelson House tug of war team have a mean weight of
64 kilograms.
Which team do you think will win a tug of war between Hereward House
and Nelson House? Give a reason for your answer.
(MEG)
15.
Pupils in Year 8 are arranged in eleven classes. The class sizes are
23, 24, 24, 26, 27, 28, 30, 24, 29, 24, 27.
(a)
(b)
16.
What does this tell you about the class sizes in Year 9 compared with those
in Year 8?
(SEG)
A school has to select one pupil to take part in a General Knowledge Quiz.
Kim and Pat took part in six trial quizzes. The following lists show their scores.
Kim
28
24
21
27
24
26
Pat
33
19
16
32
34
18
(b)
Which pupil would you choose to represent the school? Explain the reason
for your choice, referring to the mean scores and ranges.
(MEG)
Information
The study of statistics was begun by an English mathematician, John Graunt (16201674).
He collected and studied the death records in various cities in Britain and, despite the fact
that people die randomly, he was fascinated by the patterns he found.
149
9.1
17.
(i)
(ii)
(b)
Do you think it is better to count all eight marks, or to count only the six
remaining marks? Use the means and the ranges to explain your answer.
(c)
The eight marks obtained by Tonya in the same competition have a mean
of 5.2 and a range of 0.6. Explain why none of her marks could be as high
as 5.9.
(MEG)
Worked Example 1
A football team keep records of the number of goals it scores per match during a season.
No. of Goals
Frequency
0
1
2
3
4
5
8
10
12
3
5
2
Solution
The table above can
be used, with a third
column added.
The mean can now
be calculated.
Mean =
73
40
= 1.825 .
No. of Goals
Frequency
0
1
2
3
8
10
12
3
0
1
2
3
4 5 = 20
5 2 = 10
TOTALS
40
73
(Total matches)
150
8
10
12
3
=
=
=
=
0
10
24
9
(Total goals)
Worked Example 2
The bar chart shows how many cars were sold by a salesman over a period of time.
6
5
4
Frequency
3
2
1
0
1
2
3
4
Cars sold per day
Solution
The data can be transferred to a table and a third column included as shown.
Cars sold daily
Frequency
0
1
2
2
4
3
0 2 = 0
1 4 = 4
2 3 = 6
3
4
6
3
3 6 = 18
4 3 = 12
5 2 = 10
TOTALS
20
50
(Total days)
Mean =
50
20
= 2.5
Worked Example 3
A police station kept records of the number of road traffic accidents in their area each day
for 100 days. The figures below give the number of accidents per day.
1 4 3 5 5 2 5 4 3 2 0 3 1 2 2 3 0 5 2 1
3 3 2 6 2 1 6 1 2 2 3 2 2 2 2 5 4 4 2 3
3 1 4 1 7 3 3 0 2 5 4 3 3 4 3 4 5 3 5 2
4 4 6 5 2 4 5 5 3 2 0 3 3 4 5 2 3 3 4 4
1 3 5 1 1 2 2 5 6 6 4 6 5 8 2 5 3 3 5 4
9.2
Solution
The first step is to draw out and complete a tally chart. The final column shown below
can then be added and completed.
Number of Accidents
Tally
Frequency
||||
0 4 = 0
|||| ||||
10
1 10 = 10
22
2 22 = 44
23
3 23 = 69
16
4 16 = 64
17
5 17 = 85
|||| |
6 6 = 36
7 1 = 7
8 1 = 8
TOTALS
100
323
323
= 3.23.
100
Exercises
1.
A survey of 100 households asked how many cars there were in each household
The results are given below.
No. of Cars
Frequency
0
1
2
3
4
5
70
21
3
1
The survey of question 1 also asked how many TV sets there were in each household. The results are given below.
No. of TV Sets
Frequency
0
1
2
3
4
5
2
30
52
8
5
3
3.
A manager keeps a record of the number of calls she makes each day on her mobile
phone.
Number of calls
per day
Frequency
12
10
14
A cricket team keeps a record of the number of runs scored in each over.
No. of Runs
Frequency
0
1
2
3
4
5
6
7
8
3
2
1
6
5
4
2
1
1
6.
(a)
(b)
How many times was the number of worms seen greater than the mean?
As part of a survey, a station recorded the number of trains which were late each
day. The results are listed below.
0 1 2 4 1 0 2 1 1 0
1 2 1 3 1 0 0 0 0 5
2 1 3 2 0 1 0 1 2 1
1 0 0 3 0 1 2 1 0 0
Construct a table and calculate the mean number of trains which were late each
day.
153
9.2
7.
Hannah drew this bar chart to show the number of repeated cards she got when she
opened packets of football stickers.
12
10
8
Frequency
6
4
2
0
2
3
4
5
Number of repeats
9.
In a season a football team scored a total of 55 goals. The table below gives a
summary of the number of goals per match.
Goals per Match
Frequency
0
1
2
3
4
5
4
6
8
2
1
(a)
(b)
A traffic warden is trying to work out the mean number of parking tickets he has
issued per day. He produced the table below, but has accidentally rubbed out some
of the numbers.
Tickets per day
Frequency
0
1
2
3
4
5
6
TOTALS
1
10
7
20
2
26
72
154
10.
Copy and complete the frequency table below using a class interval of 10
and starting at 30.
Weight Range (w)
Tally
Frequency
30 w < 40
(b)
11.
(a)
(b)
Calculate the mean number of children per family. Show your working.
Describe two changes that have occurred in the number of children per
family since 1960.
(SEG)
Worked Example 1
The mean of a sample of 6 numbers is 3.2. An extra value of 3.9 is included in the
sample. What is the new mean?
155
9.3
Solution
= 19.2
New total = 19.2 + 3.9
= 23.1
New mean =
23.1
7
= 3.3
Worked Example 2
The mean number of a set of 5 numbers is 12.7. What extra number must be added to
bring the mean up to 13.1?
Solution
Total of the original numbers
= 5 12.7
= 63.5
Total of the new numbers = 6 13.1
= 78.6
Difference = 78.6 63.5
= 15.1
Exercises
1.
The mean height of a class of 28 students is 162 cm. A new girl of height 149 cm
joins the class. What is the mean height of the class now?
2.
After 5 matches the mean number of goals scored by a football team per match is
1.8. If they score 3 goals in their 6th match, what is the mean after the 6th match?
3.
The mean number of children ill at a school is 3.8 per day, for the first 20 school
days of a term. On the 21st day 8 children are ill. What is the mean after 21 days?
4.
The mean weight of 25 children in a class is 58 kg. The mean weight of a second
class of 29 children is 62 kg. Find the mean weight of all the children.
5.
A salesman sells a mean of 4.6 conservatories per day for 5 days. How many must
he sell on the sixth day to increase his mean to 5 sales per day?
6.
Adrian's mean score for four tests is 64%. He wants to increase his mean to 68%
after the fifth test. What does he need to score in the fifth test?
7.
The mean salary of the 8 people who work for a small company is 15 000. When
an extra worker is taken on this mean drops to 14 000. How much does the new
worker earn?
156
8.
The mean of 6 numbers is 12.3. When an extra number is added, the mean changes
to 11.9. What is the extra number?
9.
When 5 is added to a set of 3 numbers the mean increases to 4.6. What was the
mean of the original 3 numbers?
10.
Three numbers have a mean of 64. When a fourth number is included the mean is
doubled. What is the fourth number?
Worked Example 1
The table below gives data on the heights, in cm, of 51 children.
Class Interval
16
21
Frequency
(a)
(c)
(b)
Solution
(a)
Mid-point Frequency
Class Interval
Mid-point
145
155
16
155 16 = 2480
165
21
165 21 = 3465
175
175 8 = 1400
Totals
51
8215
Mean =
145 6 =
870
8215
51
The median is the 26th value. In this case it lies in the 160 h < 170 class interval.
The 4th value in the interval is needed. It is estimated as
160 +
(c)
4
10 = 162 (to the nearest cm)
21
The modal class is 160 h < 170 as it contains the most values.
157
9.4
Also note that when we speak of someone by age, say 8, then the person could be any age
from 8 years 0 days up to 8 years 364 days (365 in a leap year!). You will see how this is
tackled in the following example.
Worked Example 2
The age of children in a primary school were recorded in the table below.
Age
56
78
9 10
29
40
38
Frequency
(a)
(b)
(c)
Solution
(a)
To estimate the mean, we must use the mid-point of each interval; so, for example
for '5 6', which really means
5 age < 7 ,
the mid-point is taken as 6.
Mid-point
Frequency
Mid-point Frequency
56
29
6 29 = 174
78
40
8 40 = 320
9 10
10
38
10 38 = 380
Totals
107
874
Class Interval
Mean =
874
107
The median is given by the 54th value, which we have to estimate. There are 29
values in the first interval, so we need to estimate the 25th value in the second
interval. As there are 40 values in the second interval, the median is estimated as
being
25
40
of the way along the second interval. This has width 9 7 = 2 years, so the
median is estimated by
25
2 = 1.25
40
from the start of the interval. Therefore the median is estimated as
7 + 1.25 = 8.25 years.
(c)
158
Worked Example 1 uses what are called continuous data, since height can be of any value.
(Other examples of continuous data are weight, temperature, area, volume and time.)
The next example uses discrete data, that is, data which can take only a particular value,
such as the integers 1, 2, 3, 4, . . . in this case.
The calculations for mean and mode are not affected but estimation of the median
requires replacing the discrete grouped data with an approximate continuous interval.
Worked Example 3
The number of days that children were missing from school due to sickness in one year
was recorded.
Number of days off sick
15
6 10
11 15
16 20
21 25
12
11
10
Frequency
(a)
(b)
(c)
Solution
(a)
The estimate is made by assuming that all the values in a class interval are equal to
the midpoint of the class interval.
Class Interval
Mid-point
Frequency
Mid-point Frequency
15
12
3 12 = 36
610
11
8 11 = 88
1115
13
10
13 10 = 130
1620
18
18 4 = 72
2125
23
23 3 = 69
Totals
40
395
Mean =
395
40
= 9.925 days.
(b)
As there are 40 pupils, we need to consider the mean of the 20th and 21st values.
These both lie in the 610 class interval, which is really the 5.510.5 class interval,
so this interval contains the median.
As there are 12 values in the first class interval, the median is found by considering
the 8th and 9th values of the second interval.
As there are 11 values in the second interval, the median is estimated as being
8.5
11
9.4
But the length of the second interval is 10.5 5.5 = 5 , so the median is estimated by
8.5
5 = 3.86
11
from the start of this interval. Therefore the median is estimated as
The modal class is 15, as this class contains the most entries.
Exercises
1.
A door to door salesman keeps a record of the number of homes he visits each day.
Homes visited
Frequency
(a)
(b)
(c)
2.
20 29
30 39
40 49
24
60
21
40 w < 45
45 w < 50
50 w < 55
11
15
(a)
(b)
(c)
Frequency
10
A stopwatch was used to find the time that it took a group of children to run 100 m.
Time (seconds)
Frequency
(a)
(c)
(d)
4.
10 19
Mean (kg)
3.
09
10 t < 15
15 t < 20
20 t < 25
25 t < 30
16
21
(a)
(b)
(c)
0 d < 0.5
30
22
19
5.
The ages of the children at a youth camp are summarised in the table below.
Age (years)
68
9 11
12 14
15 17
Frequency
22
29
25
6 10
11 15
16 25
20
42
12
Frequency
15
Frequency
8.
6 10
20
11 15
26
16 20
32
(a)
(c)
(a)
(i)
(iii)
(b)
1 10
11 20
21 30
15
(ii)
31 40
11
41 50
3
Another class took the same test. Their results are given below.
Correct answers
Frequency
(i)
(iii)
(c)
(b)
21 25
1 10
11 20
21 30
31 40
41 50
14
20
(ii)
Information
A quartile is one of 3 values (lower quartile, median and upper quartile) which divides
data into 4 equal groups.
A percentile is one of 99 values which divides data into 100 equal groups.
The lower quartile corresponds to the 25th percentile. The median corresponds to the
50th percentile. The upper quartile corresponds to the 75th percentile.
161
9.4
9.
29 children are asked how much pocket money they were given last week.
Their replies are shown in this frequency table.
Pocket money
10.
Frequency
f
0 1.00
12
1.01 2.00
2.01 3.00
3.01 4.00
(a)
(b)
Calculate an estimate of the mean amount of pocket money received per child.
(NEAB)
The graph shows the number of hours a sample of people spent viewing television
one week during the summer.
40
30
Number of
people
20
10
(a)
10
20
30
40
50
Viewing time (hours)
60
70
Number of
people
0 h < 10
13
10 h < 20
27
20 h < 30
33
30 h < 40
40 h < 50
50 h < 60
(b)
Another survey is carried out during the winter. State one difference you
would expect to see in the data.
(c)
Use the mid-points of the class intervals to calculate the mean viewing time
for these people. You may find it helpful to use the table below.
162
Viewing time
(h hours)
Mid-point
Frequency
Mid-point
Frequency
0 h < 10
13
65
10 h < 20
15
27
405
20 h < 30
25
33
825
30 h < 40
35
40 h < 50
45
50 h < 60
55
(SEG)
11.
20
21
22
23
24
25
26
27
28
29
Frequency
10
(a)
(b)
(c)
12.
The following list shows the maximum daily temperature, in F , throughout the
month of April.
(a)
56.1
49.4
63.7
56.7
55.3
53.5
52.4
57.6
59.8
52.1
45.8
55.1
42.6
61.0
61.9
60.2
57.1
48.9
63.2
68.4
55.5
65.2
47.3
59.1
53.6
52.3
46.9
51.3
56.7
64.3
Frequency
40 < T 50
50 < T 54
54 < T 58
58 < T 62
62 < T 70
(b)
Use the table of values in part (a) to calculate an estimate of the mean of this
distribution. You must show your working clearly.
(c)
(MEG)
Worked Example 1
Height (cm)
Frequency
90 < h 100
22
30
31
18
Solution
The table below shows how to calculate
the cumulative frequencies.
Height (cm)
Frequency
Cumulative Frequency
90 < h 100
22
5 + 22 = 27
30
27 + 30 = 57
31
57 + 31 = 88
18
88 + 18 = 106
106 + 6 = 112
120
(150,112)
(140,106)
100
(130,88)
80
Cumulative
Frequency
60
(120,57)
40
(110,27)
20
(90,0)
0
90
(100,5)
100
110
120
Height (cm)
164
130
140
150
Note
A more accurate graph is found by drawing a smooth curve through the points, rather
than using straight line segments.
120
(150,112)
(140,106)
100
(130,88)
80
Cumulative
Frequency
60
(120,57)
40
(110,27)
20
(90,0)
(100,5)
0
90
100
110
120
130
140
150
Height (cm)
Worked Example 2
The cumulative frequency graph below gives the results of 120 students on a test.
120
100
80
Cumulative
Frequency
60
40
20
0
0
20
40
60
Test Score
165
80
100
9.5
Use the graph to find:
(a)
(b)
(c)
(d)
Solution
(a)
120
100
80
Start at 60
Cumulative 60
Frequency
40
20
Median = 53
0
0
20
40
60
80
100
Score
(b)
To find out the inter-quartile range, we must consider the middle 50% of the
students.
To find the lower quartile,
start at
1
4
120
90
This gives
Lower Quartile = 43 .
80
Cumulative
Frequency
60
3
4
40
This gives
30
20
Upper Quartile = 67 .
Upper quartile = 67
0
0
20
40
Lower quartile = 43
60
80
Test Score
166
100
(c)
120
108
100
10% of 120 = 12
80
Cumulative
Frequency
60
40
20
79
0
20
40
60
80
100
120
Test Score
(d)
103
100
80
40
20
120 103 = 17 .
75
0
20
40
60
80
100
Test Score
As in Worked Example 1, a more accurate estimate for the median and inter-quartile
range is obtained if you draw a smooth curve through the data points.
Exercises
1.
Make a cumulative frequency table for each set of data given below. Then draw a
cumulative frequency graph and use it to find the median and inter-quartile range.
(a)
John weighed each apple in a large box. His results are given in this table.
Weight of
apple (g)
60 < w 80 80 < w 100 100 < w 120 120 < w 140 140 < w 160
Frequency
(b)
28
33
27
Pasi asked the students in his class how far they travelled to school each day.
His results are given below.
Distance (km)
Frequency
0 < d 1 1< d 2
5
2<d3
3<d4
12
167
4<d5 5<d6
5
9.5
(c)
A P.E. teacher recorded the distances children could reach in the long jump
event. His records are summarised in the table below.
1< d 2
2<d3
3<d4
12
Frequency
2.
0<m5
Frequency Field A
22
Frequency Field B
11
34
3.
20 < m 25
25 < m 30
10
(a)
(b)
(c)
4.
4<d5 5<d6
2<l3 3<l4
4<l5 5<l6
6<l7 7<l8
Frequency Type A
10
22
Frequency Type B
38
(a)
Use cumulative frequency graphs to find the median and inter-quartile range
for each type of battery.
(b)
The table below shows how the height of girls of a certain age vary. The data was
gathered using a large-scale survey.
Height (cm) 50 < h 55 55 < h 60 60 < h 65 65 < h 70 70 < h 75 75 < h 80 80 < h 85
Frequency
100
300
2400
1300
700
150
50
Percentage of Population
Very Tall
Tall
Normal
Short
5%
15%
60%
15%
Very short
5%
Use a cumulative frequency graph to find the heights of children in each category.
168
5.
Awarded to
500
250
50
The sales made during 1995 and 1996 are shown in the table below.
Value of sales
(1000)
0 < V 100 100 < V 200 200 < V 300 300 < V 400 400 < V 500
Frequency 1996
15
10
Frequency 1995
18
Use cumulative frequency graphs to find the values of sales needed to obtain each
bonus in the years 1995 and 1996.
6.
The histogram shows the cost of buying a particular toy in a number of different
shops.
7
6
5
Frequency 4
3
2
1
0
2.00
(a)
(b)
2.20
2.40
2.60
Price ()
2.80
3.00
(ii)
(iii)
(iv)
(v)
Comment on which of your answers are exact and which are estimates.
169
9.5
7.
Laura and Joy played 40 games of golf together. The table below shows Laura's
scores.
Scores (x)
70 < x 80 80 < x 90 90 < x 100 100 < x 110 110 < x 120
Frequency
(a)
15
17
40
30
Cumulative
Frequency
20
10
60
70
80
90
100
110
120
Score
(b)
(c)
8.
Joy's median score was 103. The inter-quartile range of her scores was 6.
(i)
(ii)
The winner of a game of golf is the one with the lowest score.
Who won most of these 40 games? Give a reason for your choice.
(NEAB)
A sample of 80 electric light bulbs was taken. The lifetime of each light bulb was
recorded. The results are shown below.
Lifetime (hours)
800
900
Frequency
13
Cumulative Frequency
17
1000
17
1100
22
1200
1300
1400
20
(a)
Copy and complete the table of values for the cumulative frequency.
(b)
170
90
80
70
60
q
Cumulative
50
Frequency
40
30
20
10
0
800
900
1000
1100
1200
1300
1400
1500
Lifetime (hours)
Use your graph to estimate the number of light bulbs which lasted more than
1030 hours.
(d)
Use your graph to estimate the inter-quartile range of the lifetimes of the
light bulbs.
(e)
A second sample of 80 light bulbs has the same median lifetime as the first
sample. Its inter-quartile range is 90 hours. What does this tell you about
the difference between the two samples?
(SEG)
The numbers of journeys made by a group of people using public transport in one
month are summarised in the table.
Number of journeys
Number of people
(a)
010
1120
2130
3140
4150
5160
6170
60
70
Number of journeys
10
20
30
40
50
Cumulative frequency
(b)
(i)
40
30
q
9.
(c)
Cumulative
Frequency
20
10
10
20
30
40
Number of journeys
171
50
60
70
9.5
(c)
(ii)
(iii)
Use your graph to estimate the number of people who made more than
44 journeys in the month.
40
30
Cumulative
Frequency
20
10
10
20
30
40
Number of journeys
50
60
70
y( )
80
60
q
10.
Cumulative
Frequency
40
20
100000
200000 240000
i ()
House prices in 1992
This grouped frequency table gives the percentage distribution of house prices (p)
in England in 1993.
172
(a)
Percentage of houses
in this class interval
0 p < 40 000
26
19
22
15
Use the data above to complete the cumulative frequency table below.
House prices (p)
in pounds 1993
Cumulative
Frequency (%)
0 p < 40 000
0 p < 52 000
0 p < 68 000
0 p < 88 000
0 p < 120 000
0 p < 160 000
0 p < 220 000
(b)
(c)
In 1992 the price of a house was 100 000. Use both cumulative frequency
graphs to estimate the price of this house in 1993. Make your method clear.
(LON)
11.
The lengths of a number of nails were measured to the nearest 0.01 cm, and the
following frequency distribution was obtained.
Length of nail
(x cm)
Number of nails
10
24
32
17
4
173
Cumulative Frequency
9.5
(a)
(b)
100
80
60
Cumulative
Frequency
40
20
0.98
1.00
1.02
1.04
1.06
1.08
1.10
1.12
1.14
12.
(ii)
A wedding was attended by 120 guests. The distance, d miles, that each guest
travelled was recorded in the frequency table below.
Distance
(d miles)
Number of
guests
0 < d 10 10 < d 20
26
38
20 < d 30
30 < d 50
20
20
50 < d 100
12
(a)
(b)
(i)
Distance
(d miles)
d 10
d 20
d 30
Number of
guests
(ii)
d 50
d 100
d 140
120
120
100
80
Cumulative
Frequency
60
40
20
20
40
60
80
100
120
140
(c)
(i)
(ii)
Give a reason for the large difference between the mean distance and
the median distance.
(MEG)
25
20
Frequency
15
10
5
0
10
11
12
Length
The range (highest value lowest value) gives a simple measure of how much the data
are spread out.
175
9.6
Standard deviation (s.d.) is a much more useful measure and is given by the formula:
n
(x
i =1
s.d. =
where
xi
x
n
x )2
Then ( xi x )2 gives the square of the difference between each value and the mean
(squaring exaggerates the effect of data points far from the mean and gets rid of negative
values), and
n
(x x)
i =1
1
n
(x
i =1
x)
gives an average value to these differences. If all the data were the same, then each xi
would equal x and the expression would be zero.
Finally we take the square root of the expression so that the dimensions of the standard
deviation are the same as those of the data.
So standard deviation is a measure of the spread of the data. The greater its value, the
more spread out the data are. This is illustrated by the two frequency polygons shown
above. Although both sets of data have the same mean, the data represented by the
'dotted' frequency polygon will have a greater standard deviation than the other.
Worked Example 1
Find the mean and standard deviation of the numbers,
6, 7, 8, 5, 9.
Solution
The mean, x , is given by,
x =
=
6+7+8+5+9
5
35
5
= 7.
176
s.d. =
1+ 0 +1+ 4 + 4
5
10
5
2
i
i =1
s.d. =
x2
This expression is much more convenient for calculations done without a calculator. The
proof of the equivalence of this formula is given below although it is beyond the scope of
the GCSE syllabus.
Proof
You can see the proof of the equivalence of the two formulae by noting that
n
(x
i =1
x) =
2
(x
2
i
2 xi x + x 2
i =1
n
2
i
i =1
xi2
(2 x x ) + x
i =1
2x
i =1
i =1
+x
i =1
1
i =1
(since the expressions 2x and x 2 are common for each term in the summation).
n
But
1
1 = n , since you are summing 1 + 1 + ... + 1 = n , and x =
14
4244
3
n
i =1
n terms
x , by
i =1
definition, thus
1
n
(x
i =1
x)
1 n 2
x
x
xi + x 2 n
2
i
n i =1
i =1
177
(substituting
1 = n)
i =1
9.6
n
xi
i =1
+ x2
2x
n
xi2
i =1
(dividing by n)
2
i
i =1
2x + x
2
(substituting x for
2
i
i =1
x2
Worked Example 2
Find the mean and standard deviation of each of the following sets of numbers.
(a)
(b)
5, 6, 12, 18, 19
Solution
(a)
x =
=
10 + 11 + 12 + 13 + 14
5
60
5
= 12
The standard deviation can now be calculated using the alternative formula.
s.d. =
10 2 + 112 + 12 2 + 132 + 14 2
2
12
5
146 144
32
(b)
x =
5 + 6 + 12 + 18 + 19
5
= 12
178
i =1
52 + 6 2 + 12 2 + 182 + 19 2
2
12
5
178 144
Note that both sets of numbers have the same mean value, but that set (b) has a much
larger standard deviation. This is expected, as the spread in set (b) is clearly far more
than in set (a).
Worked Example 3
The table below gives the number of road traffic accidents per day in a small town.
Accidents per day
Frequency
Solution
The necessary calculations for each datapoint, xi , are set out below.
Accidents per day
Frequency
( xi )
( fi )
xi 2
xi fi
xi 2 fi
1
2
3
4
5
6
8
6
3
2
0
1
1
4
9
16
25
36
8
12
9
8
0
6
8
24
27
32
0
36
TOTALS
25
43
127
n = 25 ,
xi fi = 43 ,
i =1
i =1
i fi
i =1
43
25
= 1.72 .
179
2
i
= 127 .
9.6
2
i
fi
i =1
s.d. =
x2
127
1.72 2
25
= 1.457 .
Most scientific calculators have statistical functions which will calculate the mean and
standard deviation of a set of data.
Exercises
1.
(a)
(b)
2.
3.
Find the mean and standard deviation of each set of data given below.
A
51
56
51
49
53
62
71
76
71
69
73
82
102
112
102
98
106
124
Describe the relationship between each set of numbers and also the relationship between their means and standard deviations.
Two machines, A and B, fill empty packets with soap powder. A sample of boxes
was taken from each machine and the weight of powder (in kg) was recorded.
A
2.27
2.31
2.18
2.2
2.26
2.24
2.78
2.62
2.61
2.51
2.59
2.67
2.62
2.68
(a)
(b)
2.70
Two groups of students were trying to find the acceleration due to gravity.
Each group conducted 5 experiments.
Group A
9.4
9.6
10.2
10.8
10.1
Group B
9.5
9.7
9.6
9.4
9.8
Find the mean and standard deviation for each group, and comment on their results.
4.
The number of matches per box was counted for 100 boxes of matches.
The results are given in the table below.
180
Number of Matches
Frequency
44
45
46
47
48
49
28
31
14
15
8
2
50
When two dice were thrown 50 times the total scores shown below were obtained.
Score
Frequency
2
3
4
5
6
7
8
9
10
11
12
1
0
4
8
12
9
7
5
3
1
0
The length of telephone calls from an office was recorded. The results are given in
the table below.
Length of call (mins)
Frequency
0 < t 0.5
10
12
The charges (to the nearest ) made by a garage for repair work on cars in one
week are given in the table below.
Charge ()
20 29
30 49
50 99
100 149
150 199
200 300
Frequency
10
22
181
9.6
8.
Thirty families were selected at random in two different countries. They were
asked how many children there were in each family.
Country A
Country B
1
2
2
2
2
3
5
0
1
2
0
2
1
2
1
1
2
1
1
1
3
1
2
0
4
1
0
2
4
2
3
9
6
4
4
3
2
1
3
5
4
5
2
4
2
1
1
5
1
2
1
2
5
2
Find the mean and standard deviation for each country and comment on the results.
9.
(a)
(b)
10.
Show that the standard deviation of every set of five consecutive integers is
the same as the answer to part (a).
(LON)
Ten students sat a test in Mathematics, marked out of 50. The results are shown
below for each student.
25, 27, 35, 4, 49, 10, 12, 45, 45, 48
(a)
The same students also sat an English test, marked out of 50. The mean and
standard deviation are given by
mean = 30, standard deviation = 3.6.
(b)
11.
Ten boys sat a test which was marked out of 50. Their marks were
28, 42, 35, 17, 49, 12, 48, 38, 24 and 27.
(a)
Calculate
(i)
(ii)
Ten girls sat the same test. Their marks had a mean of 30 and a standard deviation
of 6.5.
(b)
12.
There are twenty pupils in class A and twenty pupils in class B. All the pupils in
class A were given an I.Q. test. Their scores on the test are given below.
100, 104, 106, 107, 109, 110, 113, 114, 116, 117,
118, 119, 119, 121, 124, 125, 127, 127, 130, 134.
13.
(a)
(b)
Class B takes the same I.Q. test. They obtain a mean of 110 and a standard
deviation of 21. Compare the data for class A and class B.
(c)
Class C has only 5 pupils. When they take the I.Q. test they all score 105.
What is the value of the standard deviation for class C?
(SEG)
(a)
10
(i)
(ii)
A set of 10 different students took the same test. Their scores are listed below.
(b)
14.
After making any necessary calculations for the second set, compare the two
sets of scores. Your answer should be understandable to someone who does
not study Statistics.
(MEG)
Number of people
10
(a)
(b)
A Normal Distribution has approximately 68% of its data values within one
standard deviation of the mean.
Use your answers to part (a) to check if the given distribution satisfies this
property of a Normal Distribution. Show your working clearly.
(MEG)
183