0% found this document useful (0 votes)
11 views26 pages

S1.2 Calculating Means and Standard Deviations

This document covers the calculation of means and standard deviations in statistics, specifically for Edexcel AS-Level Maths. It explains how to compute the mean from data sets and frequency tables, as well as the standard deviation, including its sensitivity to outliers. Additionally, it discusses coding techniques to simplify calculations and provides examples and examination-style questions for practice.

Uploaded by

stefanalbert2302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views26 pages

S1.2 Calculating Means and Standard Deviations

This document covers the calculation of means and standard deviations in statistics, specifically for Edexcel AS-Level Maths. It explains how to compute the mean from data sets and frequency tables, as well as the standard deviation, including its sensitivity to outliers. Additionally, it discusses coding techniques to simplify calculations and provides examples and examination-style questions for practice.

Uploaded by

stefanalbert2302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 26

AS-Level Maths:

Statistics 1
for Edexcel

S1.2 Calculating means


and standard deviations

These icons indicate that teacher’s notes or useful web addresses are available in the Notes Page.

This icon indicates the slide contains activities created in Flash. These activities are not editable.
For more detailed instructions, see the Getting Started presentation.
11 of
of 26
26 © Boardworks Ltd 2005
Contents Means

Calculating means
Calculating standard deviations
Coding

22 of
of 26
26 © Boardworks Ltd 2005
Mean

The mean is the most widely used average in statistics. It is


found by adding up all the values in the data and dividing by
how many values there are.

Notation: If the data values are x1, x2 , x3 ,..., xn , then


the mean is
x1  x2  x3  ...  xn  xi This symbol
This is the x  means the
mean symbol n n total of all the
x values

Note: The mean takes into account every piece of


data, so it is affected by outliers in the data. The
median is preferred over the mean if the data
contains outliers or is skewed.

3 of 26 © Boardworks Ltd 2005


Mean

If data are presented in a frequency table:

Value Frequency
x1 f1
x2 f2
… …
xn fn

then the mean is


x1 f1  x2 f 2  ...  xn f n  xi f i
x 
 fi  fi

4 of 26 © Boardworks Ltd 2005


Mean

Example: The table shows the results of a survey


into household size. Find the mean size.
Household size, x Frequency, f x×f
1 20 20
2 28 56
3 25 75
4 19 76
5 16 80
6 6 36

TOTAL 114 343

To find the mean, we add a 3rd column to the table.


Mean = 343 ÷ 114 = 3.01

5 of 26 © Boardworks Ltd 2005


Contents Standard deviation

Calculating means
Calculating standard deviations
Coding.

66 of
of 26
26 © Boardworks Ltd 2005
Standard deviation

There are three commonly used measures of spread (or


dispersion) – the range, the inter-quartile range and the
standard deviation.
The standard deviation is widely used in statistics to measure
spread. It is based on all the values in the data, so it is
sensitive to the presence of outliers in the data.

The variance is related to the standard deviation:


variance = (standard deviation)2
The following formulae can be used to find the variance and s.d.

 (x  x )
2

2
(x  x )
i i
variance  s.d. 
n n

7 of 26 © Boardworks Ltd 2005


Standard deviation

Example: The mid-day temperatures (in ˚C) recorded for


one week in June were: 21, 23, 24, 19, 19, 20, 21
21 23  ...  21 147
First we find the mean: x   21˚C
7 7
xi xi  x ( xi  x )2

2
(x  x )
i
variance 
n
21 0 0
23 2 4 So variance = 22 ÷ 7 = 3.143
24 3 9
So, s.d. = 1.77 ˚C (3 s.f.)
19 -2 4
19 -2 4
20 -1 1
21 0 0
Total: 22
8 of 26 © Boardworks Ltd 2005
Standard deviation

There is an alternative formula which is usually a more


convenient way to find the variance:
 ( xi  x )
2

variance 
n
But,  ( xi  x )2  ( xi2  2 xi x  x 2 )

 xi2  2 x  xi  nx 2

 xi2  2 x nx  nx 2

 xi2  nx 2

x
2

2
x i
Therefore, variance  i
 x 2
and s.d.   x2
n n
9 of 26 © Boardworks Ltd 2005
Standard deviation

Example (continued): Looking again at the temperature


data for June: 21, 23, 24, 19, 19, 20, 21
147
We know that x  21˚C
7

 i
2 2 2 2
Also, x 21  23  ...  21 = 3109


2
x i 2 3109
So, variance   x   212 3.143
n 7
s.d. 1.77 ˚C

Note: Essentially the standard deviation is a measure


of how close the values are to the mean value.

10 of 26 © Boardworks Ltd 2005


Calculating standard deviation from a table

When the data is presented in a frequency table, the formula


for finding the standard deviation needs to be adjusted slightly:


2
f i xi
s.d.   x2
f i

Example: A class of 20 Number of times Frequency


exercise taken
students were asked how
0 5
many times they exercise
1 3
in a normal week.
2 5
Find the mean and the 3 4
standard deviation. 4 2
5 1

11 of 26 © Boardworks Ltd 2005


Calculating standard deviation from a table
No. of times Frequency, f x×f x2 × f
exercise taken, x
0 5 0 0
1 3 3 3
2 5 10 20
3 4 12 36
4 2 8 32
5 1 5 25

TOTAL: 20 38 116

The table can be extended to help find the mean and the s.d.

 f x
2
38 i i 116
2
x  1.9 s.d.   x   1.92 1.48
20 f i 20

12 of 26 © Boardworks Ltd 2005


Calculating standard deviation from a table

If data is presented in a grouped frequency table, it is only


possible to estimate the mean and the standard deviation.
This is because the exact data values are not known.
An estimate is obtained by using the mid-point of an interval to
represent each of the values in that interval.

Example: The table Annual mileage, x Frequency


shows the annual mileage 0 ≤ x < 5000 7
for the employees of an 18
5000 ≤ x < 10,000
insurance company.
10,000 ≤ x < 15,000 14
Estimate the mean and
15,000 ≤ x < 20,000 4
standard deviation.
20,000 ≤ x < 30,000 2

13 of 26 © Boardworks Ltd 2005


Calculating standard deviation from a table
Mileage Frequency, f Mid-point, x f×x f × x2

0 – 5000 6 2500 15000 37,500,000

5000 – 10,000 17 7500 127,500 956,250,000

10,000 – 15,000 14 12,500 175,000 2,187,500,000

15,000 – 20,000 5 17,500 87,500 1,531,250,000

20,000 – 30,000 3 25,000 75,000 1,875,000,000

TOTAL 45 480,000 6,587,500,000

480,000
x 10,667 miles
45
6,587,500,000
s.d.   10,6672 5711 miles
45
14 of 26 © Boardworks Ltd 2005
Notes about standard deviation

Here are some notes to consider about standard deviation.


In most distributions, about 67% of the data will lie within
1 standard deviation of the mean, whilst nearly all the
data values will lie within 2 standard deviations of the mean.
Values that lie more than 2 standard deviations from the
mean are sometimes classed as outliers – any such
values should be treated carefully.
Standard deviation is measured in the same units as the
original data. Variance is measured in the same units
squared.
Most calculators have built-in functions which will find
the standard deviation for you. Learn how to use this
facility on your calculator.
15 of 26 © Boardworks Ltd 2005
Examination style question

Examination style question: 2 3 means 23 years old


The ages of the people in a 2 3 6
3 1 6 6
cinema queue one Monday 4 1 2 5 6 9
afternoon are shown in the 5 0 4 7
stem-and-leaf diagram: 6 1

a) Explain why the diagram suggests that the mean and


standard deviation can be sensibly used as measures of
location and spread respectively.
b) Calculate the mean and the standard deviation of the ages.
c) The mean and the standard deviation of the ages of the
people in the queue on Monday evening were 29 and
6.2 respectively. Compare the ages of the people
queuing at the cinema in the afternoon with those in the
evening.
16 of 26 © Boardworks Ltd 2005
Examination style question

a) The mean and the standard 2 3 means 23 years old


2 3 6
deviation are appropriate, as 3 1 6 6
the distribution of ages is 4 1 2 5 6 9
roughly symmetrical and there 5 0 4 7
are no outliers. 6 1
597
b)  xi 597 so, x  42.64286 42.6
14
27,131
 ix 2
27131 so, s.d. 
14
 42 .64286 2
10.9

c) The cinemagoers in the evening had a smaller mean


age, meaning that they were, on average, younger
than those in the afternoon.
The standard deviation for the ages in the evening was
also smaller, suggesting that the evening audience were
closer together in age.
17 of 26 © Boardworks Ltd 2005
Combining sets of data

Sometimes in examination questions you are asked to pool


two sets of data together.

Example: Six male and five female students sit an A


level examination.
The mean marks were 52% and 57% for the males
and females respectively. The standard deviations
were 14 and 18 respectively.
Find the combined mean and the standard deviation
for the marks of all 11 students.

18 of 26 © Boardworks Ltd 2005


Combining sets of data

Let x1,..., x6 be the marks for the 6 male students.


Let y1,..., y5 be the marks of the 5 female students.
To find the overall mean, we first need to find the total
marks for all 11 students.

As x 52  x 6 52 312


As y 57  y 5 57 285
Therefore  x  y 312  285 597
So the combined mean is: 597
54.2727... 54.3%
11

19 of 26 © Boardworks Ltd 2005


Combining sets of data
To find the overall standard deviation, we need to find the
total of the marks squared for all 11 students.
x
2
i
Notice that the formula s.d.   x2
n

rearranges to give  x 2
n ( s.d.2
 x 2
)

As s.d.x 14  x 2
6 (14 2
 52 2
) 17,400

As s.d.y 18  y 2
5 (18 2
 57 2
) 17,865

Therefore,   35,265
x 2
 y 2

35,265
So the combined s.d. is:  54.2 7 2 16.1% to 3 s.f.
11

20 of 26 © Boardworks Ltd 2005


Contents Coding

Calculating means
Calculating standard deviations
Coding.

21
21 of
of 26
26 © Boardworks Ltd 2005
Coding

Coding is a technique that can simplify the numerical effort


required in finding a mean or standard deviation.
Enter some data below, and see how it changes when you
add or multiple by different numbers.

22 of 26 © Boardworks Ltd 2005


Coding

Adding
So, if a number b is added to each piece of data, the
mean value is also increased by b.
The standard deviation is unchanged.

Multiplying
If each piece of data is multiplied by a, the mean value
is multiplied by a.
The standard deviation is also multiplied by a.

More formally, if yi axi  b then:


y ay  b
s.d. y a s.d.x
23 of 26 © Boardworks Ltd 2005
Coding

Example: Find the mean and the standard deviation of the


values in the table. Use the transformation below to help you.
1
y  x 5
10

x Frequency y
50 3 0
60 5 1
70 7 2
80 4 3
90 1 4

Using the given transformation, add a y column to the table.

24 of 26 © Boardworks Ltd 2005


Coding

y Frequency, f y×f y2 × f
0 3 0 0
1 5 5 5
2 7 14 28
3 4 12 36
4 1 4 16

Total 20 35 85
35
To find the mean: y  1.75
20

 f y
2
i i 85
To find the s.d.: s.d.   y2   1.75 2 1.09
f i 20

25 of 26 © Boardworks Ltd 2005


Coding

You have now found the mean and standard deviation of y.


To find them for the x values, you must reverse the coding.
1
We can rearrange: y  x  5
10
to get: x 10 y  50

Therefore the mean of x is: x 10 y  50 10 1.75  50 67.5

And the standard deviation of x is: 10 × 1.09 = 10.9

Note how the coding helped to simplify the


calculations by making the numbers smaller.

26 of 26 © Boardworks Ltd 2005

You might also like