0% found this document useful (0 votes)
15 views20 pages

AL S1 1.2 Representation of Data

Uploaded by

liuxingba57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views20 pages

AL S1 1.2 Representation of Data

Uploaded by

liuxingba57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Head to www.savemyexams.

com for more awesome resources

CIE A Level Maths: Probability & Your notes


Statistics 1
1.2 Representation of Data
Contents
1.2.1 Data Presentation
1.2.2 Stem and Leaf Diagrams
1.2.3 Box Plots & Cumulative Frequency
1.2.4 Histograms

Page 1 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

1.2.1 Data Presentation


Your notes
Data Presentation
What graphs and diagrams should I be familiar with?
You will be expected to be able to use a variety of graphs such as:
Stem-and-leaf diagrams
Can be used with ungrouped data of a single variable
Shows all the data and the shape of its distribution
Box plots
Can be used with ungrouped data of a single variable
Shows the range, interquartile range and quartiles clearly
Very useful for comparing data patterns quickly
Cumulative frequency graphs
Can be used with continuous grouped data of a single variable
Shows the running total of the frequencies that fall below the upper bound of each class
Histograms
Can be used with continuous grouped data of a single variable
Can be used with varying group sizes
Shows the frequencies of the group, represented by the area of each bar
You might be expected to draw a full diagram or to add to an incomplete diagram
What should I look out for when interpreting graphs?
Look carefully at the context of the information given in the graph
Check the scales on both axes carefully, including units
Sometimes the numbers will be abbreviated to fit on the scale, for example if a population is given
in millions then the number 60 will represent 60 000 000
Look carefully at the labels and units to determine how a value should be read
If there is more than one graph represented on the same set of axes take extra care to ensure you are
reading from the correct one
Beware of misleading graphs, the scales on the axes, units and representation can be manipulated to
make a graph look more/less convincing

Worked example
A student is collecting information on his friends’ interests and believes that his friends who only have
dogs spend more time outside than his friends who only have cats. He has surveyed 20 friends with
only cats and 20 friends with only dogs and has written down the total amount of time, rounded to the
nearest hour, each of them spent outside last week. Describe, with a reason, which diagram would be
best for the student to use to display the data.

Page 2 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

Examiner Tip
Take the time needed when working with diagrams, they are usually ‘easy marks’ questions but it is
common for students to rush them and make silly mistakes.

Page 3 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

1.2.2 Stem and Leaf Diagrams


Your notes
Stem and Leaf Diagrams
What is a stem and leaf diagram?
A stem and leaf diagram shows ALL RAW data and groups it into class intervals
Stem and leaf diagrams lend themselves to two-digit data but can be used with three-digit data, rarely
more

The numbers in brackets indicate how many values are in that class interval
These are not always included but can be useful when there is a large amount of data to display
How do I draw a stem and leaf diagram?
Identify the stems and the leaves
Leaves would always be single digits
the number 2 would be represented by 12 | 2
If starting from unordered data draw two diagrams
The first diagram should get the data into the right format
i.e. a list of stems with their corresponding leaves
The second diagram should have stems and leaves in order, with a key
This helps accuracy as values are less likely to be missed out
What are stem and leaf diagrams used for?
The data is arranged into classes so at a glance it is possible to see the modal class interval
As the data is in order the median, quartiles, maximum and minimum can be identified easily
Check you can do this – find the minimum, maximum, median and upper and lower quartiles from
the stem and leaf diagram at the start of this revision note
Note that these five values are those needed in order to construct a box‑and-whisker diagram
(box plot)
Outliers, once defined, can be easily identified and removed
What about back-to-back stem and leaf diagrams?
These are used when it is helpful for the data to be split into two comparable categories such as
boy/girl, child/adult, UK/non-UK. Etc

Page 4 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

Note that the leaves on the left-hand side of the stems (Boys) increase from the centre outwards
Are there any variations on stem and leaf diagrams?
There are a few minor variations on stem and leaf diagrams that you may see online or in different
textbooks

Some or all the different/extra features in the diagram above may appear
These differences can be applied to back-to-back stem and leaf diagrams
With large amounts of data, the stems may be split into two rows
Every stem will be listed twice
The first row for a stem will contain leaves 0 - 4
The second row will contain leaves 5 - 9
What might I be asked to do with a stem and leaf diagram?
You may be asked to draw or complete a stem and leaf diagram
Find statistical measures – median, quartiles and interquartile range in particular
From which you may be required to draw a box-and-whisker diagram
Identify and remove outliers
Compare data shown by stem and leaf diagrams (either separate or back-to-back); comment on two
things and each should be in both terms of the maths and the context of the question
a comment about average (use median)
e.g. the girls’ median of 88% was higher than the boys’ median of 65% so on average the girls
performed better on the test

Page 5 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

a comment about variation (spread) (use interquartile range)


e.g. the girls’ interquartile range of 30% was greater than the boys’ 15% so the boys had
more consistent scores on the test Your notes
Analyse what would happen to statistical measures such as the median and quartiles if a value
changed or a new value were to be added to the data

Worked example
The following stem and leaf diagrams show the times taken by some children and adults to complete a
level on a computer game.

2 | 3 represents a time of 23 seconds


(a) Compare the times taken to complete the level between the children and the adults.
(b) It is later discovered two of the adults’ times had been omitted from the diagram –times of 23
and 42 seconds.
Briefly explain whether adding these times would change the adults’ median time.

(a) Compare the times taken to complete the level between the children and the adults.

Page 6 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

(b) It is later discovered two of the adults’ times had been omitted from the diagram –times of
23 and 42 seconds.
Briefly explain whether adding these times would change the adults’ median time.

Page 7 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Examiner Tip
Your notes
Accuracy is important
(Lightly) tick off values as you add them to a stem and leaf diagram
Check you have the right number of data values in total on your diagram
Other checks can include ensuring the median has the same number of values either side
of it

Page 8 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

1.2.3 Box Plots & Cumulative Frequency


Your notes
Box Plots
What is a box plot?
A box plot is a graph that clearly shows key statistics from a data set
It shows the median, quartiles, minimum and maximum values and outliers
It does not show any other individual data items
The middle 50% of the data will be represented by the box section of the graph and the lower and
upper 25% of the data will be represented by each of the whiskers
Only one axis is used when graphing a box plot
It is still important to make sure the axis has a clear, even scale and is labelled with units
Box plots are often used for comparing two sets of data
Both box plots will be drawn one above the other on the same scale on the x-axis
They are useful for comparing data because it is easy to see the main shape of the distribution of
the data from a box plot

Page 9 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Worked example
Your notes
The incomplete box plot below shows the tail lengths in cm of some students’ pets.

(i) Given that the median tail length was 21 cm, complete the box plot.

(ii) Find the range and interquartile range of the tail lengths.

Page 10 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

Examiner Tip
Remember a box plot is a graph and should be treated like one, even though there is only one axis.
It should have a title, a clear, even scale that is labelled with units if there are any. If drawing two box
plots on the same axis label each one clearly.

Page 11 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Cumulative Frequency
What is a cumulative frequency graph? Your notes
A cumulative frequency graph is used with data that has been organised into a grouped frequency
table
The cumulative frequency graph can be used to find estimates of percentiles and quartiles
As the data is grouped, it is not possible to find the actual values of these statistics
What are the main features of a cumulative frequency graph?
A cumulative frequency graph considers how much data there is up to a certain value, including the
data in that group and the one below
Cumulative frequency will always be plotted on the y – axis
Consider the scale carefully because this will usually be a large number
You may be asked to add to one or both axes, remember to label both axes clearly and include
units on the x – axis if they are needed
The cumulative frequency is calculated by adding the frequency in each group, or class, to the
frequency in the ones before
This is essentially accumulating the frequencies as you go
The cumulative frequency for each class must be plotted against the upper boundary of each
corresponding class
The cumulative frequency that corresponds with each upper boundary will not only consider the
frequency of the data in that class, but all of the data in the groups below it too
When the points have been plotted they should be joined up with a smooth curve
However, some may be joined with straight lines from point to point
How do we read statistics from a cumulative frequency graph?
Quartiles and percentiles can be read from a cumulative frequency graph
n
The median, Q2 is read from the y – axis scale at the th value
2
n 3n
The lower quartile, Q1 ,is read from the th value and the upper quartile, Q3 is read from the th
4 4
value
Any percentile can be read from the graph by finding the percent of the total frequency and
reading from the value on the y - axis
To read the corresponding data value once the position on the y – axis is known, use a ruler to draw a
line from the y – axis to the graph and then down to the x – axis
Sometimes the frequency of values greater than or less than a particular data value will need to be
found, this time you will have to read from the x – axis to the y – axis
Take particular care if the question asks for a frequency greater than a particular data value, the
value found from the y – axis will need to be subtracted from the total frequency

Page 12 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Worked example
Your notes
The cumulative frequency graph below shows the lengths in cm, l , of a group of puppies in a training
group.

(i) Given that the group 40 < l ≤ 45 was one of the groups used in the data collection,
find the number of puppies that were in this group.

(ii) Use the graph to find an estimate for the interquartile range of the puppies.

(iii) x % of the puppies are greater than 53. 5 cm long, use your graph to find an estimate
for the value of x .

(i) Given that the group 40 < l ≤ 45 was one of the groups used in the data collection, find
the number of puppies that were in this group.

Page 13 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

(ii) Use the graph to find an estimate for the interquartile range of the puppies.

Page 14 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

(iii) x % of the puppies are greater than 53.5 cm long, use your graph to find an estimate for the
value of x .

Page 15 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

Examiner Tip
If you are asked to read values from your graph make sure you use a ruler and mark the lines on
clearly to show where you took your readings from. Remember that the graph shows the
accumulated frequencies so if you need only the frequency you may need to subtract the
previous value.

Page 16 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

1.2.4 Histograms
Your notes
Histograms
What is a histogram?
A histogram is similar to a bar chart but with some key differences
A histogram is for displaying grouped continuous data whereas a bar chart is for discrete or
qualitative data
There will never be any gaps between the bars of adjacent groups in a histogram
Whilst in a bar chart the frequency is read from the height of the bar, in a histogram the height of
the bar is the frequency density
On a histogram frequency density is plotted on the y – axis
This allows a histogram to be plotted for unequal class intervals
It is particularly useful if data is spread out at either or both ends
The area of each bar on a histogram will be proportional to the frequency in that class
How do I draw a histogram?
Step 1. Always check that there are no gaps between the upper boundary of a class and the lower
boundary of the next class
If there are gaps you will need to close them by changing the boundaries before carrying out any
calculations
Consider whether the values are rounded or truncated before closing the gaps
Step 2. Find the class width of each group by subtracting the lower boundary from the upper
boundary
Step 3. Calculate the frequency density for each group using the formula:
frequency
frequency density =
class width
Step 4. The histogram will be drawn with the data values on the x – axis and frequency density on the y
– axis
Remember that the scale on both axes must be even, although the class widths may be uneven
Both axes should be clearly labelled and units included on the x – axis
Most often, the bars will have different widths
How do we interpret a histogram?
It is important to remember that the y – axis does not tell us the frequency of each bar in the histogram
The frequency of a class is found by
Frequency = Frequency Density × Class Width
You may be asked to find the frequency of part of a bar within a histogram
Find the area of that section of the bar using any information you have already found out

Page 17 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Worked example
Your notes
The table below and its corresponding histogram show the mass, in kg, of some new born bottlenose
dolphins.

Mass, m kg Frequency

4 ≤ m <8 4

8 ≤ m < 10 15

10 ≤ m < 12 19

12 ≤ m < 15 9

15 ≤ m < 30 6

(a) Complete the histogram.

(b) Estimate the number of dolphins whose weight is greater than 13 kg.

Page 18 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Your notes

Page 19 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to www.savemyexams.com for more awesome resources

Examiner Tip Your notes


Look carefully at the scales on the axes, it will rarely be a simple 1 unit to 1 square.

Page 20 of 20
© 2015-2024 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers

You might also like