2. Summarising data
2. Summarising data
Summarising data
Frequency distributions
A frequency distribution is a table showing all possible values a variable can take and the number of times the variable
takes each of those values (the frequency).
Often the data will be presented in groups (or classes). For continuous data, these groups must cover all possible data
values in the range, so there can be no gaps between the classes.
Example 1:
The following table shows the times, recorded to the nearest minute, taken by students to complete an IQ test.
The three commonly used measures for the average are the mean (often written as ), the median and the mode. These can
all be easily found from the GDC.
Example 2:
For the data set 1, 1, 3, 6, 8 find:
Key point:
The mean of a set of data
where
Example 3:
If the mean of 1, 2, 4, 5, a, 2a is 8, find the value of a.
AIHL mathematics
Topic 4 – Statistics and probability
Example 4:
Given that the mean of the following data set is 4.2, find y.
Example 5:
For the following frequency table, estimate the mean value.
Modal Class
It is not possible to find the mode from grouped data as we do not have any information on individual data values, so the
best we can do is identify the modal group, which is simply the one with the highest frequency.
Example 6:
Find the modal class in the table below.
Example 6:
Find the upper and lower quartiles of 3, 3, 5, 5, 12, 15.
AIHL mathematics
Topic 4 – Statistics and probability
The distance from the average is known as the dispersion (or spread) of the data and there are several measures for this.
1. The range: largest data value – smallest data value (measures the width of the whole data set)
2. The interquartile range (IQR): IQR = Q3 – Q1(measures the width of the central half of the data set)
3. The standard deviation ( ): Calculated by GDC. The standard deviation can be thought of as the mean distance of
each point from the mean.
4. The variance ( ): The square of the standard deviation
Example 7:
In a quality control process. eggs are weighed and the following 10 masses, in grams, are found:
Identifying outliers
For this course the data value x is an outlier if:
Example 7:
A set of data has lower quartile 60 and upper quartile 70. Find the range of values for which data would be flagged as
outliers.
Adding/subtracting a constant, k, to every data value will: Multiplying every data value by a positive constant k, will:
• change the mean, median and mode by k • multiply the mean, median and mode by k
• not change the standard deviation or IQR • multiply the standard deviation and IQR by k
Example 8: Example 9:
The mean of a data set is 12 and the standard deviation is The median of a data set is 2.4 and the inter quartile range
15. If 100 is added to every data value, what would be the is 3.6. If every data item is halved, find the new median
new mean and standard deviation? and interquartile range.
AIHL mathematics
Topic 4 – Statistics and probability
Problems
AIHL mathematics
Topic 4 – Statistics and probability
AIHL mathematics
Topic 4 – Statistics and probability
AIHL mathematics
Topic 4 – Statistics and probability
Answers