Statistics & Probability
Statistics & Probability
CSEC MATHEMATICS
TYPES OF DATA
• Categorical data – descriptive data, e.g. colours, types of music, design of carnival
costumes.
• Numerical data – numerical information, e.g. numbers of people in a crowd, or
measurements.
There are two types of numerical data:
• Continuous data – data that is measured and can take any value e.g. the height of a
person
• Discrete data – data that is counted, can only take specific values, e.g. the number
of books in a bag
COLLECTING DATA
When we first collect data, it would be unorganized. This is called raw data.
Example: This list of test marks (out of 10) is raw data:
2, 4, 2, 6, 3, 8, 3, 3, 5, 6
We can organise it by writing the marks in order:
2, 2, 3, 3, 3, 4, 5, 6, 6, 8
MEAN, MEDIAN & MODE
There are three different types of average:
• Mean
• Median
• Mode
𝒔𝒖𝒎 𝒐𝒇 𝒕𝒉𝒆 𝒗𝒂𝒍𝒖𝒆𝒔
Mean =
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔
𝟐+𝟐+𝟑+𝟑+𝟑+𝟒+𝟓+𝟔+𝟔+𝟖 𝟒𝟐
Example: The mean mark of the set of test marks = = = 𝟒. 𝟐
𝟏𝟎 𝟏𝟎
MEAN, MEDIAN & MODE
The mode is the value that occurs most often.
Example: The modal mark for the set of test marks: 3
The median is the middle value when the values have been arranged in ascending or
descending order of size.
When there is an even number of values, the median is the average of the two middle
values.
In the example of the test marks, the middle values are 3 and 4, so the median mark is 3.5
𝒏+𝟏
For any set of n values the median is the th value
𝟐
EXERCISE 1
EXERCISE 2
1.
2.
EXAMPLE
SOLUTION
EXERCISE 3
1.
2.
EXERCISE 3
3.
ORGANISING DISCRETE DATA
This list of marks is raw, discrete data:
1, 4, 2, 5, 3, 4, 3, 5, 4, 4, 5, 3, 5, 2, 4, 2, 3, 4
For an ungrouped frequency table, we list each different value against the number of times
it occurs – this is called its frequency
∑𝑓𝑥 =
286
= 4.8
𝑥= 60
∑𝑓
- the mean value
A line graph is used to illustrate how the value of a quantity changes over time.
This table shows the average daily temperature at noon, in degrees Celsius, for each month
of a year on an island.
LINE GRAPHS
class limits
EXAMPLE
• To draw a cumulative frequency curve, plot the first point at 0 for the
cumulative frequency and at the lowest boundary of the first class.
• Then plot the other points by treating the upper class boundary of each group
and the cumulative frequency as coordinates.
Example:
Patrick and Michael both play cricket.
Patrick’s scores over five innings are 35, 38, 41, 41 and 43.
Michael’s scores over five innings are 0, 21, 38, 38 and 98.
They both have a mean score of 39, a median of 38 and a mode of 38.
But their scores are quite different; Michael’s scores are more spread out.
RANGE
The range of a set of data is the difference between the largest value and the
smallest.
This shows that Michael’s scores are much more spread out than Patrick’s.
QUARTILE RANGES
• The median is the middle value of a set of data.
𝒏+𝟏
Median item = where n is number of items or values
𝟐
𝒏+𝟏
Lower quartile item = where n is the number of items or values
𝟒
𝟑𝒏+𝟏
Upper quartile item = where n is the number of items or values
𝟒
EXAMPLE
11+1
• If there are 11 items, the median is the = 6th item
2
11+1
• The lower quartile will be in position =3
4
3 11 +1
• The upper quartile will be in position , or the 9th item.
4
INTERQUARTILE RANGE
The interquartile range measures the difference between the upper quartile and
the lower quartile.
It shows the range of the middle 50% of the data, and so is not affected by outliers.
interquartile range
The semi-interquartile range = 2
EXAMPLE
Tulane took 27 Maths tests. Here are her scores:
44 ÷ 4 = 11