0% found this document useful (0 votes)
42 views29 pages

T3C1 Module

This document discusses different types of data and methods for organizing and visualizing data, including: 1) Qualitative and quantitative data can be categorized as discrete or continuous. Qualitative data describes attributes while quantitative data can be measured numerically. 2) Stem-and-leaf diagrams organize data into "stems" and "leaves" to show the distribution and compare multiple data sets. 3) Histograms use class intervals and frequencies to display quantitative data distribution, while cumulative frequency curves (ogives) show the cumulative total of frequencies up to each class interval.

Uploaded by

Wendy Boon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views29 pages

T3C1 Module

This document discusses different types of data and methods for organizing and visualizing data, including: 1) Qualitative and quantitative data can be categorized as discrete or continuous. Qualitative data describes attributes while quantitative data can be measured numerically. 2) Stem-and-leaf diagrams organize data into "stems" and "leaves" to show the distribution and compare multiple data sets. 3) Histograms use class intervals and frequencies to display quantitative data distribution, while cumulative frequency curves (ogives) show the cumulative total of frequencies up to each class interval.

Uploaded by

Wendy Boon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

STPM Mathematics T [Term 3] Chapter 1

DATA DESCRIPTION

1.1 Types of Data


Collection of data can be categorised broadly as qualitative data or quantitative data.
Qualitative data are also known as categorical data. It cannot be measured directly by an
instrument. It is used to show relative preference or comparison.
Example : colour, blood type, race, type of material
Quantitative data are numerical data that can be measured and defined by a standard unit.
Example : mass, height, age, blood pressure, revenues, number of customers

Quantitative data can be further categorised into two types, discrete data and continuous data .
Discrete data refers to data that represent counts. Example : number of absentees, number of
books, number of subject
Continuous data have infinite possibilities for a value depending on the instrument of
measurement. Example : height (165.4 cm, 178.8 cm), mass (58.9 kg, 60.0 kg, 76.3 kg).

1.2 Stem-and-Leaf Diagram


Stem-and-leaf plot
 Data are organised into two columns, first column called the stem and second column
called the leaf, consists of numbers that are presented with a key to explain its meaning.
 Can be used to show comparison between two different sets of raw data, called back-to-
back stem-and-leaf plot.

EXAMPLE 1

1. The number of ships anchored at a port every week is recorded for 28 weeks. The data
obtained are shown below. Construct a stem-and-leaf diagram to represent the data.
23 25 16 47 30 12 22
29 31 14 28 26 26 17
23 15 13 41 28 10 30
15 26 26 20 21 52 29

Dr. Ley 1
016-8674543
STPM Mathematics T [Term 3] Chapter 1

2. The heights of students (to the nearest cm) in a class is given below.
155 145 153 142 155 157
156 149 144 157 150 147
155 154 152 153 151 148
151 152 147 156 146 144
Construct a stem-and-leaf plot for the heights of these students using class intervals of 5
cm.

3. Draw a stem-and-leaf plot for the following data.


10 22 5 15 18 12 8 16 14 12
21 17 16 14 13 11 9 19 14 11

4. Below are the ages of lecturers in College A and College B.

College A College

50 41 32 36 26 36 53 53 33 30 39 42 45 37 28 25 53 33 22 40
25 38 52 48 54 45 47 47 38 24 41 51 49 25 44 24 33 35 37 24
24 30 47 49 52 50 45 44 47 44 25 35 24 26 29 53 26 23 28 38

Draw a back-to-back stem-and-leaf plot for the data above.

Dr. Ley 2
016-8674543
STPM Mathematics T [Term 3] Chapter 1

1.3 Histograms and Cumulative Frequency Curves (Ogives)


Histogram
 Constructed from frequency distribution table
 Similar to bar chart, but has no spacing between each of the bar
 Horizontal axis is usually set to the lower and upper class boundary

Class interval and its boundary;

Height Frequency Lower boundary Upper boundary

150 – 159 8

160 – 169 12

Class width =

Dr. Ley 3
016-8674543
STPM Mathematics T [Term 3] Chapter 1

EXAMPLE 2

1. The mass (to the nearest kg) of 40 students are shown.

42 51 53 63 43 65 54 41
51 52 58 60 40 46 45 68
54 48 60 53 51 51 54 58
52 40 61 47 49 46 44 58
53 45 63 67 56 48 59 49

Complete the following table.

Mass (kg) Frequency Lower boundary Upper boundary

40 – 44 6

45 – 49

50 – 54

55 – 59

60 – 64

65 – 69

2. Complete the following table to represent the data.

Diameter Lower
22.6 22.1 22.2 21.5 23.4 Frequency
(cm) boundary
23.5 21.2 23.0 21.5 21.9
21.6 21.8 21.7 21.7 22.0 21.0 – 21.4
21.6 21.9 22.8 22.7 23.7 21.5 – 21.9
23.4 22.6 22.8 22.9 22.7
22.0 – 22.4
22.8 22.2 23.3 22.3 22.7
23.0 23.2 22.1 22.5 22.8 22.5 – 22.9
22.1 22.5 22.3 21.4 22.7
23.0 – 23.4
21.1 22.5 22.4 22.2 23.1
23.5 – 23.9

Dr. Ley 4
016-8674543
STPM Mathematics T [Term 3] Chapter 1

3. Complete the table below and construct a histogram.

Mass (kg) Frequency Lower boundary Upper boundary

5–8 5

9 – 12 7

13 – 16 10

17 – 20 8

21 – 24 6

Dr. Ley 5
016-8674543
STPM Mathematics T [Term 3] Chapter 1

4. Complete the table and construct a histogram.

Volume (ml) Frequency Lower boundary Upper boundary

3.0 – 3.2 7

3.3 – 3.5 5

3.6 – 3.8 9

3.9 – 4.1 8

4.2 – 4.2 10

Dr. Ley 6
016-8674543
STPM Mathematics T [Term 3] Chapter 1

Cumulative frequency and Ogive


Cumulative frequency refers to the total frequency up to the class interval.
Ogive is the curve obtained by plotting cumulative frequency against upper boundary. This is
particularly useful for estimating the median and quartiles of a data.

EXAMPLE 3

1. The table below shows the mass of 25 students in class.

Mass (kg) 30 – 37 38 – 45 46 – 53 54 – 61 62 – 69

Frequency 2 4 13 5 1

Complete the following table.

Mass (kg) Frequency Upper boundary Cumulative


frequency

0 29.5 0

30 – 37 2

38 – 45 4

46 – 53 13

54 – 61 5

62 – 69 1

Take note!
When constructing a cumulative frequency table, an additional class interval before the
first class interval is added to introduce the zero cumulative frequency value.

Dr. Ley 7
016-8674543
STPM Mathematics T [Term 3] Chapter 1

2. The table below shows the marks of 40 students in a test.

Mass (kg) 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79

Frequency 5 16 10 6 2 1

(a) Complete the following table below.

Marks Frequency Upper boundary Cumulative


frequency

0 0

20 – 29 5

30 – 39

(b) Construct an ogive with a suitable scale.

(c) Using your graph, estimate the number of students who failed the test if the passing
marks is 36.

(d) 5% of the students are awarded with grade A. Use your ogive to estimate the
minimum marks to qualify for grade A.

(e) Use your ogive to estimate the median, quartiles and interquartile range.

Dr. Ley 8
016-8674543
STPM Mathematics T [Term 3] Chapter 1

Dr. Ley 9
016-8674543
STPM Mathematics T [Term 3] Chapter 1

1.4 Measures of Central Tendency and Dispersion


Measures of central tendency refer to the determination of the centre of the data set. These are
described by mean, mode, and median.
Mean – synonymous to average, also known as expectation value.
Median – the data that falls in the middle when arranged in ascending order.
Mode – the value with the highest frequency.

EXAMPLE 4

1. Determine the mode, mean and median of the set of the data
77, 76, 68, 57, 63, 76, 67

2. Determine the mode, mean and median of the set of the data
15, 12, 25, 31, 8, 18, 25, 30, 36

3. Given that the mean of the set of data 5, 7, 9, 12, 𝑥, 19, 9, 10 is 11.
(a) Find the value of 𝑥.

(b) Determine the median and mode of the set of data.

Dr. Ley 10
016-8674543
STPM Mathematics T [Term 3] Chapter 1

4. The following frequency distribution table shows the number of pencils each student has
for a group of students.

Number of pencils 1 2 3 4 5

Number of students 4 7 5 𝑦 1

(a) Given that the mean number of pens is 2.5, calculate the value of 𝑦.

(b) State the mode and determine the median of the data.

5. A set of numbers has eight numbers. The mean of the set of numbers is 11.
(a) Find ∑ 𝑥.

(b) When a number 𝑚 is added to the set of numbers, the mean becomes 10. Find 𝑚.

Dr. Ley 11
016-8674543
STPM Mathematics T [Term 3] Chapter 1

Mean, Median and Mode for Grouped Data


∑ 𝑓𝑥
Mean 𝑥̅ =
∑𝑓

Mode → the class having the highest frequency


Mode → determined from histogram
𝑑
Mode → using formula: 𝑚 = 𝑎 + 𝑐
𝑑 +𝑑

𝑁
−𝐹
Median → 𝑚 = 𝐿 + 2 𝐶
𝑓
where 𝐿 = lower boundary of the class in which the median lies
𝑁 = total frequency
𝐹 = cumulative frequency before the median class
𝐶 = class interval
𝑓 = frequency of the median class

EXAMPLE 5

1. Identify the modal class and the mean in each of the following table.
(a) Number of books(b) Frequency

0–4 5

5–9 8

10 – 14 12

15 – 19 10

20 – 24 5

Dr. Ley 12
016-8674543
STPM Mathematics T [Term 3] Chapter 1

2. For each of the following frequency distribution table below, construct a histogram to
represent the data and determine its mean, median and mode.

(a)
Bonus (RM) Frequency

100 – 190 4

200 – 290 13

300 – 390 12

400 – 490 6

500 – 590 3

600 – 690 2

Dr. Ley 13
016-8674543
STPM Mathematics T [Term 3] Chapter 1

(b)

Time record (s) Frequency

50.0 – 52.9 1

53.0 – 55.9 3

56.0 – 58.9 11

59.0 – 61.9 15

62.0 – 64.9 12

65.0 – 67.9 8

Dr. Ley 14
016-8674543
STPM Mathematics T [Term 3] Chapter 1

3. Determine the mode from the following histogram.

Dr. Ley 15
016-8674543
STPM Mathematics T [Term 3] Chapter 1

4. Estimate the median of the sets of data from the given ogive.

Dr. Ley 16
016-8674543
STPM Mathematics T [Term 3] Chapter 1

5. Determine the mode of the following frequency distribution table without using
histogram.
(a)

Mass (g) 100 – 150 150 – 200 200 – 250 250 – 300 300 – 350

Number of
28 75 42 26 10
mangoes

(b)

Number of
0–4 5–9 10 – 14 15 – 19 20 – 24 25 – 29 30 – 34
accidents

Number of
4 6 11 15 8 5 3
weeks

Dr. Ley 17
016-8674543
STPM Mathematics T [Term 3] Chapter 1

Frequency Density and Non-Uniform Histogram


Histogram with different class width is different from histogram as the height of every
rectangle is not the same as the frequency. The height of each rectangle with different class
widths is determined by frequency density as follows;
Frequency
Frequency density =
class width
EXAMPLE 6

1. Calculate the frequency density for each class of the following data and construct a
histogram to represent the data.

Time (min) 4–7 8 9 – 10 11 12 – 16 17 – 20

Frequency 12 20 18 22 15 13

Time (min) 4–7 8 9 – 10 11 12 – 16 17 – 20

Class width

Frequency 12 20 18 22 15 13

Frequency
Density

Dr. Ley 18
016-8674543
STPM Mathematics T [Term 3] Chapter 1

2. The waiting time for 80 patients who are seeking treatment from a doctor in a clinic is
shown in the following frequency distribution.

Waiting time (minutes) Number of patients

0 – 3.5 12

3.5 – 10.5 18

10.5 – 14.0 15

14.0 – 17.5 6

17.5 – 24.5 8

24.5 – 35.0 15

35.0 – 49.0 6

Construct a histogram to represent the data.

Dr. Ley 19
016-8674543
STPM Mathematics T [Term 3] Chapter 1

3. The histogram shows information about the heights of some tomato plants.

26 plants have a height of less than 20 cm. Work out the total number of tomato plants.

Dr. Ley 20
016-8674543
STPM Mathematics T [Term 3] Chapter 1

4. The histogram gives information about the heights of some plants.

There are 360 plants with a height of 20 cm or less.


Work out the number of plants with a height of more than 20 cm.

Dr. Ley 21
016-8674543
STPM Mathematics T [Term 3] Chapter 1

Measures of dispersion provide the information about the patterns of spread of a set of data.
The parameter of interest includes range, interquartile range, variance and standard deviation.

Range refers to the difference between the largest and smallest data
Range = Largest value − Smallest value
Interquartile range
Interquartile range = Third quartile, 𝑄 − First quartile, 𝑄
For grouped data, the first and third quartile can be determined by modifying the formula for
median;
𝑁
−𝐹
First quartile → 𝑄 = 𝐿 + 4 𝐶
𝑓
where 𝐿 = lower boundary of the class in which the first quartile lies
𝑁 = total frequency
𝐹 = cumulative frequency before the quartile class
𝐶 = class interval
𝑓 = frequency of the quartile class
3𝑁
−𝐹
Third quartile → 𝑄 = 𝐿 + 4 𝐶
𝑓
where 𝐿 = lower boundary of the class in which the third quartile lies
𝑁 = total frequency
𝐹 = cumulative frequency before the quartile class
𝐶 = class interval
𝑓 = frequency of the quartile class

Standard deviation measures the difference between the data and the mean value. Variance
refers to the square of standard deviation. The formulae are summarised as follows

Ungrouped data Grouped data

∑𝑥
− (𝑥̅ )
𝑁
Standard deviation 𝜎 ∑ 𝑓𝑥
− (𝑥̅ )
∑(𝑥 − 𝑥̅ ) ∑𝑓
𝑁

Dr. Ley 22
016-8674543
STPM Mathematics T [Term 3] Chapter 1

∑𝑥
− (𝑥̅ ) ∑ 𝑓𝑥
𝑁 − (𝑥̅ )
Variance 𝜎 ∑𝑓
∑(𝑥 − 𝑥̅ )
𝑁

EXAMPLE 7

1. (a) Determine the variance and standard deviation of the set of data 4, 6, 9, 3, 5, 12, 10.

(b) The table shows the distribution of the scores obtained when a dice is thrown 25
times. Determine the variance and standard deviation of the distribution.

Score Frequency

1 3

2 4

3 3

4 6

5 4

6 5

Dr. Ley 23
016-8674543
STPM Mathematics T [Term 3] Chapter 1

2. The variance of a set of 8 numbers 𝑥 , 𝑥 , … , 𝑥 is 51.5. It is given that ∑ 𝑥 = 3004. Find


(a) the mean, 𝑥̅ ,

(b) the value of ∑ 𝑥.

3. The height of a group of 8 students has a mean of 160 cm and a standard deviation of 12
cm. Find
(a) the sum of the height of the students

(b) the sum of the squares of the height of the students.

4. The sum ∑ 𝑥 and ∑ 𝑥 of ten values is 62 and 438 respectively. If two numbers, 5 and 8,
are taken away from the ten values, find the new mean and standard deviation.

Dr. Ley 24
016-8674543
STPM Mathematics T [Term 3] Chapter 1

1.5 Box-and-Whisker Plots


This type of data representation requires the knowledge of the quartiles, they are
𝑄 , median (𝑄 ) and 𝑄 . Box-whisker plot is also useful to identify outliers.

A data is consider an outlier if it lies out of the following range


𝑄 − 1.5(𝐼𝑄𝑅) < 𝑥 < 𝑄 + 1.5(𝐼𝑄𝑅)

1.6 Pearson Coefficient of Skewness and Data Distribution


Symmetrical distribution

 𝑄 −𝑄 =𝑄 −𝑄

Positively skewed distribution (to the right)

Dr. Ley 25
016-8674543
STPM Mathematics T [Term 3] Chapter 1

 𝑄 −𝑄 <𝑄 −𝑄

Negatively skewed distribution (to the left)

The distribution of a data can be determined without having to perform graphical


representation. This can be done by calculating the Pearson coefficient of skewness, given by
3(Mean − Median) 3(𝑥̅ − 𝑚)
Pearson coefficient of skewness = =
Standard deviation 𝜎
Positive value denote positive skewness and vice-versa. For distribution with only one mode,
the coefficient of skewness is also given as
Mean − Mode
Pearson coefficient of skewness =
Standard deviation

Dr. Ley 26
016-8674543
STPM Mathematics T [Term 3] Chapter 1

EXAMPLE 8

1. The following data shows a summary of the marks for the Mathematics and Biology test
for students in a class.

Subjects Minimum Maximum Median First Third


quartile quartile

Mathematics 10 90 60 45 70

Biology 35 85 60 48 72

Draw two box and whisker plot for this data and comment regarding the distribution of
the marks for the two subjects.

2. Autoclassic company has 48 used cars for sale. The table below shown the age, 𝑥 (in
years) of the cars.

Age (𝑥) 1 2 3 4 5 6 7 8 9

Frequency 7 12 8 6 5 4 3 2 1

(a) Find the median, first and third quartiles for this distribution.

(b) Represent this data using a box-whisker plot.

Dr. Ley 27
016-8674543
STPM Mathematics T [Term 3] Chapter 1

3. The following stem-and-leaf plot shows the maximum temperature for each day from 1st
August to 23rd August in a town. Draw a box-whisker plot and use your plot to identify
the outliers.

Stem Leaf

7 6 7
7 0 2 2 3
6 5 7 8 8 8 9 9
6 2 3 3 4 4 4 4 4
5 9
5 1

Key : 5|9 means 59°F


Quartile 1 Quartile 3

Median

Boundary of box-whisker plot;

Outlier :

4. Calculate the Pearson coefficient of skewness for the frequency distribution given below.

Number of children 1 2 3 4 5 6

Number of family 3 10 7 4 1 1

Dr. Ley 28
016-8674543
STPM Mathematics T [Term 3] Chapter 1

5. A set of data consists of the following numbers.

1 2 2 3 3

3 4 4 4 4

5 5 5 5 5

(a) find the mean and standard deviation.

(b) calculate the Pearson coefficient of skewness.

Dr. Ley 29
016-8674543

You might also like