0% found this document useful (0 votes)
8 views50 pages

Chapter 2 - Frequency Distributions and Graphs (For Student)

Chapter 2 of SEHH1028 Elementary Statistics covers frequency distributions and graphs, detailing how to organize raw data into frequency distributions, both categorical and grouped. It explains the construction of frequency tables, the importance of class limits, boundaries, and widths, and provides examples for calculating these elements. Additionally, it includes exercises for practice on determining class widths and constructing frequency distributions.

Uploaded by

mqkdncdkt9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views50 pages

Chapter 2 - Frequency Distributions and Graphs (For Student)

Chapter 2 of SEHH1028 Elementary Statistics covers frequency distributions and graphs, detailing how to organize raw data into frequency distributions, both categorical and grouped. It explains the construction of frequency tables, the importance of class limits, boundaries, and widths, and provides examples for calculating these elements. Additionally, it includes exercises for practice on determining class widths and constructing frequency distributions.

Uploaded by

mqkdncdkt9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

SEHH1028

ELEMENTARY STATISTICS
CHAPTER 2
Frequency Distributions and Graphs
Frequency Distributions and Graphs
2.1 Organizing Data
2.2 Histograms, Frequency Polygons and Ogives
2.3 Other Types of Graphs
2.4 Paired Data and Scatter Plots

SEHH1028 Elementary Statistics Page 2


2.1 Organizing Data
• Raw Data are data collected in original form.
• Frequency is the number of values in a specific class of
the distribution.
• Frequency Distribution organizes raw data into table
format.
– Grouped & ungrouped frequency distributions are used for
numeric data, such as height weight and etc.
– Categorical frequency distribution is used for data that can
be placed in specific categories, such as blood types, grades
and etc.

SEHH1028 Elementary Statistics Page 3


2.1 Organizing Data
Categorical Frequency Distribution
Categorical frequency distributions are often used to summarize
nominal or ordinal data.
Example: At a college financial aid office, students who applied for
scholarship were classified according to their class rank:
Fr = freshman (year 1) So = sophomore (year 2)
Jr = junior (year 3) Se = Senior (year 4)
Below are the class rank of 40 students applying for scholarship.
Fr Fr Fr Fr Fr Jr Fr Fr So Fr
Fr So Jr So Fr So Fr Fr Fr So
Se Jr Jr So Fr Fr Fr Fr Fr So
Se Se Jr Jr Se So So So So So

SEHH1028 Elementary Statistics Page 4


2.1 Organizing Data
Categorical Frequency Distribution
Constructing a categorical 1. Construct a table as shown on
frequency distribution for the the left.
data on previous slide. 2. Tally the data and place the
results in column B.
3. Count the tallies and place the
(A) (B) (C) (D) results in column C.
Class Tally Frequency () Percent 4. Find the percentage of values
Fr in each class by using the
So formula:

Jr × 100%

Se where  is the total frequency.
5. Find the totals for columns C
and D.

SEHH1028 Elementary Statistics Page 5


2.1 Organizing Data
Categorical Frequency Distribution
The completed categorical frequency distribution
(A) (B) (C) (D)
Class Tally Frequency () Percent (%)

Fr 18 45
So 12 30
Jr 6 15
Se 4 10
Total 40 100

SEHH1028 Elementary Statistics Page 6


2.1 Organizing Data
Grouped Frequency Distribution
Grouped frequency distributions are often used to
summarize numeric data.
Example: Distribution of the battery life of 25 batteries in hours
(The raw data are integers)
Class Class Class Tally Frequency Cumulative
limits boundaries midpoint frequency
24-30 23.5-30.5 27 3 3
31-37 30.5-37.5 34 1 4
38-44 37.5-44.5 41 5 9
45-51 44.5-51.5 48 9 18
52-58 51.5-58.5 55 6 24
59-65 58.5-65.5 62 1 25

SEHH1028 Elementary Statistics Page 7


2.1 Organizing Data
Grouped Frequency Distribution
• Each raw data value is placed into a category called a
class.
• Lower class limit represents the smallest data value
that can be included in the class (same decimal place
value as the data).
• Upper class limit represents the largest data value that
can be included in the class (same decimal place value
as the data).

SEHH1028 Elementary Statistics Page 8


2.1 Organizing Data
Grouped Frequency Distribution
• Class boundaries are used to separate the classes so
that there are no gaps in the frequency distribution
(average of the upper class limit of a class and the lower
class limit of the next class).
• Class width for a class in a frequency distribution is
found by subtracting the lower (or upper) class limit of
one class from the lower (or upper) class limit of the
next class.
• Class midpoint can be founded by dividing the sum of
lower and upper boundaries (or limits) by 2.

SEHH1028 Elementary Statistics Page 9


2.1 Organizing Data
Grouped Frequency Distribution
• Frequency of a class is the number of data values
contained in the class.
• Cumulative frequency of a class is the number of data
values less than or equal to the upper class limit of the
class in then entire data set.

SEHH1028 Elementary Statistics Page 10


2.1 Organizing Data
Grouped Frequency Distribution
Rules for constructing grouped frequency distribution
1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive.
(i.e. no overlapping classes)
3. The classes must be continuous.
(i.e. no gaps between classes)
4. The classes must be exhaustive.
(i.e. cover all data values in the data set)
5. The classes must be equal in width.

SEHH1028 Elementary Statistics Page 11


2.1 Organizing Data
Grouped Frequency Distribution
Class width of a grouped frequency Distribution
• Number of classes are often decided by the researcher following
rule number 1 on previous page
• Once we have chosen the number of classes, the class width can
be computed by the following formula
       −       
ℎ =
 ! 
• Class width should be rounded to the next higher value with the
number of decimal places which is equal to the maximum
number of decimal places observed in the data.
• The rounding rule ensures the number of classes does not
exceed the one chosen by the researcher.
SEHH1028 Elementary Statistics Page 12
Example 1
In a survey of 20 patients who smoked, the following data
were obtained. Each value represents the number of
cigarettes the patient smoked per day.
10 8 6 14 22 13 17 19 11 9
18 14 13 12 15 15 5 11 16 11

Construct a grouped frequency distribution for the data


using six classes.

SEHH1028 Elementary Statistics Page 13


Example 1 – Solution
1. Calculate class width
i. Number of classes = 6
ii. Largest data value (L) = 22
iii. Smallest data value (S) = 5
iv. Range of the data set (R) = L – S = 22 – 5 = 17
v. Class width = 17 / 6 = 2.83, round up to 3
2. Tally the data
3. Find the numerical frequencies from the tallies
4. Find the cumulative frequencies
For more details on class width calculation , refer to examples 1 - 6.

SEHH1028 Elementary Statistics Page 14


Example 1 – Solution (continued)
The completed grouped frequency distribution
Class Class Class Tally Frequency Cumulative
limits boundaries midpoint frequency

5–7 4.5 – 7.5 6 2 2

8 – 10 7.5 – 10.5 9 3 5

11 – 13 10.5 – 13.5 12 6 11

14 – 16 13.5 – 16.5 15 5 16

17 – 19 16.5 – 19.5 18 3 19

20 – 22 19.5 – 22.5 21 1 20

SEHH1028 Elementary Statistics Page 15


Example 2
Data values in the following data set are all integers.
12 13 13 14 15 17 17 18 18 19
20 21 21 22 23 24 26 27 28 29

If number of classes is 5, then


       −       
ℎ =
 ! 
(29 − 12)
= = 3.4, +,-./ -0 1, 2
5
Class width should be rounded to 4 (0 decimal place)
because maximum number of decimal places observed in
the data is 0.
SEHH1028 Elementary Statistics Page 16
Example 3
Data values in the following data set are all integers.
10 12 13 14 15 17 17 18 18 19
20 21 21 22 23 24 26 27 28 30

If number of classes is 5, then


       −       
ℎ =
 ! 
(30 − 10)
= = 4.00, +,-./ -0 1, 3
5
Class width should be rounded to the next integer 5 (0
decimal place) because maximum number of decimal
places observed in the data is 0.
SEHH1028 Elementary Statistics Page 17
Example 4
Data values in the following data set have 1 decimal place.
1.2 1.3 1.3 1.4 1.5 1.7 1.7 1.8 1.8 1.9
2.0 2.1 2.1 2.2 2.3 2.4 2.6 2.7 2.8 2.9

If number of classes is 5, then


       −       
ℎ =
 ! 
(2.9 − 1.2)
= = 0.34, +,-./ -0 1, 4. 2
5
Class width should be rounded to 0.4 (1 decimal place)
because maximum number of decimal places observed in
the data is 1.
SEHH1028 Elementary Statistics Page 18
Example 5
Data values in the following data set have 1 decimal place.
1.0 1.2 1.3 1.4 1.5 1.7 1.7 1.8 1.8 1.9
2.0 2.1 2.1 2.2 2.3 2.4 2.6 2.7 2.8 3.0

If number of classes is 5, then


       −       
ℎ =
 ! 
(3.0 − 1.0)
= = 0.40, +,-./ -0 1, 4. 3
5
Class width should be rounded to the next higher value
0.5 (1 decimal place) because maximum number of
decimal places observed in the data is 1.
SEHH1028 Elementary Statistics Page 19
Exercise 1
Data values in the following data set have 2 decimal places.
0.12 0.13 0.13 0.14 0.15 0.17 0.17 0.18 0.18 0.19
0.20 0.21 0.21 0.22 0.23 0.24 0.26 0.27 0.28 0.29

Compute the class width if number of classes is 5.

SEHH1028 Elementary Statistics Page 20


Exercise 2
Data values in the following data set have 2 decimal places.
0.10 0.13 0.13 0.14 0.15 0.17 0.17 0.18 0.18 0.19
0.20 0.21 0.21 0.22 0.23 0.24 0.26 0.27 0.28 0.30

Compute the class width if number of classes is 5.

SEHH1028 Elementary Statistics Page 21


Example 6: Determine the Number of
Decimal Places in a Data Set
If the data values in a data set have different number of
decimal places, we will use the maximum of number of
decimal places found in the data.
For example, the maximum number of decimal places
found in the following data set is 2.
1.0 1.2 1.3 1.4 1.53 1.7 1.7 1.8 1.8 1.9
2.0 2.1 2.14 2.2 2.3 2.4 2.6 2.7 2.8 3.0
Suppose you want to construct a frequency distribution for the
data using six classes. Calculate the class width.

(3.0 − 1.0)
ℎ = = 0.333, +,-./ -0 1, 4. 62
6

SEHH1028 Elementary Statistics Page 22


2.1 Organizing Data
Ungrouped Frequency Distribution
Ungrouped frequency distributions are often used for
numerical data when the range of data is small.

In an ungrouped frequency distribution, each class


corresponds to exactly one data value.

SEHH1028 Elementary Statistics Page 23


Example 7
The following data set shows the mean temperature of
each day in June measured at an observatory.

25 23 23 24 26 27 25 24 22 24
25 27 27 26 26 25 24 24 22 23
25 25 23 23 22 22 22 23 23 22

Construct an ungrouped frequency distribution for the


data set.

SEHH1028 Elementary Statistics Page 24


Example 7 - Solution
Class Class Tally Frequency Cumulative
limit boundaries frequency
22 21.5 – 22.5 6 6
23 22.5 – 23.5 7 13
24 23.5 – 24.5 5 18
25 24.5 – 25.5 6 24
26 25.5 – 26.5 3 27
27 26.5 – 27.5 3 30

* The class limit is a single value instead of a range.

SEHH1028 Elementary Statistics Page 25


2.1 Organizing Data - Summary
Procedure for constructing a grouped frequency distribution:
1. Determine the classes
i. Find the highest and lowest value and the range
ii. Select the number of class desired
iii. Find the width by dividing the range by the number of classes and
rounding up
iv. Use the lowest data value as the starting point; add the width to get the
lower limits of the other classes
v. Find the upper class limits and the boundaries of the other classes
2. Tally the data
3. Find the numerical frequency from the tallies
4. Find the cumulative frequency

SEHH1028 Elementary Statistics Page 26


2.2 Histogram and Ogives
The histogram is a graph that displays the data by using
vertical bars of various heights to represent the
frequencies.

The ogive (cumulative frequency polygon) is a graph that


represents the cumulative frequencies for the classes in a
frequency distribution. The cumulative frequency is
plotted at each upper class boundary.

SEHH1028 Elementary Statistics Page 27


Example 8 (Histogram)
In a survey of 20 patients who smoked, the following data
were obtained. Each value represents the number of
cigarettes the patient smoked per day.
10 8 6 14 22 13 17 19 11 9
18 14 13 12 15 15 5 11 16 11

Construct a histogram for the data set.

SEHH1028 Elementary Statistics Page 28


Example 8 (Histogram)
Grouped frequency distribution of the data set
Class Class Frequency Cumulative
limits boundaries frequency
5–7 4.5 – 7.5 2 2
8 – 10 7.5 – 10.5 3 5
11 – 13 10.5 – 13.5 6 11
14 – 16 13.5 – 16.5 5 16
17 – 19 16.5 – 19.5 3 19
20 – 22 19.5 – 22.5 1 20

SEHH1028 Elementary Statistics Page 29


Example 8 (Histogram)
Class Class Frequency Cumulative
limits boundaries frequency

5–7 4.5 – 7.5 2 2


The number of cigarettes smoked 8 – 10 7.5 – 10.5 3 5
per day by a sample of 20 patients 11 – 13 10.5 – 13.5 6 11
y
14 – 16 13.5 – 16.5 5 16
6
17 – 19 16.5 – 19.5 3 19
20 – 22 19.5 – 22.5 1 20
5

4
x-axis: Class boundaries
Frequency

3
y-axis: Frequency

2 There should be no
gaps between bars.
1

x
0
1.5 4.5 7.5 10.5 13.5 16.5 19.5 22.5
Number of cigarettes

SEHH1028 Elementary Statistics Page 30


Exercise 3 (Histogram)
The number of cigarettes smoked
y per day by a sample of 20 patients Use the given histogram to
6 answer the following questions.
5 • Find the lower class
boundary, the upper class
Frequency

3
boundary, and midpoint for
2
the class with the highest
1

0 x
frequency.
1.5 4.5 7.5 10.5 13.5 16.5 19.5 22.5
Number of cigarettes

• How many patients in the sample smoked more than 13.5 cigarettes per day?

• What percentage of the patients in the sample smoked less than 16.5
cigarettes per day? Correct your answer to nearest integer percentage.

SEHH1028 Elementary Statistics Page 31


Example 9
(Ogive/Cumulative Frequency Polygon)
Construct a cumulative frequency polygon (ogive) for the data set
in example 7.
To construct cumulative frequency polygon (ogive), we need class
boundaries and cumulative frequencies from the grouped
frequency distribution.
Class Class Class Frequency Cumulative
limits boundaries midpoint frequency
5–7 4.5 – 7.5 6 2 2
8 – 10 7.5 – 10.5 9 3 5
11 – 13 10.5 – 13.5 12 6 11
14 – 16 13.5 – 16.5 15 5 16
17 – 19 16.5 – 19.5 18 3 19
20 – 22 19.5 – 22.5 21 1 20

SEHH1028 Elementary Statistics Page 32


Example 9 - Solution
Class Class Class Frequency Cumulative
limits boundaries midpoint frequency

5–7 4.5 – 7.5 6 2 2


8 – 10 7.5 – 10.5 9 3 5
The number of cigarettes smoked
11 – 13 10.5 – 13.5 12 6 11
per day by a sample of 20 patients
14 – 16 13.5 – 16.5 15 5 16
y
17 – 19 16.5 – 19.5 18 3 19
25
20 – 22 19.5 – 22.5 21 1 20

20
Cumulative Frequency

15
x-axis: Upper class boundary
y-axis: Cumulative frequency
10

x
0
1.5 4.5 7.5 10.5 13.5 16.5 19.5 22.5 An ogive is anchored on the
Number of cigarettes x-axis at the lower class
boundary of the first class.

SEHH1028 Elementary Statistics Page 33


Exercise 4 (Ogive)
The number of cigarettes smoked
per day by a sample of 20 patients Use the given ogive to answer
y
25
the following questions.
• How many patients in the
Cumulative Frequency

20

15
sample smoked less than
10
10.5 cigarettes per day?
5

0 x
1.5 4.5 7.5 10.5 13.5 16.5 19.5 22.5
Number of cigarettes

• How many patients in the sample smoked more than 16.5 cigarettes per day?

• What percentage of the patients in the sample smoked more than 10.5
cigarettes per day? Correct your answer to nearest integer percentage.

SEHH1028 Elementary Statistics Page 34


2.2 Histogram & Ogives
• Key points
– Give your graph a proper title about the data to be presented
– X-axis label should show the unit of the data
– Y-axis label should either be "frequency" or "cumulative frequency"
– Use the correct information from the grouped frequency distribution to
construct your graphs

Graph type x-axis y-axis


Histogram class boundaries frequency
Cumulative frequency upper class boundary cumulative
polygon (Ogive) frequency

– Pay attention to the anchoring points in ogive

SEHH1028 Elementary Statistics Page 35


2.3 Other Types of Graphs
• Graphs introduced in section 2.2 are useful for
summarizing a single quantitative variable
• In this section, we will look at graphs for other
situations
– Summarizing a single categorical variable
• Bar graph, pie graph, Pareto chart
– Change of a quantitative variable over time
• Time series graph
– A pair of quantitative variables
• Scatter plot

SEHH1028 Elementary Statistics Page 36


2.3 Other Types of Graphs
- Bar Graph & Pareto Chart
• A bar graph represents the data by using vertical or
horizontal bars whose heights or lengths represent the
frequencies of the data. The bars are not necessarily
ordered by the frequencies of the categories.
• A Pareto chart is used to represent a frequency
distribution for a categorical variable, and the
frequencies are displayed by the heights of vertical bars
or the lengths of horizontal bars. The bars are arranged
in order from highest to lowest (or vice versa).

SEHH1028 Elementary Statistics Page 37


2.3 Other Types of Graphs
- Bar Graph & Pareto Chart
• A Pareto chart that
summarizes the number of
IT workers in Hong Kong in
different job areas

• This is based on a survey


conduced by Census and
Statistics Department in
2012. The full report can
be found HERE.

SEHH1028 Elementary Statistics Page 38


2.3 Other Types of Graphs
- Bar Graph & Pareto Chart
• The Pareto chart on the right
summarizes the number of
chickens (in billions) in various
countries.
(1 billion = 1,000,000,000)

• You can read the complete article


on The Economist HERE.

SEHH1028 Elementary Statistics Page 39


2.3 Other Types of Graphs – Pie Graph
• A pie graph is a circular graph for summarizing
categorical variables. Each category corresponds to a
sector in the graph. The size of the sectors are
proportional to the frequencies of the categories.

SEHH1028 Elementary Statistics Page 40


2.3 Other Types of Graphs – Pie Graph
• Distribution of minorities in
Hong Kong

• This is based on a survey


conducted by Census and
Statistics Department. The
details can be found HERE.

SEHH1028 Elementary Statistics Page 41


2.3 Other Types of Graphs
– Time Series Graph
• A time series graph represents data that occur over a
specific period of time.

The plot shows the support rate of 6 candidates in an election.


https://www.hkupop.hku.hk/english/resources/lc2016/rolling/graph/hki_eng.png

SEHH1028 Elementary Statistics Page 42


2.3 Other Types of Graphs
– Time Series Graph
Average temperature in Hong Kong over time

This time series graph is produced by Hong Kong Observatory for illustrating the
climate change in Hong Kong. See more information HERE.
SEHH1028 Elementary Statistics Page 43
2.4 Paired Data and Scatter Plot
• A scatter plot is a graph of ordered pairs of data values
that is used to determine if a relationship exists
between the two variables.
• Relationships that can be determined using scatter plot
– Positive linear relationship
– Negative linear relationship
– Non-linear relationship
– No relationship

SEHH1028 Elementary Statistics Page 44


2.4 Paired Data and Scatter Plot
Data points fall Data points fall
roughly on a roughly on a
straight line with straight line with
positive slope. negative slope.

The data points do


Data points fall not exhibit any
roughly on a curve. obvious pattern.

SEHH1028 Elementary Statistics Page 45


2.4 Paired Data and Scatter Plot
This is a scatter plot of
gender inequality index
versus democracy index.

You can see a roughly


negative linear relationship
between the two indices.

Further details about the


plot can be found on The
Economist HERE.

SEHH1028 Elementary Statistics Page 46


2.4 Paired Data and Scatter Plot
This is a scatter plot of miles travelled per
driver versus obesity rate.
Note: Although each point corresponds to a
year, it is not really a time series plot!

There is a clear positive linear relationship


between the two variables.

Further details about this plot can be


found on The Economist HERE.

SEHH1028 Elementary Statistics Page 47


Useful Resources
• Census and Statistics Department
– Link to Survey Reports
– Link to 2006 By-Census
– Link to 2011 Census
– Link to 2016 By-Census
• Public Opinion Programme, Hong Kong University (HKUPOP)
– Link to their website
• The Economist
– Graphic Detail - Charts, Maps and Infographics

SEHH1028 Elementary Statistics Page 48


Chapter Summary
• Organizing data
– Categorical frequency distribution
– Grouped frequency distribution
– Ungrouped frequency distribution
• Construction of frequency distribution
– Rules for constructing frequency distribution
– How to calculate class width?
– What are class limits, class boundaries, class midpoint, frequency and
cumulative frequency?
• Graphs derived from grouped frequency distribution
– Histogram and cumulative frequency polygon (Ogive)

SEHH1028 Elementary Statistics Page 49


Chapter Summary
• Other types of graphs
– Pie graph, bar graph, Pareto chart
– Time series graph
– Scatter plot

SEHH1028 Elementary Statistics Page 50

You might also like