0% found this document useful (0 votes)
29 views

Lecture Notes 2

This document discusses frequency distributions and graphs. It defines categorical and grouped frequency distributions. A categorical distribution organizes nominal or ordinal data into categories, while a grouped distribution places continuous data into classes of ranges. Examples show how to construct frequency distributions and represent the data in histograms, frequency polygons, and ogives. Frequency distributions organize raw data, and graphs provide a visual representation of the organized frequency data.

Uploaded by

mi5180907
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Lecture Notes 2

This document discusses frequency distributions and graphs. It defines categorical and grouped frequency distributions. A categorical distribution organizes nominal or ordinal data into categories, while a grouped distribution places continuous data into classes of ranges. Examples show how to construct frequency distributions and represent the data in histograms, frequency polygons, and ogives. Frequency distributions organize raw data, and graphs provide a visual representation of the organized frequency data.

Uploaded by

mi5180907
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Lecture 2

2- Frequency Distributions and Graphs

2-1 Organizing Data

frequency
A frequency distribution is the organization of raw data in
distributions
table form, using classes and frequencies.

Two types of frequency distributions that


are most often used are the categorical
frequency distribution and the grouped
frequency distribution. categorical grouped ungrouped
frequency frequency frequency
Categorical Frequency Distributions distribution distribution distribution

The categorical frequency distribution is used for data that can be placed in specific categories, such as nominal-
or ordinal-level data. For example, data such as political affiliation, religious affiliation, or major field of study
would use categorical frequency distributions.

Example 2–1: Twenty-five army inductees were given a blood test to determine their blood type. The data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.

Solution
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, and AB. These
types will be used as the classes for the distribution.
The frequency distribution for the data is:

A B C D
Class Tally Frequency Percent
A //// 5 20
B //// // 7 28
O //// //// 9 36
AB //// 4 16
Total 25 100

For the sample, more people have type O blood than any other type.

PHM111s - Probability and Statistics


Grouped Frequency Distributions

When the range of the data is large, the data must be grouped into classes that are more than one unit in width, in
what is called a grouped frequency distribution.

Example 2–2: These data represent the record high temperatures in degrees Fahrenheit (oF) for each of the 50
states. Construct a grouped frequency distribution for the data using 7 classes.

112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114

Solution
The grouped frequency distribution for the data is:

Class Class
limits boundaries Tally Frequency
100–104 99.5–104.5 // 2
105–109 104.5–109.5 //// /// 8
110–114 109.5–114.5 //// //// //// // 18
115–119 114.5–119.5 //// //// /// 13
120–124 119.5–124.5 //// // 7
125–129 124.5–129.5 / 1
130–134 129.5–134.5 / 1

n = Ʃf = 50

The frequency distribution shows that the class 109.5–114.5 contains the largest number of temperatures (18)
followed by the class 114.5–119.5 with 13 temperatures. Hence, most of the temperatures (31) fall between 109.5
and 119.5oF.

Sometimes it is necessary to use a cumulative frequency distribution. A cumulative frequency distribution is a


distribution that shows the number of data values less than or equal to a specific value (usually an upper
boundary). The values are found by adding the frequencies of the classes less than or equal to the upper-class
boundary of a specific class. This gives an ascending cumulative frequency.
The cumulative frequency distribution for the data in example 2-2 is as follows:
Cumulative frequency

Less than 99.5 0


Less than 104.5 2
Less than 109.5 10
Less than114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50

Cumulative frequencies are used to show how many data values are accumulated up to and including a specific
class. In Example 2–2, 28 of the total record high temperatures are less than or equal to 114oF. Forty-eight of the
total record high temperatures are less than or equal to 124oF.
PHM111s - Probability and Statistics
After organizing the data into a frequency distribution, they can be presented in graphical form.

The three most commonly used graphs in research are

1. The histogram is a graph that displays the data by using contiguous


vertical bars (unless the frequency of a class is 0) of various heights to
represent the frequencies of the classes.

2. The frequency polygon is a graph that displays the data by using lines
that connect points plotted for the frequencies at the midpoints of the
classes. The frequencies are represented by the heights of the points.

3. The ogive is a graph that represents the cumulative frequencies for the
classes in a frequency distribution.

Example 2–3: Construct a histogram to represent the data shown for the record high temperatures for each of the
50 states in Example 2–2.

Solution
Class boundaries Frequency

99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

PHM111s - Probability and Statistics


Example 2–4: Using the frequency distribution given in Example 2–3, construct a frequency polygon.

Solution
Find the midpoints of each class. Recall that midpoints are found by adding the upper and lower boundaries and
dividing by 2:
99.5 + 104.5 104.5 + 109.5
= 102 = 107
2 2

and so on. The midpoints are


Class boundaries Midpoints Frequency

99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1

Example 2–5: Construct an ogive for the frequency distribution described in Example 2–3.

Solution
Find the cumulative frequency for each class.
Cumulative frequency

Less than 99.5 0


Less than 104.5 2
Less than 109.5 10
Less than114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50

PHM111s - Probability and Statistics


2-3 Other Types of Graphs
1. A bar graph represents the data by using vertical or horizontal bars whose heights or lengths represent
the frequencies of the data.
2. A Pareto chart is used to represent a frequency distribution for a categorical variable, and the
frequencies are displayed by the heights of vertical bars, which are arranged in order from highest to
lowest.
3. A time series graph represents data that occur over a specific period of time.
4. A pie graph is a circle that is divided into sections or wedges according to the percentage of
frequencies in each category of the distribution.

PHM111s - Probability and Statistics


Stem and Leaf Plots
A stem and leaf plot is a data plot that uses part of the data value as the stem and part of the data value as the leaf
to form groups or classes.
Example 2–6: At an outpatient testing center, the number of cardiograms performed each day for 20 days is
shown. Construct a stem and leaf plot for the data.
25 31 20 32 13
14 43 02 57 23
36 32 33 32 44
32 52 44 51 45

Solution

1- Arrange the data in order:


02, 13, 14, 20, 23, 25, 31, 32, 32, 32, 32, 33, 36, 43, 44, 44, 45, 51, 52, 57
2- Separate the data according to the first digit, as shown.
02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36
43, 44, 44, 45 51, 52, 57
3- A display can be made by using the leading digit as the stem and the trailing digit as the leaf. For example,
for the value 32, the leading digit, 3, is the stem and the trailing digit, 2, is the leaf. For the value 14, the 1
is the stem and the 4 is the leaf. Now a plot can be constructed as shown in Figure:
Leading digit (stem) Trailing digit (leaf)

0 2
1 34
2 035
3 1222236
4 3445
5 127
The figure shows that:
• The distribution peaks in the center.
• There are no gaps in the data.
• For 7 of the 20 days, the number of patients receiving cardiograms was between 31 and 36.
• The testing center treated from a minimum of 2 patients to a maximum of 57 patients in any one day.
If there are no data values in a class, you should write the stem number and leave the leaf row blank. Do not put a
zero in the leaf row.
Example 2–7: An insurance company researcher conducted a survey on the number of car thefts in a large city for
a period of 30 days last summer. The raw data are shown. Construct a stem and leaf plot by using
classes 50–54, 55–59, 60–64, 65–69, 70–74, and 75–79.
52 62 51 50 69
58 77 66 53 57
75 56 55 67 73
79 59 68 65 72
57 51 63 69 75
65 53 78 66 55

PHM111s - Probability and Statistics


Solution
Step 1 Arrange the data in order.
50, 51, 51, 52, 53, 53, 55, 55, 56, 57, 57, 58, 59, 62, 63, 65, 65, 66, 66, 67, 68, 69, 69, 72, 73, 75, 75, 77,
78, 79

Step 2 Separate the data according to the classes.


50, 51, 51, 52, 53, 53 55, 55, 56, 57, 57, 58, 59
62, 63 65, 65, 66, 66, 67, 68, 69, 69 72, 73
75, 75, 77, 78, 79
Step 3 Plot the data as shown.

Leading digit (stem) Trailing digit (leaf)

5 011233
5 5567789
6 23
6 55667899
7 23
7 55789

PHM111s - Probability and Statistics

You might also like