0% found this document useful (0 votes)
5 views

Lecture 3-Exploring and Making Sense of Data-Deriving Information

The document discusses different ways of organizing and presenting data through frequency distributions. It describes how to construct frequency tables for both categorical and quantitative variables, including ungrouped and grouped frequency tables. Examples are provided for each.

Uploaded by

lsejeso15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture 3-Exploring and Making Sense of Data-Deriving Information

The document discusses different ways of organizing and presenting data through frequency distributions. It describes how to construct frequency tables for both categorical and quantitative variables, including ungrouped and grouped frequency tables. Examples are provided for each.

Uploaded by

lsejeso15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Exploring and Making

Sense of Data-Presenting
Information
Lecture 3-Class Discussion Notes
BM & EBL Year 1
Kelebogile Kenalemang
Introduction
• We are now moving on to discuss ways of organising and presenting
data so that relevant information can be derived.
• The ideas and techniques considered here are very simple but they
must be applied appropriately if they are to result in useful and timely
information management.
• In this lecture it is assumed that the required data have been
collected by an appropriate method from an appropriate population
or sample.
• We are now ready to process and organise the data to find the
required information.
Frequency Distributions
• One of the most frequently used ways in which data are organized is
by means of frequency distributions.

• In larger data sets, it is likely that values will be repeated, a frequency


is the number of times a data value appears in a data set.

• Frequency distribution is a representation, either in a graphical or


tabular format, that displays the number of observations within a
given interval.
Frequency Table
• Frequency tables can show either categorical variables (sometimes
called qualitative variables) or quantitative variables (sometimes
called numeric variables). You can think of categorical variables as
categories (like eye color or brand of dog food) and quantitative
variables as numbers.
• The table summarizes the various frequencies and is known as a
frequency distribution.
Example; Categorical
Variable
• The following table shows what
family planning methods were
used by teens in Kweneng, West
Botswana.
• The left column shows the
categorical variable (Method
used) and the right column is the
frequency, (the number of teens
using that particular method).
Example; Quantitative(numerical)
variables
• Suppose that 20 statistics students’ marks on an exam are as follows:
97, 92, 88, 75, 83, 67, 89, 55, 72, 78, 81, 91, 57, 63, 67, 74, 87, 84, 98,
46
• We can construct a frequency table with classes 90-99, 80-89, 70-79
etc., by counting the number of grades in each grade range.
Class Frequency ( f )

90-99 4
80-89 6
70-79 4
60-69 3
50-59 2
40-49 1
• A frequency table lists intervals or ranges of data values called data
classes together with the number of data values from the set that are
in each class.
• This number is called the frequency of the class.
Types of Quantitative Frequency Tables

• Ungrouped Frequency Tables: In an ungrouped frequency


distribution, a listing is made by pairing up each data value with the
number of times that the data value occurs.
• Ungrouped frequency distribution is used for data that is discrete.
• Grouped Frequency Tables: In a grouped frequency distribution, data
is sorted and separated into groups called classes.
• Grouped frequency tables are used for data that is continuous.
Discrete vs Continuous Data
• Continuous Data: This is quantitative data that is measured on a
scale, such as weight or temperature, the data has infinite number of
possible values within, this is data that can take on fractions or
decimals, e.g. money, weight, height

• Discrete Data: This is data that takes on whole values, the data is
obtained by counting, such as the number of defective items in a
batch, number of students in a class, The data cannot be subdivided,
e.g. you can't have half a student or 1/3 student.
How to Construct an Ungrouped
Frequency Table
• In a certain area, 50 households were surveyed.
• The following data give the occupancy of each household.
4 7 4 1 4 2 3 6 3 5
6 3 4 9 12 1 3 4 2 2
1 1 3 8 1 1 4 2 3 4
3 2 1 4 6 5 6 1 2 3
4 4 4 1 4 2 3 5 4 4
• Construct a frequency distribution for these data.
• The discrete variable is 'the number of occupants in the household'.
• The minimum value is 1 and the maximum value is 12, hence, in this
data set, the variable takes values from 1 to 12 inclusive.
• The number of households with a given number of occupants is the
'frequency’.
• Determine the frequency of each value using a tally system . It is
necessary (and desirable) to read through the data set once only to
reduce the risk of error.
Grouped Frequency Table/Distribution
• When dealing with large amounts of data it is usual to group the data
into classes.
• For example, in a weight loss program, the weights in kilograms of 30
people were measured using a scale.
• We could list all 30 weights here but such a long list of data is
cumbersome.
• Instead, we can group the data into several weight ranges or classes.
• Such a table is called a Grouped frequency distribution.
Steps for constructing a grouped
frequency table from a data set​
• 1. If the number of classes is not given, decide on a number
of classes to use. This number should be between 5 and 20.​
• 2. Find the class width by determining the range (max-min) of the
data and divide this by the number of classes you chose in step 1. ​
• 3. Round up to the next convenient number (if it's a whole
number, also round up to the next whole number).​
• 4.Find the class limits: You can use the minimum data entry as
the lower limit of the class. To get the lower limit of the next class,
add the class width. Continue until you reach the last class.
5. Then find the upper limits of each class (since the classes cannot
overlap, and occasionally your data will include decimal numbers,
remember that it's fine for the upper limits to be decimals).
6. Count the number of data entries for each class, and record the
number in the row of the table for that class. (The book recommends
using \tally" marks to count)
• The groups are usually referred to as classes or class intervals.
• The range of values included in a class is referred to as the class
width. The minimum and maximum values included in the class are
referred to as class boundaries.
• The numbers used to specify the class are called the class limits.
Example of a Grouped Frequency Table

• For example, let’s say you have a list of IQ scores for a gifted
classroom in an elementary school.
• The IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133,
136, 138, 141, 142, 149, 150, 154.
• That list doesn’t tell you much about anything. You could draw a
frequency distribution table, which will give a better picture of your
data than a simple list.
• Pick 5 classes for this example
• Find the class width by calculating the range and dividing it by the
number of classes you picked above. Range= max-min= 154-118=36
• Class width=
• Find the class limits, use the minimum data entry as the lower limit of
the class. E.g 118 will be the lower limit of the first class.
• Add the class width 8 to 118 to get the next lower-class limit: 118 + 8 =
126
• keep on adding your class width to your minimum data values until you
have created the number of classes you chose in Step 1.
• We chose 5 classes, so our 5 minimum data values are:
118
126
134
142
150
• Write down the upper-class limits.
• These are the highest values that can be in the category, so in most
cases you can subtract 1 from the class width and add that to the
minimum data value. For example:
118-125
126-133
134-141
142-149
150-157
Finishing up the table
• Add a second column for the frequencies

IQ Frequency
118-125 4
126-133 6
134-141 3
142-149 2
150-157 2
Class exercise
• Construct a frequency table with 6 data classes from the following
data set.
• Amount of gas purchased by 28 drivers:
• 7, 4, 18, 4, 9, 8, 8, 7, 6, 2, 9, 5, 9, 12, 4, 14, 15, 7, 10, 2, 3, 11, 4, 4, 9,
12, 5, 3
Cumulative Frequency Distribution

• A cumulative frequency distribution is the sum of the class and all


classes below it in a frequency distribution.
• All that means is you're adding up a value and all of the values that
came before it.
• Cumulative frequency distributions can be summarized in a table.
Example (Notice that the last entry in the cumulative frequency column is n = 20.

Class Frequency ( f ) Cumulative


Frequency

90-99 4 4

80-89 6 10

70-79 4 14

60-69 3 17

50-59 2 19

40-49 1 20
Class exercise
• Add a cumulative frequency column to the table of gas purchases.
Relative Frequency
• The relative frequency of a data class is the percentage of data
elements in that class. We can calculate the relative frequency for
each class as follows:

• Relative frequency =

• Percentage frequency = relative frequency × 100


Example
Class Frequency ( f ) Cumulative Relative
Frequency Frequency
(f / n)
90-99 4 4 .20

80-89 6 10 .30

70-79 4 14 .20

60-69 3 17 .15

50-59 2 19 .10

40-49 1 20 .05
Class exercise

• Add a relative frequency and percentage frequency columns to the


gas example.
Describing Data Sets
• Once we have organized our data in a frequency distribution, we may
want to display the data so as to be more descriptive. We will look at
six ways to describe our data:
1. Bar Graphs
2. Pie Charts
3. Histograms
4. Frequency Polygons/Line charts
5. Stem and leaf plots (own reading)
6. Box plots(own reading)
Bar Charts/Graphs

bar charts - simple, compound and component


In a bar chart information is represented by rectangles or bars. The bars may
be drawn horizontally or vertically. The length of each bar corresponds to the
frequency.
A bar chart is suitable for category data only. E.g. gender, age groups,
education level
 The scale on the frequency axis must always include zero i.e. do not suppress
the zero.
Reading assignment (what are the advantages of using a bar chart to display
data
Pie Charts
• In a pie chart, a circular 'pie' is divided into a number of portions, with
each portion, or sector, of the pie representing a different category.
• The whole pie represents all categories together.
• The size of each portion must represent the number in its category, and
this is done by dividing the circle proportionately.
• This diagram is suitable only if we have a single variable which is sub-
divided to show percentages or proportions. The pie chart will then
illustrate the relative size of the sub-divisions.
• Pie charts are good for displaying data for around 6 categories or fewer.
Example of a Pie Chart
Diagrams to Display non-categorical data
• If the data are organised into a frequency distribution or a grouped
frequency distribution, then a histogram or a frequency polygon
(essentially the same diagram) may be used to illustrate the
distribution.
• If the data are organized into a cumulative frequency distribution,
then an ogive or a cumulative frequency polygon (same thing) may be
used to illustrate the distribution.
The Histogram
• The histogram is one of the most used diagrams for the illustration of
a frequency distribution.
• A histogram is a graphical representation of the information in a
frequency table using a bar graph with sides touching.
• The histogram should have the variable being measured in the data
set as its horizontal axis, and the class frequency as the vertical axis.
• Each data class will be represented by a vertical bar whose height is
the frequency of the class and whose width is the class width.
Example of a histogram with even class
intervals
Frequency Polygon
• A frequency polygon is a line graph representation of the information in a frequency
table.
• This is a modification of the histogram and may be used in its place. It is constructed
by plotting the frequency density at the mid-point of the class.
• The plotted points are joined, dot-to-dot, by straight lines. The polygon should be a
closed figure, therefore the first and last points are joined to the axis (zero
frequency density) at the adjacent mid-points.

• Like a histogram, the vertical axis represents frequency and the horizontal axis
represents the variable being measured in the data set. To construct the graph, a
point is plotted for each class at its midpoint and with height given by the frequency
of the class. The points are then connected by straight lines.
An example
of a
Frequency
Polygon
Interpretation of Charts and Diagrams
• It is very easy to produce diagrams which mislead the user.
• Great care should always be taken to ensure that any diagram which
you draw conveys an accurate impression of the data.
• By the same token, you should also examine critically, diagrams
produced by others, in order to avoid being misled yourself.
Reading Assignment
• Study the advantages and disadvantages of using each chart
• Know at least four advantages and disadvantages under each.

You might also like