Unit 4 Frequency Distribution and Graphical Presentation: Structure
Unit 4 Frequency Distribution and Graphical Presentation: Structure
Structure
4.0 Introduction
4.1 Objectives
4.2 Arrangement of Data
4.2.1 Simple Array
4.2.2 Discrete Frequency Distribution
4.2.3 Grouped Frequency Distribution
4.2.4 Types of Grouped Frequency Distributions
4.3 Tabulation of Data
4.3.1 Components of a Statistical Table
4.3.2 General Rules for Preparing Table
4.3.3 Importance of Tabulation
4.4 Graphical Presentation of Data
4.4.1 Histogram
4.4.2 Frequency Polygon
4.4.3 Frequency Curves
4.4.4 Cumulative Frequency Curves or Ogives
4.4.5 Misuse of Graphical Presentations
4.5 Diagrammatic Presentation of Data
4.5.1 Bar Diagram
4.5.2 Sub-divided Bar Diagram
4.5.3 Multiple Bar Diagram
4.5.4 Pie Diagram
4.5.5 Pictograms
4.6 Let Us Sum Up
4.7 Unit End Questions
4.8 Glossary
4.9 Suggested Readings
4.0 INTRODUCTION
Data collected either from Primary or Secondary source need to be systemetically
presented as these are invariably in unsystematic or rudimentary form. Such raw
data fail to reveal any meaningful information. The data should be rearranged
and classified in a suitable manner to understand the trend and message of the
collected information. This unit therefore, deals with the method of getting the
data organised in all respects in a tabular form or in graphical presentation.
4.1 OBJECTIVES
After going through this Unit, you will be able to:
Explain the methods of organising and condensing statistical data;
43
Introduction to Statistics Define the concepts of frequency distribution and state its various types;
Analyse the different methods of presenting the statistical data;
Explain how to draw tables and graphs diagrams, pictograms etc; and
describe the uses and misuses of graphical techniques.
On the other hand, those variables which cannot take all the possible values
within the given specified range, are termed as discrete variables. For example,
number of children, marks obtained in an examination ( out of 200), etc.
2) What would be the limits of each class interval ? Another factor used in
determining the number of classes is the size/ width or range of the class
which is known as ‘class interval’ and is denoted by ‘i’. Class interval should
be of uniform width resulting in the same-size classes of frequency
distribution. The width of the class should be a whole number and
conveniently divisible by 2, 3, 5, 10, or 20.
The width of a class interval (i) = Largest Observation(OL – OS) / I (class interval)
After deciding the class interval, the range of scores should be decided by
subtracting the highest value to the lowest value of the data array.
Now, the next step is to decide from where the class should be started. There are
three methods for describing the class limits for distribution
Exclusive method
Inclusive method
True or actual class method
Exclusive method: In this method of class formation, the classes are so formed
that the upper limit of one class also becomes the lower limit of the next class.
Exclusive method of classification ensures continuity between two successive
classes. In this classification, it is presumed that score equal to the upper limit of
the class is exclusive, i.e., a score of 40 will be included in the class of 40 to 50
and not in a class of 30 to 40.
Finally we count the number of scores falling in each class and record the
appropriate number in frequency column. The number of scores falling in each
class is termed as class frequency. Tally bar is used to count these frequencies.
Example: Scores of 30 students are given below. Prepare the frequency
distribution by using exclusive method of classification.
3, 30, 14, 30, 27, 11, 25, 16, 18, 33, 49, 35, 18, 10, 25, 20, 14, 18, 9, 39, 14, 29,
20, 25, 29, 15, 22, 20, 29, 29
The above ungrouped data do not provide any useful information about
observations rather it is difficult to understand.
Solution:
Step 1: First of all arrange the raw scores in ascending order of their magnitude.
3,9,10,11,14,14,14,15,16,18,18,18,20,20,20,22,25,25,25,27,29,29,29,29,30,30,33,35,39,49
Step 2: Determine the range of scores by adding 1 to the difference between
46
largest and smallest scores in the data array. For above array of data it Frequency Distribution and
Graphical Presentation
is 49–3 = 46+1= 47.
Step 3: Decide the number of classes. Say 5 for present array of data.
Step 4: To decide the approximate size of class interval, divide the range with
the decided number of classes (5 for this example) . If the quotient is in
fraction, accept the next integer. For examples, 47/5 = 9.4. Take it as
10
Step 5: Find the lower class-limit of the lowest class interval and add the width
of the class interval to get the upper class-limit. (e.g. 3 – 12)
Step 6: Find the class-limits for the remaining classes.(13-22), (23-32), (33-
42), (43-52)
Step 7: Pick up each item from the data array and put the tally mark (I) against
the class to which it belongs. Tallies are to mark in bunch of five, four
times in vertical and fifth in cross-tally on the first four. Count the
number of observations, i.e., frequency in each class. (an example is
given)
Table 4.3: Representation of preparing class-interval by marking the tallies
for data frequencies in the exclusive method.
Class Interval Tallies Frequency
40-50 I 1
30-40 IIII I 6
20-30 IIII IIII I 11
10-20 IIII IIII 10
0-10 II 2
30
Note: The tallying of the observations in frequency distribution may be checked
out for any omitted or duplicated one that the sum of the frequencies should
equal to the total number of scores in the array.
Inclusive method: In this method classification includes scores, which are equal
to the upper limit of the class. Inclusive method is preferred when measurements
are given in the whole numbers. Above example may be presented in the following
form by using inclusive method of classification.(Refer to table below)
Table number: When there are more than one tables in a particular analysis, a
table should be marked with a number for their reference and identification. The
number should be written in the center at the top of the table.
Title of the table: Every table should have an appropriate title, which describes
the content of the table. The title should be clear, brief, and self-explanatory.
Title of the table should be placed either centrally on the top of the table or just
below or after the table number.
Caption: Captions are brief and self-explanatory headings for columns. Captions
may involve headings and sub-headings. The captions should be placed in the
middle of the columns. For example, we can divide students of a class into males
and females, rural and urban, high SES and Low SES etc.
Stub: Stubs stand for brief and self-explanatory headings for rows. A relatively
more important classification is given in rows. Stub consist of two parts : (i)
Stub head : It describes the nature of stub entry (ii) Stub entry : It is the description
of row entries.
Body of the table: This is the real table and contains numerical information or
data in different cells. This arrangement of data remains according to the
description of captions and stubs.
Head note: This is written at the extreme right hand below the title and explains
the unit of the measurements used in the body of the tables.
Source of data : The source from which data have been taken is to be mentioned
at the end of the table. Reference of the source must be complete so that if the
potential reader wants to consult the original source they may do so.
50
– Items in the table should be placed logically and related items should be Frequency Distribution and
Graphical Presentation
placed nearby.
– All items should be clearly stated.
– If item is repeated in the table, its full form should be written.
– The unit of measurement should be explicitly mentioned preferably in the
form of a head note.
– The rules of forming a table is diagrammatically presented in the table below.
TITLE
Stub Head Caption
Stub Entries Column Head I Column Head II
Sub Head Sub Head Sub Head Sub Head
It simplifies complex data: If data are presented in tabular form, these can be
readily understood. Confusions are avoided while going through the data for
further analysis or drawing the conclusions about the observation.
It facilitates comparison: Data in the statistical table are arranged in rows and
columns very systematically. Such an arrangement enables you to compare the
information in an easy and comprehensive manner.
Tabulation presents the data in true perspective: With the help of tabulation, the
repetitions can be dropped out and data can be presented in true perspective
highlighting the relevant information.
Figures can be worked-out more easily: Tabulation also facilitates further analysis
and finalization of figures for understanding the data.
Self Assessment Questions
1) What points are to be kept in mind while taking decision for preparing
a frequency distribution in respect of (a) the number of classes and (b)
width of class interval.
...............................................................................................................
...............................................................................................................
...............................................................................................................
...............................................................................................................
51
Introduction to Statistics
2) Differentiate between following pairs of statistical terms
i) Column and row entry
ii) Caption and stub head
iii) Head note and foot note
...............................................................................................................
...............................................................................................................
...............................................................................................................
...............................................................................................................
3) State briefly the importance of tabulation in statistical analysis.
...............................................................................................................
...............................................................................................................
...............................................................................................................
...............................................................................................................
...............................................................................................................
4.4.1 Histogram
It is one of the most popular method for presenting continuous frequency
distribution in a form of graph. In this type of distribution the upper limit of a
class is the lower limit of the following class. The histogram consists of series of
rectangles, with its width equal to the class interval of the variable on horizontal
axis and the corresponding frequency on the vertical axis as its heights. The
steps in constructing a histogram are as follows:
Step 1: Construct a frequency distribution in table form.
Step2: Before drawing axes, decide on a suitable scale for horizontal axis
then determine the number of squares ( on the graph paper) required
for the width of the graph.
52
Step 3: Draw bars equal width for each class interval. The height of a bar Frequency Distribution and
Graphical Presentation
corresponds to the frequency in that particular interval . The edge of a
bar represents both the upper real limit for one interval and the lower
real limit for the next higher interval.
Step 4: Identify class intervals along the horizontal axis by using either real
limit or midpoint of class interval. In case of real limits, these will be
placed under the edge of each bar. On the other hand , if you use midpoint
of class interval, it will be placed under the middle of each bar.
Step 5: Label both axes and decide appropriate title to the histogram.
Table 4.8: Results of 200 students on Academic achievement test.
Class Interval Frequency
10- 20 12
20- 30 10
30- 40 35
40- 50 55
50- 60 45
60- 70 25
70- 80 18
Let us take a simple example to demonstrate the construction of histogram based
on the above data.
60.00
50.00
40.00
Frequency
30.00
20.00
10.00
0.00
10 20 30 40 50 60 70 80
Achievement Scores
Fig. 4.1: Histogram
53
Introduction to Statistics
4.4.2 Frequency Polygon
Prepare an abscissa originating from ‘O’ and ending to ‘X’. Again construct the
ordinate starting from ‘O’ and ending at ‘Y’. Now label the class-intervals on
abscissa stating the exact limits or midpoints of the class-intervals. There is also
a fashion to add one extra limit keeping zero frequency on both side of the class-
interval range. The size of measurement of small squares on graph paper depends
upon the number of classes to be plotted. Next step is to plot the frequencies on
ordinate using the most comfortable measurement of small squares depending
on the range of whole distribution. To obtain an impressive visual figure it is
recommended to use the 3:4 ratio of ordinate and abscissa though there is no
tough rules in this regard. To plot a frequency polygon you have to mark each
frequency against its concerned class on the height of its respective ordinate.
After putting all frequency marks a draw a line joining. This is the polygon. A
polygon is a multi-sided figure and various considerations are to be maintained
to get a smooth polygon in case of smaller N or random frequency distribution.
The very common way is to compute the smoothed frequencies of the classes by
having the average of frequencies of that particular class along with upper and
lower classes’ frequencies. For instance, the frequency 4 of class-interval 75-79
might be smoothed as 6+4+5 /3 = 5.
60
50
40
Frequency
30
20
10
0
0 10 20 30 40 50 60 70 80 90
Achievement Scores
54
4.4.4 Cumulative Frequency Curve or Ogive Frequency Distribution and
Graphical Presentation
The graph of a cumulative frequency distribution is known as cumulative
frequency curve or ogive. Since there are two types of cumulative frequency
distribution e.g., “ less than” and “ more than” cumulative frequencies, we can
have two types of ogives.
i) ‘Less than’ Ogive: In ‘less than’ ogive , the less than cumulative frequencies
are plotted against the upper class boundaries of the respective classes. It is
an increasing curve having slopes upwards from left to right.
ii) ‘More than’ Ogive: In more than ogive , the more than cumulative frequencies
are plotted against the lower class boundaries of the respective classes. It is
decreasing curve and slopes downwards from left to right.
Example of ‘Less than’ and ‘more than’ cumulative frequencies based on data
reported in table
Class Interval Frequency Less than c.f. More than c.f.
10-20 12 12 200
20- 30 10 22 188
30- 40 35 57 178
40- 50 55 112 143
50- 60 45 157 88
60- 70 25 182 43
70- 80 18 200 18
The ogives for the cumulative frequency distributions given in above table are
drawn in Fig. 4.3
175
150
Less than
125
100
75
50
25
0
10 20 30 40 50 60 70 80
Achievement Score
A sub-divided bar diagram for the hypothetical data given in above Table 4.9 is
drawn in Fig. 4.5
Frequency
Metropolitan City
Fig. 4.5: Subdivided Bar diagram
Psychological Parameters
58
Component Value Frequency Distribution and
Degree of any component part = ——————————— ×360° Graphical Presentation
Total Value
After the calculation of the angles for each component, segments are drawn in
the circle in succession corresponding to the angles at the center for each segment.
Different segments are shaded with different colour, shades or numbers.
Table 4.11: 1000 software engineers pass out from a institute X and they
were placed in four different company in 2009.
Company Placement
A 400
B 200
C 300
D 100
Pie Diagram Representing Placement in four different company.
4.5.5 Pictograms
It is known as cartographs also. In pictogram we used appropriate picture to
represent the data. The number of picture or the size of the picture being
proportional to the values of the different magnitudes to be presented. For showing
population of human beings, human figures are used. We may represent 1 Lakh
people by one human figure. Pictograms present only approximate values.
59
Introduction to Statistics
Self Assessment Questions
1) Explain the following terms:
i) Frequency polygon
........................................................................................................
........................................................................................................
........................................................................................................
ii) Bar diagram
........................................................................................................
........................................................................................................
........................................................................................................
iii) Subdivided bar diagram
........................................................................................................
........................................................................................................
........................................................................................................
iv) Multiple bar diagram
........................................................................................................
........................................................................................................
........................................................................................................
v) Pie diagram
........................................................................................................
........................................................................................................
........................................................................................................
4.8 GLOSSARY
Abscissa (X-axis) : The horizontal axis of a graph.
Array : A rough grouping of data.
Bar diagram : It is thick vertical lines corresponding to
values of variables.
Body of the Table : This is the real table and contains numerical
information or data in different cells
Caption : It is part of table, which labels data presented
in the column of table.
Classification : A systematic grouping of data.
Continuous : When data are in regular in a classification.
Cumulative frequency : A classification, which shows the cumulative
distribution frequency below, the upper real limit of the
corresponding class interval.
Data : Any sort of information that can be analysed.
Discrete : When data are counted in a classification.
Exclusive classification : The classification system in which the upper
limit of the class becomes the lower limit of
next class.
Histogram : It is a set of adjacent rectangles presented
vertically with areas proportional to the
frequencies.
61
Introduction to Statistics Frequency distribution : Arrangement of data values according to their
magnitude.
Frequency Polygon : It is a broken line graph to represent frequency
distribution.
Inclusive classification : When the lower limit of a class differs the
upper limit of its successive class.
Ogive : It is the graph of cumulative frequency
Open-end distributions : Classification having no lower or upper
endpoints.
Ordinate (Y-axis) : The vertical axis of a graph.
Pictogram : In pictogram data are presented in the form
of pictures.
Pie diagram : It is a circle sub-divided into components to
present proportion of different constituent
parts of a total
Primary data : The information gathered direct from the
variable.
Qualitative classification : When data are classified on the basis of
attributes.
Quantitative classification : When data are classified on the basis of
number or frequency
Relative frequency : It is a frequency distribution where the
distribution frequency of each value is expressed as a
fraction or percentage of the total number of
observations.
Secondary data : Informatio n gathered t hrough already
maintained records about a variable.
Stub : It is a part of table. It stands for brief and self
explanatory headings of rows.
Tabulation : It is a systematic presentation of classified data
in rows and columns with appropriate
headings and sub headings.
Yale, G. U., and M.G. Kendall (1991). An Introduction to the Theory of Statistics.
Universal Books, Delhi.
Sani, F., and Todman, J. (2006). Experimental Design and Statistics for
Psychology. A first course book. Blackwell Publishing.
63