TOPIC 2 Presentation of Data
TOPIC 2 Presentation of Data
PRESENTATION OF DATA
Before applying any statistical technique on the raw data, we must arrange and
classify the data in the systematic form. So that the statistical work become simple and easy.
This is called presentation of data.
Usually following four methods are used for the presentation of data.
(i) Classification (ii) Tabulation
(iii) Diagrammatical (iv) Graphical
CLASSIFICATION:
The process of arranging data into classes or categories according to some common
characteristics present in the data is called as classification.
OR
The process of arranging the huge amount of values into homogeneous groups or
classes is called classification.
For example, the process of sorting letters in a post office, the letters are classified according
to cities first and then arranged according to sectors and streets.
The Basis of Classification:
There are four important bases for classification of data.
(i) Qualitative base (ii) Quantitative base
(iii) Geographical base (iv) Chronological base
(i) Qualitative Base:
The classification is called Qualitative when the data are classified by qualities or
attributes such as gender, marital status, employment status, religion, beauty etc.
(ii) Quantitative Base:
The classification is called Quantitative when the data are classified by quantitative
characteristics such as heights, age, weight, distance, length, income etc.
(iii) Geographical Base:
The classification is called Geographical when the data are classified by geographical
regions or locations. For example, the population of country may be classified by provinces,
division, districts, tehsils or towns etc.
(iv) Chronological Base:
The classification is called Chronological when the data are arranged by successive
time periods. For example, the monthly sale of a departmental store, yearly enrollment of
students in M.A.O. College, hourly temperature recorded by weather bureau etc.
Types of Classification:
Some important types of classification are;
(i) One way classification. (ii) Two way classification.
(iii) Three way classification. (iv) Many way classification.
(i) One Way Classification:
When the data are classified by one characteristic, then the classification is said to be
one way.
For example, the population of country may be classified by religions as Muslims, Christians
and Sikhs.
(ii) Two Way Classification:
When the data are classified by two characteristics simultaneously (at a time), then
classification is said to be two way.
For example, the students of Punjab University, Lahore may be classified by Age and
Height.
(iii) Three Way Classification:
When the data are classified by three characteristics simultaneously, then
classification is said to be three way.
For example, the population of city Lahore may be classified by Religion, Sex and Literacy
rate.
(iv) Many Way Classification:
When the data are classified by many characteristics simultaneously, then the
classification is said to be many way.
For example, the population of city Lahore may be classified by Religion, Sex, age, height,
Literacy rate etc.
TABULATION:
The process of systematic arrangement of data into rows and columns is called
tabulation.
Main Parts of Table and its Construction:
A statistical table has at least four major parts as;
(i) The title (ii) The box head
(iii) The stub (iv) The body of table
In addition some tables have some other minor parts as;
(v) Prefatory Note or Head Note (vi) Foot Note
(vii) Source Note
…………………………….TITLE…………………………...
(Prefatory Notes)
Column Captions (BOX HEAD)
Row Captions
STUB
Foot note………..
Source note……..
…………………………………………….
FREQUENCY DISTRIBUTION:
A tabular arrangement of data into classes with corresponding class frequencies is
called as frequency distribution.
Data which has classified in various categories or groups is called as Grouped data while
Data which have not been arranged in a systematic order are called Raw data or Ungrouped
data.
…………………………………………….
Example# 1: The weights recorded to the nearest grams of 60 apples picked out at random
from a consignment are given below;
106 107 76 82 109 107 115 93 187 95
123 125 111 92 86 70 126 68 130 129
139 119 115 128 100 186 84 99 113 204
111 141 136 123 90 115 98 110 78 185
162 178 140 152 173 146 158 194 148 90
107 181 131 75 184 104 110 80 118 82
(i) Construct a grouped frequency distribution with suitable size of class interval.
(ii) Also find the class boundaries and class marks.
Solution:
(i)
Step I: Minimum value = 68 Maximum value = 204
Range = Maximum value – Minimum vale
Range = 204 – 68 = 136
Step II: Suitable number of classes = 1 + 3.3 log N
= 1 + 3.3 log (60)
= 1 + 3.3 log (1.7782)
= 1 + 5.8681
= 6.8681 7
Range 136
Step III: Class interval = h = = 20
Number of classes 7
Step IV: FREQUENCY DISTRIBUTION OF WEIGHTS OF 60 APPLES
(ii)
Weight (grams) Frequency
Tally Class marks
C − I f C − B
X
65 – 84 IIII, IIII 9 64.5 – 84.5 74.5
85 – 104 IIII, IIII 10 84.5 – 104.5 94.5
105 – 124 IIII, IIII, IIII, II 17 104.5 – 124.5 114.5
125 – 144 IIII, IIII 10 124.5 – 144.5 134.5
145 – 164 IIII 5 144.5 – 164.5 154.5
165 – 184 IIII 4 164.5 – 184.5 174.5
185 – 204 IIII 5 184.5 – 204.5 194.5
TOTAL 60
NOTE: (i) It must be noted that for finding the class boundaries, we take the half
of the difference between the lower-class limit of one class and upper-
85 − 84
class limit of the preceding class i.e., = 0.5 , then this value be
2
subtracted from lower class limit and added in upper class limit to
obtain the class boundaries.
(ii) For finding the class marks we divide the sum of the lower and upper
65 + 84
class boundaries (or limits) by 2, i.e. = 74.5 and so on.
2
…………………………………………….
Example# 2: Construct a frequency distribution using a class interval of 0.5 from the
following data representing the lives of 40 similar car batteries recorded to the nearest tenth
of a year. The batteries were guaranteed to last three years. Also make the class boundaries
and class marks.
2.6 2.2 4.1 3.5 4.5 3.2 3.7 3.0 3.7 3.4
1.6 3.1 3.3 3.8 3.1 4.7 3.1 2.5 4.3 3.4
3.6 2.9 3.3 3.9 3.4 3.3 3.1 3.7 4.4 3.2
4.1 1.9 3.5 4.7 3.8 3.2 2.6 3.9 3.0 4.2
Solution: Given h = 0.5
Step I Minimum value = 1.6 Maximum value = 4.7
Example# 5: The following table shows the weights recorded to nearest pound of 40
students at a university.
138 164 150 132 144 125 149 157 161 145
146 158 140 147 136 148 152 144 150 156
168 126 138 176 163 119 154 165 135 142
146 173 142 147 135 153 140 135 145 128
(a) Tabulate the data into a frequency distribution taking a class interval of size 9.
(b) Make a relative frequency (R.f.), percentage relative and cumulative frequency
(c.f.) distribution.
Step II
GRAPHICAL REPRESENTATION:
The numerical facts and figures as such do not catch our attention unless they are
presented in an interesting way. Graphical representation of data is one of the most
commonly used method of presentation. Graphical representation of data may be defined as
“A visual display of statistical data in the form of points, lines, areas and other geometrical
forms and symbols”.
Graphs cannot only be made attractive, but they are also easy to comprehend and do
not take much time to read.
The advantages of graphical representations of data are as; it makes the reading more
interesting, less time consuming and easily understandable. The disadvantage of graphical
representation is that it gives lack details and is less accurate.
Graphical representation can be divided into two main groups as diagrams and
graphs.
DIAGRAMS OR CHARTS:
A diagram is any one, two or three-dimensional form of graphical representation. The
commonly used diagrams or charts are as;
(i) Simple Bar Chart (ii) Multiple Bar Chart
(iii) Component Bar Chart or Sub-divided Bar Chart
(iv) Percentage Component Bar Chart
(v) Pictogram (vi) Pie chart
(i) Simple Bar Chart or Diagram:
Simple Bar Chart is used to represent the data having a single variable. The vertical or
horizontal bars are made to represent the data when the difference between different
quantities is usually small. The width of the bars always uniform and has no significance. The
length of the bars is proportional to the size of quantities. The space between the bars should
not be more than the width of bars and should not be less than half of its width. The vertical
bars are used to represent time series or quantitative data while horizontal bars are used to
represent qualitative or geographical data. A data which do not belong to time should be
arranged in ascending or descending order before drawing chart.
Example# 13: The following table shows the production of wheat in Pakistan during the year
2001 to 2006. Represent the data by a Simple Bar Chart.
Years 2001 2002 2003 2004 2005 2006
Production (Lakh tons) 64 68 73 75 71 81
Solution:
SIMPLE BAR CHART SHOWING PRODUCTION OF WHEAT IN PAKISTAN FOR
THE YEARS 2001 TO 2006
100
80
Production
60
40
20
0
2001 2002 2003 2004 2005 2006
Years
Solution:
MULTIPLE BAR CHART SHOWING IMPORTS & EXPORTS OF PAKISTAN FROM
1970 TO 1970
…………………………………………….
Example# 16: Draw a Multiple Bar Chart to represent the male and females’ population from
the following data;
…………………………………………….
Example# 19: The table given below shows the quantity in hundreds of kilograms of Wheat,
Barley and Oats produced on a certain farm during the year 1971 to 1974. Construct a
Component Bar Chart to illustrate these data.
Years Wheat Barley Oats
1971 34 18 27
1972 43 14 24
1973 43 16 27
1974 45 13 34
Example# 20: Draw a Component Bar Chart for the following population (in Lakh) of Male
and Female in different cities of Pakistan.
Division Both Sexes Male Female
Peshawar 64 33 31
Rawalpindi 40 21 19
Sargodha 60 32 28
Lahore 65 35 30
D.Y.S.
…………………………………………….
(iv) Percentage Component Bar Chart or Percentage Sub-divided Bar Chart:
Component Bar Chart may also be drawn on percentage basis. The given components
are expressed in percentages of their respective totals. To draw the Percentage Component
Bar Chart, firstly bars of length equal to 100 are drawn for each class and then sub-divided
according to the proportion of the percentages of their components. Percentage Component
Bar Chart is also known as Percentage Staked Bar Chart.
…………………………………………….
Example# 21: The prices (in rupees) of different commodities from the Year 2001 to 2004
are given below. Represent the following data by a Percentage Component Bar Chart.
Years Wheat Rice Ghee Total
2001 800 3000 6500 10300
2002 1000 3200 6800 11000
2003 1150 3500 7000 11650
2004 1200 3800 7150 12150
Solution:
Years Ghee Rice Wheat Total
800
2001 100 = 7.8% 29.1% 63.1% 100%
10300
1000
2002 100 = 9.1% 29.1% 61.8% 100%
11000
1150
2003 100 = 9.9% 30.0% 60.1% 100%
11650
1200
2004 100 = 9.9% 31.3% 58.8% 100%
12150
Example# 22: The following table gives the value added (in Crore) in the Agriculture Sector
of Pakistan. Draw Percentage Component Bar Chart to represent the data. (D.Y.S.)
Major Minor
Years Others Total
Crops Crops
1972-73 1235 283 672 2190
1973-74 1533 378 897 2808
1974-75 1827 490 1047 3364
1975-76 2072 569 1218 3859
(D.Y.S.)
…………………………………………….
iv) Pictogram:
A Pictogram is a chart that uses pictures or symbols to represent data so you don’t
have to look at lots of numbers.
You have to read Pictograms carefully so you understand what the symbols mean.
All Pictograms should have a key. A key shows you what each symbol represents.
Example# 23: Take a look at this pictogram. It shows how many cars a salesman sold during
a week.
…………………………………………….
Example# 25: The areas of the various Continents/countries of the world in millions of
square kilometers are given below. Prepare a Pie Chart of the data given below;
Continent/C
U.S.S.R.
America
America
Oceania
ountry
Europe
Africa
South
North
Asia
(viii) The graph should not be marked with too many curves
GRAPHS OF FREQUNCY DISTRIBUTION:
The important graphs of frequency distributions are;
(i) Histogram (ii) Frequency Polygon
(iii) Frequency Curve (iv) Cumulative frequency Curve or Ogive.
(i) Histogram:
A Histogram consists of a set of adjacent rectangles in which class boundaries are
marked along X-axis and frequencies are taken on Y-axis. When the class intervals are equal
then the rectangles all have the same width and the heights of rectangles are directly
proportional to the respective class frequencies. If the class intervals are not equal, then the
heights of the rectangles have to be adjusted accordingly. To adjust the heights of the
rectangles in frequency distributions, each class frequency is divided by its class interval size.
…………………………………………….
Example# 26: Construct Histogram for the following frequency distribution.
HISTOGRAM
…………………………………………….
Example# 27: Construct Histogram for the following frequency distribution.
Classes 10-11 12-14 15-19 20-29 30-34 35-39 40-42
f 4 12 25 60 25 15 6
Solution:
Class Interval Adjusted
C–I frequency C–B
Size frequency
4
10 – 11 4 9.5 – 11.5 2 =2
2
12
12 – 14 12 11.5 – 14.5 3 =4
3
25
15 – 19 25 14.5 – 19.5 5 =5
5
60
20 – 29 60 19.5 – 29.5 10 =6
10
25
30 – 34 25 29.5 – 34.5 5 =5
5
15
35 – 39 15 34.5 – 39.5 5 =3
5
6
40 – 42 6 39.5 – 42.5 3 =2
3
…………………………………………….
Example# 28: Draw a Histogram to illustrate the following data;
Age nearest
birth day
20 – 24
25 – 29
30 – 39
40 – 44
45 – 49
50 – 54
55 – 64
Number
1 2 26 22 20 15 14
of men
Example# 29: Draw a Histogram, frequency polygon and Cumulative frequency polygon of
the following distribution;
105 – 124
125 – 144
145 – 164
165 – 184
185 – 204
85 – 104
Classes
65 – 84
frequency 9 10 17 10 5 4 5
Example# 30: Draw a Histogram and frequency polygon from the following data;
FREQUENCY POLYGON
Alternative Method
FREQUENCY POLYGON
…………………………………………….
(iii) Frequency Curve:
If the curve of the frequency polygon is smoothed, it is called as frequency curve or if
in the frequency polygon, the plotted points are joined by a freehand drawing method instead
of joined by a straight line, we get the frequency curve. A frequency curve should not touch
the X-axis.
Example# 32: Draw a frequency polygon for the following frequency distribution.
Classes 60-62 63-65 66-68 69-71 72-74 75-77 78-80
f 4 9 14 18 12 7 3
Solution:
C – I 60-62 63-65 66-68 69-71 72-74 75-77 78-80
f 4 9 14 18 12 7 3
X 61 64 67 70 73 76 79
FREQUENCY CURVE
…………………………………………….
Example# 34: Draw a “more than” cumulative frequency polygon from the following data.
Age 20-24 25-29 30-34 35-39 40-44 45-49 50-54
f 1 2 26 22 20 15 14
Solution:
Age More than Class
f C–B c.f
C–B Boundaries
- - - More than 19.5 100
20 – 24 1 19.5 – 24.5 More than 24.5 100-1=99
25 – 29 2 24.5 – 29.5 More than 29.5 99-2=97
30 – 34 26 29.5 – 34.5 More than 34.5 97-26=71
35 – 39 22 34.5 – 39.5 More than 39.5 71-22=49
40 – 44 20 39.5 – 44.5 More than 44.5 49-20=29
45 – 49 15 44.5 – 49.5 More than 49.5 29-15=14
50 – 54 14 49.5 – 54.5 More than 54.5 14-14=0
TOTAL 100 - - -
…………………………………………….
…………………………………………….
(xxv) The cumulative frequency of the last class in less than cumulative frequency
distribution is always equal to;
(a) 100 (b) f
(c) 1 (d) Mean
(xxvi) The sum of the relative frequency is always equal to;
(a) 1 (b) f
(c) 100 (d) None of these
(xxvii) Simple Bar chart is used to represent the data having;
(a) Two variables (b) Single variable
(c) Three variables (d) None of these
(xxviii) A _________ indicates two or more characteristics corresponding to the values of a
common variable in the form of group;
(a) Multiple bar chart (b) Simple bar chart
(c) Component bar chart (d) None of these
(xxix) To draw the ________________, bars of length equal to 100 are drawn for each class
(a) Multiple bar chart
(b) Simple bar chart
(c) Percentage Component bar chart
(d) None of these
(xxx) Total angle of Pie Chart is;
(a) 1800 (b) 3600
(c) 3000 (d) 900
(xxxi) The graph of frequency distribution is called as;
(a) Histogram (b) Historigram
(c) Ogive (d) None of these
(xxxii) A graph of the cumulative frequency distribution is called;
(a) Histogram (b) Historigram
(c) Ogive (d) None of these
(xxxiii) The graph obtained by joining the mid-points at the tops of the adjacent rectangles in
the Histogram is called as;
(a) Frequency polygon (b) Historigram
(c) Ogive (d) None of these
(xxxiv) In case of paired observations, the classification of data in tabular form is known as a;
(a) Uni-variate frequency distribution
(b) Multivariate frequency distribution
(c) Bivariate frequency distribution
(d) None of these
(xxxv) If in a frequency distribution, the lower limit of first class or the upper limit of last
class is not fixed, then it is called;
(a) Closed ended frequency distribution
(b) Open ended frequency distribution
(c) Range
(d) None of these
ANSWERS
Q# Answer Q# Answer Q# Answer
i b xiii d xxv b
ii d xiv c xxvi a
iii c xv c xxvii b
iv c xvi a xxviii a
v b xvii b xxix c
vi a xviii a xxx b
vii b xix b xxxi a
viii b xx a xxxii c
ix d xxi d xxxiii a
x b xxii b xxxiv c
xi b xxiii a xxxv b
xii b xxiv b
…………………………………………….