Unit 03 - 04

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 27

Unity University College

BLOCK 3: CLASSIFICATION AND


PRESENTATION OF STATISTICAL DATA

UNIT 3: CLASSIFICATION OF DATA


UNIT 4: GRAPHICAL PRESENTATION OF DATA

INTRODUCTION

In the preceding block, you have learnt about the collection of data. When conducting a
statistical study, you must gather data for the particular variable under study.

In this block, you will learn about classification and presentation of data. The first part deals with
‘classification of data’ and the following unit deals with ‘presentation of data’.

The purpose of this block is to explain how to organize data by constructing ‘frequency
distribution’ and how to present the data by constructing graphs and charts. The graphs and
charts illustrated in this block include histograms, frequency polygons, ogives, pie charts, bar
charts, time series graphs, and pictographs (pictograms).

Creating Opportunity Through Education 37


Unity University College

UNIT 3: CLASSIFICATION OF DATA

CONTENTS:
3.0. Aims and Objectives
3.1 Introduction
3.2. Definition of Classification of Data
3.3. Types of Classification
3.4. Frequency Distribution
3.5. Common Terminologies in a Grouped Frequency Distribution
3.6. Rules for Forming a Grouped Frequency Distribution
3.7. Cumulative Frequency Distribution (CFD)
3.8. Relative Frequency Distribution (RFD)
3.9. Summary
3.10.Answers to Check Your Progress (CYP) Questions
3.11. Model Examination Questions
3.12. Glossary
3.13. References

3.0. AIMS AND OBJECTIVES

The aim of this unit is to study about the collection of data for a statistical study and discuss the
various types of classification of data, and then to organize these data into a frequency
distribution.

At the end of this unit, you will be able to:

 Define ‘classification of data’


 Identify the types of ‘classification of data’
 Define what a ‘Frequency Distribution (FD)’ is
 Organize raw data in to a ‘Frequency Distribution (FD)’
 Define ‘presentation of data’
 Represent the FD in a histogram, frequency polygon or cumulative frequency curve
(ogive)
 Present data using such common diagrammatic techniques as bar charts, pie charts, and
pictogram (pictograph).

3.1. INTRODUCTION

After collecting relevant information (data) for the purpose of statistical investigation, the next
important task is classification and presentation of this data. It is difficult to group the meaning
of any considerable volume of numerical data unless their mass is some hours reduced to
relatively few convenient classes or categories and presented with the help of some kinds of
visual aid.

Creating Opportunity Through Education 38


Unity University College

This section discusses classification of data. Presentation of data using graphs and charts will be
seen in the next unit.

3.2. DEFINITION OF CLASSIFICATION OF DATA

Classification: - is the process of arranging things in groups or classes according to


their resemblance.

Purposes of Classification:-
 To eliminate unnecessary detail.
 To bring out clearly points of similarity & dissimilarity
 To enable one to form mental pictures of objects on measurements
 To enable one to make comparisons and draw inferences

3.3. TYPES OF CLASSIFICATION

1. Geographical Classification: - Data are arranged according to places like continents,


regions, and countries

Example
Region Common Language Spoken
1 Tigrigna
2 Afar
3 Amharic
4 Oromifa

2. Chronological Classification:- Data are arranged according to time like year, month.

Example
Year (in EC) Population (in million)
1974 30
1986 52
1991 60

3. Qualitative Classification: - Data are arranged according to attributes like color, religion,
marital-status, sex, educational background, etc.

Creating Opportunity Through Education 39


Unity University College

Example 3. Employees in a Factory x

Educated Un educated

Female Male Female Male

4. Quantitative Classification:- In this type of classification, the statistical data is classified


according to some quantitative variables. The variable may be either discrete or continuous.

Example 4.
Mr. x Height (X) in cm
A 160
B 182
C 175
D 178

Note: There are two kinds of variables, which can have values: Discrete Variable and
Continuous Variable.

A. Discrete Variables – are variables that are associated with enumeration or counting
Example
Number of students in a class
Number of children in a family, etc

B. Continuous Variables – are variables associated with measurement.


Example
Weights of 10 students.
Heights of 12 persons.
Distance covered by a car between two stations etc.

3.4. FREQUENCY DISTRIBUTION

When the raw data have been collected, they should be put in to an ordered array in an ascending
or descending order so that it can be looked at more objectively. Then this data must be
organized in to a “FD” which simply lists the values or classes with their corresponding
frequencies in a tabular form. Here, frequency refers to the number of observations a certain
value occurred in a data.

Creating Opportunity Through Education 40


Unity University College

The tabular representation of values of a variable together with the corresponding frequency is
called a Frequency Distribution (FD).

Definition:

A frequency distribution is the organization of raw data in table form, using classes and
frequencies.

Frequency distribution is of two kinds

A. Ungrouped Frequency Distribution (UFD)


Shows a distribution where the values of a variable are linked with the respective frequencies.

Example 7. Consider the number of children in 15 families.


1 0 3 2 0
2 4 1 3 1
4 1 2 2 3
Construct ungrouped FD for the above data.
Solution:
No. of Children No. of Family Frequency
(Values) (Tallies)
0 // 2
1 //// 4
2 //// 5
3 /// 3
4 // 2
Total 16
CYP 1
Consider the following scores in a statistics test obtained by 20 students in a given class.
10, 4, 4, 7, 5, 7, 7, 8, 5, 7, 8, 5, 10, 8, 7, 5, 7, 8, 7, 4
Prepare an ungrouped FD

B. Grouped Frequency Distribution (GFD)

If the mass of the data is very large, it is necessary to condense the data in to an appropriate
number of classes or groups of values of a variable and indicate the number of observed values
which fall in to each class. Therefore, a GFD is a frequency distribution where values of a
variable are linked in to groups & corresponded with the number of observations in each group.

Creating Opportunity Through Education 41


Unity University College

Example *
Values (xi) 1 - 25 26 - 50 51 - 75 76 - 100
Frequency (fi) 3 10 18 6

3.5. COMMON TERMINOLOGIES IN A GFD

i. Class:- group of values of a variable between two specified numbers called lower class limit
(LCL) & upper class limit (UCL)

*
In Example , the GFD contains four classes: 1 – 25, 26 – 50, 51 – 75, and 76 – 100
LCL1 = 1, UCL1 = 25 LCL3 = 51, UCL3 = 75
LCL2 = 26, UCL2 = 50 LCL4 = 76, UCL4 = 100

ii. Class Frequency (or Simply Frequency): refers to the number of observations
corresponding to a class.

In Example * the class frequency of the 1 , 2


st nd
, 3rd, & 4th classes are respectively 3, 10, 18 and
6.

iii. Class Boundaries: are boundaries obtained by subtracting half of the unit of measurement
(u) from the lower limits or by adding ½ (u) on the upper limits of a class.
i.e UCBi = UCLi + ½ (u)
LCBi = LCLi - ½ (u)
Where UCBi = Upper Class Boundaries and
LCBi = Lower Class Boundaries
Remark: The unit of measurement (u) is the gap between any two successive classes. i.e

u = lower limit of a class – upper limit of the preceding class.

*
In Example , consider the 2nd class, 26 – 50 , since u = 26 – 25 = 1,
LCL2 = 26 UCL2 = 50
LCB2 = 26 - ½(1) = 25.5 UCB2 = 50 + ½(1) =50.5

iv. Class Width (size of a class or class interval): it is the difference between the upper and
lower class limits or the difference between the upper and lower class boundaries of any class.

Remarks:
1. If both the LCL & UCL are included in a class, it is called an inclusive class. For
inclusive classes,
Class width (cw) = UCBi - LCBi

Creating Opportunity Through Education 42


Unity University College

2. If LCL is included and the UCL is not included in a class, it is called an exclusive class.
For exclusive classes

cw = UCLi – LCLi

To be consistent, we use inclusive classes.

v. Class Mark (cm): it is the mid point (center) of a class

cmi = UCBi + LCBi


2
Note:- the difference between any two successive class marks is equal to the width of a class

vi. Range (R) : is the difference between the largest (L) and the smallest (S) values in a
data

R=L–S

CYP 2 consider the following GFD

Class Frequency (f)


5–9 2
10 – 14 6
15 – 19 12
20 – 24 7
25 – 29 3
Total 30

a. What is the class frequency of the 3rd class?


b. How many observations (items) are linked into the last class?
c. Find i. the LCL and UCL of the fourth class
ii. the UCB and LCB of the third class
iii. the class interval ( class width) of the fifth class
iv. the class mark (mid point) of the second class

3.6. RULES FOR FORMING A GROUPED FREQUENCY


DISTRIBUTION

To construct a GFD the following points should be considered


1) The classes should be clearly defined. That is each observation should fall in to on e &
only one class.

Creating Opportunity Through Education 43


Unity University College

2) The number of classes neither should either to be too larger nor should be too small.
Normally, 5 to 20 classes are recommended
3) All the classes should be of the same width. An approximate suitable class width can be
obtained as:
Range R L S
cw  i.e cw  
Number of Classes n n
R
Example 8. Let  6.8263
n
If all the observations are whole numbers, cw = 7
If all the observations are to one decimal places, cw = 6.8
If all the observations are to two decimal places, cw = 6.83, etc.
Note that a suitable number of classes can be obtained by using the formula n  1 + 3.322 logN

up/down to the nearest whole number, where N is the total number of observations.

Remark Unequal class intervals create problem in graphing and computing some statistical
measures

4) Determine the class limits


i. determine the lower class limit of the first class (LCL1), then
LCL2 = LCL1 + cw, LCL3 = LCL2 + cw, …, LCLi+1 = LCLi + cw
ii. determine the upper class limit of the first class (UCL1) i.e.
UCL1 = LCL1 + cw – u, where u = the unit of measurement, then
UCL2 = UCL1 + cw , UCL3 UCL2, … , UCLi+1 = UCLi + cw
5) Complete the GFD with the respective class frequencies.

Example 9. The number of customers for consecutive 30 days in a supermarket was listed as
follows:

20 48 65 25 48 49
35 25 72 42 22 58
53 42 23 57 65 37
18 65 37 16 39 42
49 68 69 63 29 67
a. construct a GFD with a suitable number of classes
b. complete the distribution obtained in (a) with class boundaries & class marks

Solution: i. Range = Largest value – smallest value


= 72 – 16 = 56

ii. N = 30 (total number of observations)


 number of classes, n = 1 + 3.322 log30
 n = 1 + 3.322 log30
= 1 + 3.322 (1.4771)
= 5.9
Hence a suitable number of class n is chosen to be 6

Creating Opportunity Through Education 44


Unity University College

Range 56
iii. Class width =  = 9.33 = cw
n 6
For the sake of convenience, take cw to be 10 (note that it is also possible to
choose the cw to be 9).
iv. Take lower limit of the 1st class (LCL1) to be 16 & u = 1
i.e. LCL1 = 16 and UCL1 = LCL1 + cw – u = 16+10-1 = 25
LCL2 = LCL1 + cw = 16 + 10 = 26 UCL2 = UCL1 + cw = 25 + 10 = 35
LCL3 = LCL2 + cw = 26 + 10 = 36 UCL3 = UCL2 + cw = 35 + 10 = 45

There fore, the GFD would be

a)
Class (xi) Frequency (fi) Class (xi) Frequency (fi) CBi cmi
16 – 25 7 16 – 25 7 15.5 – 25.5 2.05
26 – 35 2 26 – 35 2 25.5 – 35.5 30.5
36 – 45 6 36 – 45 6 35.5 – 45.5 40.5
46 – 55 5 46 – 55 5 45.5 – 55.5 50.5
56 – 65 6 56 – 65 6 55.5 – 65.5 60.5
66 – 75 4 66 – 75 4 65.5 – 75.5 70.5
b)

CYP 3
Construct a grouped frequency distribution for the following ages of 50 persons with 6 classes.
37 40 69 35 36 70 72 62 36 72
65 64 47 59 55 42 45 50 46 65
54 63 51 50 61 60 58 58 56 58
55 45 49 51 50 56 44 60 70 44
52 43 55 46 42 62 57 48 60 55

3.7. CUMULATIVE FREQUENCY DISTRIBUTION (CFD)

It is the collection of values of a variable above or below specified values in a distribution. GFD
is of two types.
a. ‘Less Than’ Cumulative Frequency Distribution (<CFD): shows the collection of
cases lying below the upper class boundaries of each class.

b. ‘More Than’ Cumulative Frequency Distribution (>CFD): shows the collection of


cases lying above the lower class boundaries of each class.

Creating Opportunity Through Education 45


Unity University College

Remark: The frequency distribution does not tell us directly the number of units above or
below specified values of the classes this can be determined from a “cumulative Frequency
Distribution’

Example 11 Consider the frequency distribution in Example 9

Class (xi) Frequency (fi) Less than Cumulative More than Cumulative
Frequency (<cfi) Frequency (>cfi)
3-6 4 4 30
7 – 10 7 11 26
11 – 14 10 21 19
15 – 18 6 27 9
19 – 22 3 30 3

This means that from ‘less than’ cumulative frequency distribution there are 4 observations less
than 6.5, 11 observations below 10.5, etc and from ‘more than’ cumulative frequency
distribution 30 observations are above 2.5, 25 above 6.5 etc.

3.8. RELATIVE FREQUENCY DISTRIBUTION (RFD)

It enables the researcher to know the proportion or percentage of cases in each class. Relative
frequencies can be obtained by dividing the frequency of each class by the total frequency. It
can be converted in to a percentage frequency by multiplying each relative frequency by 100%.
i.e.

fi
Rf i 
n

Where Rfi – is the relative frequency of the ith class


fi – is the frequency of the ith class
n – is the total number of observations
Note: Pfi = Rfi  100%
Where Pfi is percentage frequency of each class.

Example 14: The relative and percentage of frequency distribution of Example 9 is :

Creating Opportunity Through Education 46


Unity University College

xi fi Rfi %freq. (Pfi)


3–6 4 4/30 4/30  100
7 – 10 7 7/30 7/30  100
11 – 14 10 10/30 10/30  100
15 – 18 6 6/30 6/30  100
19 – 22 3 3/30 3/30  100
Total 30 1 100%
3.9 SUMMARY

This unit discussed the definitions of classification of data and a frequency distribution. In order
to describe situations, draw conclusions or make inferences about random events, one must
organize the data in some meaningful way. The most convenient method of organizing data is to
construct a frequency distribution.

Therefore, a frequency distribution was seen as a distribution showing the correspondence of


values or classes with their respective frequencies.

3.10 ANSWERS TO CHECK YOUR PROGRESS (CYP) QUESTIONS

CYP 1
Value(xi) Frequency(fi)
4 3
5 4
7 7
8 4
10 2

CYP 2
a) 12
b) 3
c) i) L.C.L4 = 20 and U.C.L4 = 24
ii) Since u = 10 – 9 = 1 (or any gap between two consecutive classes)
L.C.B3 = L.C.L3 – ½(u) = 15 - ½.1 = 14.5
U.C.B3 = U.C.L3 + ½(u) = 19+ ½.1 = 19.5
iii) class interval = class width = cw = UCB5 – LCB5 = 29.5 – 25.5 = 6
iv) class mark(cm2) = UCB2 + LCB2
2
= 19.5 + 14.5
2
= 24/2
= 12

Creating Opportunity Through Education 47


Unity University College

CYP 3

i) Largest Value(L) = 72 and Smallest Value(S) = 35


Range(R) = L – S = 72 – 35
= 37
ii) N = 50 (total number of observation)
iii) Select the number of classes desired (usually between 5 and 20);
in this case, let n = 6 be arbitrarily chosen.
iv) class width(cw)= Range = R
# of classes n
i.e. cw = 37 = 6.1666… = 7(Round the answer up to the next number)
6
iv) Select a starting point as the lower class limit (this is usually the smallest score i.e.
LCL1 = 35 ). Add the class width(cw = 7) to that score to get the lower limits of
the next class. Keep adding until there are 6 classes as shown
35
42
49
56
63
70
v) Subtract one unit from the lower limit of the second class to get the upper limit of
the first class; then add the class width(cw) to each upper limit to get all the
upper limits. i.e. UCL1 = LCL2 - 1 = 42 – 1 = 41. So the first class is 35-41.
vi) Tally the data (count the number of observations linked in to the respective
classes) and write the numerical values for tallies in the frequency column.
Therefore, the frequency distribution would be:

Class Limits Tally Frequency(fi)


35-41 //// 5
42-48 //// //// 11
49-55 //// //// // 12
56-62 //// //// /// 13
63-69 //// 5
70-76 //// 4

3.11. MODEL EXAMINATION QUESTIONS

Direction: Answer each of the following questions.

1. Determine whether each statement is true or false.

a) A frequency distribution is the organization of raw data, in table form, that lists
values or classes with their corresponding frequencies.

Creating Opportunity Through Education 48


Unity University College

b) The mid point of a class is found by adding the upper and lower limits and
dividing by
c) If the gap between any two successive classes is one and the limits of a class are 10-19,
then the width of the class is 9.
d) If the limits of a class in a frequency distribution are 26-30, then the boundaries are
25.5-30.5.
e) When data is first collected, it is called raw data.

f) A frequency distribution should contain between 50 and 100 classes.


g) It is not important to keep the width of each class the same in a frequency distribution.

2. Classify each variable as discrete or continuous.

a) Number of cartoons of milk manufactured each day.


b) Temperatures of airplane interiors at a given airport.
c) Lifetimes of transistors in a stereo set.
d) Weights of newborn calves.
3. 100 employees were surveyed in a factory to find out their ages. The result was obtained as
follows.

32 21 28 31 35 46 48 49 49 48
36 37 22 31 28 34 20 45 44 48
38 33 33 23 28 29 33 26 36 30
43 42 32 36 24 27 27 32 45 45
39 39 38 32 33 25 30 28 37 36
42 43 38 40 35 34 20 30 36 32
40 38 38 40 46 36 35 21 31 35
41 42 39 40 46 44 32 37 22 27
41 39 40 38 44 45 48 36 32 23
40 41 40 44 49 49 49 49 37 33

Construct a Grouped Frequency Distribution (GFD) with five classes for the above data.

3.12. GLOSSARY

Raw Data: Data collected in original form.

Frequency: The number of values in a specific class of the distribution or the number of times a
value occurs in the distribution.

Cumulative Frequencies: refer to the total frequency of all values up to and including the upper
boundary of the class interval that is under consideration.

Creating Opportunity Through Education 49


Unity University College

Frequency Distribution: A table showing classes or values with their corresponding


frequencies.

Class: In set refers to a group of data considered as one item in a frequency distribution.

Range: Means the difference between the largest and the smallest values in a set of data.

Class Interval: Refers to difference between class limits (boundaries).

Class Limits: Means limits of different classes in a frequency distribution.

Class Boundaries: Boundaries that are obtained by adding and subtracting half of unit of
measurement.

3.13. REFERENCES

 Allen G.Bluman, Elementary Statistics, A Step By Step Approach,


 Anderson, Sweeney, Williams, Statistics For Business and Economics, Fifth edition 1986
 Douglas A.Lind, Robert D.Mason, Basic Statistics for Business & Economics, Second
Edition
 Richard I.Levin, Statistics for Management, Third edition,1984.
 Stephen A.Book, Essentials of Statistics, 1978

Creating Opportunity Through Education 50


Unity University College

UNIT 4: PRESENTATION OF DATA

CONTENTS:
4.0. Aims and Objectives
4.1. Introduction
4.2. Histogram
4.3. Frequency Polygon
4.4. Cumulative Frequency Curve (Ogive)
4.5. Line Graph
4.6. Vertical Line Graph
4.7. Bar Chart (Bar Diagram)
4.8. Types of Bar Charts
4.9. Pie Chart
4.10. Pictograph (Pictogram)
4.11. Summary
4.12. Answer to Check Your Progress Questions (CYP)
4.13. Model Examination Questions
4.14. Glossary
4.15. References

4.0. AIMS AND OBJECTIVES

The aim of this unit is to study how to construct and present data using different types of graphs,
charts, and diagrams that can facilitate comparisons and in general to have an over all good
picture of data.

At the end of this unit, you will be able to:

 Define ‘presentation of data’


 Identify different types of ‘graphs’ and ‘charts’
 Identify the types of ‘bar charts’
 Construct a ‘histogram’, ‘frequency polygon’, ‘ogives’, and other graphs, and ‘charts’.

4.1. INTRODUCTION

This unit deals with the study of organizing a set of raw data in to a Frequency Distribution (FD)
and describes the distribution graphically in a histogram, a frequency polygon, & a cumulative
frequency curve (ogive). The other types of numerical information will be summarized &
presented in the form of bar chart, pie chart or a pictogram.

Definition:

Creating Opportunity Through Education 51


Unity University College

Presentation is a statistical procedure of arranging and putting data in a form of tables,


graphs, charts and/or diagrams

4.2. HISTOGRAM

After you complete a frequency distribution, your next step will be to construct a “picture” of
these data values using a histogram. A histogram is a graph consisting of a series of adjacent
rectangles whose bases are equal to the class width of the corresponding classes and whose
heights are proportional to the corresponding class frequencies. Here, class boundaries are
marked along the horizontal axis (x – axis) and the class frequencies along the vertical axis ( y –
axis) according to a suitable scale. It describes the shape of the data. You can use it to answer
quickly such questions a,s are the data symmetric? And where do most of the data values lie?

Example 1. Considers the following GFD and construct a histogram

Class (xi) Frequency (fi)


3–6 4
7 – 10 7
11 – 14 10
15 – 18 6
19 - 22 3
Total 30

Solution:

Histogram for the above distribution

10
Class frequency (fi)

2.5 6.5 1.05 14.5 18.5 22.5

Class boundaries (CBi)

Creating Opportunity Through Education 52


Unity University College

CYP 1 construct a histogram for the following distribution

Class (xi) Frequency (fi)


5 – 10 4
10 – 15 7
15 – 20 9
20 – 25 12
25 - 30 6
30 – 35 5

4.3. FREQUENCY POLYGON

It is a line graph of frequency distribution. Although a histogram does demonstrate the shape of
the data, perhaps the shape can be more clearly illustrated by using a frequency polygon. Here,
you merely connect the centers of the tops of the histogram bars (located at the class midpoints)
with a series of straight lines. The resulting figure is a frequency polygon. Here the class marks
are plotted along the x – axis and the class frequencies along the y – axis. Empty classes are
include at each end so that the curve will anchor with the x – axis.

Example 2. Construct a frequency polygon for the frequency distribution given in Example9

Solution:
A frequency polygon for the
distribution in example 9

15
frequency (fi)

10

0
0.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5
Class marks (cmi)

Creating Opportunity Through Education 53


Unity University College

CYP 2 construct a frequency polygon for the frequency distribution given under CYP 1

4.4 CUMULATIVE FREQUENCY CURVE, (OGIVE)

It is the graphic representation of a cumulative frequency distribution Ogives are of two kinds.
‘Less than’ ogive and ‘more than’ Ogive < Ogive and > Ogive.
A) ‘Less than’ ogive: here, upper class boundaries are plotted against the ‘less than’
cumulative frequencies of the respective class & they are joined by adjacent lines.
Example 3. Draw a ‘less than’ ogive for the frequency distribution in Example 11

Solution:

A less than ogive showing the frequency


distribution above

35
Less than cumulative

30
frequency (<Cfi)

25
20
15
10
5
0
6.5 10.5 14.5 18.5 22.5
Upper class boundary (UCBi)

B) ‘More than’ ogive: here, lower class boundaries are plotted against the ‘more than’
cumulative frequencies of their respective class and they are joined by adjacent lines.

Example 4. Draw a ‘More than’ ogive for the frequency distribution in Example 11

Solution:

Creating Opportunity Through Education 54


Unity University College

A more than ogive for the above frequency


distribution

40
More than cumulative

30
frequency (>Cfi

20
10
0
2.5 6.5 10.5 14.5 18.5
lower class boundaries (LCBi)

4.5. LINE GRAPH

It represents the relation ship between time (on the x-axis) and values of variable (on the y-axis).
The values are recorded with respect to the time of occurrence.

Example 5. Draw a line graph for the following time series.

Year 1986 1987 1988 1989 1991


Values 20 10 30 15 1

Solution:
A line graph showing the above time series

35
30 30
25 25
20 20
Values

15 15
10 10 10
5
0
1986 1987 1988 1989 1990 1991
Year

Creating Opportunity Through Education 55


Unity University College

4.6. VERTICAL LINE GRAPH:

Is a graphical representation of discrete data (or characteristics expressed with whole numbers)
with respect to the frequencies. Vertical solid lines are used to indicate the frequencies.

Example 6. Draw a vertical line graph for the following data

Family A B C D E
Number of children 3 2 7 6 4

Solution:
Y
7 …………………
6 …………………………
5
4 ………………………………
3 ……
2 ……………
1
X
A B C D E
vertical line graph showing number of children in family A , B , C , D and E

4.7. BAR CHART (BAR DIAGRAM):

Histogram, Frequency polygon, ogives are used for data having an interval or ratio level of
measurement. The other kinds of presenting statistical data suitable for a particular kind of
situations are bar charts, pie chart and pictograph.

Bar chart is a series of equally spaced bars of uniform width where the height (length) of a bar
represents the amount (magnitude) of frequency corresponding with a category. Bars may be
drawn horizontally or vertically. Vertical bar graphs are preferred as they allow comparison with
other bars.

4.8. TYPES OF BAR CHARTS

A. Simple Bar Chart:

It represents a single set of data (variable) classified in different categories. Singular bars are
drawn with the respective frequencies.

Example18: Revenue (in millions of Birr) of company x from 1980 to 1982 is given below

Creating Opportunity Through Education 56


Unity University College

Year Revenue
1980 50
1981 150
1982 200

Solution:

A simple bar chart showing revenues of


company X from 1980 to 1982

250
200
Revenue

150
100
50
0
1980 1981 1982
year

B. Multiple Bar Chart:

here two or more bars are grouped with the corresponding frequency to represent two or more
interrelated data in each category. The bars of related variables are kept adjacent to each other
for every set of values. These charts can be used if the overall total is not required and each bar
is shaded or colored separately and a key is given to distinguish them.

Example19: The following table shows the production of wheat and maize in hundreds of
quintals.

Year Maize Wheat


1980 40 80
1981 20 60

Creating Opportunity Through Education 57


Unity University College

1982 60 100

Solution:

The number of quintals(in thousands) of


wheat and maize production

100 100

80 80

60 60 60
Number of
quintals 40 40 maize
20 20 wheat

0
1980 1981 1982
Year

C. Subdivided Bar Chart:

It is used to present data by subdividing a single bar with respect to the proportional frequency.
Each portion of the bar is then shaded or colored and a key is give to distinguish them.

Example20: The number of quintals of wheat and maize (in millions of quintals) produced by
country x in the indicated years.

Year Wheat Maize


1980 150 150
1981 300 200
1982 350 100
Solution:

Creating Opportunity Through Education 58


Unity University College

The number of quintals of wheat and maize


produced by country X

600
Number of
quintals
400 200 100 Maize
200 150 Wheat
300 350
150
0
1980 1981 1982
Year

D. Percentage Bar Chart:

It is a subdivided bar chart where percentages are used in each classification rather than the actual
frequencies.

Example 21: construct percentage bar chart for the data in Example 19.
Solution:
Year % of Wheat Production % of Maize
Production
1980 150/300  100 = 50 150/300  100 = 50
1981 300/500  100 = 60 200/500  100 = 40
1982 350/450  100 = 78 100/450  100 = 22

Percentage of wheat and maize production from 1980-1982

100%
22
80% 50 40
Percentage
produced

60% wheat
40% 78 maize
50 60
20%
0%
1980 1981 1982
Year

Creating Opportunity Through Education 59


Unity University College

4.9. PIE CHART

A pie chart is a circle divided in to various sectors with areas proportional to the value of the
component they represent. It shows the components in terms of percentages not in absolute
magnitude. The degree of the angle formed at the center has to be proportional to the values
represented.

Example 22: the monthly expenditure of a certain family is given below.

Items Expenditure % Proportion (Pfi) Degrees (360o Rfi)


Clothing 100 100/1000  100 = 10 100/1000  360o = 36
Food 350 350/1000  100 = 35 350/1000  360o = 126
House Rent 250 250/1000  100 = 25 250/1000  360o = 90
Miscellaneous 300 300/1000  100 = 30 300/1000  360o = 108
Total 1000 100% 360o

Solution: The pie chart for the above expenditure is as follows

300
350 Food
House rent
Clothing
Misc.
100

250

4.10. PICTOGRAPH (PICTOGRAM)

A pictograph is a graph that uses symbols or pictures to represent data.

Example 23: In comparing the population of a country from 1990 to 1992, we simply draw
pictures of people where each picture may represent 1000,000 people.

1992 -  Key:  = 1000,000

1991 - 

1990 - 

Creating Opportunity Through Education 60


Unity University College

4.11. SUMMERY
This unit discussed how to present the organized data. Once a frequency distribution is
constructed, the representation of the data by using graphs is a simple task. The most commonly
used graphs in research statistics are the histograms, frequency polygon, an ogive, and other
graphs and diagrams, like the bar charts, pie charts, pictograms can also be used. And some of
these graphs are seen frequently in newspapers, magazines, and various statistical reports.

4.12. ANSWERS TO CHECK YOUR PROGRESS (CYP) QUESTIONS

CYP 1
y

freq.12

10

x
5 10 15 20 25 30 35
Class boundaries (CBi)

CYP 2
. y
12
10
Cummulative Frequency

x
2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5
Class Marks (cmi)

Creating Opportunity Through Education 61


Unity University College

4.13. MODEL EXAMINATION QUESTION

Direction: Answer each of the following questions.

1. Determine whether each statement is true or false.


a. The ogive uses cumulative frequencies.
b. Histogram can be drawn by using vertical or horizontal bars.
c. In the construction of a frequency polygon, the class limits are used for the x-
axis.
d. Data collected over a period of time can be graphed by using a pie chart.
e. When the data is represented graphically by symbols or pictures, the graph is
called a frequency curve.
3. Construct a histogram, frequency polygon, and both ogives to represent the data shown
below .

Class Boundaries (CBi) Frequency fi


5.5-10.5 1
10.5-15.5 2
15.5-20.5 3
20.5-25.5 5
25.5-30.5 4
30.5-35.5 3
35.5-40.5 2

Creating Opportunity Through Education 62


Unity University College

4.14. GLOSSARY

Histogram: Refers to a statistical graph which represents, by the height of a rectangular column,
the number of times that each class of result occurs in a sample or experiment.

Frequency Polygon: Refers to the graph obtained when the mid points of the tops of the
rectangles in a histogram having equal class intervals are connected
by line segments.

Frequency Curve: Refers to a smooth frequency polygon for data that can take a continuous set
of values.

Ogives: Are cumulative frequency curves.

Bar Chart: Refers in a graph made up of bars whose lengths are proportional to quantities in a
set of data

Pie Chart: Refers to a diagram wherein proportions are shown as sectors of a circle.

Pictogram: Refers to a diagram that shows statistical data in a pictorial form.

4.15 REFERENCES

Refer the list of books in unit 1.

Creating Opportunity Through Education 63

You might also like