0% found this document useful (0 votes)
5 views

Data Collection + Data Presentation

Uploaded by

Orbora Merz B KE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Data Collection + Data Presentation

Uploaded by

Orbora Merz B KE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

DATA COLLECTION

Statistical data is collected from different sources and different methods are adopted to collect
adequate and reliable data. The data is collected in order to conduct some inquiries and to
analyse some problems.

STATISTICAL INQUIRIES
In order to collect data for a particular investigation it is important to keep in mind the
following:
i. OBJECT AND SCOPE OF INQUIRY
Every investigation has its objectives to be achieved. Therefore, the objective and scope must
be determined beforehand i.e. what will be the objective of the inquiry, from where, whom
and when it should be collected etc

ii. NATURE AND TYPE OF INQUIRY


There are several types of statistical inquiries. They include:
a. Primary/ Secondary inquiry
A primary inquiry is one in which the data is going to be collected for the first time by the
investigator while a secondary inquiry means the data which has already been collected by
someone and the record of which is available in research papers, magazines and journals.

b. Census and sample inquiry


A Census inquiry is one in which complete enumeration of each and every individual of the
population is made.
A sample inquiry is whereby some units are selected which represent the whole population.

c. Open or confidential inquiry


An open inquiry is one in which the results are not kept in secret.
A confidential inquiry is one in which the results are kept in secret and are not known to the
public.

d. Direct/ indirect inquiry


A direct inquiry is one in which the subject matter can be measured directly e.g. ages, wages
of workers etc
An indirect inquiry in one in which direct measurement cannot be made and indirect methods
are applied e.g. intelligence of a class

e. Regular/ adhoc inquiry


A regular inquiry is one in which the data is collected on a regular periodical basis over a
period of time e.g. census of population in Kenya is conducted after every 10 years. An
adhoc inquiry is one in which the data is collected once in a while.

f. Initial/ repetitive inquiry


An initial inquiry is the one being conducted for the first time while a repetitive inquiry is the
one which is a repetition of the previous inquiry.

g. Official/ Semi-official/ Non official inquiry


Official inquiry is one which is made by the government and therefore is conducted through
regulation or legislation.

1
Semi-official inquiry is one which is made by bodies which are supported by the government.
Non official inquiry is conducted by private bodies.

iii. STATISTICAL UNITS


They refer to units that are used in collection of data. They include:

a. Physical units
These are units which are used in day to day life e.g.kg, metres, litres, centimetres etc.

b. Arbitrary units
These are units adopted by statisticians for their own use in statistics e.g. salaries, wages,
workers, sales etc.

CLASSIFICATION OF DATA
Depending on the source, data can be classified as primary data or secondary data.

Primary Data
It is data collected the first time whether directly or indirectly. Is is original in character and
shape e.g. census in Kenya. The following methods are used to collect primary data:

1. Questionnaire:
It is the most commonly used method in survey. Questionnaires are a list of questions either
open-ended or close-ended for which the respondents give answers. Questionnaire can be
conducted via telephone, mail, live in a public area, or in an institute, through electronic mail
or through fax and other methods.

Types of Questionnaires
a. Open ended or unstructured Questionnaire
These are the type of questions that are used to allow the respondents to express their views
in a free-flowing manner.
By using such questions, the respondents do not have to follow the criteria for answering
questions and he/she can truly express their beliefs and suggestions.
An ideal questionnaire is a type of questionnaire that includes open-ended questions and also
has feedback and suggestions for future improvements.

E.g. What according to you is the biggest challenge of Academic libraries?...............................


……………………………………………………………………………

Advantages of Open Ended Questions


i) Freedom and spontaneity of the answer
ii) Opportunity to probe
iii) Useful for testing hypothesis about ideas

Disadvantages
i. Time-consuming
ii. Coding: very costly and slow to process

b. Closed ended or structured Questionnaire

2
The user is restricted to answer their opinions through the options that are set by the surveyor.
Therefore the respondent selects one or more options from pre-determined set of responses.

Types of Closed ended or structured Questionnaire


i. Simple dichotomy
Closed ended question with only two response alternatives. E.g.
How do you rate the services of your library?
Good …. Bad……

ii. Multiple Choice


Closed ended question with more than two response alternatives. E.g.
How much satisfaction do you get from your job?
(a) A great deal (b) Quite a bit (c) A little (d) not at all

iii. Determinant choice


Multiple choice question in which respondent must select only one of the response
alternatives.
E.g. How much satisfaction do you get from your library job?
a) A great deal c) Fairly well
b) Very little d) Not at all

iv. Checklist question


Multiple choice question in which respondent can select more than one of the response
alternatives
E.g. Please indicate the purpose of visit in the library? (you may tick (√) as many
responses as are applicable).
a) To consult text books b) To read comic Books c) To consult Journals
d) To Access Internet e) To read newspapers f) To listen music

Advantages Closed Ended Questions


i) No extended writing
ii) Easy to process
iii) Useful for testing
specific hypothesis

Disadvantages:
i) Loss of spontaneous response
ii) Bias in answer categories
iii) May irritate respondents

Features of a good questionnaire


i) Short and clear: The questions should be short and clear so that they can be
understood easily by informants.
ii) Questions should be few in number: A large number of questions would irritate
the informants and they would hesitate to answer the questions
iii) Logical sequence: Questions should be put in some logical order. Their replies
should also be put in some order. This will enable the investigator to analyse the
replies easily and quickly.

3
iv) Non-confidential: Questions used should be non-confidential in nature because no
one would like to answer personal questions.
v) Relevant questions: The questions should be relevant to the problem under
investigation
vi) Definiteness: Questions should be framed in such a way that the answers to them
are perfectly definite i.e. in form of ‘yes’ or ‘no’

Advantages of using Questionnaire


i) It saves time
ii) They can be kept for future reference
iii) It can cover a wide area and therefore it is able to reach many people
iv) Answers can be carefully given since the duration of time given is enough

Disadvantages of using Questionnaire


i. One may give false information especially when some questions are misinterpreted
ii. It is limited to the blind
iii. It is limited to the illiterate
iv. No room for correction once the questionnaires are submitted
v. No assurance for feedback after posting the questionnaires

2. Interview
Interview is a face-to-face conversation with the respondent. In interview the main problem
arises when the respondent deliberately hides information otherwise it is an in depth source of
information. The interviewer can not only record the statements the interviewee speaks but he
can observe the body language, expressions and other reactions to the questions too. This
enables the interviewer to draw conclusions easily.

Advantages
i) Information collected by this method is reliable and accurate
ii) It is a good method for intensive investigation
iii) The interviewer can explain part of the questions not understood
by the respondents
iv) It is quick method of obtaining information
v) It gives satisfactory results provided the scope of inquiry is narrow

Disadvantages
i) It requires a lot of expenses and time
ii) Sometime the respondent may not be willing to answer the questions
iii) The method is not suitable for extensive inquiry

3. Observation
Observation can be done while letting the observing person know that s/he is being observed
or without letting him know. Observations can also be made in natural settings as well as in
artificially created environment.

Advantages
i) Data collected is highly reliable
ii) It is cheap method
iii) It gives more relevant and accurate information

4
Disadvantages
i) The presence of an investigator may make the performer to work in a different
manner
ii) The results may be different under different conditions

4. Sampling
In this method each unit of the population has an equal chance of being selected

Advantages of Using Primary Data


1. The investigator collects data specific to the problem under study.
2. There is no doubt about the quality of the data collected (for the investigator).
3. If required, it may be possible to obtain additional data during the study period.

Disadvantages of Using Primary Data


1. The investigator has to contend with all the hassles of data collection e.g.
 deciding why, what, how, when to collect data
 getting the data collected (personally or through others);
 getting funding and dealing with funding agencies;
 ethical considerations (consent, permissions, etc.).

2. Ensuring the data collected is of a high standard-


• all desired data is obtained accurately, and in the format it is required in;
• there is no fake/ cooked up data;
• unnecessary/ useless data has not been included.

3. Cost of obtaining the data is often the major expense in studies.

SECONDARY DATA
Data collected from a source that has already been published in any form is called as
secondary data. The review of literature in any research is based on secondary data. It is
collected by someone else for some other purpose (but being utilized by the investigator for
another purpose). For examples, Census data being used to analyse the impact of education on
career choice and earning.
Common sources of secondary data for social science include censuses, organizational records
and data collected through qualitative methodologies or qualitative research.

Sources of Secondary Data:


 The following are some ways of collecting secondary
data – Books
 Records
 Biographies
 Newspapers
 Published censuses or other statistical data
 Data archives
 Internet articles
 Research articles by other researchers (journals)
 Databases, etc.

5
Importance of Secondary Data:
 Secondary data can be less valid but its importance is still there.
 Sometimes it is difficult to obtain primary data; in these cases getting information
from secondary sources is easier and possible.
 Sometimes primary data does not exist in such situation one has to confine the
research on secondary data.
 Sometimes primary data is present but the respondents are not willing to reveal it in
such case too secondary data can suffice. For example, if the research is on the
psychology of transsexuals first it is difficult to find out transsexuals and second they
may not be willing to give information you want for your research, so you can collect
data from books or other published sources.
 A clear benefit of using secondary data is that much of the background work needed
has already been carried out. For example, literature reviews, case studies might have
been carried out, published texts and statistics could have been already used
elsewhere, media promotion and personal contacts have also been utilized. This
wealth of background work means that secondary data generally have a pre-
established degree of validity and reliability which need not be re-examined by the
researcher who is re-using such data.
 Furthermore, secondary data can also be helpful in the research design of subsequent
primary research and can provide a baseline with which the collected primary data
results can be compared to. Therefore, it is always wise to begin any research activity
with a review of the secondary data.

Advantages of Using Secondary Data


i) No hassles of data collection.
ii) It is less expensive. iii) It provides most and all
information required iv) It helps to improve
understanding of the problem at hand
v) It provides some basis of comparison of data collected by the researcher vi) The
investigator is not personally responsible for the quality of data (‘I didn’t do it’).

Disadvantages of Using Secondary Data


i) The accuracy of secondary data is not known.
ii) Data collected in one location may not be suitable for the other one due to
variable environmental factor.
iii) With the passage of time the data becomes obsolete and very old.
iv) Secondary data collected can distort the results of the research.
v) Secondary data can also raise issues of authenticity and copyright.

Errors in statistics
A mistake in statistics refers to incorrect presentation or calculation due to human factors.
Mistakes may have occurred in the collection of data e.g. a respondent might have mistakenly
ticked yes box instead of no box. Mistakes can also occur when the collected data was being
misplotted on the graph.

An error refers to the difference between the actual figure and the estimated figure. The
deviation is just by chance and not due to carelessness of human beings. Normally errors arise
due to approximation or rounding off of figures.

6
Types of errors
1. Sampling error
It refers to the difference between the actual value and the estimated value as obtained from
the sample. The amount of the sampling error will depend on the size of the sample.
Therefore the greater the sample size the smaller the size of sampling error and vice versa.

2. Non sampling error


When errors arise due to other reasons then they are known as non-sampling errors. The errors
may arise due to the following reasons:
 Failure to cover all the items
 Biased behaviour of the investigator
 Use of estimated values

3. Biased errors
These are errors which arise due to the bias on the part of the investigator, enumerator or the
instrument.

4. Un biased errors
These are errors which arise by chance in the usual course of the investigation. These errors
are compensatory in nature since both negative and positive errors cancel each other and
mostly the estimated value is equal to the actual value.

5. Positive/ Negative errors


If the actual value is greater than the estimated value, the error is said to be positive while if
the actual value is less than the estimated value, the error is said to be negative.

Measurement of Errors
Errors are measured absolutely or relatively.

Absolute error
It is the difference between the actual value and the estimated value.

Ae =A –E

Where Ae = absolute error


A = actual value
E = expected value

Example
The actual sales of an enterprise amounted to Ksh.987,500 but the estimated sales were
Ksh.1,000,000. Find the absolute error.

Ae =A –E

= Ksh.987,500 - Ksh.1,000,000 = (Ksh.12,500)

Relative error (Re)


It refers to the ratio between absolute error and actual value. It can be expressed as follows:

7
Re = Ae = Absolute error
A Actul value

Using the above example calculate the relative error

Re = Ae = -12,500
A 987,500

= -0.013

NB: Relative error can be expressed as a percentage by multiplying with 100. When relative
error is expressed as percentage it is known as percentage error.

Percentage error = A –E ×100


A

Example
Assume that the population of a town was estimated as 1,424,880 whereas the actual
population was 1,578,620. Find the following:
a) Absolute error
b) Relative error
c) Percentage error

Solution
Absolute error (Ae) =A –E

= 1,578,620 -1,424,880
= 153,740

Relative error (Re)= Ae = 153,740


A 1,578,620

= 0.1

Percentage error = A –E ×100


A

= 0.1 ×100 = 10 %

ORGANISATION OF DATA
Data organisation refers to classification and tabulation of data. The collected data is mostly
large in quantity and it is necessary to organise data in such a way that further analysis and
interpretation of data is made easily and correctly.

Classification of Data

8
It refers to arranging of data in groups or classes according to some resemblance of the data in
each group or class. In data classification the elements which possess the same characteristics
are grouped in one class and therefore the whole data is divided into a number of classes.

Advantages/ objectives of classification of data


i) To eliminate unnecessary details
ii) To bring out clearly points of similarity and
dissimilarity
iii) To enable one to make comparison and draw
inferences
iv) The required data can be located easily
v) The classified data takes less space as compared to
narrative data.

Tabulation of data
It refers to systematic arrangement of the statistical data in columns and rows

Advantages/ objectives of tabulation of data


i) To unnecessary details are avoided
ii) Tabulated data can be understood easily compared to data given in narrative
form
iii) The comparison between different classes of data can be made easily
iv) The required data can be located easily
v) The tabulated data takes less space as compared to narrative data.

Principles/ Rules of table construction


1. Each table should have a title
2. The table should be self-explanatory and easy to understand
3. The size of the table should be suitable
4. Source of data must be stated
5. The headings to columns and rows should be clear
6. The sub totals for each separate class of data and a grand total for all the combined
classes must be given where appropriate.
7. The total of rows should be given in the extreme right column and the total of the
columns should be given at the foot.
8. It the foot notes are necessary in a table then they should be short, clear and precise
9. Thick lines should be drawn to separate one class of data from the other and thin lines
to separate subdivision of classes.
10. The units of measurement should be clearly mentioned

Example
Out of the total number of 2,500 women who were interviewed for employment in a factory,
1,500 were married and the rest unmarried. Amongst the married women 900 were
experienced and the rest inexperienced while from the unmarried 300 were experienced.
Present the information in tabular form.

Solution
Job interview

9
Experienced Inexperienced Total
Married 900 600 1,500
Unmarried 300 700 1,000
Total 1,200 1,300 2,500

Source: Human resource department

Example
The following report was prepared by an examination officer on the performance of Meru
Central district in a national examination. Out of 4,000 male candidates below 20 years of age
3,000 passed and 1,000 failed. Of the 1,100 male candidates 20 years old and over 500 passed
and 600 failed. As regards the female candidates out of 600 below 20 years of age 400 passed
and 200 failed. Of the 350 females 20 years old and over 100 passed and 250 failed. Present
the information in a tabular form.

Solution

Meru Central District national examination results

Pass Fail Total


Below 20 years Male 3,000 1,000 4,000
Female 400 200 600
Above 20 years Male 500 600 1,100
Female 100 250 350
Total 4,000 2,050 6,050

Source: Examination officer

Question
On 1st January 2010, a company had 50 employees. Among them 18 were women. During the
year 9 employees left and 5 of those were men. The total of new employees in the year was 11
out of whom 4 were women. During the year 2011, 2 men left employment and 14 men and 4
women joined the work force. Present the information in a tabular form.

DISCRETE AND CONTINUOUS VARIABLE


A variable refers to a measurable quantity which varies from one value to another e.g. sales,
temperature, prices etc
A discrete variable is that variable which can assume only the whole number. These variables
only take exact values e.g. number of students in a college, number of books in a library,
number if wild animals in a national park etc.
A Continuous variable is that variable which can assume any value in a specific range. These
variables change continuously e.g. height, weight, temperature, litres etc.

Statistical series
It refers to arrangement of statistical data in a systematic manner.

10
Types of statistical series
1. Spatial series
It refers to the data that is arranged in relation to geographical location. e.g.
Town profit (sh.’000’)
Nairobi 42,590
Mombasa 11,243
Meru 2,348

2. Time series
It is data arranged with respect to time e.g.
Year Sales (sh.million)
2011 324
2012 489
2013 1,128
2014 2,056

3. Condition series
It is data arranged with respect to a specific condition such as examination marks, height,
weight, expenditure etc e.g.

Department Exam mean mark


Agriculture 92.5
Business 68
ICT 72
Engineering 53

FREQUENCY DISTRIBUTION
It refers to grouping of statistical data according to size or magnitude. A frequency
distribution will consist of class intervals and their corresponding frequencies.

Construction of frequency distribution tables


Number of classes: It depends upon the number of items of the series. If the number of items
is less, the number of classes will be less and vice versa.
A frequency distribution should not have less than 6 to 8 classes and not more than 20 to 25
classes.

Class interval: It determines how large a class is. In order to find the class interval we find
the range and divide it by the number of classes.
Class interval = Range
Number of classes

Range = Highest value –lowest value


Class limit: These are numerical values that define a specific class in a frequency distribution
table. E.g.
5 -9
10 -14

11
15 -19

Upper limit
Lower limit
Frequency in each class: the values falling in a particular class are called frequencies. It is
calculated using strokes or tally sheets.

TYPES OF FREQUENCY DISTRIBUTION TABLES


1. Simple frequency distribution table (ungrouped)
It is a tabular layout of data into classes alongside their corresponding frequencies. It is
suitable for tabulation of discrete data e.g. number of cars, number of children etc

Example
The following data shows the number of children in the families of 32 employees:
5, 8, 3, 4, 2, 1, 4, 3, 3, 4, 1, 2, 7, 5, 6, 4, 5, 5, 4, 5, 8, 2, 1, 2, 2, 4, 3, 6, 0, 4, 7, 6 Construct
a frequency distribution table from the data.

Class Tally column Frequency


(Number (Number
of of
children) families)
0 1

1 3

2 5

3 4
4 7
5 5
6 3
7 2
8 2
Total 32

Question
The following data represents the number of refrigerators sold on 22 working days by a
leading company.
23, 30, 40, 23 ,23, 28, 30, 30, 40, 40, 30, 30, 20, 20, 26, 28, 40, 26, 23, 20, 20, 20 Construct
a frequency distribution table for the data.

2. Grouped frequency distribution table


It is a tabular layout of data into classes alongside their corresponding frequencies. In this case
the classes will be represented as a group of values. It is suitable for tabulation of continuous
data. The grouping can be done in two ways:

12
a. Inclusive form of grouping/ Inclusive method
This is whereby both the lower limit and upper limit are included while taking the items in a
group e.g. if the first class is 1 -9 then both 1 and 9 values will be included.

b. Exclusive form of grouping/ Exclusive method


In this method the lower limit is included but the upper limit is excluded while taking items in
a group e.g. in a class of 5 -10, 5 and all the values below 10 are included but 10 is excluded
while taking items in a group.

Example
The following data relates to marks of 30 students in a statistics test.
10, 36, 40, 30, 26, 20, 19, 10, 10, 16, 19, 27, 15, 26, 20, 19, 7, 44, 33, 21, 26, 27, 6, 20, 11, 37,
37, 30, 20, 5
Construct a frequency distribution table with 8 classes using exclusive method.

Solution
Number of classes =8

Class interval/ size = Range


Number of classes

Range = Highest value –lowest value

=44 -5 =39
39
Class interval = /8 = 5

Example
The following data relate to marks of 60 applicants who were given a certain test for the
purpose of selection to a particular post.
41, 17, 83, 63, 58, 92, 60, 58, 70, 57, 67, 82, 33, 44, 51, 49, 34, 73, 54, 6, 36, 52, 32, 75, 60,
33, 9, 79, 28, 63, 42, 93, 43, 80, 3, 32, 57, 67, 84, 30, 63, 11, 35, 28, 10, 23, 8, 41, 60, 64, 72,
53, 92, 88, 62, 55, 60, 33, 40, 32
Construct a frequency distribution table using inclusive method. Take the first class as 0-9

Solution
Class Tally column Frequency

13
0-9 4
10-19 3
20-29 3
30-39 10
40-49 7
50-59 9
60-69 11
70-79 5
80-89 5
90-99 3

Total 60

Question
Group the following data taking the class interval of 5 in:
a. Exclusive form of grouping
b. Inclusive form of grouping

2, 4, 1, 3, 5, 7, 9, 2, 13, 15, 18, 11, 14, 10, 12, 16, 7, 6, 19, 22, 11, 23, 22, 24, 2, 5, 3, 4, 3, 2

Question
The following data shows the amount of unsecured personal loans in thousands of shillings
from a commercial bank.
700 450 725 1,125 625 1,650 750 400 1,050
500 750 850 1,250 725 475 925 1,050 925
850 625 900 1,750 700 825 550 925 850
475 750 550 725 575 575 1,450 700 450
700 1,650 925 500 675 1,300 1,125 775 850

Required:
i) Frequency distribution table of the data with Sh.200 thousand class interval
ii) The mean of the personal loans
iii) The standard deviation of the personal loans
iv) An ogive for the personal loans data.

DATA PRESENTATION IN DIAGRAMS


The main objective of statistics is to simplify the complexity of quantitative data and make it
easy to understand. Diagrams therefore helps us to understand the information in an easy and
comprehensive form.

Importance/ advantages of diagrams


1. Provides an easy and attractive means of representing data.
2. Facilitate comparison
3. Saves time and labour
4. Give an effective impression
5. Upgrade memorising of value as compared to mere figures.

14
Limitations/ Disadvantages
1. Do not give an accurate result but a rough idea
2. A technical hand can construct a diagram but a common man cannot construct it
correctly.
3. Diagrams takes more time to construct than tables
4. The method of data presentation is very expensive
5. Many people are not accustomed to diagrams and therefore they do not attach much
importance to them.
6. Comparison of diagrams may not be possible unless the units used are the same.

Rules for construction of diagrams


1. Diagrams should be neat and clean so that they have an attractive impression on the
reader’s mind.
2. Diagrams should always have a title. This enables the reader to create an idea about
the diagram before studying it.
3. The scale to be used should be suitable and mentioned on the right hand top or left
hand bottom.
4. Sometimes a key is necessary
5. It should have a source
6. All types of symbols used should be explained
7. The relative data should be given near the diagram so that it can give a correct view.

Types of diagrams
1. Bar charts
2. Pie chart
3. Pictogram

1. Bar charts/ Bar graph


In bar chart, data is represented by a series of bars

Types of Bar Graphs

i. Simple bar chart


It is a chart that consist of rectangular bars which are not joined together. The height or
length of each bar indicate the size of the figure represented. The width of the bar is not
taken into account and it should be uniform for all the bars.

ii. Component/ Subdivided bar chart


In this type of bar chart, the total figure is built up from two or more component figures. It is
just like simple bar chart except that the bars are subdivided into component parts. It is of
two kinds:
a) Actual component bar chart
b) Percentage component bar chart

iii. Multiple bar chart

15
In this bar chart, the component figures are shown as separate bar charts adjoining each other.
The height of each bar represent the actual value of the component figure. They are suitable
when the totals of the components are not required.

Advantages of bar charts


a) Easy to construct
b) They can be used for comparison
c) They can be used to indicate the size of the component figures

Disadvantages
a) Not more informative
b) They are restricted to three or four components figures only.

Example
The national income statistics of a country for 3 years are given in the following table:

National Income (sh. Million)


Year Agriculture Industry Other sectors
2015 170 90 80
2016 190 120 90
2017 200 140 120

From the above information;


i. Draw a simple bar chart ii. Draw a
component bar chart (actual) iii. Draw a
component bar chart (percentage) iv. Draw a
multiple bar chart

Solution

National Income (sh. Million)


Year Agriculture Industry Other sectors Total
2015 170 90 80 340
2016 190 120 90 400
2017 200 140 120 460

i. Simple bar chart

16
A simple bar chart showing National Income
National Income (Sh.million) Statistics
500
450
400
350
300
250
200
150
100
50
0
2015 2016 2017
Years

Scale
Y axis: 1 cm represent sh. 50 million

ii. Component bar chart

National Income (sh. Million)


Year Agriculture Industry Other sectors Total
2015 170 90 80 340
2016 190 120 90 400
2017 200 140 120 460

17
A Component Bar Chart Showing National
National Income (Sh.million) Income Statistics

500
450
400
350
300
250 Other sectors
200 Industry
150
Agriculture
100
50
0
2015 2016 2017
Years

Scale
Y axis: 1 cm represent sh. 50 million

iii. Percentage component bar chart


Year Agriculture Industry Other sectors Total
170
2015 = /340 ×100=50 =90/340 ×100=26.5 =80/340 ×100=23.5 100
2016 =190/400 ×100=47.5 =120/400 ×100=30 =90/400 ×100=22.5 100
200
2017 = /460 ×100=43.5 =140/460 ×100=30.5 =120/460 ×100=26 100

A Percentage Component Bar Chart showing


National Income Statistics (%)
National Income Statistics
120

100

80

60 Other sector
Industry
40
Agriculture
20

0
2015 2016 2017
Years

18
Scale
Y axis: 1 cm represent sh. 20%

iv. Multiple bar chart

National Income (sh. Million)


Year Agriculture Industry Other sectors
2015 170 90 80
2016 190 120 90
2017 200 140 120

A Multiple Bar Chart showing National


National Income (Sh.million) Income Statistics

250

200

150
Agriculture
100 Industry
Other sectors
50

0
2015 2016 2017
Years

Scale
Y axis: 1 cm represent sh. 50 million

Question
ABC Limited are manufacturers of biscuits, bread and cakes. Their sales for period of four
years were as follows:

Sales (sh.’000’)
Year Biscuit Bread Cakes
2016 50 80 40
2017 60 100 50
2018 70 110 30
2019 90 120 50

From the above information;


i) Draw a simple bar chart
ii) Draw a component bar chart (actual
iii) Draw a component bar chart (percentage)
iv) Draw a multiple bar chart

19
Question
The following data shows the number of different types of insurance policies issued in the
month of September 2019 by four insurance companies: Wyed Ltd., Xed Ltd., Yed Ltd., and
Zed Ltd.:

Insurance
Company

Types of policy Wyed Ltd. XedLtd. Yed Ltd. Zed Ltd.


Life 20 5 35 40
Accident 150 120 220 100
Fire 200 80 180 150
Maritime 5 2 8 5
Burglary 120 100 250 200

Required:
Present the above data using a component bar chart

2. Pie chart
It is a circle divided by lines into sections so that the area of each section is proportional to the
size of the figure represented. To find the angle of each component we use the following
formula:

Angle of each component = component value ×360º


Total value of components

When the angle of the various sectors are known, arrange them in the ascending order of
magnitude and then draw the circle.

Advantages of pie chart


i. It provides an attractive form of data presentation ii. It is not
restricted to three or four components figures only

Disadvantages
Changes in the overall total cannot be shown by changing the size of the pie chart.

Example
Draw a pie chart from the following data
Country Production (units)
A 30,000
B 25,000
C 24,000
D 23,000
E 20,000
F 10,000

20
Solution
Country Production (units)
A 30,000
B 25,000
C 24,000
D 23,000
E 20,000
F 10,000
132,000

Angle of each country = component value ×360º


Total value of components

A = 30,000 ×360º = 82º


132,000

B= = 25,000 ×360º =68º


132,000

C= = 24,000 ×360º =65º


132,000

D= = 23,000 ×360º =63º


132,000

E= = 20,000 ×360º =55º


132,000

F= = 10,000 ×360º =27º


132,000

21
A Pie Chart showing Production in units of
countries
27
82
55
A
B
C
D
E
63 68
F

65

Question
The data below shows the number of households in a village using electricity over the past 5
years
Year Number of households
2010 6
2011 9
2012 12
2013 15
2014 18

i. Draw a simple bar chart ii.


Draw a pie chart

DATA PRESENTATION IN GRAPHS


A graph is a pictorial presentation of data using a curve that is continuous. The relationship
between two quantities can be shown by the help of a graph.
The data is plotted on a graph as a series points and these points are joined with the help of a
line or a curve.

Characteristics of a graph
1. A graph must be neat and clean
2. The graph must have a clear title
3. It must not be overcrowded with curves
4. The scale chosen along the x axis and y axis must be suitable according to the given
data.
5. The graph must give the correct impression

22
Principles/ Rules for construction of graphs
1. The independent variable should always be taken along the x-axis and the dependent
variable along the y-axis
2. The vertical scale should always start at zero and if not possible a break should be
shown in the scale between zero and the next number.
3. The scale chosen must be one which can easily accommodate the whole data.
4. Against each value of independent variable given, there is a corresponding value of
dependent variable.
5. A graph must have a clear and comprehensive title
6. The source of data must be given
7. If more than one graph is plotted on the same Cartesian plane, then a different type of
line should be used for each curve e.g. dotted line, straight line etc
8. The scale caption for x-axis is placed under the centre of the horizontal axis while the
scale caption for y-axis is placed at the top of y scale.

Types of Graphs

Graphs of frequency distribution such as:

1. Cumulative frequency (ogive) curve


2. Histogram
3. Frequency polygon
4. Frequency curve

1. CUMULATIVE FREQUENCY (OGIVE) CURVE

An ogive curve is the name given to the curve obtained when the cumulative frequency of a
distribution is graphed.
The following steps are followed when drawing an ogive curve:
1. Compute the cumulative frequency (CF) of the distribution
2. Prepare a graph with cumulative frequency on the y axis and class intervals on the x
axis
3. Plot a starting point at zero (0) on the y axis and the lower class limit of the first class
on the x axis.
4. Plot the cumulative frequency on the graph at the upper class limit of the classes to
which they refer. This cumulative frequency is called less than cumulative
frequency.
5. Join the points by the help of a curve.

NB: An ogive curve is used to find the values of median, quartiles, deciles and percentiles
graphically.

Example

23
The following distribution shows the daily wages of 100 employees.
Wages (sh.) Number of employees
0-30 20
30-60 35
60-90 30
90-120 15

Plot a cumulative frequency curve

Solution
Wages (sh.) Number of employees Cumulative frequency
0-30 20 20
30-60 35 55
60-90 30 85
90-120 15 100

A cumulative Frequency Curve showing


Wages of 100 Employees
120
Cumulative Frequency
100

80

60

40

20

0
30 60 90 120
Wages (Sh.)

Scale
Y axis: 1 cm represent 20 CF
X axis: 1 cm represent sh.30

Example

24
The table below shows the age distribution of employees of XYZ Limited
Age group Frequency
21-25 5
26-30 12
31-35 23
36-40 39
41-45 32
46-50 21
51-55 9
56-60 2
Draw an ogive curve using the above data.
Solution
Age group Frequency Class boundaries Cumulative Frequency
21-25 5 20.5-25.5 5
26-30 12 25.5-30.5 17
31-35 23 30.5-35.5 40
36-40 39 35.5-40.5 79
41-45 32 40.5-45.5 111
46-50 21 45.5-50.5 132
51-55 9 50.5-55.5 141
56-60 2 55.5-60.5 143
Total 143

An Ogive Curve showing Age Distribution of


Employees
160

Cumulative
140 Frequency

120

100

80

60

40

20

0
20.5 25.5 30.5 35.5 40.5 45.5 50.5 55.5 60.5
Age Group

Scale
Y axis: 1 cm represent 20 CF

25
X axis: 1 cm represent 5

Question
The table below shows the profit made by 160 companies in the manufacturing industry for
the year ended December 2014.

Profit (sh.million) Number of companies


0-10 2
10-20 7
20-30 21
30-40 25
40-50 30
50-60 35
60-70 28
70-80 12

Plot a cumulative frequency curve

PERCENTAGE CUMULATIVE FREQUENCY CURVE


In this curve the cumulative frequencies are shown as percentages.
The following steps are followed when drawing percentage cumulative frequency curve:
1. Find out the percentage of frequencies of the distribution
2. Find out the cumulative percentage frequencies
3. Take the percentage cumulative frequencies along the y axis and the class interval
along the x axis
4. Mark the respective percentage cumulative frequency against the upper class limit.
5. Join the points by the help of a curve.
Example
The following is age distribution of employees in ABC Limited
Age group (years) Number of employees
Less than 15 20
15-25 80
25-35 200
35-45 120
45-55 60
0ver 55 20
500
Draw a percentage cumulative frequency
Solution
Age group (years) Number of employees(f) %f %CF
20
5-15 20 = /500 ×100 =4 4
15-25 80 =80/500 ×100=16 20
200
25-35 200 = /500 ×100=40 60
120
35-45 120 = /500 ×100=24 84
45-55 60 =60/500 ×100=12 96
20
55-65 20 = /500 ×100=4 100
500

26
A Percentage Cumulative Frequency showing
Age Distribution of 500 employees
120
% Cumulative Frequency
100

80

60

40

20

0
5 15 25 35 45 55 65
Age Group Years)

Scale
Y axis: 1 cm represent 20%
X axis: 1 cm represent 5 years

Question
Using the following data draw a percentage cumulative frequency

Age group Number of employees


Less than 20 8
20-25 48
25-30 120
30-35 80
35-40 40
40-45 24

2. HISTOGRAM
It is a graph that represent the class frequencies in a frequency distribution by vertical
rectangles. It consist of a series of rectangles having a base measured along the x axis and this
is proportional to the class intervals and a height measured along the y axis which represent
the class frequency.

NB: Histogram is used to find the value of the mode graphically.

Example

27
Present the following data by means of a frequency histogram
Class Frequency
0-10 5
10-20 11
20-30 19
30-40 21
40-50 16
50-60 8
60-70 10
70-80 6
80-90 3
90-100 1

Solution

A Frequency Histogram
25

20

Frequency
15

10

0
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Class Interval

Scale
Y axis: 1 cm represent 2
X axis: 1 cm represent 10

28
Question
The following table gives a frequency distribution of mass of 40 objects
Mass frequency
10-14 3
15-19 4
20-24 10
25-29 12
30-34 6
35-39 3
40-44 2

Draw a histogram

3. FREQUENCY POLYGON
It is a line graph drawn from histogram by joining the midpoints at the height of class interval
rectangles. The points to draw the frequency polygon will be joined with the help of a straight
line.
The frequency polygon gives the area of the histogram because it includes as much as the area
from outside the histogram as is left out from the inside.
Example
Use the following data to draw a frequency polygon
Class interval Frequency
0-10 3
10-20 5
20-30 7
30-40 12
40-50 15
50-60 8
60-70 4

Solution
Class interval Frequency Midpoint
0-10 3 5
10-20 5 15
20-30 7 25
30-40 12 35
40-50 15 45
50-60 8 55
60-70 4 65

Direct construction

29
A Frequency Polygon showing Class Interval
and Frequency
16

14

12
Frequency
10

0
5 15 25 35 45 55 65
Midpoints

Scale
Y axis: 1 cm represent 2
X axis: 1 cm represent 5
Histogram and frequency polygon

Class interval Frequency Midpoint


0-10 3 5
10-20 5 15
20-30 7 25
30-40 12 35
40-50 15 45
50-60 8 55
60-70 4 65

30
A Histogram and Frequency Polygon showing
Class Interval and Frequency
16

14

12

Frequency
10

0
5 15 25 35 45 55 65
Class Interval

Scale
Y axis: 1 cm represent 2
X axis: 1 cm represent 5

Question
Use the following data to draw a histogram and superimpose frequency polygon Class
interval Frequency
0-10 3
10-20 5
20-30 7
30-40 12
40-50 15
50-60 8
60-70 4

4. FREQUENCY CURVE
It has the same structure as frequency polygon except that the midpoints are joined with a
smooth curve and by rounding off the top.

31
Example
Construct a frequency histogram for the below data and superimpose a frequency curve on the
same graph
Daily profit (sh.000’) Frequency
0-50 12
50-100 18
100-150 27
150-200 20
200-250 17
250-300 6

A frequency Histogram and Frequency Curve


showing daily Profit
30

25

Frequency
20

15

10

0
0-50 50-100 100-150 150-200 200-250 250-300
Daily Profit (Sh.'000')

32

You might also like