02 Data Collection & Classification
02 Data Collection & Classification
Classification
Some Definitions
• What is data?
Data are collections of any number of
related observations.
• What is data set?
A collection of data is called a data set.
• What is singular of Data?
Datum. A single observation is datum.
Sources of Data
• Primary Sources
A primary source is one that itself
collects the data. Such data are
original or first-hand data.
• Secondary Sources
A secondary source is one that makes
available data which were collected
by some other agency.
Primary Sources
• Direct Personal Interviews
• Indirect Oral Interviews
• Information from Correspondents
• Mailed Questionnaire Method
• Schedules sent through Enumerators.
Secondary Sources
• Published Sources
• Reports and publications of
• International Bodies e.g. World Bank, WTO,
ILO, UNO etc.
• Central and State Governments
• Ad-hoc Committees and Commissions etc.
• Semi-official Publications of Various Local
Bodies.
-Contd…
Secondary Sources Contd…
• Publications of autonomous and private
Institutes etc.
• Unpublished Sources
• Records maintained by Government and
Private Bodies.
Classification of Data
• Geographical Classification
• Chronological Classification
• Qualitative Classification
• Quantitative Classification
Geographical Classification
In this type of classification data are
classified on the basis of geographical
location differences between the
various items, e.g. states, cities,
regions etc.
-Contd…
Geographical Classification Contd…
Country Human development index
(Year 2002)
Norway 0.956
Australia 0.946
China 0.745
Sri Lanka 0.740
Indonesia 0.692
India 0.595
Bangladesh 0.509
Pakistan 0.497
Chronological Classification
In this type of classification data are
observed over a period of time.
Males Females
1 0 2 3 4 5 6
7 2 3 4 0 2 5
8 4 5 12 6 3 2
7 6 5 3 3 7 8
9 7 9 4 5 4 3
Elements of Continuous Frequency
Distribution
• Class Limits
Class limits are the lowest and highest values that
can be included in the class.
• Class Intervals
The difference between the upper and lower limit of
a class is known as class interval of that class.
i = U-L
Contd…
Elements of Continuous
Frequency Distribution Contd…
• Class Frequency
The number of observations corresponding to a
particular class is known as the frequency of that
class or the class frequency.
• Class Mid-Point
It is the value lying half-way between the lower and
upper class limits of a class-interval.
U L
M
2
Where, M = Mid-Point
Methods of Classification of Data
• Exclusive Method
In this method upper limit of one class is the
lower limit of next class and an observation
equal to upper limit is excluded from that class.
• Inclusive Method
In this method an observation equal to upper
limit is included in that class.
Example:
Exclusive Method Inclusive Method