Data Collection & Procesing
Data Collection & Procesing
4/9/2023 1
Objectives
At the end of this session the student will be able
to:
4/9/2023 3
Types of Data based on Data source
Primary Data
Those which are collected as fresh and for
the first time, and thus happen to be
original in character.
It is more reliable and accurate.
High response rates might be
obtained
It permits explanation of questions
4/9/2023 4
Types of Data based on Data
source …
Secondary data
Which have already been collected by
someone else for other purpose.
Are less expensive to collect
Must be used with great care
Example
Patients chart review
Use of EDHS Data
4/9/2023 5
Types of data based on scale of
measurement
1. Qualitative data
Qualitative data is categorical
measurement expressed not in terms of
numbers, but rather by means of a
natural language description.
Nominal e.g. gender, race, religion
Ordinal e.g. Size (small, medium,
large)
4/9/2023 6
Types of data based on scale of
measurement...
2. Quantitative data
Quantitative data is a numerical
measurement expressed in terms of
number.
It can be continues or discrete.
e.g. age, weight, height, temperature…
4/9/2023 7
Methods of Data collection
The choice of methods of data collection
is largely based on the accuracy/precision
of the information they yield:
Correspondence between the
information and objective reality
The extent to which the method will
provide a precise measure of the
variable the investigator wishes to
study
4/9/2023 8
Methods of Data collection
Selection of data collection method also based on
Practical considerations, such as:
The need for personnel, skills, equipment,
etc.
The acceptability of the procedures to the
subjects –
The probability that the method will
provide a good coverage,
The investigator’s familiarity with a
study procedure may be a valid
consideration
4/9/2023 9
In general selection of data collection method
in relation to variables type and objectives is
based on:
4/9/2023 10
Data collection techniques
Using available Sources
Published
Unpublished Records like
Empirical
Observation
Interview method
Administering printed questionnaires
Focus group discussions (FGD)
4/9/2023 11
1.Using Available sources
4/9/2023 14
Disadvantages of Available documentary
sources
Problems of reliability and validity
Collected by a number of different
persons ,different definitions or methods of
obtaining data.
There is a possibility that errors may occur
The records are maintained not for research
purposes,
The information required may not be
recorded at all, or only partly recorded.
4/9/2023 15
Empirical Sources
Observation
Interview method
Administering printed questionnaires
Focus group discussions (FGDs)
4/9/2023 16
Observation
It is a technique that involves
systematically selecting, watching and
recording behavior and characteristics .
4/9/2023 20
2. Interviewing Method
Involves oral questioning of respondents,
either individually or as a group
Answers can be recorded by:
Writing down
Tape-recording
Combination of them
Interviews can be conducted with varying
degree of flexibility (high degree of
flexibility Vs low degree of flexibility)
4/9/2023 21
A) High degree of flexibility
When the researcher has little understanding of the
problem
Is frequently applied in exploratory studies
When studying sensitive issues (e.g. teenage
pregnancy) the investigator may use a list of topics
rather than fixed question
The sequence of topics should be determined by the
flow of discussion
It is often possible to come back to a topic discussed
earlier in a later stage of the interview
4/9/2023 22
Interviewing con’t…
B) Low degree of flexibility
Useful when:
Researcher is relatively knowledgeable about
expected answers or
When the number of respondents being
interviewed is relatively large
Questionnaires may be used with a fixed
list of questions in a standard sequence,
which have mainly fixed answers
4/9/2023 23
Interviewing con’t…
•Interview methods can done through:
•Face to face interview
•Telephone interview
•Computer based interview,
email, or other online methods
•In Face to face, interviewer and the
respondent come together and the interviewer
directly asks the designed questions.
4/9/2023 24
Face to face interview
ADVANTAGES
DISADVANTAGE
Good response rate
•Time consuming
The data is Complete
and immediate • Need to set up
interviews
Possible in- depth
questions • Geographic
limitations
Interviewer in
control and can give • Can be expensive
help if there is a
problem
4/9/2023 25
Telephone interview
This method of collecting information
consists in contacting respondents on
telephone itself.
4/9/2023 26
Telephone cont…….
ADVANTAGES DISADVANTAGES
4/9/2023 27
ADMINISTERING WRITTEN
QUESTIONNAIRES
4/9/2023 29
4.Focus group discussions (FGDs)
FGDs allow a group of 8-12 informants to freely
discuss a certain subject with the guidance of a
facilitator or reporter
4/9/2023 34
1. DEFECTIVE INSTRUMENTS
Questionnaires with:
fixed or closed questions on unknown topics
leading questions .
4/9/2023 36
2.OBSERVER BIAS
4/9/2023 37
Observer Bias Can Be Minimized By
4/9/2023 40
Tips to minimize such mistakes
1 Drafting a content:-
• this involves outlining of questions and variables
based on the set objectives.
2.Formulating questions:-
• Formulate one or more questions that will provide
the information needed for each variable.
• The questions should be specific and precise
enough so that different respondents do not
interpret the question differently.
• Each question has to measure only one thing at a
time and leading questions must be avoided.
4/9/2023 42
3. Sequencing of questions:-
• Place questions about background information or
variable at the beginning of the interview (such as age,
religion, education, marital status, occupational.
• Place sensitive questions towards the end. Questions
regarding in , sexual behavior, diseases with stigma
attached to them.
• Use simple and every day language
• make the questionnaire as short as possible or conduct
the interview in two parts if it takes more than an hour.
4/9/2023 43
4. Format the questionnaire:-
• Check that each question has adequate space for
responses and name of informant only when it is
necessary. Name of interviewer should preferably
included for quality control.
• Related question should appear together.
• Place boxes for pre categorized answers in a consistent
manner
4/9/2023 44
6. Translation: Translate to local language and
retranslate into original language.
7 Pre-testing
8 Conducting the actual interview
five stages preparation
introduction
uneven conversation
ending
post interview
4/9/2023 45
STAGES IN DATA COLLECTION
4/9/2023 47
a) Logistics of data collection
4/9/2023 48
HOW LONG will it take to collect the data for
each component of the study?
Step 1: Consider:
• The time required to reach the study
area(s);
• The time required to locate the study
units(persons, groups, records); If you
have to search for specific informants
(e.g., users or defaulters of a specific
service)
• The number of visits required per study
unit.
4/9/2023 49
Step 2:Calculate the number of interviews that can be
the interviews.
4/9/2023 51
WHEN should the data be collected?
The type of data to be collected and the demands
of the project will determine the actual time
needed for the data to be collected. Consideration
should be given to:
Availability of team members & assistants,
The appropriate season(s) to conduct the field
work
Accessibility and availability of the sampled
population, and
Public holidays and vacation periods.
4/9/2023 52
b). Ensuring quality
It is extremely important that the data we collect are of good
quality, that is, reliable and valid.
4/9/2023 54
DATA PROCESSING
4/9/2023 55
Data Processing
56
PROCESSING OPERATIONS
57
Data editing can be of two types:
a. Field Editing
b. Central editing
58
Field editing
59
Central editing
60
2. Coding: Coding refers to the process of assigning
numerals or other symbols to answers so that
responses can be put into a limited number of
categories or classes.
E.g. instead of using “Male” and “Female” for the
variable sex, it can be indicated as:
1= Male, 2= Female
61
• They must possess the characteristic of exhaustiveness
(i.e., there must be a class for every data item) and
62
Data Can be pre, post and recoded
63
Code book:
is a document that describes the location of variables and
lists of the code assignments of the values of the variable.
Provides
• a guide used in the coding process
• locating the variables
• lists of the code
• assignments of the values of the variable
• decoding back to original variables when reporting.
64
Sample Code book
Question/statement Column Variable name codes
location
Respondent ID.NO respid 1-100
65
3. Classification:
66
Classification according to attributes:
67
Classification according to class-intervals:
• Unlike descriptive characteristics, the numerical
characteristics refer to quantitative phenomenon
which can be measured through some statistical units.
• Data relating to income, production, age, weight, etc.
come under this category.
• For instance, persons whose incomes, say, are within
Birr 201 to Birr 400 can form one group, those whose
incomes are within Birr 401 to Birr 600 can form
another group and so on
68
4. Tabulation:
• Tabulation is an orderly arrangement of data in columns
and rows.
• When a mass of data has been assembled, it becomes
necessary for the researcher to arrange the same in some
kind of concise and logical order.
• Tabulation is the process of summarizing raw data and
displaying the same in compact form (i.e., in the form of
statistical tables) for further analysis.
69
Tabulation is essential because of the following reasons.
1. It conserves space and reduces explanatory and descriptive
statement to a minimum.
2. It facilitates the process of comparison.
3. It facilitates the summation of items and the detection of
errors and omissions.
4. It provides a basis for various statistical computations.
70
reference
4/9/2023 71