0% found this document useful (0 votes)
11 views

Data Collection & Procesing

This document discusses data collection and processing. It begins by outlining the objectives of describing various data collection techniques, identifying different data sources and types of bias, and describing data processing. It then defines data and data collection. The document describes primary and secondary data sources, as well as qualitative and quantitative data scales. Various data collection techniques are outlined, including using available sources, observation, interviews, questionnaires, and focus groups. Advantages and disadvantages of these methods are also discussed.

Uploaded by

Muluken Assefa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Data Collection & Procesing

This document discusses data collection and processing. It begins by outlining the objectives of describing various data collection techniques, identifying different data sources and types of bias, and describing data processing. It then defines data and data collection. The document describes primary and secondary data sources, as well as qualitative and quantitative data scales. Various data collection techniques are outlined, including using available sources, observation, interviews, questionnaires, and focus groups. Advantages and disadvantages of these methods are also discussed.

Uploaded by

Muluken Assefa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Data collection and processing

4/9/2023 1
Objectives
At the end of this session the student will be able
to:

• Describe data and various data collection


techniques

• Identify different sources of data


• Identify various sources of bias in data
collection and ways of preventing bias.

•Describe data processing


4/9/2023 2
Definitions
Data
Is collection of information facts and evidence
from which you can draw conclusions.
Data collection
Data collection is a stage in the research process
when information is gathered through surveys,
experiments, fieldwork, or indirect methods to
generate data. (Kultar Shing, 2007)

4/9/2023 3
Types of Data based on Data source

Primary Data
 Those which are collected as fresh and for
the first time, and thus happen to be
original in character.
 It is more reliable and accurate.
 High response rates might be
obtained
 It permits explanation of questions

4/9/2023 4
Types of Data based on Data
source …
Secondary data
Which have already been collected by
someone else for other purpose.
 Are less expensive to collect
 Must be used with great care
Example
 Patients chart review
 Use of EDHS Data

4/9/2023 5
Types of data based on scale of
measurement
1. Qualitative data
 Qualitative data is categorical
measurement expressed not in terms of
numbers, but rather by means of a
natural language description.
 Nominal e.g. gender, race, religion
 Ordinal e.g. Size (small, medium,
large)
4/9/2023 6
Types of data based on scale of
measurement...
2. Quantitative data
 Quantitative data is a numerical
measurement expressed in terms of
number.
It can be continues or discrete.
e.g. age, weight, height, temperature…

4/9/2023 7
Methods of Data collection
The choice of methods of data collection
is largely based on the accuracy/precision
of the information they yield:
Correspondence between the
information and objective reality
The extent to which the method will
provide a precise measure of the
variable the investigator wishes to
study
4/9/2023 8
Methods of Data collection
 Selection of data collection method also based on
 Practical considerations, such as:
 The need for personnel, skills, equipment,
etc.
 The acceptability of the procedures to the
subjects –
 The probability that the method will
provide a good coverage,
 The investigator’s familiarity with a
study procedure may be a valid
consideration
4/9/2023 9
In general selection of data collection method
in relation to variables type and objectives is
based on:

Resource required Accuracy

Acceptability of the method Relevance

Coverage of the method Timeliness

Familiarization of the Cost effectiveness


procedure

4/9/2023 10
Data collection techniques
Using available Sources
Published
Unpublished Records like
Empirical
 Observation
 Interview method
 Administering printed questionnaires
 Focus group discussions (FGD)

4/9/2023 11
1.Using Available sources

These sources can be either published


documents or unpublished records,
Published
• Demographic data
• Mortality statistics,

•Census publications, etc.


4/9/2023 12
1.Using Available sources

 Unpublished Records like


 Reports on Mortality and morbidity Rates
 Epidemic reports
 Reports of laboratory utilization and test
results
 Social surveys like
•hospital admissions,
•disease registers, and
•serologic surveys
4/9/2023 13
Advantages Vs Disadvantages of available sources

Advantages of Available documentary sources


•Documents can provide readymade
information
•They are relatively easy to obtain
•It is the best means of studying past
events.

4/9/2023 14
Disadvantages of Available documentary
sources
 Problems of reliability and validity
 Collected by a number of different
persons ,different definitions or methods of
obtaining data.
 There is a possibility that errors may occur
 The records are maintained not for research
purposes,
The information required may not be
recorded at all, or only partly recorded.
4/9/2023 15
Empirical Sources

Observation
Interview method
Administering printed questionnaires
Focus group discussions (FGDs)

4/9/2023 16
Observation
It is a technique that involves
systematically selecting, watching and
recording behavior and characteristics .

It is a technique that can be used when


data collected through other means can be
of limited value or is difficult to validate.
4/9/2023 17
Observation can be categorized as

Participant Observation Non Participant


observation
The observer
observer observes
observes by making
separating himself
himself, more or less,
from the group
a member of the
group Do not make any
attempt on his part to
He is observing so
experience through
that he can experience
participation what
what the members of
others feel,
the group experience,
4/9/2023 18
Advantage
•Respondents subjective bias is eliminated,

•The information obtained relates to what is


currently happening; it is not complicated by
either the past behavior or future intentions
or attitudes.

•It is independent of respondents’ willingness


to respond and as such is relatively less
4/9/2023 19
demanding of active cooperation on the part
Disadvantages of Observational method

It is an expensive method.

Sometimes unexpected factors may interfere with the


observational task.

Some people are rarely accessible to direct


observation creates obstacle for this method to collect
data effectively.

4/9/2023 20
2. Interviewing Method
 Involves oral questioning of respondents,
either individually or as a group
 Answers can be recorded by:
 Writing down
 Tape-recording
 Combination of them
 Interviews can be conducted with varying
degree of flexibility (high degree of
flexibility Vs low degree of flexibility)

4/9/2023 21
A) High degree of flexibility
 When the researcher has little understanding of the
problem
 Is frequently applied in exploratory studies
 When studying sensitive issues (e.g. teenage
pregnancy) the investigator may use a list of topics
rather than fixed question
 The sequence of topics should be determined by the
flow of discussion
 It is often possible to come back to a topic discussed
earlier in a later stage of the interview

4/9/2023 22
Interviewing con’t…
B) Low degree of flexibility
 Useful when:
 Researcher is relatively knowledgeable about
expected answers or
 When the number of respondents being
interviewed is relatively large
 Questionnaires may be used with a fixed
list of questions in a standard sequence,
which have mainly fixed answers

4/9/2023 23
Interviewing con’t…
•Interview methods can done through:
•Face to face interview
•Telephone interview
•Computer based interview,
email, or other online methods
•In Face to face, interviewer and the
respondent come together and the interviewer
directly asks the designed questions.

4/9/2023 24
Face to face interview

ADVANTAGES
DISADVANTAGE
Good response rate
•Time consuming
The data is Complete
and immediate • Need to set up
interviews
 Possible in- depth
questions • Geographic
limitations
 Interviewer in
control and can give • Can be expensive
help if there is a
problem
4/9/2023 25
Telephone interview
This method of collecting information
consists in contacting respondents on
telephone itself.

It is not a very widely used method,


but plays important part in industrial
surveys, particularly in developed
regions

4/9/2023 26
Telephone cont…….
ADVANTAGES DISADVANTAGES

Faster than other methods


• Not every one has a
telephone
Cover large number of • Respondent has little time
people or organizations to think
Recall is easy • Can not use visual aids
• Require telephone access
High response rate the non
• Good telephone manner is
response rate is low
required
Can tape answers

4/9/2023 27
ADMINISTERING WRITTEN
QUESTIONNAIRES

 Self administered written questions are provided to


the respondent to be answered in written form.
Sending questionnaires by different channels like mail,
Postal, and other online channel
Gathering all or part of the respondents in one place at
one time, giving oral or written instructions, and letting
them fill out the questionnaires
Hand-delivering questionnaires to respondents and
collecting them later
4/9/2023 28
WRITTEN QUESTIONNAIRES….

A questionnaire may contain:


•Open Vs close ended questions
•Two-way questions (yes and no.)
• Multiple-choice questions
•Ranking scales

4/9/2023 29
4.Focus group discussions (FGDs)
FGDs allow a group of 8-12 informants to freely
discuss a certain subject with the guidance of a
facilitator or reporter

Focuses on research and develop relevant


research hypotheses by exploring in greater
depth the problem to be investigated and its
possible causes
4/9/2023 30
Advantage of FGDs
•Quick result and cost-effective
•Groups may generate important issues
• ideas as how to proceed with the study
may be generated.
•People usually feel comfortable
DISADVANTAGES
•Topic of discussion may be missed
• The discussion my be manipulated by
the moderator.
• Needs well trained professionals
4/9/2023 31
Advantages and disadvantages
Techn Advantages Disadvantages (constraints)
ique

observ Gives more detailed & context Ethical issues concerning


ing related information; confidentiality or privacy may
 permits collection of information arise;
on facts not mentioned in an  Observer bias may occur;
interview; The presence of data collector
permits tests of reliability of can influence the situation
responses to questionnaires observed;
Thorough training of research
assistances is required
Interv Is suitable for use with both Presence of interviewer can
iewing literates & illiterates; influence responses;
Permits clarification of questions; Reports of events may be less
Has higher response rate than complete than information
written questionnaires gained through observations
4/9/2023 32
Advantages and disadvantages cont…
Technique Advantages Disadvantages (constraints)
flexible Permits collection of in-depth Interviewer may
interview information & exploration of inadvertently influence the
spontaneous remarks by respondents;
respondents analysis of open ended data
is more difficult & time-
consuming
fixed Is easy to analyze Important information may
interview be missed b/c spontaneous
remarks by respondents are
usually not recorded or
explored
Administerin Is less expensive; •Cannot be used with
g written permits anonymity & may illiterates;
questionnaire result in more honest responses; •there is often a low rate of
does not require research response;
assistants; eliminates bias due to •questions may be
4/9/2023 33
phrasing questions differently misunderstood
Bias in data Collection
 Distortion that results from inaccuracies in the
measurement of subject characteristics, and incorrect
classification .
 Sources of bias
1. Defective instruments
2. Observer bias
3. Effect of the interview on the informant
4. Informational bias

4/9/2023 34
1. DEFECTIVE INSTRUMENTS

Questionnaires with:
 fixed or closed questions on unknown topics

 open-ended questions without guidelines.

 vaguely phrased questions.

 leading questions .

 questions placed in an illogical order.

 Weighing scales or other measuring equipment that


are not standardized .
4/9/2023 35
To avoid defective instrument bias?

These sources of bias can be prevented by

carefully planning the data collection process and

by pre-testing the data collection tools.

4/9/2023 36
2.OBSERVER BIAS

Observer bias can easily occur when:-

Conducting observations or utilizing loosely


structured group- or individual interviews.

There is a risk that the data collector will only


see or hear things in which (s)he is interested or
will miss information that is critical to the
research.

4/9/2023 37
Observer Bias Can Be Minimized By

Preparing Observation protocols and guidelines


for conducting loosely structured interviews.

 Practice should be provided to data collectors.

 Data collectors should work in pairs when using


flexible research techniques and discuss and
interpret the data immediately after collecting it.

Using a tape recorder and transcribing the tape


word by word.
4/9/2023 38
Informational bias
Sometimes it happens with the weakness of the
information itself
e.g. Medical records may have many blanks or be
unreadable.
This tells something about the quality of the data and
has to be recorded.
Gaps in people’s memory which is called memory or
recall bias.
e.g. A mother may not remember all details of her
child’s last diarrhea episode and of the treatment she
gave two or three months before.
For such common diseases it is advisable to limit the
period of recall, asking,
4/9/2023 39
EFFECT OF THE INTERVIEW ON THE INFORMANT

 This is a possible factor in all interview situations.

The informant may mistrust the intention of the


interview and escape certain questions or give
misleading answers.

In a survey on smoking you ask school children:


‘Does your father sometimes smoking cigarette?’

4/9/2023 40
Tips to minimize such mistakes

Such bias can be reduced by

• Introducing the purpose of the study to informants.

• Questions on sensitive issues in a positive way.

• Taking sufficient time for the interview.

• Assuring informants that the data collected will be


confidential .

• Be careful in the selection of interviewers.


4/9/2023 41
Steps in designing a questionnaire

1 Drafting a content:-
• this involves outlining of questions and variables
based on the set objectives.
2.Formulating questions:-
• Formulate one or more questions that will provide
the information needed for each variable.
• The questions should be specific and precise
enough so that different respondents do not
interpret the question differently.
• Each question has to measure only one thing at a
time and leading questions must be avoided.

4/9/2023 42
3. Sequencing of questions:-
• Place questions about background information or
variable at the beginning of the interview (such as age,
religion, education, marital status, occupational.
• Place sensitive questions towards the end. Questions
regarding in , sexual behavior, diseases with stigma
attached to them.
• Use simple and every day language
• make the questionnaire as short as possible or conduct
the interview in two parts if it takes more than an hour.

4/9/2023 43
4. Format the questionnaire:-
• Check that each question has adequate space for
responses and name of informant only when it is
necessary. Name of interviewer should preferably
included for quality control.
• Related question should appear together.
• Place boxes for pre categorized answers in a consistent
manner

5 Guideline for interview

4/9/2023 44
6. Translation: Translate to local language and
retranslate into original language.

7 Pre-testing
8 Conducting the actual interview
five stages preparation
introduction
uneven conversation
ending
post interview
4/9/2023 45
STAGES IN DATA COLLECTION

The process of data collection may involve


the following stages .
Stage 1. Permission to proceed
Stage 2. Data collection
a) logistics of data collection
b) ensuring quality
Stage 3. Data handling
4/9/2023 46
Stage 1: Permission to proceed
Consent must be obtained from the relevant
authorities, individuals and the community in which
the project is to be carried out.
For clinical studies this may also involve obtaining
written informed consent.
Stage 2: Data collection
When collecting our data, we have to consider:
 Logistics: who will collect what, when and with what
resources
 Quality control

4/9/2023 47
a) Logistics of data collection

WHO will collect WHAT data?

When allocating tasks for data collection, it is


recommended that you first list them.
 Then you may identify who could best implement
each of the tasks.

 If it is clear beforehand that your research team will


not be able to carry out the entire study by itself, you
might plan to look for research assistants .

4/9/2023 48
HOW LONG will it take to collect the data for
each component of the study?

Step 1: Consider:
• The time required to reach the study
area(s);
• The time required to locate the study
units(persons, groups, records); If you
have to search for specific informants
(e.g., users or defaulters of a specific
service)
• The number of visits required per study
unit.
4/9/2023 49
Step 2:Calculate the number of interviews that can be

carried out per person per day

Step 3:Calculate the number of days needed to carry out

the interviews.

e.g• you need to do 200 interviews,

• your research team of 5 people can do 5 x 4 =20

interviews per day,

•you will need 200:20 = 10 days for the interviews.


4/9/2023 50
Step 4:Calculate the time needed for the other parts of the
study,

Step 5:Determine how much time you can devote to the


study. If the team has fewer days for fieldwork than the
required, they would need additional research assistants to
help complete this part of the study.

It is always advisable to slightly over estimate the period


needed for data collection to allow for unforeseen delays

4/9/2023 51
WHEN should the data be collected?
The type of data to be collected and the demands
of the project will determine the actual time
needed for the data to be collected. Consideration
should be given to:
Availability of team members & assistants,
The appropriate season(s) to conduct the field
work
Accessibility and availability of the sampled
population, and
Public holidays and vacation periods.

4/9/2023 52
b). Ensuring quality
It is extremely important that the data we collect are of good
quality, that is, reliable and valid.

if not it will come up with false or misleading conclusions.


Measures taken to ensure good quality of
data
Prepare a field work manual for the research team as
a whole
Select your research assistants with care.(if required)
Train research assistants carefully in all topics
Pre-test research instruments and research procedures
Arrange for on-going supervision
4/9/2023 53
Stage 3: DATA HANDLING

 Once the data have been collected and


checked for completeness and accuracy, a
clear procedure should be developed for
handling and storing.
 Decide if the questionnaires are to be
numbered;
identify the person who will be responsible
for storing the data; and how they are going
to be stored.

4/9/2023 54
DATA PROCESSING

4/9/2023 55
Data Processing

• The data, after collection, has to be processed and


analyzed in accordance with the outline laid down for the
purpose at the time of developing the research plan.
(Kothari, 2004 )
• Technically speaking, processing implies
 Editing,
 Coding,
 Entry
 Cleaning
 Classification and
 Tabulation of collected data so that they are
amenable to analysis.

56
PROCESSING OPERATIONS

1. Editing: Editing of data is a process of examining


the collected raw data to detect errors and omissions
and to correct these when possible.
• Editing is done to assure that the data are accurate,
consistent with other facts gathered, uniformly
entered, as completed as possible and have been well
arranged to facilitate coding and tabulation.

57
Data editing can be of two types:

a. Field Editing

b. Central editing

58
Field editing

• Consists in the review of the reporting forms by the


investigator for completing (translating or rewriting)
what the latter has written in abbreviated and/or in
illegible form at the time of recording the respondents’
responses.

59
Central editing

• Should take place when all forms or schedules have


been completed and returned to the office. This type
of editing implies that all forms should get a thorough
editing by a single editor in a small study and by a
team of editors in case of a large inquiry.

60
2. Coding: Coding refers to the process of assigning
numerals or other symbols to answers so that
responses can be put into a limited number of
categories or classes.
E.g. instead of using “Male” and “Female” for the
variable sex, it can be indicated as:
1= Male, 2= Female

61
• They must possess the characteristic of exhaustiveness
(i.e., there must be a class for every data item) and

• Mutual exclusively: which means that a specific


answer can be placed in one and only one cell in a
given category set.

62
Data Can be pre, post and recoded

 Pre-coding: coding categories & column location appear


in the questionnaire itself.

 Post-coding: After respondents have answered questions


mainly for open ended questions .

 Recoding: adding, combining & removing earlier coding


done after data have been collected.

63
Code book:
is a document that describes the location of variables and
lists of the code assignments of the values of the variable.
Provides
• a guide used in the coding process
• locating the variables
• lists of the code
• assignments of the values of the variable
• decoding back to original variables when reporting.

64
Sample Code book
Question/statement Column Variable name codes
location
Respondent ID.NO respid 1-100

1.Age (in years) age 1. 15-24


2. 25-34
3.>34

2. Sex sex 1.Male


2. female

3.Marital status Marital status 1.Married 2.Single 3.Divorced


4.Widowed

4.Religion: Religion 1.Orthodox


2.Muslim
3.Protestant 4.Catholic
5.others

65
3. Classification:

• This fact necessitates classification of data which happens


to be the process of arranging data in groups or classes on
the basis of common characteristics.

• Classification can be one of the following two types:

a) Classification according to attributes


b) Classification according to class-intervals

66
Classification according to attributes:

• Data are classified on the basis of common


characteristics, specially descriptive data (such as
literacy, sex, honesty, marital status, etc.)

Example: marital status can be classified as single,


married, divorced and widowed.

67
Classification according to class-intervals:
• Unlike descriptive characteristics, the numerical
characteristics refer to quantitative phenomenon
which can be measured through some statistical units.
• Data relating to income, production, age, weight, etc.
come under this category.
• For instance, persons whose incomes, say, are within
Birr 201 to Birr 400 can form one group, those whose
incomes are within Birr 401 to Birr 600 can form
another group and so on

68
4. Tabulation:
• Tabulation is an orderly arrangement of data in columns
and rows.
• When a mass of data has been assembled, it becomes
necessary for the researcher to arrange the same in some
kind of concise and logical order.
• Tabulation is the process of summarizing raw data and
displaying the same in compact form (i.e., in the form of
statistical tables) for further analysis.

69
Tabulation is essential because of the following reasons.
1. It conserves space and reduces explanatory and descriptive
statement to a minimum.
2. It facilitates the process of comparison.
3. It facilitates the summation of items and the detection of
errors and omissions.
4. It provides a basis for various statistical computations.

70
reference

• Kothari research methodology, method and


techniques, second edition
• John G., Research method for post graduate, 2nd
edition
• Research methodology lecture note For Health Science
Students in Ethiopia ,Getu Degu& Tegbar Yigzaw
• WHO Regional Publications Eastern Mediterranean
Series 30 ,a Practical Guide for Health Researchers

4/9/2023 71

You might also like