0% found this document useful (0 votes)
37 views

Data Processing: by Mrs P. K. Arunga 15. 06. 2021

This document discusses the process of organizing and preparing raw data for analysis. It describes data organization, processing, validation, editing, coding, tabulation, and analysis. The key steps are validating the data, editing it to correct errors and inconsistencies, coding the data by assigning numeric values to variables and responses, and tabulating the data into a table to facilitate analysis. The overall goal of data processing is to organize the raw data into a usable form for examining relationships and drawing inferences.

Uploaded by

Dante Mutz
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Data Processing: by Mrs P. K. Arunga 15. 06. 2021

This document discusses the process of organizing and preparing raw data for analysis. It describes data organization, processing, validation, editing, coding, tabulation, and analysis. The key steps are validating the data, editing it to correct errors and inconsistencies, coding the data by assigning numeric values to variables and responses, and tabulating the data into a table to facilitate analysis. The overall goal of data processing is to organize the raw data into a usable form for examining relationships and drawing inferences.

Uploaded by

Dante Mutz
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

DATA PROCESSING

BY MRS P. K. ARUNGA
15. 06. 2021
Data Organization
Refers to orderliness in the research
data, ie putting the data into some
systematic form.
The raw data collected needs to be
processed before it can be subjected
to any useful analysis.
Data Organization
Organization includes identifying
and correcting errors in the data,
coding the data and storing it in
appropriate form
Data analysis refers to examining
the coded data critically and making
inferences
Data processing

 This is a procedure for:


 Validating,
 Editing,
 Coding and
 Tabulating raw data from the instruments

It will enable you to manage the collected data.


Data Processing
You will then be able to decide on the
best way of analyzing the data to
answer your research questions well.
Also, it will be easier for you to select
the data you need from the one that
you do not need.
Five steps in data processing
Data Validation
Is the data relevant?
Data Editing
Are there irregularities in the data?
Data Coding
What will each category of data represent? (Variables)
Data Tabulation
What types of data represent each variable?
• Data Analysis
Data validation

• Ascertain whether or not the data you


collected is right for your study.
• Attempt to determine if the
questionnaires, interviews and
observations were conducted correctly
and were free from error.
Data validation
Establish the following: 
Was the data collection process
falsified?
Was the data collected from the right
respondents?
Was the data collection process carried
out in the proper procedural setting?
Data Validation
How accurately was the process
completed?
Was the process done in a courteous
manner?
If your answer to most of the questions
is YES, it means that you managed to
secure good data for your study.
Data editing

To edit means “to remove chaff from wheat


after harvesting” or “to remove weeds if you
want your plants to grow well”.
What is data editing? 
It is a procedure for manual scanning and
cleaning of data to reduce inconsistencies in
the questionnaires or interview responses.
 
Editing
Sometimes you may wonder what to do with so
many of the questionnaires you have received
from respondents. Also, the interview schedules
you completed or filled after fieldwork could be
bulky. It is a tedious task to begin analyzing your
data straight from the questionnaires or
interview schedules. You could end up getting
disorganized or confused.
Editing
The following are items in selected questionnaires,
which were returned by respondents to a
researcher after data collection. Examine each
question and identify any irregularities about the
responses given.
 
Question 10:
To what extent do you feel satisfied with your
current working conditions?
 
Respondent 1
Editing
COMPLETELY SATISFIED INDIFFERENT DISSATISFIED COMPLETELY
SATISFIED DISSATISFIED
5 4  3 2 1

Respondent 2
COMPLETELY SATISFIED INDIFFERENT DISSATISFIED COMPLETELY
SATISFIED DISSATISFIED
5 4 3 2 1
Editing
Question 12:
How long have you worked in your current
teaching position before getting promoted
to the next job group?

Respondent I: promotions are very rare.


Respondent II: Six years
Editing
In question 10, respondent I provided
double responses where only one was
required. Likewise, respondent II never
gave any response.
 
In question 12, respondent I gave an
irrelevant answer, while respondent II
provided a relevant answer.
Editing
How would the response in the above
Activity affect your data analysis results?
 
Some responses could be misleading and can
negatively affect the accuracy of your data
analysis results. It is not necessary, therefore,
to include such data in your analysis, unless
you are pretty sure of how to correct them.
The importance of data editing

Data editing will assist you in ensuring that;


Proper questions were asked
The respondent recorded proper answers.
Proper screening questions were
employed, to stimulate respondent to
clarify their answers.
Open-ended responses were recorded
accurately.
Editing
Editing checks on and corrects the following;
Wrong – entries / responses
Errors in response
Omissions in response
Possible outliers (extreme values)
Any other inconsistency
Editing
Wrong entries are committed when
a respondent records an answer in a
space meant for a different question.
This can easily be corrected by
checking which question/items are
relevant to the answer given. The
mismatch can then be corrected.
Editing
Errors in responses:
Are made when numerical answers are
either exaggerated or understated. For
example, a respondent may record work
experience as 400 years when the correct
value is 40 years. Most of the errors can be
corrected through common sense.
Editing
Omissions in responses affect the accuracy in statistical
analysis. Eg, if the total number of respondents for a study
was 200 and only 150 of them responded to a question/item,
this could affect results.

Supposing the question required a YES or NO response, and 50


respondents indicated YES while 100 indicated NO. The
percentage for YES may be computed erroneously as (50/200
x 100) instead of (50/150 x 100). The number of omissions
should, therefore, be checked to avoid inaccurate data
analysis results.
 
Editing
Outliers :
They are extreme values, which differ
significantly from a set of responses
given for an item/question. They
often distort the true picture that the
statistical analysis results would
otherwise give. Eg;
Editing
Question: What is your monthly average income after
taxation?
Five responses were given as follows.
  Income (in Kshs.)
Respondent 1 25,000
Respondent 2 30,000
Respondent 3 15,000
Respondent 4 2,500,000
Respondent 5 18,000
 
Compute the mean income for the small businesses.
Data Coding
It is a procedure for assigning numeric
values or symbols to all the variables
(or items) as well as of providing
numeric labels or symbols to data so
that they can be tabulated for
subsequent statistical analysis.

 
Data Coding
Imagine you are developing a
questionnaire to be administered
in collecting data from your
students in a class. Questions
you included in the instruments
are shown below:
Coding
Questionnaire Item 1:
What is your gender?
Male
Female
Coding
Questionnaire Item 2:
What is your age-group?
Less than 15 years
16 – 20 years
21 – 25 years
Over 25 years
Coding
Questionnaire Item 3:
How was your performance in mathematics
in the last end term examinations?
Excellent
Very Good
Fair
Poor
Coding
In coding the data for the items, you
need to assign numeric values or
symbols to each question as well as to
each response category provided.
Compile a coding scheme as shown;
Coding
Gender of students - 01
Male – 1
Female – 2
Age-group - 02
Less than 15 years - 1
16 – 20 years – 2
21 – 25 years - 3
Over 25 years – 4
Data Coding scheme
Description of Variable Response Labels Labels Type of
variable Code Codes Variable/Dat
a
Var. 1 GENDER OF 01 Male 1 NOMINAL
STUDENTS Female 2
Var. 2 AGE OF STUDENT 02  Less than 15 years 1
 16 – 20 years 2 INTERVAL
 21 – 25 years 3
 Over 25 years 4
Var. 3 STUDENTS 03  Poor 1
PERFORMANCE  Fair 2 ORDINAL
IN  Good 3
MATHEMATICS  Very good 4
 Excellent 5
Coding
• The process of coding will make it easy for
you to tabulate the data in the preparation
for data analysis.
• It will also make it easy for you to store
the data in a computer using spreadsheet
software or data analysis software’s such
as SPSS (Statistical Package for Social
Sciences).
Data Tabulation

Data tabulation is a process for organising data to fit


into a tabular framework to facilitate subsequent
analysis. 
Data tabulation indicates the number of
respondents who gave each possible answer to
each question on a questionnaire or interview
schedule.
It generates a cross tabulation which provides
categorization of respondents by treating several
variables simultaneously.
Data Tabulation
It organises data into Columns and Rows
to create an array of Variable Values
against the units of analysis, thus
generating a data matrix.
The columns represent the coded
variables, while the rows represent the
units of analysis, who are the respondents.
Tabulation
Based on the previous coding scheme developed
earlier, you need to specify the following:
(1) The Units of Analysis:
• Those who responded by filling in the
questionnaires. Each questionnaire represents a
unit of analysis.
• Provide a number for each questionnaire as they
come back to you. Indicate on top of the first
questionnaire, the number 1, second
questionnaire, the number 2, and so on.
Tabulation
(2) The variable codes as shown in
the data-coding scheme. These are;
01 for “Gender of Students”, O2 for
“age of students” and O3 for “Students
performance in mathematics”.
 Draw up the table as shown below;
Data matrix
variable codes
Units of Analysis 01 02 03 04 06 07 ---
(Respondents) 05
Respondent No 1 1 3 4
Respondent No 2 2 2 4
Respondent No 3 2 2 5
Respondent No 4 2 4 3
Respondent No 5 1 3 3
Tabulation
Respondent No. 1 represented by the first
questionnaire (1) is;  
“Male” represented by 1
Age –group “21 –25 years ” represented by 3
Recorded “very good performance” represented
by 4.
Explain the entries for respondent No. 2 and No. 3
as indicated in the data matrix.
Tabulated data
Table 4.1 Analysis of students by gender
Gender of Frequency Percentage
students
Male 3 60

Female 2 40

Total 5 100
Analysis
The process of categorizing, classifying, manipulating
and summarizing data in order to obtain answers to
research questions
Remember the steps of scientific research;
Problem identification
Literature review
Design and methodology
Data collection
Data analysis
Generalization and dissemination (report the results)
Finally
Derive joy and enjoyment as you
venture into your data processing

BLESSINGS AND GOOD DAY

THANK YOU

You might also like