Processing and Analysis of Data
Processing and Analysis of Data
Introduction
• After data collection, the next step is to processing and analysis of
data.
• Technically speaking, processing implies editing, coding,
classification and tabulation of collected data so that they are
amenable to analysis.
• The term analysis refers to the computation of certain measures
along with searching for patterns of relationship that exist among
data-groups.
PROCESSING OPERATIONS
1. Editing:
• Editing of data is a process of examining the collected raw data
(specially in surveys) to detect errors and omissions and to correct
these when possible.
• As a matter of fact, editing involves a careful scrutiny of the completed
questionnaires and/or schedules.
• Editing is done to assure that the data are accurate, consistent with
other facts gathered, uniformly entered, as completed as possible and
have been well arranged to facilitate coding and tabulation.
• Editing may be two types:
» Field Editing
» Central Editing
Field Editing
• Field editing consists in the review of the reporting forms by the
investigator for completing (translating or rewriting) what the latter
has written in abbreviated and/or in illegible form at the time of
recording the respondents’ responses.
• This type of editing is necessary in view of the fact that individual
writing styles often can be difficult for others to decipher.
• This sort of editing should be done as soon as possible after the
interview, preferably on the very day or on the next day
Central Editing
• Central editing should take place when all forms or schedules have
been completed and returned to the office.
• This type of editing implies that all forms should get a thorough
editing by a single editor in a small study and by a team of editors in
case of a large inquiry.
• Editor(s) may correct the obvious errors such as an entry in the
wrong place, entry recorded in months when it should have been
recorded in weeks, and the like.
• In case of inappropriate on missing replies, the editor can sometimes
determine the proper answer by reviewing the other information in
the schedule.
• At times, the respondent can be contacted for clarification.
• The editor must strike out the answer if the same is inappropriate
and he has no basis for determining the correct answer or the
response.
PROCESSING OPERATIONS contd..
2. Coding:
• Coding refers to the process of assigning numerals or other symbols
to answers so that responses can be put into a limited number of
categories or classes.
• Such classes should be appropriate to the research problem under
consideration.
• They must also possess the characteristic of exhaustiveness (i.e.,
there must be a class for every data item) and also that of mutual
exclusively which means that a specific answer can be placed in one
and only one cell in a given category.
• Coding is necessary for efficient analysis and through it the several
replies may be reduced to a small number of classes which contain
the critical information required for analysis.
PROCESSING OPERATIONS contd..
3. Classification:
• Most research studies result in a large volume of raw data which must
be reduced into homogeneous groups if we are to get meaningful
relationships.
• This fact necessitates classification of data which happens to be the
process of arranging data in groups or classes on the basis of common
characteristics.
• Data having a common characteristic are placed in one class and in
this way the entire data get divided into a number of groups or classes
Types of Classification
Classification according the the attributes:
• Data are classified on the basis of common characteristics which can
either be descriptive ( such as literacy, sex, honesty, etc.) or
numerical (such as weight, height, income, etc.).
• classification can be simple classification or manifold classification.
• In simple classification we consider only one attribute and divide the
universe into two classes—one class consisting of items possessing
the given attribute and the other class consisting of items which do
not possess the given attribute.
• But in manifold classification we consider two or more attributes
simultaneously, and divide that data into a number of classes (total
number of classes of final order is given by 2n
• ,where n = number of attributes considered).
Types of Classification Contd...
Classification according to class-intervals:
• Unlike descriptive characteristics, the numerical characteristics refer
to quantitative phenomenon which can be measured through some
statistical units.
• Data relating to income, production, age, weight, etc. come under
this category. Such data are known as statistics of variables and are
classified on the basis of class intervals.
• For instance, persons whose incomes, say, are within Rs 201 to Rs
400 can form one group, those whose incomes are within Rs 401 to
Rs 600 can form another group and so on.
• In this way the entire data may be divided into a number of groups
or classes or what are usually called, ‘class-intervals.’
• Each group of class-interval, thus, has an upper limit as well as a
lower limit which are known as class limits.
• The difference between the two class limits is known as class
magnitude.
Types of Classification Contd...
Classification according to class-intervals: (contd...)
• Class interval mainly divided into two types
» Exclusive type class Interval
» Inclusive type class interval
Such a curve is technically described as a normal curve and the relating distribution as
normal distribution. Such a curve is perfectly bell shaped curve in which case the value of
or or M or Z is just the same and skewness is altogether absent.
MEASURES OF ASYMMETRY (SKEWNESS) contd..