0% found this document useful (0 votes)
2 views43 pages

Chapter 6

Uploaded by

tigabie2993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views43 pages

Chapter 6

Uploaded by

tigabie2993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

CHAPTER 7

Data analysis & Interpretation


The data, after collection, has to be
processed and analyzed in accordance
with the outline laid down for the
purpose at the time of developing the
research plan. This is essential for a
scientific study and for ensuring that we
have all relevant data for making
contemplated comparisons and analysis.
Technically speaking, processing implies
editing, coding, classification and
tabulation of collected data so that they

Chapter sevenPage 1
are amenable to analysis. The term
analysis refers to the computation of
certain measures along with searching
for patterns of relationship that exist
among data-groups. Thus, “in the process
of analysis, relationships or differences
supporting or conflicting with original or
new hypotheses should be subjected to
statistical tests of significance to
determine with what validity data can be
said to indicate any conclusions” But
there are persons (Selltiz, Jahoda and
others) who do not like to make

Chapter sevenPage 2
difference between processing and
analysis. They opine that analysis of data
in a general way involves a number of
closely related operations which are
performed with the purpose of
summarizing the collected data and
organizing these in such a manner that
they answer the research question(s).
We, however, shall prefer to observe the
difference between the two terms as
stated here in order to understand their
implications more clearly.
7.1. Processing Operations

Chapter sevenPage 3
With this brief introduction concerning
the concepts of processing and analysis,
we can now proceed with the explanation
of all the processing operations.
1. Editing: Editing of data is a process of
examining the collected raw data
(especially in surveys) to detect errors
and omissions and to correct these when
possible. As a matter of fact, editing
involves a careful scrutiny of the
completed questionnaires and/or
schedules. Editing is done to assure that
the data are accurate, consistent with

Chapter sevenPage 4
other facts gathered, uniformly entered,
as completed as possible and have been
well arranged to facilitate coding and
tabulation.
With regard to points or stages at which
editing should be done, one can talk of
field editing and central editing. Field
editing consists in the review of the
reporting forms by the investigator for
completing (translating or rewriting)
what the latter has written in abbreviated
and/or in illegible form at the time of
recording the respondents’ responses.

Chapter sevenPage 5
This type of editing is necessary in view
of the fact that individual writing styles
often can be difficult for others to
decipher. This sort of editing central
editing should take place when all forms
or schedules have been completed and
returned to the office. This type of
editing implies that all forms should get a
thorough editing by a single editor in a
small study and by a team of editors in
case of a large inquiry. Editor(s) may
correct the obvious errors such as an
entry in the wrong place, entry recorded

Chapter sevenPage 6
in months when it should have been
recorded in weeks, and the like. In case
of inappropriate on missing replies, the
editor can sometimes determine the
proper answer by reviewing the other
information in the schedule. At times, the
respondent can be contacted for
clarification. The editor must strike out
the answer if the same is inappropriate
and he has no basis for determining the
correct answer or the response. In such a
case an editing entry of ‘no answer’ is
called for. All the wrong replies, which

Chapter sevenPage 7
are quite obvious, must be dropped from
the final results, especially in the context
of mail surveys.
Editors must keep in view several points
while performing their work: (a) they
should be familiar with instructions
given to the interviewers and coders as
well as with the editing instructions
supplied to them for the purpose. (b)
While crossing out an original entry for
one reason or another, they should just
draw a single line on it so that the same
may remain legible. (c) They must make

Chapter sevenPage 8
entries (if any) on the form in some
distinctive color and that too in a
standardized form. (d) They should
initial all answers which they change or
supply. (e) Editor’s initials and the date
of editing should be placed on each
completed form or schedule.
2. Coding: Coding refers to the process
of assigning numerals or other symbols
to answers so that responses can be put
into a limited number of categories or
classes. Such classes should be
appropriate to the research problem

Chapter sevenPage 9
under consideration. They must also
possess the characteristic of
exhaustiveness (i.e., there must be a class
for every data item) and also that of
mutual exclusively which means that a
specific answer can be placed in one and
only one cell in a given category set.
Another rule to be observed is that of
unidimensionality by which is meant that
every class is defined in terms of only
one concept. Coding is necessary for
efficient analysis and through it the
several replies may be reduced to a small

Chapter sevenPage 10
number of classes which contain the
critical information required for analysis.
Coding decisions should usually be taken
at the designing stage of the
questionnaire. This makes it possible to
precede the questionnaire choices and
which in turn is helpful for computer
tabulation as one can straight forward
key punch from the original
questionnaires. But in case of hand
coding some standard method may be
used. One such standard method is to
code in the margin with a colored pencil.

Chapter sevenPage 11
The other method can be to transcribe the
data from the questionnaire to a coding
sheet. Whatever method is adopted, one
should see that coding errors are
altogether eliminated or reduced to the
minimum level.
3. Classification: Most research studies
result in a large volume of raw data
which must be reduced into
homogeneous groups if we are to get
meaningful relationships. This fact
necessitates classification of data which
happens to be the process of arranging

Chapter sevenPage 12
data in groups or classes on the basis of
common characteristics. Data having a
common characteristic are placed in one
class and in this way the entire data get
divided into a number of groups or
classes. Classification can be one of the
following two types, depending upon the
nature of the phenomenon involved:
(a) Classification according to
attributes: As stated above, data are
classified on the basis of common
characteristics which can either be
descriptive (such as literacy, sex,

Chapter sevenPage 13
honesty, etc.) or numerical (such as
weight, height, income, etc.). Descriptive
characteristics refer to qualitative
phenomenon which cannot be measured
quantitatively; only their presence or
absence in an individual item can be
noticed. Data obtained this way on the
basis of certain attributes are known as
statistics of attributes and their
classification is said to be classification
according to attributes.
Such classification can be simple
classification or manifold classification.

Chapter sevenPage 14
In simple classification we consider only
one attribute and divide the universe into
two classes—one class consisting of
items possessing the given attribute and
the other class consisting of items which
do not possess the given attribute. But in
manifold classification we consider two
or more attributes simultaneously, and
divide that data into a number of classes
(total number of classes of final order is
given by 2n, where n = number of
attributes considered). Whenever data are
classified according to attributes, the

Chapter sevenPage 15
researcher must see that the attributes are
defined in such a manner that there is
least possibility of any doubt/ambiguity
concerning the said attributes.
(b) Classification according to class-
intervals: Unlike descriptive
characteristics, the numerical
characteristics refer to quantitative
phenomenon which can be measured
through some statistical units. Data
relating to income, production, age,
weight, etc. come under this category.
Such data are known as statistics of

Chapter sevenPage 16
variables and are classified on the basis
of class intervals. For instance, persons
whose incomes, say, are within Rs 201 to
Rs 400 can form one group; those whose
incomes are within Rs 401 to Rs 600 can
form another group and so on. In this
way the entire data may be divided into a
number of groups or classes or what are
usually called, ‘class-intervals.’ Each
group of class-interval, thus, has an
upper limit as well as a lower limit which
are known as class limits. The difference
between the two class limits is known as

Chapter sevenPage 17
class magnitude. We may have classes
with equal class magnitudes or with
unequal class magnitudes. The number of
items which fall in a given class is
known as the frequency of the given
class. All the classes or groups, with their
respective frequencies taken together and
put in the form of a table, are described
as group frequency distribution or simply
frequency distribution. Classification
according to class intervals usually
involves the following three main
problems:

Chapter sevenPage 18
(i) How may classes should be there?
(ii) How to choose class limits?
iii) How to determine the frequency of
each class?
4. Tabulation: When a mass of data has
been assembled, it becomes necessary for
the researcher to arrange the same in
some kind of concise and logical order.
This procedure is referred to as
tabulation. Thus, tabulation is the process
of summarising raw data and displaying
the same in compact form (i.e., in the
form of statistical tables) for further

Chapter sevenPage 19
analysis. In a broader sense, tabulation is
an orderly arrangement of data in
columns and rows. Tabulation is
essential because of the following
reasons.
1. It conserves space and reduces
explanatory and descriptive statement to
a minimum.
2. It facilitates the process of
comparison.
3. It facilitates the summation of items
and the detection of errors and omissions.

Chapter sevenPage 20
4. It provides a basis for various
statistical computations.
Tabulation can be done by hand or by
mechanical or electronic devices. The
choice depends on the size and type of
study, cost considerations, time pressures
and the availability of tabulating
machines or computers. In relatively
large inquiries, we may use mechanical
or computer tabulation if other factors
are favourable and necessary facilities
are available. Hand tabulation is usually
preferred in case of small inquiries where

Chapter sevenPage 21
the number of questionnaires is small and
they are of relatively short length. Hand
tabulation may be done using the direct
tally, the list and tally or the card sort
and count methods. When there are
simple codes, it is feasible to tally
directly from the questionnaire. Under
this method, the codes are written on a
sheet of paper, called tally sheet, and for
each response a stroke is marked against
the code in which it falls. Usually after
every four strokes against a particular
code, the fifth response is indicated by

Chapter sevenPage 22
drawing a diagonal or horizontal line
through the strokes. These groups of five
are easy to count and the data are sorted
against each code conveniently. In the
listing method, the code responses may
be transcribed onto a large work-sheet,
allowing a line for each questionnaire.
This way a large number of
questionnaires can be listed on one work
sheet. Tallies are then made for each
question. The card sorting method is the
most flexible hand tabulation. In this
method the data are recorded on special

Chapter sevenPage 23
cards of convenient size and shape with a
series of holes. Each hole stands for a
code and when cards are stacked, a
needle passes through particular hole
representing a particular code. These
cards are then separated and counted.

In this way frequencies of various codes


can be found out by the repetition of this
technique. We can as well use the
mechanical devices or the computer
facility for tabulation purpose in case we
want quick results, our budget permits
their use and we have a large volume of
Chapter sevenPage 24
straight forward tabulation involving a
number of cross-breaks. Tabulation may
also be classified as simple and complex
tabulation. The former type of tabulation
gives information about one or more
groups of independent questions,
whereas the latter type of tabulation
shows the division of data in two or more
categories and as such is designed to give
information concerning one or more sets
of inter-related questions. Simple
tabulation generally results in one-way
tables which supply answers to questions

Chapter sevenPage 25
about one characteristic of data only. As
against this, complex tabulation usually
results in two-way tables (which give
information about two inter-related
characteristics of data), three-way tables
(giving information about three
interrelated characteristics of data) or still
higher order tables, also known as
manifold tables, which supply
information about several interrelated
characteristics of data. Two-way tables,
three-way tables or manifold tables are

Chapter sevenPage 26
all examples of what is sometimes
described as cross tabulation.
Generally accepted principles of
tabulation: Such principles of tabulation,
particularly of constructing statistical
tables, can be briefly states as follows:
1. Every table should have a clear,
concise and adequate title so as to make
the table intelligible without reference to
the text and this title should always be
placed just above the body of the table.
2. Every table should be given a distinct
number to facilitate easy reference.

Chapter sevenPage 27
3. The column headings (captions) and
the row headings (stubs) of the table
should be clear and brief.
4. The units of measurement under each
heading or sub-heading must always be
indicated.
5. Explanatory footnotes, if any,
concerning the table should be placed
directly beneath the table, along with
the reference symbols used in the table.
6. Source or sources from where the
data in the table have been obtained
must be indicated just below the table.

Chapter sevenPage 28
7. Usually the columns are separated
from one another by lines which make
the table more readable and attractive.
Lines are always drawn at the top and
bottom of the table and below the
captions.
8. There should be thick lines to separate
the data under one class from the data
under another class and the lines
separating the sub-divisions of the
classes should be comparatively thin
lines.

Chapter sevenPage 29
9. The columns may be numbered to
facilitate reference.
10. Those columns whose data are to be
compared should be kept side by side.
Similarly, percentages and/or averages
must also be kept close to the data.
11. It is generally considered better to
approximate figures before tabulation as
the same would reduce unnecessary
details in the table itself.
12. In order to emphasise the relative
significance of certain categories,

Chapter sevenPage 30
different kinds of type, spacing and
indentations may be used.
13. It is important that all column figures
be properly aligned. Decimal points and
(+) or (–) signs should be in perfect
alignment.
14. Abbreviations should be avoided to
the extent possible and ditto marks
should not be used in the table.
15. Miscellaneous and exceptional items,
if any, should be usually placed in the
last row of the table.

Chapter sevenPage 31
16. Table should be made as logical,
clear, accurate and simple as possible. If
the data happen to be very large, they
should not be crowded in a single table
for that would make the table unwieldy
and inconvenient.
17. Total of rows should normally be
placed in the extreme right column and
that of column should be placed at the
bottom.
18. The arrangement of the categories in
a table may be chronological,
geographical, alphabetical or according

Chapter sevenPage 32
to magnitude to facilitate comparison.
Above all, the table must suit the needs
and requirements of an investigation.
7.2. Some Problems in Processing
We can take up the following two
problems of processing the data for
analytical purposes:
(a) The problem concerning “Don’t
know” (or DK) responses: While
processing the data, the researcher
often comes across some responses
that are difficult to handle. One
category of such responses may be

Chapter sevenPage 33
‘Don’t Know Response’ or simply
DK response. When the DK response
group is small, it is of little
significance. But when it is relatively
big, it becomes a matter of major
concern in which case the question
arises: Is the question which elicited
DK response useless? The answer
depends on two points viz., the
respondent actually may not know the
answer or the researcher may fail in
obtaining the appropriate information.
In the first case the concerned

Chapter sevenPage 34
question is said to be alright and DK
response is taken as legitimate DK
response. But in the second case, DK
response is more likely to be a failure
of the questioning process.
(b) Percentages: Percentages are
often used in data presentation for
they simplify numbers, reducing all of
them to a 0 to 100 range. Through the
use of percentages, the data are
reduced in the standard form with
base equal to 100 which fact
facilitates relative comparisons. While

Chapter sevenPage 35
using percentages, the following rules
should be kept in view by researchers:
1. Two or more percentages must not be
averaged unless each is weighted by
the group size from which it has been
derived.
2. Use of too large percentages should
be avoided, since a large percentage is
difficult to understand and tends to
confuse, defeating the very purpose for
which percentages are used.
3. Percentages hide the base from which
they have been computed. If this is not

Chapter sevenPage 36
kept in view, the real differences may
not be correctly read.
4. Percentage decreases can never
exceed 100 per cent and as such for
calculating the percentage of decrease,
the higher figure should invariably be
taken as the base.
5. Percentages should generally be
worked out in the direction of the
causal-factor in case of two-dimension
tables and for this purpose we must
select the more significant factor out of

Chapter sevenPage 37
the two given factors as the causal
factor.

7.3. Elements/Types of Analysis


As stated earlier, by analysis we mean
the computation of certain indices or
measures along with searching for
patterns of relationship that exist among
the data groups. Analysis, particularly in
case of survey or experimental data,
involves estimating the values of

Chapter sevenPage 38
unknown parameters of the population
and testing of hypotheses for drawing
inferences. Analysis may, therefore, be
categorised as descriptive analysis and
inferential analysis (Inferential analysis
is often known as statistical analysis).
Descriptive analysis is largely the study
of distributions of one variable. This
study provides us with profiles of
companies, work groups, persons and
other subjects on any of a multiple of
characteristics such as size. Composition,
efficiency, preferences, etc. this sort of

Chapter sevenPage 39
analysis may be in respect of one
variable (described as unidimensional
analysis), or in respect of two variables
(described as bivariate analysis) or in
respect of more than two variables
(described as multivariate analysis). In
this context we work out various
measures that show the size and shape of
a distribution(s) along with the study of
measuring relationships between two or
more variables.
We may as well talk of correlation
analysis and causal analysis. Correlation

Chapter sevenPage 40
analysis studies the joint variation of two
or more variables for determining the
amount of correlation between two or
more variables. Causal analysis is
concerned with the study of how one or
more variables affect changes in another
variable. It is thus a study of functional
relationships existing between two or
more variables. This analysis can be
termed as regression analysis. Causal
analysis is considered relatively more
important in experimental researches,
whereas in most social and business

Chapter sevenPage 41
researches our interest lies in
understanding and controlling
relationships between variables then with
determining causes per se and as such we
consider correlation analysis as relatively
more important.
Inferential analysis is concerned with
the various tests of significance for
testing hypotheses in order to determine
with what validity data can be said to
indicate some conclusion or conclusions.
It is also concerned with the estimation
of population values. It is mainly on the

Chapter sevenPage 42
basis of inferential analysis that the task
of interpretation (i.e., the task of drawing
inferences and conclusions) is
performed.

Chapter sevenPage 43

You might also like