0% found this document useful (0 votes)
11 views9 pages

Research M

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Research M

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

CHAPTER 7

Data analysis & Interpretation

The data, after collection, has to be processed and analyzed in accordance with the outline laid
down for the purpose at the time of developing the research plan. This is essential for a scientific
study and for ensuring that we have all relevant data for making contemplated comparisons and
analysis. Technically speaking, processing implies editing, coding, classification and
tabulation of collected data so that they are amenable to analysis. The term analysis refers to
the computation of certain measures along with searching for patterns of relationship that exist
among data-groups. Thus, “in the process of analysis, relationships or differences supporting or
conflicting with original or new hypotheses should be subjected to statistical tests of significance
to determine with what validity data can be said to indicate any conclusions” But there are
persons (Selltiz, Jahoda and others) who do not like to make difference between processing and
analysis. They opine that analysis of data in a general way involves a number of closely related
operations which are performed with the purpose of summarizing the collected data and
organizing these in such a manner that they answer the research question(s). We, however, shall
prefer to observe the difference between the two terms as stated here in order to understand their
implications more clearly.
7.1. Processing Operations
With this brief introduction concerning the concepts of processing and analysis, we can now
proceed with the explanation of all the processing operations.
1. Editing: Editing of data is a process of examining the collected raw data (especially in
surveys) to detect errors and omissions and to correct these when possible. As a matter of fact,
editing involves a careful scrutiny of the completed questionnaires and/or schedules. Editing is
done to assure that the data are accurate, consistent with other facts gathered, uniformly entered,
as completed as possible and have been well arranged to facilitate coding and tabulation.
With regard to points or stages at which editing should be done, one can talk of field editing and
central editing. Field editing consists in the review of the reporting forms by the investigator for
completing (translating or rewriting) what the latter has written in abbreviated and/or in illegible
form at the time of recording the respondents’ responses. This type of editing is necessary in
view of the fact that individual writing styles often can be difficult for others to decipher. This

Chapter seven Page 1


sort of editing central editing should take place when all forms or schedules have been
completed and returned to the office. This type of editing implies that all forms should get a
thorough editing by a single editor in a small study and by a team of editors in case of a large
inquiry. Editor(s) may correct the obvious errors such as an entry in the wrong place, entry
recorded in months when it should have been recorded in weeks, and the like. In case of
inappropriate on missing replies, the editor can sometimes determine the proper answer by
reviewing the other information in the schedule. At times, the respondent can be contacted for
clarification. The editor must strike out the answer if the same is inappropriate and he has no
basis for determining the correct answer or the response. In such a case an editing entry of ‘no
answer’ is called for. All the wrong replies, which are quite obvious, must be dropped from the
final results, especially in the context of mail surveys.
Editors must keep in view several points while performing their work: (a) they should be familiar
with instructions given to the interviewers and coders as well as with the editing instructions
supplied to them for the purpose. (b) While crossing out an original entry for one reason or
another, they should just draw a single line on it so that the same may remain legible. (c) They
must make entries (if any) on the form in some distinctive color and that too in a standardized
form. (d) They should initial all answers which they change or supply. (e) Editor’s initials and
the date of editing should be placed on each completed form or schedule.
2. Coding: Coding refers to the process of assigning numerals or other symbols to answers so
that responses can be put into a limited number of categories or classes. Such classes should be
appropriate to the research problem under consideration. They must also possess the
characteristic of exhaustiveness (i.e., there must be a class for every data item) and also that of
mutual exclusively which means that a specific answer can be placed in one and only one cell in
a given category set. Another rule to be observed is that of unidimensionality by which is meant
that every class is defined in terms of only one concept. Coding is necessary for efficient
analysis and through it the several replies may be reduced to a small number of classes which
contain the critical information required for analysis. Coding decisions should usually be taken at
the designing stage of the questionnaire. This makes it possible to precede the questionnaire
choices and which in turn is helpful for computer tabulation as one can straight forward key
punch from the original questionnaires. But in case of hand coding some standard method may
be used. One such standard method is to code in the margin with a colored pencil. The other

Chapter seven Page 2


method can be to transcribe the data from the questionnaire to a coding sheet. Whatever method
is adopted, one should see that coding errors are altogether eliminated or reduced to the
minimum level.
3. Classification: Most research studies result in a large volume of raw data which must be
reduced into homogeneous groups if we are to get meaningful relationships. This fact
necessitates classification of data which happens to be the process of arranging data in groups or
classes on the basis of common characteristics. Data having a common characteristic are placed
in one class and in this way the entire data get divided into a number of groups or classes.
Classification can be one of the following two types, depending upon the nature of the
phenomenon involved:
(a) Classification according to attributes: As stated above, data are classified on the basis of
common characteristics which can either be descriptive (such as literacy, sex, honesty, etc.) or
numerical (such as weight, height, income, etc.). Descriptive characteristics refer to qualitative
phenomenon which cannot be measured quantitatively; only their presence or absence in an
individual item can be noticed. Data obtained this way on the basis of certain attributes are
known as statistics of attributes and their classification is said to be classification according to
attributes.
Such classification can be simple classification or manifold classification. In simple
classification we consider only one attribute and divide the universe into two classes—one class
consisting of items possessing the given attribute and the other class consisting of items which
do not possess the given attribute. But in manifold classification we consider two or more
attributes simultaneously, and divide that data into a number of classes (total number of classes
of final order is given by 2n, where n = number of attributes considered). Whenever data are
classified according to attributes, the researcher must see that the attributes are defined in such a
manner that there is least possibility of any doubt/ambiguity concerning the said attributes.
(b) Classification according to class-intervals: Unlike descriptive characteristics, the numerical
characteristics refer to quantitative phenomenon which can be measured through some statistical
units. Data relating to income, production, age, weight, etc. come under this category. Such data
are known as statistics of variables and are classified on the basis of class intervals. For instance,
persons whose incomes, say, are within Rs 201 to Rs 400 can form one group; those whose
incomes are within Rs 401 to Rs 600 can form another group and so on. In this way the entire

Chapter seven Page 3


data may be divided into a number of groups or classes or what are usually called, ‘class-
intervals.’ Each group of class-interval, thus, has an upper limit as well as a lower limit which
are known as class limits. The difference between the two class limits is known as class
magnitude. We may have classes with equal class magnitudes or with unequal class magnitudes.
The number of items which fall in a given class is known as the frequency of the given class. All
the classes or groups, with their respective frequencies taken together and put in the form of a
table, are described as group frequency distribution or simply frequency distribution.
Classification according to class intervals usually involves the following three main problems:
(i) How may classes should be there?
(ii) How to choose class limits?
iii) How to determine the frequency of each class?
4. Tabulation: When a mass of data has been assembled, it becomes necessary for the researcher
to arrange the same in some kind of concise and logical order. This procedure is referred to as
tabulation. Thus, tabulation is the process of summarising raw data and displaying the same in
compact form (i.e., in the form of statistical tables) for further analysis. In a broader sense,
tabulation is an orderly arrangement of data in columns and rows. Tabulation is essential
because of the following reasons.
1. It conserves space and reduces explanatory and descriptive statement to a minimum.
2. It facilitates the process of comparison.
3. It facilitates the summation of items and the detection of errors and omissions.
4. It provides a basis for various statistical computations.
Tabulation can be done by hand or by mechanical or electronic devices. The choice depends on
the size and type of study, cost considerations, time pressures and the availability of tabulating
machines or computers. In relatively large inquiries, we may use mechanical or computer
tabulation if other factors are favourable and necessary facilities are available. Hand tabulation is
usually preferred in case of small inquiries where the number of questionnaires is small and they
are of relatively short length. Hand tabulation may be done using the direct tally, the list and
tally or the card sort and count methods. When there are simple codes, it is feasible to tally
directly from the questionnaire. Under this method, the codes are written on a sheet of paper,
called tally sheet, and for each response a stroke is marked against the code in which it falls.
Usually after every four strokes against a particular code, the fifth response is indicated by

Chapter seven Page 4


drawing a diagonal or horizontal line through the strokes. These groups of five are easy to count
and the data are sorted against each code conveniently. In the listing method, the code responses
may be transcribed onto a large work-sheet, allowing a line for each questionnaire. This way a
large number of questionnaires can be listed on one work sheet. Tallies are then made for each
question. The card sorting method is the most flexible hand tabulation. In this method the data
are recorded on special cards of convenient size and shape with a series of holes. Each hole
stands for a code and when cards are stacked, a needle passes through particular hole
representing a particular code. These cards are then separated and counted.

In this way frequencies of various codes can be found out by the repetition of this technique. We
can as well use the mechanical devices or the computer facility for tabulation purpose in case we
want quick results, our budget permits their use and we have a large volume of straight forward
tabulation involving a number of cross-breaks. Tabulation may also be classified as simple and
complex tabulation. The former type of tabulation gives information about one or more groups of
independent questions, whereas the latter type of tabulation shows the division of data in two or
more categories and as such is designed to give information concerning one or more sets of inter-
related questions. Simple tabulation generally results in one-way tables which supply answers to
questions about one characteristic of data only. As against this, complex tabulation usually
results in two-way tables (which give information about two inter-related characteristics of data),
three-way tables (giving information about three interrelated characteristics of data) or still
higher order tables, also known as manifold tables, which supply information about several
interrelated characteristics of data. Two-way tables, three-way tables or manifold tables are all examples
of what is sometimes described as cross tabulation.
Generally accepted principles of tabulation: Such principles of tabulation, particularly of
constructing statistical tables, can be briefly states as follows:
1. Every table should have a clear, concise and adequate title so as to make the table intelligible
without reference to the text and this title should always be placed just above the body of the
table.
2. Every table should be given a distinct number to facilitate easy reference.
3. The column headings (captions) and the row headings (stubs) of the table should be clear and
brief.

Chapter seven Page 5


4. The units of measurement under each heading or sub-heading must always be indicated.
5. Explanatory footnotes, if any, concerning the table should be placed directly beneath the
table, along with the reference symbols used in the table.
6. Source or sources from where the data in the table have been obtained must be indicated just
below the table.
7. Usually the columns are separated from one another by lines which make the table more
readable and attractive. Lines are always drawn at the top and bottom of the table and below
the captions.
8. There should be thick lines to separate the data under one class from the data under another
class and the lines separating the sub-divisions of the classes should be comparatively thin
lines.
9. The columns may be numbered to facilitate reference.
10. Those columns whose data are to be compared should be kept side by side. Similarly,
percentages and/or averages must also be kept close to the data.
11. It is generally considered better to approximate figures before tabulation as the same would
reduce unnecessary details in the table itself.
12. In order to emphasise the relative significance of certain categories, different kinds of type,
spacing and indentations may be used.
13. It is important that all column figures be properly aligned. Decimal points and (+) or (–)
signs should be in perfect alignment.
14. Abbreviations should be avoided to the extent possible and ditto marks should not be used in
the table.
15. Miscellaneous and exceptional items, if any, should be usually placed in the last row of the
table.
16. Table should be made as logical, clear, accurate and simple as possible. If the data happen to
be very large, they should not be crowded in a single table for that would make the table
unwieldy and inconvenient.
17. Total of rows should normally be placed in the extreme right column and that of column
should be placed at the bottom.

Chapter seven Page 6


18. The arrangement of the categories in a table may be chronological, geographical, alphabetical or
according to magnitude to facilitate comparison. Above all, the table must suit the needs and
requirements of an investigation.
7.2. Some Problems in Processing
We can take up the following two problems of processing the data for analytical purposes:
(a) The problem concerning “Don’t know” (or DK) responses: While processing the data,
the researcher often comes across some responses that are difficult to handle. One
category of such responses may be ‘Don’t Know Response’ or simply DK response.
When the DK response group is small, it is of little significance. But when it is relatively
big, it becomes a matter of major concern in which case the question arises: Is the
question which elicited DK response useless? The answer depends on two points viz., the
respondent actually may not know the answer or the researcher may fail in obtaining the
appropriate information. In the first case the concerned question is said to be alright and
DK response is taken as legitimate DK response. But in the second case, DK response is
more likely to be a failure of the questioning process.
(b) Percentages: Percentages are often used in data presentation for they simplify numbers,
reducing all of them to a 0 to 100 range. Through the use of percentages, the data are
reduced in the standard form with base equal to 100 which fact facilitates relative
comparisons. While using percentages, the following rules should be kept in view by
researchers:
1. Two or more percentages must not be averaged unless each is weighted by the group size
from which it has been derived.
2. Use of too large percentages should be avoided, since a large percentage is difficult to
understand and tends to confuse, defeating the very purpose for which percentages are used.
3. Percentages hide the base from which they have been computed. If this is not kept in view,
the real differences may not be correctly read.
4. Percentage decreases can never exceed 100 per cent and as such for calculating the
percentage of decrease, the higher figure should invariably be taken as the base.
5. Percentages should generally be worked out in the direction of the causal-factor in case of
two-dimension tables and for this purpose we must select the more significant factor out of
the two given factors as the causal factor.

Chapter seven Page 7


7.3. Elements/Types of Analysis
As stated earlier, by analysis we mean the computation of certain indices or measures along with
searching for patterns of relationship that exist among the data groups. Analysis, particularly in
case of survey or experimental data, involves estimating the values of unknown parameters of
the population and testing of hypotheses for drawing inferences. Analysis may, therefore, be
categorised as descriptive analysis and inferential analysis (Inferential analysis is often known as
statistical analysis).
Descriptive analysis is largely the study of distributions of one variable. This study provides us
with profiles of companies, work groups, persons and other subjects on any of a multiple of
characteristics such as size. Composition, efficiency, preferences, etc. this sort of analysis may
be in respect of one variable (described as unidimensional analysis), or in respect of two
variables (described as bivariate analysis) or in respect of more than two variables (described as
multivariate analysis). In this context we work out various measures that show the size and shape
of a distribution(s) along with the study of measuring relationships between two or more
variables.
We may as well talk of correlation analysis and causal analysis. Correlation analysis studies the
joint variation of two or more variables for determining the amount of correlation between two
or more variables. Causal analysis is concerned with the study of how one or more variables
affect changes in another variable. It is thus a study of functional relationships existing between
two or more variables. This analysis can be termed as regression analysis. Causal analysis is
considered relatively more important in experimental researches, whereas in most social and
business researches our interest lies in understanding and controlling relationships between
variables then with determining causes per se and as such we consider correlation analysis as
relatively more important.
Inferential analysis is concerned with the various tests of significance for testing hypotheses in
order to determine with what validity data can be said to indicate some conclusion or
conclusions. It is also concerned with the estimation of population values. It is mainly on the

Chapter seven Page 8


basis of inferential analysis that the task of interpretation (i.e., the task of drawing inferences and
conclusions) is performed.

Chapter seven Page 9

You might also like