Research Methods Lectures Note
Research Methods Lectures Note
Research in common parlance refers to a search for knowledge. Once can also define research as a
scientific and systematic search for pertinent information on a specific topic. In fact, research is an
art of scientific investigation. Research is defined as human activity based on intellectual application
in the investigation of matter. In other words, research is the systematic process of collecting and
analyzing information to increase our understanding of the phenomenon under study. Some people
consider research as a movement, a movement from the known to the unknown. It is actually a
voyage of discovery.
Research is an academic activity and as such the term should be used in a technical sense. A research
comprises defining and redefining problems, formulating hypothesis or suggested solutions;
collecting, organizing and evaluating data; making deductions and reaching conclusions; and at last
carefully testing the conclusions to determine whether they fit the formulating hypothesis. It is the
persuit of truth with the help of study, observation, comparison and experiment.
In short, the search for knowledge through objective and systematic method of finding solution to a
problem is research. The systematic approach concerning generalization and the formulation of a
theory is also research. As such the term ‘research’ refers to the systematic method consisting of
enunciating the problem, formulating a hypothesis, collecting the facts or data, analyzing the facts
and reaching certain conclusions either in the form of solutions(s) towards the concerned problem
or in certain generalizations for some theoretical formulation.
It may be said that the general aims of research are to observe and describe, to predict, to
determine causes and explain.
Generally research is:
• Systematic - so ordered, planned and disciplined
• Controlled - the researcher can have confidence in his/her research outcomes;
1
• Empirical - putting beliefs, ideas, or assumptions to a test.
• Critical - many truths are tentative and are subject to change as a result of subsequent research.
OBJECTIVES OF RESEARCH
The purpose of research is to discover answers to questions through the application of scientific
procedures. The main aim of research is to find out the truth which is hidden and which has not
been discovered as yet.
Though each research study has its own specific purpose, we may think of research objectives as
falling into a number of following broad groupings:-
1. To gain familiarity with a phenomenon or to achieve new insights into it (exploratory or
formulative research studies);
2. To portray accurately the characteristics of a particular individual, situation or a group (descriptive
research studies);
3. To determine the frequency with which something occurs or with which it is associated with
something else (diagnostic research studies);
4. To test a hypothesis of a causal relationship between variables (hypothesis-testing research studies).
TYPES OF RESEARCH
The basic types of research are as follows:
i) Descriptive vs. Analytical: Descriptive research includes surveys and fact-finding enquiries
of different kinds. The major purpose of descriptive research is description of the state of affairs
as it exists at present. The main characteristic of this method is that the researcher has no
control over the variables; he can only report what has happened or what is happening. Most ex
post facto research projects are used for descriptive studies in which the researcher seeks to
measure such items as, for example, frequency of shopping, preferences of people, or similar
data. The methods of research utilized in descriptive research are survey methods of all kinds,
including comparative and correlational methods. In analytical research, on the other hand, the
researcher has to use facts or information already available, and analyze these to make a critical
evaluation of the material.
ii) Applied vs. Fundamental: Research can either be applied (or action) research or
fundamental (to basic or pure) research. Applied research aims at finding a solution for an
immediate problem facing a society or an industrial/business organization, whereas
2
fundamental research is mainly concerned with generalisations and with the formulation of a
theory. “Gathering knowledge for knowledge’s sake is termed ‘pure’ or ‘basic’ research.”
Research concerning some natural phenomenon or relating to pure mathematics are examples
of fundamental research. Similarly, research studies, concerning human behaviour carried on
with a view to make generalisations about human behaviour, are also examples of fundamental
research, but research aimed at certain conclusions (say, a solution) facing a concrete social or
business problem is an example of applied research. Research to identify social, economic or
political trends that may affect a particular institution or the marketing research or evaluation
research are examples of applied research. Thus, the central aim of applied research is to
discover a solution for some pressing practical problem, whereas basic research is directed
towards finding information that has a broad base of applications and thus, adds to the already
existing organized body of scientific knowledge.
iii) Quantitative vs. Qualitative: Quantitative research is based on the measurement of quantity
or amount. It is applicable to phenomena that can be expressed in terms of quantity. Qualitative
research, on the other hand, is concerned with qualitative phenomenon, i.e., phenomena
relating to or involving quality or kind. For instance, when we are interested in investigating the
reasons for human behaviour (i.e., why people think or do certain things), we quite often talk
of ‘Motivation Research’, an important type of qualitative research. This type of research aims
at discovering the underlying motives and desires, using in depth interviews for the purpose.
Other techniques of such research are word association tests, sentence completion tests, story
completion tests and similar other projective techniques. Attitude or opinion research i.e.,
research designed to find out how people feel or what they think about a particular subject or
institution is also qualitative research. Qualitative research is specially important in the
behavioural sciences where the aim is to discover the underlying motives of human behaviour.
Through such research we can analyze the various factors which motivate people to behave in a
particular manner or which make people like or dislike a particular thing.
iv) Conceptual vs. Empirical: Conceptual research is that related to some abstract idea(s) or
theory. It is generally used by philosophers and thinkers to develop new concepts or to
3
reinterpret existing ones. On the other hand, empirical research relies on experience or
observation alone, often without due regard for system and theory. It is data-based research,
coming up with conclusions which are capable of being verified by observation or experiment.
We can also call it as experimental type of research. In such a research it is necessary to get at
facts firsthand, at their source, and actively to go about doing certain things to stimulate the
production of desired information. In such a research, the researcher must first provide himself
with a working hypothesis or guess as to the probable results. He then works to get enough facts
(data) to prove or disprove his hypothesis. He then sets up experimental designs which he thinks
will manipulate the persons or the materials concerned so as to bring forth the desired
information. Such research is thus characterized by the experimenter’s control over the
variables under study and his deliberate manipulation of one of them to study its effects.
Empirical research is appropriate when proof is sought that certain variables affect other
variables in some way. Evidence gathered through experiments or empirical studies is today
considered to be the most powerful support possible for a given hypothesis.
v) Some Other Types of Research: All other types of research are variations of one or more of
the above stated approaches, based on either the purpose of research, or the time required to
accomplish research, on the environment in which research is done, or on the basis of some
other similar factor. Form the point of view of time, we can think of research either as one-time
research or longitudinal research. In the former case the research is confined to a single time-
period, whereas in the latter case the research is carried on over several time-periods. Research
can be field-setting research or laboratory research or simulation research, depending upon the
environment in which it is to be carried out. Research can as well be understood as clinical or
diagnostic research. Such research follow case-study methods or indepth approaches to reach
the basic causal relations. Such studies usually go deep into the causes of things or events that
interest us, using very small samples and very deep probing data gathering devices. The research
may be exploratory or it may be formalized. The objective of exploratory research is the
development of hypotheses rather than their testing, whereas formalized research studies are
those with substantial structure and with specific hypotheses to be tested. Historical research is
4
that which utilizes historical sources like documents, remains, etc. to study events or ideas of
the past, including the philosophy of persons and groups at any remote point of time. Research
can also be classified as conclusion-oriented and decision-oriented. While doing conclusion
oriented research, a researcher is free to pick up a problem, redesign the enquiry as he proceeds
and is prepared to conceptualize as he wishes. Decision-oriented research is always for the need
of a decision maker and the researcher in this case is not free to embark upon research according
to his own inclination. Operations research is an example of decision oriented research since it
is a scientific method of providing executive departments with a quantitative basis for decisions
regarding operations under their control.
When you’re thinking about your research, ask yourself the five ‘WH’ questions:
– What is my research?
Before an attempt is made to start with a research project, a research proposal should be compiled.
Defining the problem is the first step and one of the most difficult in research undertaking. But a
proposal cannot and should not attempt to specify exactly everything that will be done or what is
expected. The basic components of a research proposal are the same in many fields however, how
they are phrased and staged may vary by disciplines.
5
The following components can be regarded as steps in the writing of the research proposal.
Generally, the basic components of a proposal are described in the order in which they most
logically appear in a proposal.
1. Title page
2. Summary/Abstract
3. Introduction/Background
5. Literature review
6. Hypotheses /Questions
7. Conceptual framework
Study area
Study design
Study subjects
Sample size
Sampling methods
Description of variables
6
Data quality assurance
Operational definitions
11. Budget
12. References
13. Appendices/Annexes
1. Title Page
The title of the proposal and, later, of the thesis should be a succinct summary of the topic and
generally should not exceed 15 words.
The title should include key terms that readily identify the scope and nature of the study and should
be typed using all capital letters.
A title ought to be well studied, and to give, so far as its limits permit, a definite and concise
indication of what is to come. The title of your research proposal should state your topic exactly in
the smallest possible number of words. Put your name, the name of your
department/faculty/college, the name of your advisor(s) and date of delivery under the title.
2. Summary/Abstract
The abstract is a one page brief summary of the thesis proposal.
It needs to show how your work fits into what is already known about the topic and what new
contribution your work will make.
Specify the question that your research will answer, establish why it is a significant question.
Do not put information in the abstract that is not in the main text of your research proposal.
Do not put references, figures, or tables in the abstract.
7
Though it appears at the front of the proposal, it is written last.
Its purpose is to establish a framework for the research, so that readers can understand how
it is related to other research.
The introduction should cite those who had the idea or ideas first, and those who have done
the most recent and relevant work. You should then go on to explain why more work is
necessary.
Sufficient references such that a reader could, by going to the library, achieve a
sophisticated understanding of the context and significance of the question.
All cited work should be directly relevant to the goals of the research.
8
Explain the scope of your work, what will and will not be included.
The most important aspect of a research proposal is the clarity of the research problem. The
statement of the problem is the focal point of your research. It should state what you will be
studying, and what the purpose of your findings will be.
A problem might be defined as the issue that exists in the literature, theory, or practice that leads to
a need for the study.
The problem statement describes the context for the study and it also identifies the general analysis
approach.
Effective statements of the problem answer the question “Why does this research need to be
conducted”.
This section should also include a clear and concise statement of the purpose or goal of the study/
project.
It consists of:-
The specific question(s) to be answered,
An explanation of how the results will contribute to the existing body of knowledge.
5. Literature review
To conduct research regarding a topic means that the researcher has obtained sound knowledge
with regard to the research topic.
9
It gives an overview of what has been said, who the key writers are, what are the main theories and
hypotheses, what questions are being asked, and what methods and methodologies are appropriate
and useful.
This section need not be lengthy, however it should be comprehensive.
It should suggest the central subjects in the literature, highlight major areas of difference,
and reflect a critical position towards the materials reviewed.
The literature review develops broad ideas of what is already known in a field, and what
questions are still unanswered.
This process will assist you in furthering narrowing the problem for investigation, and will
highlight any theories that may exist to support developing hypotheses.
You must show that you have looked through the literature and have found the latest
updates in your field of study in order for a proposal to be convincing to an audience.
Questions and hypotheses are testable explanations that are proposed before the methodology
of a study is conducted, but after the researcher had an opportunity to develop background
knowledge like the literature review.
Although research questions and hypotheses are different in their sentence structure and
purpose, both seek to predict relationships.
10
A differences research question asks if there are differences between groups on some
phenomenon.
A relationship question asks if two or more phenomena are related in some systematic
manner.
7. Conceptual framework
This section also includes the specific definitions of the unit of analysis.
This section should describe what the investigator hopes to accomplish with the research.
After reading this section, the reader should be clear about the questions to be asked, the kinds of
answers expected, and the nature of the information to be provided by the proposed research.
11
o Stated using “action verbs” that are specific enough to be measured.
Commonly, research objectives are classified into general objectives and specific
objectives. The general and specific objectives are logically connected to each other and the
specific objectives are commonly considered as smaller portions of the general objectives. It is
important to determine that the general objective is closely related to the statement of the
problem.
This is also referred to as the strategy for research and it is really the heart of the research
proposal.
You must decide exactly how you are going to achieve your stated objectives: i.e., what new
data you need in order to handle the problem you have selected and how you are going to
collect and process this data.
Indicate the methodological steps you will take to answer every question, to test every
hypothesis illustrated in the Questions/Hypotheses section or address the objectives you set.
• Description of your analytical methods, including reference to any specialized statistical software.
12
10. Work plan
Work plan is a schedule, chart or graph that summarizes the different components of a research
proposal and how they will be implemented in a clear way within a specific time-span.
A good work time plan enables both the investigators and the advisors to monitor project progress
and provide timely feedback for research modification or adjustments.
Most often than not, you will require to secure funds from a funding organization to cover the cost
of conducting a research project.
• Their might be a need for budget justification of certain costs whose requirement is not obvious
12. References
• You must give references to all the information that you obtain from books, papers in journals,
and other sources.
• References may be made in the main text using index numbers in brackets (Vancouver style) or
authors name (Harvard style).
• You will also need to place a list of references, numbered as in the main text (or alphabetically
ordered), at the end of your research proposal.
• The exact format for showing references within the body of the text and as well as the end of
the proposal varies from one discipline to another.
13. Appendices/Annexes
13
Include in the appendices of your proposal any additional information you think might be helpful to
a proposal reviewer.
• Dummy tables
Research Design
A research design is the arrangement of conditions for collection and analysis of data in a
manner that aims to combine relevance to the research purpose with economy in procedure.
In fact, the research design is the conceptual structure within which research is conducted; it
constitutes the blueprint for the collection, measurement and analysis of data. As such the
design includes an outline of what the researcher will do from writing the hypothesis and its
operational implications to the final analysis of data. More explicitly, the design decisions
happen to be in respect of:
i) What is the study about?
ii) Why is the study being made?
iii) Where will the study be carried out?
iv) What type of data is required?
v) Where can the required data be found?
vi) What periods of time will the study include?
vii) What will be the sample design?
viii) What techniques of data collection will be used?
ix) How will the data be analyzed ?
x) In what style will the report be prepared
14
Keeping in view the above stated design decisions, one may split the overall research design into the
following parts:
the sampling design : which deals with the method of selecting items to be observed for the
given study
the observational design which relates to the conditions under which the observations are
to be made;
the statistical design which concerns with the question of how many items are to be
observed and how the information and data gathered are to be analysed; and
the operational design which deals with the techniques by which the procedures specified in
the sampling, statistical and observational designs can be carried out
15
CHAPTER 2: PLANNING FOR SAMPLE SURVEY
Introduction
Statistics defined: statistics is the science of data. It involves collecting, organizing, presenting,
analyzing and interpreting numerical information.
Classification of statistics
Descriptive statistics: deals with the method or procedure used to collection, organization,
analysis and presentation of data (mode, mean, median, range variance, standard deviation,
frequency distribution)
Inferential statistics: it is the set of methods used to generalize from sample to population by
performing hypothesis-testing, determining relationships among estimates of variables and making
predictions.
16
- Sampling units: for the purpose of sample selection, the population is divided in to a finite
number of distinct, non-overlapping and identifiable units called sampling units for example in a
cluster sampling, clusters are sampling units and subjects in the cluster are elementary units.
- Frame: once a population has been defined, the next step is to establish a means to access it. A
frame provides this means to access it. In its simplest form, a frame is a list of elements covering
the survey population, and serves as a base for sample selection.
- Data: These are measurements or observations (values) recorded for each element.
- Variable: is a characteristic or attribute that can assume different values. For example age,
weight, height, sex, income etc, each of these varies from one element to another.
- Qualitative variables: these are variables that assume values that are not numeric but can be
categorized. Observations obtained from these variables are called categorical data. Example
data on gender, religion affiliation, type of occupation, political affiliation etc.
- Quantitative variables: These are variables that assume values of measurable quantity to the
amount of something. Observations obtained through such process are termed as quantitative
data. Observations obtained for weight, height, speed, distance, life time of an item, etc
examples of quantitative data. These variables can be discrete or continuous. Discrete variables
assume only certain countable values (0, 1, 2 ...) and continuous variables can assume any
measurable value within a given range.
- Population parameters: These are facts about the population. Since parameters are
descriptions of the population, a population can have money parameters.
- Statistic: it is a characteristic or a fact about a sample.
Survey is a scientific study that deals with an existing population of units typified by persons,
institutions, or physical objects. A survey attempts to acquire knowledge by observing the
population as it naturally exist and making quantitative statements about aggregate population
characteristics. When we say a study of a population as it naturally exists, it is to mean that we
17
exclude experimental studies in which the material to be studied is manipulated by the researcher
and the result is observed.
Census Survey: Survey which considers all members of the population in the study.
Sample Survey: is survey that considers a specific portion of the population in the study.
In undertaking surveys, it is difficult or even impossible for researchers to study very large
populations. Hence, they select a smaller proportion, a sample of population for study. Researchers
who apply sample survey use sampling techniques and use the information they collect from the
sample to make inference about the population as a whole. When sampling is done, the inference
that we made concerning the population can be quite reliable.
Sample survey are used to develop, test, and refine research hypotheses in different disciplines such
as sociology, social psychology, demography, political science, economics, education, and public
health. Central government makes considerable use of surveys to inform citizens about
employment, unemployment, income and expenditure, housing condition, education, nutrition,
health, travel, patterns etc.
For a survey to yield desired results there is need to pay particular attention to the preparations that
precede the field work. In this regard all surveys require careful and judicious preparations if they
have to be successful. However, the amount of planning will vary depending on the type of survey,
materials and information required. The development of an adequate survey plan requires sufficient
time and resources and a planning cycle of two years is common for a complex survey.
Conducting a good sample survey requires careful planning, implementation and analysis if it is to
yield reliable and valid information. The planning of sample survey has three major steps.
18
Preparation of sampling frames Planning time table
Decision on types of survey Survey budget proposal
Preparing survey instruments Conducting pilot survey
b) Data collection
Organization of field work
Locating respondents
Collecting information
c) Survey analysis
Data processing (data files, structures, checking, coding, entry etc)
Performing statistical analysis( descriptive or inferential statistics)
Presenting methods and findings in study report
Source of Data
Statistical data may be classified as two basic types: primary and secondary data.
Primary (original) data refers to those data which are collected to meet the specific problem
needs at hand. These are data collected by immediate user(s). It is these data that we will normally
be referring to when we talk about “data collection”. The nature and type of primary data required
largely depend on the study objective and vary from one field to another.
Secondary Data refers to already existing information which has previously been collected and
reported by some other individual or organization for their purpose and at latter stage at least some
of that data will come to be made available to other individuals and organizations. It can be taken
rapidly and inexpensively.
Sometimes data requirements can be met from available secondary sources in which case there is no
need to execute survey and generate primary data. The general rule is to exhaust all possible means
to explore secondary data before deciding for primary data. In particular secondary data may
19
provide a context (geographical, temporal, social) validation for primary data which allows us to
assess the quality and consistency of the primary data may act as a substitute for primary data.
Administrative record is the main source of secondary data, but there are other various internal and
external sources like, records, reports, books, periodicals, newspapers, and academic studies.
Apart from time saving and cost, secondary data is less subjected to intentional bias and the only
alternative to inaccessible information, which is impossible to gather through primary data
approach.
One of the subtle skills of information management is to know what secondary data is available in a
given field. Expert researchers develop a comprehensive knowledge of secondary data sources.
Willingness to use such data appropriately is also a hallmark of good research. One can group
secondary data sources in to two categories: official and unofficial. The former comprises all
information collected, processed and made available by legally constituted organizations, primarily
by government departments and statutory authorities and the later comprises all other forms of
secondary data.
Official statistics abound in all developed societies and some developing countries. Governments
collect data about their operations usually for valid reasons such as for administrative purposes and
efficient short and long-term planning.
Government Organization
Individual Government Departments i.e. both state and federal collect statistics about their
respective responsibilities and many are published in some form. This may include annual report,
and annual regular statistical analysis or occasional reports. Similarly, statutory authorities are also
required to report to state and federal parliament, but increasingly present data about their
operation as a component of public reasons.
20
In most countries government bodies collect administrative data which can be used for production
of statistics needed for their own use and for incorporation in the system of official statistics. In
developed countries, a great part of demographic and social statistics is derived from such data for
example statistics on vital events, education, health, criminality, transport, and communication,
etc. Essential parts of economic statistics are also based on administrative data, for instance,
foreign trade statistics and data on the population or sale of commodities subject to excise taxes.
The premier generator and collector of publicly available secondary data are the national statistical
office. The federal statutory authority is responsible for the independent collection and analysis of
statistics, particularly in the economic and social fields. The principal output is a large number of
periodic surveys, issued in various forms on a weekly, monthly, quarterly, or annual basis. The
following table is a summary of the major types of periodic data collected by statistical offices
Census
The national statistical office is also responsible for the largest regular exercise in undertaking
population and human census, or other censuses. This provides such a wealth of social and
21
economic data and is so widely used as source of background data in geography sociology, and
economics, that all researchers should have a solid understanding of its organization and content
Private research results: many corporations and private consultants generate potentially useful
data. The major problem is access; the commercial confidentiality that is usually thought to attach
to such materials means that they are rarely available or public security.
Research reports research papers, text books: the whole gamut of academic and public forms a
major body of secondary data sources we might even see textbooks as ‘tertiary’ sources. The
expectation that research will be freely published means that the most research is made available,
although not always available.`
Opinion polls: the process of collecting public opinion via surveys and questionnaires has been
developed to a high level of sophistication in recent decades, and the results of such surveys, if
made public, can be important sources.
Market research: like opinion polls, most market research is carried out by private organizations
on behalf of specific clients. Depending on the requirements of the client, the results of such
surveys may or may not be made public. The commercial orientation of most surveys tends to their
limit their general applicability.
Online database: increasingly, secondary sources are being made available in electronic forms.
These data may range from bibliographic information online reference database to census data.
22
Allows the researcher to extend the ‘time base’ of his/ her study by providing data about
the earlier state of system being studied
Eliminates the time consuming analysis stage
Disadvantage
The method used in the collection of the data is often unknown to the user of the data.
We may have little or no direct knowledge of the processing methods employed i.e. it
could result in lack of accuracy
It quickly becomes outdated in an ever changing environment
Differences in classification or measurement
Several types of inquiry for the gathering of statistical data on topics can be distinguished.
Census: this is an investigation that covers every individual in the population being studied.
Census is characterized by four essential features:
Individual enumeration of all units, which means it counts each and every units in the
designed territory
Universally within defined territory, which implies every unit is required to be included in
the census
Simultaneity to express population with reference to point of time
Defined periodically to assess changes of population, i.e. census is taken at regular interval
so that comparable information is made available in a fixed sequence.
The best known examples are national census of population and housing, agriculture, and industrial
which are conducted by many countries on a regular basis. Since this census aim at exhaustive
coverage of all units of the population of interest, they are usually massive operation and are
therefore conducted at regular intervals in five or ten years. Although it aims at enumerating all
units in the country this seldom fully achieved in practice. Units may be accidentally or erroneously
omitted from a census for various reasons, and certain well-defined categories of units may be
23
deliberately excluded from the scope the census. The existence of either omission or exclusions
does not mean a census is a survey.
A census can also relate to a much smaller and more specific population for example
This is an investigation in which only part of the population is studied. It is appropriate mainly
when resources are not sufficient or when it is not feasible to consider the whole population in
terms of time and level of treatment, the information is gathered through sample survey and the
result is generalized to the population.
A sample survey can be of any size. It can be large scale operation such as a national demographic
survey for general planning purpose or it can be a very small scale investigation such as a sample
survey of farmers in a project area. Usually the selection is made at random, so that the sample is
known to be representative of the population.
In sample surveys, it is useful to distinguish between the target population and the survey
population. The former is defined at planning stage, for which the results are expected but the
later is the population actually covered during implementation. The difference could exist due to
the exclusion of some units from a survey because of non-coverage and non-response.
Rapid Methods: it includes a variety of investigation techniques used to obtain rapid and
sometimes qualitative information. They are used in situations either when little time and resource
are available and limited information is still useful, or when no data are initially available, and a
quick preliminary inquiry is necessary to provide direction for further study. It is also feasible to use
this method when other methods are not technically appropriate.
24
In most social sciences five commonly used rapid methods are: key informant interview, focus
group discussion, community/group interview, direct observation and informal surveys.
These methods typically involve an investigator or team working in the study area, observing or
measuring or interviewing the characteristics of interest. The observation may be direct or indirect,
for instance, quantification of the crop mixtures found in an area, of the incidence of crop failure
due to pests of drought. It may alternatively be by interviewing informant, either select at random,
or chosen as being knowledgeable about the subject of the study. Local leaders, prominent farmers,
or the elderly of the community are possible examples.
Though these methods are applied for different purposes, it is important to know the limitations of
these methods. The main limitations include:
Reliability and validity of the information is questionable in many instances due to facts such as
informal sampling, individual bias of investigator/ interviewer difficulty of recording, coding
and analyzing qualitative data.
Qualitative methods do not generate quantitative data from which generalization can be made
for whole population.
The general credibility of these methods is low compared to formal survey methods.
Case study: this is an inquiry in which a small number of study units are investigated in great
detail. Selection of units is not necessarily on a random basis. The focus of a case study is on the
detail structures, patterns of inter-relationships observed within each individual case included in the
study. For example an investigation of allocation of rural women’s time to different activities uses a
detailed time budget approach, in order to arrive at appropriate definition and classification of
economic activities.
Experimentation: this is a controlled method of observation in which the value of one or more
independent variables is changed in order to assess its causal effect on one or more dependent
variable. That is a stimulus is applied to a subject, and the effect is observed. Experiments can be
conducted in many settings such as laboratory experiments and field experiments,
25
26
CHAPTER 3: PREPARATION OF SAMPLING FRAMES
3.1 Definition
In its simplest form sampling frame is a listing of the units from which the sample selection is to be
made at any stage of sampling. The units in the frame may be either area or units of objects
covering the items being studied in the survey. The unit in the frame may be large or small areas,
households, persons, farmers, or may be any identifiable items and are generally known as area
frames or list frames.
The frame consists of materials, procedures, and devices that identify, distinguish, and allow access
to the elements of the target population. The frame is composed of a finite set of units to which the
probability sampling scheme is applied. Rules or mechanisms for linking the frame unit to the
population elements are an integral part of the frame. The frame also includes auxiliary information
(measure of size, demographic information) used for special sampling techniques, such as
stratification and probability proportional to size sample selection, or special estimation techniques
like ratio, regression estimation.
In multistage sampling the sampling units used at the first stage of sampling are called primary
sampling units (PSUs). Those used at the final (ultimate) stage are called ultimate sampling units
(USUs). In designs with three or more stages units used for the intermediate stage are called
secondary or second stage sampling units (SSUs), third stage sampling units and so on. Therefore
for samples with multiple sample designs, a frame is needed for each stage of selection. For
example, for three stage design the sampling units for household survey are PSUs: districts
(weredas), SSUs: EA (kebeles), USUs housing units (households).
Any sampling frame used for the first stage of selection must cover the entire survey population
(the designed PSUs). At subsequent stages of selection, frames are needed only for the sample units
selected at the preceding stage. In the above case, a list of districts (weredas) would be needed for
first stage sample selection. List of EAs would be needed for second stage, but only for the sample
district. For the final stage, list of housing units (households) are needed only for EAs ( kebeles). In
27
this study the term secondary sampling frame will be used for frames that are developed specifically
for the second and subsequent stage of sample selection.
The choice of suitable frames for all stages of sample selection is a crucial aspect of the design for
surveys. The population coverage, the stage of sampling, the stratification used the process of
selection, itself every aspect of design is influenced by the sampling frames. Key considerations in
the choice of sampling frames, regardless of the stage of sampling for which they are used, include
the following: intended use, frame units, coverage, media, content, and additional information.
Intended use: the sampling frames are used for sample selection and for making estimates based
on sample data. The choice of the sampling method to be used at each stage of selection is limited
by the information available for each frame unit at that stage. If the information consists only of
attributes (e.g. urban/rural classification, identification of higher level units), it is necessary to use
an equal probability selection method with or without stratification. However, if quantitative
information or measure of size (e.g. counts of persons or households from a recent census) is
available for all or virtually all frame units, this information can be used in connection with sample
selection or estimation, or both.
Frame units: frame units are sampling units included in the frame. The kind of units in frames
used for surveys includes:
Area units such as administrative subdivisions, census enumeration areas, land areas
(segments), and others. Area units cover specified land areas with defined boundaries.
Non-area units include housing units, households, persons, nomadic tribes, institutions,
construction camps, and other items, and these units must have a clear definition.
Coverage the coverage objective of the frames used for a survey is to provide access to the
elementary units in the survey population and to do so in such a way that every one of those units
has a known (or knowable) probability of selection in the sample for the survey. Access is achieved
by sampling from the frames, usually through two or more stages of selection and by the use of
28
rules of association that link the elementary units to the units that were selected at the final stage of
selection i.e., the USUs.
Media sampling frames may be stored either on print or electronic media. For a frame stored on
electronic medium, it is relatively easy to produce a printout of the entire frame or any portion
desired, and to organize in any desired formant.
Content the frame contains a record for each frame units. The only item that is absolutely
indispensible is a unique identifier of each unit. If a unit is selected, the numerical identifier
provides a means of access to the unit in order to perform subsequent sampling operations or to
collect survey data. The numerical identifier will be linked with other identifiers such as place,
names, or addresses of housing units, either in the frame itself or maps or other auxiliary materials.
Additional Information; there are a number of possible reasons for collecting additional
information during the construction of sampling frame. One occurs when the definition of the
universe or the sampling unit to be covered is rather complicated to apply under field conditions,
and also classificatory information is gathered during the frame listing, and the final decision to
which units are to be excluded or included can be made at a later stage. Another common reason is
for the purpose of stratification and allocation in which the stratifying information must be gathered
and recorded during the frame listing.
The properties can be grouped in to three major categories: properties related to quality, those
related to efficiency and those related to cost
30
Other properties of frame that facilitate the use of efficient sample designs include:
x Choice of sampling units available- organize the frame units in a hierarchical structure and
assigning identifiers to sampling units.
x Good quality map of units available- showing the boundaries of each unit
x Easy to manipulate/ process- computerization of the frame
Above all, the cost of frame preparation must be considered at the planning stage and must be
budgeted for. They are likely to be significant proportion of the total cost and relate to an
element of the survey work which is critical important in determining the eventual quality of the
survey results.
In summary, the sampling frame plays a central role in the design of a sample survey. It
determines how well a population is covered, affects the method of enumeration and influences
the efficiency with which a sample is designed. A frame becomes more valuable if contains some
supplementary information, which can be used to improve sampling, and estimation procedures.
The structure of the frame, the information it contains, and the quality of that information will
determine the type of sample designs, and estimation procedures that can be used in a survey.
Simple frames lacking auxiliary information support simple sample designs. For example, if the
list contains no information other than the identity of the element, typically very simple sample
31
designs are used for selecting the sample. A simple random sample may be selected, or if the list
is large, a systematic sample or a systematic sample of cluster may be used.
Many sample designs use auxiliary data to produce more efficient samples. Complex sample
designs that are more efficient than simple random sampling, such as those employing
stratification, probability proportionate to sample size selection, or special estimation techniques
such as ratio and regression estimators, require additional information beyond the identity of the
target element.
The sampling frame must be accurate and free from defects. It should be exhaustive (no units
omitted), non-repetitive, current or fresh list must be available, the units should be identifiable
without ambiguity, and the lists must be traceable in the field.
32
CHAPTER 4: SAMPLE DESIGN
The general aim of all sampling methods is to obtain a sample that is representative of the target
population. By this we mean that, as much as possible, the information derived from the sample
survey is the same as we would find if we carried out a census of the target population, allowing for
inevitable variation in the estimates due to imprecision.
When selecting sampling method we need some minimal prior knowledge of the target population;
with this and some reasonable assumption we can estimate a sample size required to achieve a
reasonable estimate, with acceptable precision, and accuracy, of population characteristics.
How we actually decide the sampling unit will be chosen makes up the sampling method. Sampling
methods can be categorized according to the approach they take the probability of a particular unit
being included. Most sampling methods attempt to select units such that each has a definable
probability of being chosen. Moreover, most of the methods also attempt to ensure that each unit
has the same chance of being included as every other unit in the sample frame. All methods that
adopt this general approach are called probability sampling methods.
The basis of probability sampling is the selection of sampling units to make up the sample based on
the chance that each unit in the sample frame will be included. If we have 100 units in the frame,
we decide that we should have sample of size of 10, we can define the probability for each unit
being selected as one in ten assuming each unit has the same chance. As we shall see next, there are
various methods that we can use to select the units.
It is important feature of probability sampling that each time we apply the same method to the same
sample frame we will generate a different sample. For a finite population we can use simple
combinatorial arithmetic to calculate how many samples we can draw from a particular sample
frame such that no two samples are identical. It turns out that from any population of N objects we
can draw NCn different samples each containing n sampling units. In fact, in probability sampling
we are concerned with the probability of each sample being chosen rather than with the probability
of choosing individual units. If each sample is equally likely to be selected, then each sampling unit
automatically has the same chance of being included as other sampling unit.
33
4.1 Choice of Sample Design
A sample design is a joint effort of the survey statistician and other experts such as subject matter
specialists, data users, and survey executing agencies. Most statisticians require information from
other experts in order to propose sample design that will meet the required specification of the
users at the lowest possible cost. Among few issues on which they should discuss and reach
agreement may include objectives of the survey, variables to be measured type of estimates
required, levels of reliability and validity needed for the estimates and any restriction placed on
survey with respect to timeliness and cost.
b) Sampling Plan
There are different ways of designing sample survey, but the idea of optimum design started with
the sampling plan features such as selection process, and estimation procedures. The selection
processes deals with the preparation of sampling frame, sample size determination, choice of design
to be used, and sample selection method. The estimation procedure involves the process of for
computing sample statistics and calculating the reliability of these estimates. The purpose is to
develop a sample design that would meet reliability requirements at the lowest possible cost, or
alternatively, to produce the most reliable estimates for a fixed expenditure of resources.
i) Selection Process
After making an assessment of survey objectives, the kind of topics to be covered, description of
coverage, reporting levels, and other issues as discussed above the next step in selection process is
to make a choice of design.
Choice of sampling method there are different types of sampling methods which are likely to be
appropriate for different types of survey, and in different circumstances. It varies from the simplest
kind of sample survey (simple random sampling SRS) to a more complex large-scale sample survey
design (multi-stage sample design). In general, there are two approaches of sampling stages- single
35
stage sample design (un-stratified/stratified) and multi-stage sample design (un-
stratified/stratified).
Un-stratified single stage sample design involves sampling techniques such as simple random sampling,
systematic sampling, varying probability sampling (probability proportional to size PPS) and cluster
sampling. Estimates of the sample size required to obtain measures with a given precision will often
be determined for each design by considering the objective of the inquiry and the permissible
margin of error in the estimates.
Stratified single stage sampling deals with stratified simple random sampling and stratified varying
probability sampling. In stratified sampling the population is sub-divided in to a number of groups
called strata, and sampling is carried out independently in each stratum. Then one can use a
different selection schemes such as SRS, PPS, etc. in different strata. In determining the sample size
and allocating samples to different strata one should take into account the size of strata, total cost of
survey, and variability between strata.
The following are the most common designs used at single stage or at any sampling stages of a
survey.
Simple random sampling it is the simplest kind of sampling methods. It requires as sampling
frame a list of sampling units- households, farmers, institutions, etc in any convenient order. The
items listed must be numbered in sequence, starting one to the first item at the head of the list
continuing up to as many as there are items listed. A table of random numbers is needed to obtain a
random selection of these items and which have been given in the selected numbers that form the
sample chosen for the survey. The use of random numbers ensures that the sample units are chosen
entirely by chance, without being influenced by any person’s unconscious preferences. In a table of
random numbers, each number within the chosen range has an equal chance or probability of
36
selection. Since each element in the sample frame is given one number, each unit has equal chance
of selection for the sample. The sampling could be performed with or without replacement.
Provided that certain conditions are met, LSS can be treated just like a SRS for the purpose of
analysis. The basic requirement of this method is that the list used as the sampling frame must not
have any intrinsic regularity or periodicity of its units. It is therefore necessary to check whether the
frame to be used has built in regularity of this kind.
It is, however, perfectly acceptable for the frame to be used for an LSS to be ordered or ranked
overall in some way. Ordering is indeed advantageous as this can have the effect of making the
sample more efficient.
Varying probability sampling this method utilizes the values of auxiliary variable such as measure
of size in which the size varies from unit to unit. Using this measure of size the selection is easily
performed with PPS. A list of units with their estimated size, say Mi is required, and we cumulate
the values against each unit. Then a predetermined sample size (n) will be selected by using SRS or
systematic sampling. If SRS is to be used we select a random number between 1 and ∑𝑁
𝑖 =1 𝑀𝑖 unit
we obtain the required samples. To apply a systematic sampling we select a random number
∑𝑁
𝑖=1 𝑀𝑖
between 1 and I= . If the selected random number is r, then the sampling number will be r,
𝑛
𝑀𝑖
r+I, r+2I, … , r+(n-1)I. Note the probability of selection of a unit is 𝜋𝑖 � 𝑁
∑𝑖=1 𝑀𝑖
.
37
Cluster sampling clusters can be defined as sampling units containing several elements that occur
in groups naturally or formed artificially. A cluster has listing units associated with it in which the
units can be geographical, temporal, or spatial in nature. Thus, cluster sampling can be defined as
any sampling plan that uses a frame consisting of clusters of listing units. In a single stage sampling,
we select a sample of clusters and completely cover all units within selected clusters. Clusters can
be selected by a variety of sampling techniques. For example, we can select a sample of clusters by
SRS, or systematic sampling or sampling with PPS.
The important reasons for cluster sampling are feasibility and economy. If the only sampling frame
readily available for the target population is list of clusters, then the only feasible method of
sampling is cluster sampling. That is why for surveys of human population, to compile list of
households for the purpose of survey never seems feasible in terms of time and resource. Listing
costs and traveling costs are always lowest for cluster sampling.
The disadvantage of cluster sampling is that the standard error of estimates obtained from this
design are often higher compared with those obtained from samples of the same number of listing
units chosen by other sampling designs. Therefore, one can choose the sampling design that gives
the lowest possible standard error at specified cost or conversely, the sampling design that yield at
the lowest cost estimates having specified errors.
Stratified random sampling on occasions we may suspect that the target population actually
consists of a separate (non- overlapping) sub-population, each of which may have, on average
different values for the properties we are studying. If we ignore this possibility, the population
estimates we drive will be a sort of average for the sub populations, and therefore, be meaningless.
Thus, there are various reasons for stratifications and one must investigate these issues in detail
before resorting to it. The reason could be to increase precision, separate estimates may be
required, administrative convenience, and the nature of the population may force to use it.
In these circumstances, we should apply sampling methods that take such sub populations in to
account. It may turn out, when we analyze the results, that the sub population do not exist, or
there exists but the difference between them is not significant, in which case we will have wasted a
38
certain (minimum) amount of time during the sampling process. If on the other hand we do not
take this possibility into account we will have reduced confidence in the accuracy of our population
estimates.
The process of splitting the population in to sub population is termed as stratification. And such
techniques are called stratified sampling method. In all stratified methods, the population (N) is
first divided into L mutually exclusive and exhaustive sub-populations(strata) with sizes
N1,N2,…,NL where N1+N2+…+NL =N. usually the strata are of equal size(N1 = N2 =…= NL)
but we may decide to use strata whose relative size reflects the estimated proportions of the sub
populations within the whole population.
Within each stratum we select a sample (n1, n2,…, nL), usually ensuring that the probability of
selection is the same for each unit in each strata. This generates a stratified random sample. For this
design, one should consider the overall sample size (n) determination and allocation of samples to
different strata by taking in to account the size of the strata, the total cost of survey, and variability
of the strata.
The sample size for a survey must be decided on the planning stage, together with the sample
design. If done properly, the correct estimation of sample size is a significant statistical exercise.
The sample size required depends upon three factors the level of precision required in the estimates, this
requires specifying the acceptable margin of error and confidence level, the level of variability of the
variables to be estimated which could be measured by standard errors or coefficient of variation, and
the sample design used in which different designs will produce different levels of precision for the
sample size, or conversely, different sample sizes for the same level of precision.
Sometimes we bypass the statistical process by adopting an ad hoc approach of using a fixed sample
proportion (such as 10% of the population size) or sample size (such as 100). In relatively large
population (say at least 2000) this will normally produce results that are no worse than those
produced by a sample based on carefully calculated sample size (provided the sample units that
make up the 10% sample are properly selected so that they are representative of the population).
39
The basis for calculating sample size is that there is a minimum sample size required for a given
population to provide estimates with an acceptable level of precision. Any sample larger than this minimum
requirement (if chosen properly) is should result no less precise but not necessarily more than the
minimum sample. This means that although we may chose a large sample for other reasons; there is
no statistical basis for thinking that it will certainly produce better results. On the other hand, a
sample size less than the minimum requirement will almost certainly produce results with lower
level of precision. Again, there may be other external factors that make it necessary to use a sample
below this minimum. If the sample size is to small, the estimates will be to imprecise. And if the
sample size is too large, there will be more work, but no necessary increase in precision.
But remember that we are primarily interested in accuracy. Our aim in sampling is to get accurate
estimates of the population characteristics from measuring sample characteristics. The main
controlling factor in deciding whether the estimate will be accurate is how representative the
sample is. Using a small sample increase the possibility that the sample will not be representative,
but a sample that is larger than the minimum sample size requirement does not necessarily increase
the probability of getting a representative sample. As with precision, a larger than necessary sample
may be used, but is not justified in statistical ground. Both appropriate sample size, with proper
sampling technique is required. If the sampling process is carried out correctly using an effective
sample size, the sample will be representative and the estimates it generates will be useful.
Assumptions
Estimates produced by a set of samples from the same population are normally distributed. A well
designed random sample is the sampling method what will most usually produce such a distribution.
We can decide on the required accuracy of the sample estimates. For example if we decide the accuracy to be
� 5% , the estimated value must be within five percent either way of the true value within the
margin of error defined.
We can provide a value for the population variance (𝜎 2 ) of the variable being estimated. This is
a measure of how much variation there is within the population in the value of the property we
40
are trying to estimate. In general, we need a large sample to accurately estimate something very
variable, whilst something that has a similar value for all member of the population will require
remarkably smaller sample. As we shall discuss shortly, although we almost never have a value
for the population variance, there are various ways of obtaining an estimate for the use in
calculating sample size.
Based on these assumptions there are several formulae that have been developed for estimating
minimum sample size. The other thing that needs attention in sample size determination is that the
several variables are equally important in a particular survey and the precision requirements for
each of these will then produce a different estimate of the sample size needed. In this case, one
should make a assessment to come up with a single estimated sample size. This includes:
For studies of large and geographically dispersed populations it is more convenient to use multi-
stage sampling designs. It is particularly appropriate where a large scale survey is to be conducted,
and where for logistic and organizational reasons it is convenient for the sample to be grouped
together in a more limited number of geographical areas, rather than being separated thinly and
dispersed across the whole country.
Sampling frames may not be available for all the ultimate observational units in the entire.
41
A multistage sampling plan may be more convenient than a single stage sampling of the
ultimate units, as the cost of surveying and supervision, in large scale survey can be very
high due to travel, identification and contact.
It can be convenient means of reducing response errors and improving sampling efficiency
by reducing intra-class correlation coefficient observed in natural sampling units.
In un-stratified multistage sampling, the sample is selected in stages, i.e., the population is divided
in to a number of PSUs, which are sampled; then the selected PSUs are sub divided in to a number
of smaller second stage units, which are again sampled; the process is again continued until the
ultimate sampling units are reached.
For multistage simple random sampling, at each stage the selection design is SRS, with equal
selection probability for each stage. For example, for two stage simple random sample the selection
method SRS at first stage with equal probability(1/total PSUs) and SRS at second – stage again
with equal probability(1/ total SSUs)in which the method is described in short as SRS/SRS. For a
multistage varying probability sampling with two stage design, the selection method would be
probability proportional to size either at both sampling stages (PPS/PPS) or PPS at first stage and
SRS at second stage (PPS/SRS) and similar procedure can be followed for more than two stages.
A basic principle of scientific sampling is that every sampling unit must have a known, positive
probability of selected. Where the probabilities are equal, the sample design is known as self-
weighting and the formula for calculating estimates are relatively straight forward. Where the
sample design is not self-weighting, the data relating the different sample units have to be weighted.
For two stage sample with constant sampling fraction, an appropriate sampling design would be
SRS or LSS at the first stage for selection of PSUs and again SRS or LSS at the second – stage to
select the second stage units, i.e., SRS/SRS
Example: let us assume that the two kebeles A and B were selected by SRS from a population of
100 kebeles. Assume that kebele A has 500 households and kebele B has 50 households. If the
42
sampling fraction is to be 4% then for kebele B with 50 households the sample size would be 2
households and for that of kebele A it would be 20 households.
The probability of selecting an individual household can be calculated from the joint probability of
selecting the kebele and then selecting of household within the kebele; i.e.,
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑘𝑒𝑏𝑒𝑙𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒
𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 � �
𝑁𝑜 𝑜𝑓 𝑘𝑒𝑏𝑒𝑙𝑒 𝑖𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑𝑠 𝑖𝑛 𝑘𝑒𝑏𝑒𝑙𝑒
2 20
Accordingly, probability of selecting a household in kebele A is � � 0 .0 0 8
100 500
2 2
And that of selecting household from kebele B is � � 0 .0 0 8
100 50
Since the selection for both kebeles is in proportion to the sizes of the kebeles, no weight is needed
, and the design is self weighted, in which the estimation of means, totals, ratios, and proportions is
straight forward.
Another possible choice of design would be two stage sample with probability proportion to size, where
the selection of PSUs at the first stage is with PPS and a fixed number of sample SSUs can be
selected by SRS or LSS. Now consider the selection of kebeles at first stage and selection of
households at second stage. The procedure for PPS is to make a list of kebeles, in any order
together with the total number of households in each. Then for each kebele in turn make a
cumulative sum of households in the list next to each kebele. Then make a selection of required
number of kebeles by using SRS or by systematic sampling.
For example, suppose we have the following list of kebeles together with their individual and
cumulative number of households. We wish to select a total of 15 households from three kebeles.
The first step is to select three kebeles using SRS sampling
A 224 224
B 573 797
43
C 1140 1937 1163
D 253 2190
E 720 2910
F 654 3564
G 310 3974
I 379 4523
J 411 4934
L 399 5550
M 281 5831
For the selection of the sample, we need four digits random number because the total number of
cumulative size is 5831 has four digits. Using SRS and an appropriate random numbers table, the
first random number is 4010, which is associated with the cumulative size of kebele H. the second
random number is 1163 and related to kebele C. the third random number is 5094, which is
corresponding to kebele K. The final sample of three kebeles H, C, K. a fixed sample of size from
each of the three kebele using SRS or LSS methods. The probability of selecting kebele x
probability of selecting household is
1140 5 270 5 217
Kebele C: � � .0009 , kebele H: � � .0009 and kebele K: �
5831 1140 5831 270 5831
5
� .0009
217
44
The selection probabilities are the same in each kebele, and hence the design is self weighting. The
appropriate estimation formula for this sample design will be used.
The estimation characteristics will be a major objective in survey. Population estimates will be
calculated from sample data, and reported together with an indication of the precision of the
estimate obtained from the sample variance. Typical estimates are totals, means, ratios, and
proportions, in which their standard errors will be required to enable confidence levels to be placed
on estimates, and tests of significance can be carried out.
Calculation of population estimates are derived from the type of sampling design used for the
survey. Based on the type of design estimates are raised from an estimate of a small sample to an
estimate of the population by multiplying by the inverse of the sampling fraction. The more
complicated the design in terms of the number of stages, variation in the sampling fraction and
sampling with or without replacement, the more complicated will be the algebra for calculations.
For example, if the design is single stage simple random sampling, then the estimation procedure
for the estimates with their variances are;
∑𝑛
𝑗=1 𝑦𝑗 𝑆2 𝜎2
Mean: 𝑦� � , variance: 𝑣𝑎𝑟(𝑦�) � (1 � 𝑓), 𝑜𝑟 𝑣𝑎𝑟(𝑦�) �
𝑛 𝑛 𝑛
𝑆 2 𝜎 2
Total (Y): 𝑌� � 𝑁𝑦� , 𝑣𝑎𝑦�𝑌�� � 𝑁 2 (1 � 𝑓), 𝑜𝑟 𝑣𝑎𝑦�𝑌�� � 𝑁 2 𝑣𝑎𝑟(𝑦�) � 𝑁 2
𝑛 𝑛
𝑎 𝑝𝑞 𝑁−𝑛 𝑝𝑞 𝑝𝑞
Proportion: 𝑝 � , 𝑣𝑎𝑟(𝑝) � � �� (1 � 𝑓) 𝑜𝑟 𝑣𝑎𝑟(𝑝) �
𝑛 𝑛 −1 𝑁 𝑛 −1 𝑛 −1
∑𝑛
𝑖=1 𝑦𝑖 𝑦 𝑦� 1−𝑓 2 𝑠𝑑2 ∑� 𝑦𝑖 −𝑅� 𝑥𝑖 �2
Ratio : 𝑅� � � � , 𝑣𝑎𝑟�𝑅�� � 𝑠 𝑤�𝑒𝑟𝑒 � �
∑𝑛
𝑖=1 𝑥𝑖 𝑥 𝑥̅ 𝑛𝑥̅ 2 𝑑 𝑛𝑥̅ 2 𝑛𝑥̅ 2 �𝑛−1�
2
∑𝑖 𝑦𝑖2 +𝑅� 2 ∑𝑖 𝑥𝑖 −2𝑅� ∑𝑖 𝑦𝑖 𝑥𝑖
𝑥̅ 2 𝑛�𝑛−1�
The standard errors can be obtained by taking the square root of each variance.
45
If a two stage design is used with PPS/SRS, the estimation procedure would be as follows. At first
stage kebeles are selected with PPS (where size being number of households). At second stage, a
fixed sample of households was drown for each samples by SRS or systematic i.e., 𝑚𝑖 � 𝑚 is the
same for all kebeles( constant sample size) then the estimation procedure would be:
∑𝑖 ∑𝑗 𝑦𝑖𝑗 𝐻 𝑦� ∑𝑖 ∑𝑗 𝑦𝑖𝑗
Mean: 𝑦� � , Total: 𝑦� � ∑𝑖 ∑𝑗 𝑦𝑖𝑗 and Ratio: �
𝑛𝑚 𝑛𝑚 𝑥̅ ∑𝑖 ∑𝑗 𝑥𝑖𝑗
𝐻2 2 1 2
For total variance: 𝑣𝑎𝑟(𝑦�) � �∑𝑖�∑𝑗 𝑦𝑖𝑗 � � �∑𝑖 ∑𝑗 𝑦𝑖𝑗 � �
𝑛�𝑛−1�𝑚2 𝑛
𝐻2 2 2
For ratio: 𝑣𝑎𝑟�𝑅�� � �∑ 𝑖 � ∑ 𝑗 𝑦𝑖𝑗 � � �
𝑅 2∑ ∑
𝑖 � 𝑗 𝑥𝑖𝑗 � �
𝑛� 𝑛−1� 𝑚2 𝑋� 2
Standard errors which can be obtained by taking the square root of the variance, can be used to for
further estimation and evaluation.
m i is the number of second stage sub units (households) within ith first stage unit in the sample.
46
CHAPTER 5: METHODS OF DATA COLLECTION
Surveys are classified in to two according to the time of data collection: longitudinal surveys, and
cross-sectional surveys.
Longitudinal surveys: gather information at different points in order to study changes over
extended period of time. Three different designs are used in longitudinal survey: panel studies,
trends, and cohort.
Panel studies are studies in which the same subjects are surveyed at different times over extended
periods. The investigator observes exactly the same people, group, or organization across time
period.
In a trend study different people from the same general population are surveyed at different times.
In cohort study, a specific population is followed over a length of time.
Cross-sectional surveys: study a cross section (sample) of a population at a single point in time. It
is usually the simplest and least costly alternative. Its advantage is that it cannot capture social
process or change.
The objective of the survey, the nature of the item information, the operational feasibility and cost
will often determine the method of data collection. Of the various method of collecting the data
just a few of them are outlined below:
The data collection method will be determined by the nature of the required information and our
first step is to decide on which of these three methods to use.
47
5.2.1 Extraction of Data from Records
It is usually possible to answer some of the questions a survey is intended to cover from available
data. For example, a mass of information about the population studied by social surveys is available
in historical documents, statistical reports, records of institutions and other sources; it is up to the
survey or to drive what help he or she can from it. However, one must first consider carefully its
stability for the purpose. One must critically assess population coverage, definition, how accurate is
the information, and if sufficiently up to date.
Information from records may serve as complement for analysis and can be used as a base for
preliminary investigation. Therefore, it is advisable to examine exhaustively what is available in
records before launching any survey.
Mail and self administered questionnaire is a method of data collection in which researchers can rive
questionnaires with instructions directly to respondents or mail them to respondents who read
instructions and questions, then record their answers and give it back or return it by mail. This type
of survey has many advantages, which include:
48
They are very effective, and response rates may be high for the target population that is
well educated or has a strong interest in the topic or the survey organization
A low response rate is the biggest problem, the process of returning the questionnaire may
be unnecessarily extended. The researcher can raise response rates by sending non
respondents reminder letters but this adds to the time and cost of the data collection.
The researcher cannot control the conditions under which a mail questionnaire is completed
Researchers cannot usually observe the respondents reaction to questions, physical
characteristics, or settings.
Mail questionnaire is not suitable for illiterate community.
Therefore, the use of this method is limited to predominantly literate society, as the method
requires a clear understanding of the survey concepts and through reading and writing, because of
this its use is limited to developed countries with high percentage of literacy.
Measurement or observation of the subject and interviewing a respondent and obtaining a report on
the matter are two approaches, which are by no means exclusive. It is very common indeed to find
both being used in the same survey. Some topics can only be investigated by one or other approach,
but many can be investigated using either, and in such cases it is necessary to assess which is more
suitable in the circumstance of particular study. Therefore, the type of question and the nature and
status of the topic will determine a required piece of information can be gathered by measurement
or interview approach.
Measurement or observation
Face to face interview is a social process that involves the interviewer and respondent. It is the
process in which the interviewer meets the respondent, explains the purpose of the study, forwards
a set of questions, and records the answers. It is widely used in economic and social surveys.
Information may be collected by interviewer for varies reasons. It may be information which could
be measured directly but would require too much time or too great a use of manpower or funds to
do so, in which case probably less accurate interview method is used instead. It may be information
that cannot be directly measured or observed because they relate to the past. It may be information
about the respondent’s own knowledge, opinion, perceptions or attitude.
Face to face interview has the highest response rates and permits the longest questionnaires
Interviewers control the sequence of questions and can use some probes
Respondent is likely to answer all the questions alone
50
Interviewers can observe the surroundings and can use non verbal communication and visual
aids
Well trained interviewers can ask all types of questions including complex questions
Cost is high, traveling, training, supervision, and personal costs for interviews can be high.
Interviewer bias is also high in this method
The appearance, tone of voice, question wording, and so forth of the interview may affect
the respondent.
The use of telephone interviewing for social surveys has increased in developed countries
substantially in recent years because of the high penetration of telephones. Its major advantages are
lower cost and faster completion, with relatively high response rate. The phone permits the survey
to reach people who would not open their doors to an interviewer, but who might be willing to
talk on the phone. There may be less interviewer bias and less social desirability bias than with
personal interviews.
The main disadvantage of this method is that there is less opportunity for establishing rapport with
the respondent than in face to face situation. Another disadvantage is that households without
telephones and those with unlisted numbers are automatically excluded from the survey, which
may bias results. Those who have phone number blocking may simply ignore calls from unfamiliar
number of surveys. We have to note the use of unpopular and limited in developing countries.
51
CHAPTER 6: INSTRUMENTS OF DATA COLLECTION
6. l. Type of instruments
A data collection instrument is a document used for gathering and recording of data in a survey.
Basically there are two types of instruments to collect data: Structured questionnaire and
unstructured questionnaire.
The first type of instrument, structured questionnaire used mostly in formal sample survey, is a
forma1ized schedule or form and contains an assembly of carefully formulated set of questions for
information gathering. In other words, a structured questionnaire is one of the instruments used in
data collection and which contains written questions that people respond to directly on the
questionnaire form itself; with or without the aid of an interviewer. In a structured questionnaire,
all questions are prearranged in some specified order and the range of possible responses for each
question is provided.
The second type is a checklist of topics (unstructured questionnaire) used, mostly in qualitative survey,
when enquiries are not appropriate for structured questionnaires. An unstructured questionnaire
contains mostly open -ended questions. This type of instrument is used in an informal or
exploratory survey and designed in the form of survey guides, tally sheets, observational forms,
field notes, outline of questions, etc.
Most questionnaires used in sample surveys combine structured and unstructured questions. Since
questionnaire is the main data collection instrument in formal sample survey, this chapter will
discuss the issues involved in questionnaire design and other activities related to it.
All surveys involve presenting respondents with a series of questions to be answered. The questions
may be simple single-item measures or complex multiple-item scales. In whatever form it exists,
especially socioeconomic survey data are basically what people say to the investigator in response to
a question.
52
One major contributory element in the process of formal sample survey for maintaining data is
quality the questionnaire design. In this approach, questionnaires need to be structured and its
design is critical because ‘survey analysis depends on the completeness of the topics covered. A
well-designed questionnaire will enable us to ask the respondents the same questions in the same
way and their answers must be recorded and coded uniformly so that data can be aggregated across
the sample.
Error-free data transfer requires clear, comprehensive questions, good enumeration, and clearly set
out answers. Much of this process depends on good questionnaire design. The form must cater for
coding and subsequent data entry for processing. In this respect there are some questionnaire
design principles, which links between interview and data processing.
x Regarding the content one must include the minimum number of topics to meet the objectives.
Because of resource and time constraints we should focus on items of direct and major interest
and avoid collection of any non-essential information.
x Time for the interview is another factor that must be kept reasonable and this limits the number
of questions.
x The questions must be easy for the respondents to understand and to answer accurately and
clearly.
x The questionnaire should be easy to use as an interview guide for the enumerator and as an
instrument for recording answers.
x It should be designed in such a way that the recorded answers can easily be edited, coded and
transferred onto a computer file for data processing, tabulation and statistical analysis.
53
x The flow, structure and length of questionnaire should encourage and keep the interest of the
respondent.
x Careful thought should be given to the quality of presentation material such as paper, the size of
the sheets used, the clarity of printing and the spaces provided for recording answer.
The process of design is creative and one should develop strong preferences of particular styles of
layout and phraseology since there is no single prescription around which a form can be modeled. A
typical sequence of activities to design a form would have the following pattern.
x Draw up a list of question topics from a mixture of theoretical models, empirical information,
research evidence and terms of reference for the study;
Form design is largely a compromise between opposing criteria: layout for collection versus layout
for data processing. Layout for collection is the ease, speed and accuracy with which the
questionnaire can be completed in the field, while layout for data processing is the ease, speed and
accuracy with which information from the questionnaire can be processed for analysis one should
give equal attention to both the aspects collection and processing.
54
6.3 Types of question
Two types of questions can be used in questionnaires: open-ended questions and closed-
questions depending on the amount of freedom given to respondent in offering
responses. The type of questions for use will be determined by the form of responses sought, the
nature of the respondents and their ability to answer the questions.
Open-ended question
An open-ended (unstructured, free response) question is one which allows the respondent to
answer it freely in his or her own words, and to express any ideas generated from the question
itself: Open implies that the respondent is permitted to answer in any form and at any length
without limitation on the range or complexity of the answer, to the question asked.
Response categories are most often associated with exploratory or informal surveys, in which the -
investigator does not know the likely response from the units of study. It needs a checklist of
topics, guidelines or unstructured questions.
For examp1e: ‘Which crops do you grow?’ The question does not specify any particular season or
crops or plots and hence many answers are possible. It is open for discussion.
Why did you say you would not buy imported cooking oil when it is available in the market? V
again this could be discussed since the reasons could be quality, taste, price, etc.
x They permit an unlimited number of possible answers which may not be considered at initial
stage of the questions’ design.
x Respondent can answer in detail and can qualify and clarify responses by expressing in
his/her own words.
x They may be used when there too many categories to list on a questionnaire.
55
x They are useful when the questions are too complex to reduce to a few standard responses.
x The answers are not standardized and are therefore difficult to compare and to make
statistical analysis.
x They require a higher level of skills on the part of the data collector since responses are
written verbatim.
x The forms are often bulky because answers take up a lot of space in the questionnaire.
Closed-ended Question
Example:
56
1= 1-2 2=3-4 3= 5-6 4=7-8 5 =
More than 8
d) Has the road construction activity had impact on your access to public services (health,
education, market, etc)? Yes =1, No =2
The choice can be made by making a mark alongside a category; by entering a numeric value; or by
selecting a code from a code list. Setting categories of responses requires skill and experience in the
areas of studies and suits computer processing.
x The questions meaning is often made more clear by the response categories
x The answers are relatively complete as long as all relevant categories are specified
x The respondent can guess at answers when they, don’t know since they have the
categories to guide them
57
x Failure to understand the question is 1ess easily, detected than with an open-ended question.
x A poorly planned list may act as a constraint to\correct answers not catered for
x Too few categories may fail to differentiate between important groups, and enumerator
error (placing the tick in the wrong box by accident will be more common)
x A verbatim listing of every question, with complete wording and instructions on the progression
of the respondent through the form. It is commonly found in forms that are designed for self-
enumeration or where it is critical to the study that precise wording is used at every
interview. It can lead to lengthy and complex questionnaires and is rarely found.
x A listing of questions in a specific order, but without full or precise wording of the questions, or
instructions for progression through the form. The form is normally accompanied by a
detailed reference manual, in which questions are specified in full and examples given. The
form will be completed by a trained enumerator and hence, careful training is necessary to
ensure that enumerators follow the guidelines when they interview respondents.
x A tabular row and column format in which spaces are indicated for response, usually in coded
form, without any specification of questions. Question order is indicated by the sequence of
response categories. It is designed specifically to accommodate the needs of data processing.
In this case, reference manual is very important and comprehensive training, and
experienced enumerators are essential if it is to produce satisfactory results.
x A checklist of topics, indicating key facts to be covered, but with answers recorded either in an
unstructured way in a field notebook, or a simplified row/column table. The checklist
approach can be used in an informal study and which requires an experienced workers or
professionals.
58
6.5 Question Phrasing and Common Problems Which Arise with Question Phrasing
Another aspect of questionnaire design that needs serious consideration is phasing of question. The
information required should be well and clearly defined at each stage at which a question is posed:
initial definition and explanation in the survey manual; text in the questionnaire; precise ‘units for
physical measurement; and verbal phraseology by the enumerator.
x A clear meaning,
a) leading question
A leading question is one that leads the respondent to choose one response over another by its
wording. The presentation of question should be neutral. The form of the question should not
indicate a preferred or ‘correct’ answer. For example, the question, ‘You don’t smoke, do you?’
or ‘Do you buy the fertilizer recommended by the extension worker’?’ leads respondents to
state that they do not smoke in the first ease, and that they should buy fertilizer recommended
by the extension worker and that they are wrong if you fail to do so in the second case.
b) Multiple questions
Multiple (double-barreled) questions are questions which combine two or more distinct
questions into one single question. For example: ‘Do you like listening radio and watching
television?’ ‘Do you have a tractor or plough?’ “Does this company have pension and health
insurance benefit?’ In this case one would be confused and undecided as to which answer one
should offer. The best way to avoid confusion is to replace double questions with two or more
single questions and then to ask only one question at a time.
c) Ambiguous question
59
Ambiguity, confusion, and vagueness must be avoided from a question since different people will
understand the question differently and in effect their interpretation will depend on the
individual respondent. The question, ‘What is your income?’ could mean weekly, monthly, or
annual; family or personal; from salary or front all sources; for this year or last year. The
question, ‘Do you drink beer frequently?’ is ambiguous because the word frequently does not
specify a fixed time reference. Vague words and phrases like ‘kind of’, ‘fairly’, ‘generally’,
‘often’, regularly, etc., should be avoided.
d) Probing Questions
Probing is not easy. A delicate balance has to be struck between persistence and rudeness. Very
often the respondent does not want to tell the truth. In some culture it is socially acceptable to
tell lies to close friends, never mind strangers. The enumerator working on a repeated visit
survey has to maintain a working relationship with the respondent and cannot permit the need to
resolve minor contradictions on a few questions to disrupt the relationship. In some cases
unbelievable data have to be accepted, and it is helpful if some method is agreed for the
enumerator to draw attention to this on the form.
The language of a question should be simple. The aim in the question wording is to communicate
.with respondents as nearly as possible in their own languages. Thus the wording of the question
mist be appropriate to the respondent. Question should avoid the use of technical terms and
jargon, which the respondent may not understand. Where it is necessary to use technical or legal
terms, one should provide definitions and explanations.
For example: ‘Do you use inorganic fertilizer?’ It is better to specify types or brand names or
colloquial terms with which the respondent will be familiar. Also use terms which the
respondent will understand and which will not cause offence. For example terms such as
‘peasant’, or ‘tribe’, or ‘witchdoctor’ may cause offence.
f) Sensitive topics
60
In some cultures people do not like to discuss private matters openly. Sensitive questions are apt
to be irritating, threatening, or embarrassing to the respondent. Such questions are prone to
normative answer, answers which confirm that the respondent acts within the special rules of
society even if that particular individual sometimes acts outside these rules. In a society which
generally condemns drunkenness, question about drunkenness might generate denial even if
drunkenness sometimes does occur. Under this circumstance it may be useful to word the
questions so that there is some assumption that the activity does take place. Thus rather than ask
do you ever get drunk?’ we might ask ‘how often do you get drunk?’ the assumption in the
question that you might some ‘times get drunk may ease the guilt of the respondent and generate
a more truthful answer.
Questions on age, physical or mental disability, deaths in households, income, sexual behavior,
family planning, are relatively regarded as sensitive issues.
Special attention should be given during field testing of the questionnaire to identify particularly
sensitive questions and how they can be improved by rewording or better interviewing
procedure.
61
CHAPTER 7: PRE -TESTS AND PILOT SURVEY
7.1 Pre-tests
It is difficult to plan a survey without a good deal of knowledge of its subject matter, the population
it is to cover, the way people will react to questions and even the possible answers they are likely to
give Particularly for large-scale survey it should be the general rule to conduct pretests and pilot
survey in order to get solution to the following questions.
x How is one to estimate how long the survey will take, how many interviews will be needed
how much money it will cost?
x How, without trial interviews, can one be sure that the questions will be as meaningful to
the average respondent as to the survey expert?
Pretests and pilot surveys are standard practice with professional survey bodies and are widely used
in research surveys.
The pretest is a preliminary application of the data gathering techniques for the purpose of
determining its adequacy this may take the form of a series of sma1l pre-tests on isolated problems
the design. For in testing questionnaires, pre-testing refers to one or more series conducted on
successive drafts of the questionnaire for the purpose of identifying and correcting errors and
shortcomings. Its objective is to evaluate the general receptivity and feasibility of the questionnaire
and identify specific problems of communication on between the interviewer and the respondent in
terms of specific questions or items of information sought.
A pilot survey or pilot study is generally a full-scale dress rehearsal of the survey. A major purpose of
pilot study is to check whether organization and arrangements of the survey actually work
satisfactorily. The whole of the survey operation in all its aspects must be tested out on small scale.
This approach thus checks the administrative and organizational arrangements in general, the
62
arrangements for the supply and distribution of all the resources and equipment needed for the
survey, as well as the fieldwork operations, the survey forms and manual, sample size
determination and the data processing.
It should proceed through all the stages and operations of the survey proper, but on a small scale in
a few selected localities. These localities should be chosen to cover as complete a range as possible
to the types of area and population of different characteristics to be covered by the survey. But
there may not be enough resources or time to cover as needed, in which case the priority is to
cover a few areas but over a broad range of characteristics. Thus, the size and design of the pilot
survey is a matter of convenience, time and money. It should be large enough to fulfill the above
functions.
Since the purpose of the pilot study is to identify weaknesses and problems with the survey
materials, procedures and arrangements, the senior technical should be closely involved in the pilot
study to observe all stages of the work as it is being done under field conditions. In other word, the
survey forms and procedures must be observed under operational conditions in the field if
problems are to be correctly identified, and appropriate solutions found.
If it is properly done, it is likely to lead to changes to the survey forms and manuals, and to the
procedures and organizational arrangements. It is therefore necessary to allow enough time to
analyze the results and observations from it, and produce revised materials and arrangements in
good time for the start of the main survey operations.
The pilot survey has many benefits in particular if the survey is to be conducted for the first time. In
general it provides guidance on:
x The adequacy of the sampling frame from which it is proposed to select the sample.
x The estimates necessary for determining the size of sample needed in the actual survey so
that the final estimates may be made with stated precision.
63
x The non-response rate to be expected, i.e., the probable numbers of refusals and non-
contacts can be roughly estimated from the pilot survey or pretests and ways of reducing
non-response can be sought.
x Making a sensible choice from alternative methods of collecting the data (observation, mail
questionnaires, personal interviews, etc.).
x The adequacy of the questionnaire, which is probably the most valuable function of the pilot
survey.
x The probable cost and duration of the main survey and of its various stages.
x The deficiency of the organization in the field, in the office, and in the communication
between the two.
64
CHAPTER 8: SURVEY COST ESTIMATION
Once there is an agreement to proceed with the survey, a planning time table drawn
up in order to facilitate planning and budgeting. Scheduling for field operations must take
into account two key aspects:
Important activities to be carried out starting from the beginning to the end must be listed at the
planning phase to ensure that certain activities are not overlooked. These activities should be listed
against their target approximate time needed to perform each activity. Be realistic about the time
necessary to complete each stage of the work. The following time schedule is an example of
personal interview study
Some of the activities are performed simultaneously while others need the completion of other
activities and must be presented in a form of a chart so that the required time can easily be
estimated.
A bar chart approach in presenting the time schedule was developed by Henry L. Gantt and for the
activities indicated above it shows a more realistic time span, in which 21 weeks is required instead
of 32 weeks as illustrated below.
66
The following chart is an example of Gantt chart of study activities.
2 4 6 8 10 12 14 16 18 20 22
Pretest questionnaire
Finalize questionnaire
67
Budget preparation involves the assignments of cost to each survey activity. The main expenditure
items include:
x Office wages and salaries (administration, executive personnel, quality control, data
processing);
x Survey materials;
x Supervisory and interviewing costs (enumerators’, supervisors’ and field officers’ salaries
and allowances);
x Supplies for the reproduction of questionnaires, forms and manuals and other stationeries;
x Transport cost;
x Computer services;
x Sampling design cost
x Other administrative costs (Office rentals, overheads recovery); etc
Preparation of a preliminary budget estimates is a priority activity that should be planned and
executed at an early stage. The budget will depend on the survey design, including the levels of
precision desired or various estimates, as well as on the geographical and other classification for the
presentation of the results, and the operational conditions prevailing in the region.
68
Example of Budget Preparation for Survey:
1. Office Experts
1 Survey director for 1 month at Birr 10,000 per month 10,000
1 Field organizer for 1 month at Birr 6000 per month 6,000
1 Survey statistician I month at Birr 600 per month 6,000
sub-total 22,000
2. Field Personnel
a) Salaries
50 enumerators for 2 months at 400 Birr per month 40,000
10 Field supervisors for 3 months at 600 Birr per month 18,000
l0Drivers for 3 months at350Birr per month 10,500
sub-total 68500
b) Allowances
50 Enumerators for 1.5 months at 25 Birr per day 56,250
10 Field supervisors for 2 months at 30 Birr per day 18,000
l0 Driversfor2months at25 Birr per day 15,000
50 Guides for 2 mii1s at 10 Birr per day 1,000
sub-total allowances 90,250
70
CHAPTER 9: FIELD WORK
After completing the preparatory activities, the actual fieldwork will be carried out. An investigator
with responsibility for providing statistics for many subject matter areas must organize a fieldwork,
because it is crucial, fundamental stage on which everything else depends. It involves recruiting and
training of field, staff actua1data collection supervision, quality check, field administration and co-
ordination. The survey organization must establish its policy on these and some of the common
topics and problems’ of fieldwork.
For formal surveys of relatively 1argcale it is more common to have to use lower-level field
workers. Field workers are required for the collection of data where personal interview or
measurement is used as a method of collecting data. The quality of these workers is one of the most
crucial factors in determining the quality of information. It affects the quality of the data at the
point they are collected, and errors at this point are the most difficult to detect and the most
difficult to correct or salvage at any later stage of the data collection process. Therefore, extra care
and attention at the stage of recruitment of field workers is very essential.
One of the problems that are bound up with fieldwork is to employ full-time or part-time field
workers. There are three possible types of approach to the recruitment of field workers.
Depending on the scale and type of survey:
x Field workers might be recruited for a particular survey for a limited period only; or
x One should establish a permanent body of field workers to conduct a continuous survey
program; or
x Use an existing group of people, either from an established data collection organization of
some kind such ‘as a national (regional) statistical office, or a development agency or
ministry, or a university, etc.
71
Each approach has its advantages and disadvantages. In this aspect some of the points are outlined to
give some ideas.
x When resources (funds or personnel) are very limited, using staff of other organizations is
sometimes convenient.
x For ease of management and quality data collection, it is better to have one’s own permanent
staff which has the advantage of a permanent system of administrative and logistic support.
x Staff from other organizations may bias the results towards their preconceptions if the
function of the field workers involves the teaching or motivation of the population in the
subject of the survey.
x It is difficult to control a piece of work when operating through another organization. There
is loose responsibility in supervisory process, in which ensuring and assessing data quality
becomes difficult.
The duties of an interviewer include searching for obscure addresses, securing co-operation
from respondents and administering questionnaires. Thus, they play a crucial role in
performing the quality of field work. The value of the information obtained depends on his/her
skill, good sense and accuracy. The attributes of “good” field workers will vary depending on the
cultural, physical environment and to some extent on the content of the survey.
x The primary role of an interviewer is to gather data upon which major decisions are based
x The interviewer must be well informed about the survey and its objectives
72
x The interviewer must establish good relations with the respondent, avoid rousing
unnecessary prejudice, confusion or resentment, and always respect the confidence on
which the respondent has given information.
x The interviewer should make the respondent aware that all information collected will be
treated as strictly confidential,
x The interviewer must motivate the respondent to supply comprehensive and accurate
answers.
x Establishing minimum requirements with respect to general education and physical fitness which
include level of education, age limit, health conditions, language, sex, etc. For example, regarding
age, persons outside 20 to 45 may not be appropriate for field work, and in some cases the upper
age limit may go down to 35.
x Testing enumerators on the ability to read maps and to make changes and clerical duties such as
handwriting, form filling and ability to follow instruction;
x Making of dummy interviewers with field personnel and/or observing enumerators working in
the field; and
x Assessment of interviewers qualities, which include the following: being confident, appearing
relaxed, being neutral, conscientious regard for detail, absolute honesty and integrity, work under
difficult conditions, abide by instructions, being able to write legibly and accurately, pleasant
appearance and manner, tact (intelligence and education).
73
No matter how thorough and searching recruitment is, there will always be some chosen candidates
who prove in practice to be unsuitable, or who find the work unsatisfactory. It is therefore
necessary to make an allowance for wastage when making the initial selection, to include a period
of trial employment upon first recruitment of enumerators. - The functions of the field staff mainly
fall into three categories: data collection; supervision and quality, control of data; and
administrative or clerical duties and co-ordination. A duty statement must be prepared in detail
according to these activities. To operate such functions, all field staff must have appropriate
qualifications and be sufficiently trained.
A large-scale survey requires hiring several field workers. Good training, adequate pay and good
supervision are important for consistent high-quality performance. An effective training program is
essential to complete field operations efficiently and on schedule. Before going into the actual
fieldwork, all the field workers should be trained on some important general introductory
procedures and specific aspects of survey.
There are many satisfactory ways of framing training, but a few general principles can be applied.
They should know how to give some insight into the overall all work of the organization. This
includes general issues as why the survey is being done, its relevance to the national and local
development, the rationale of sampling (if applicable), the purpose of particular measurement
techniques, by whom and how the results are to be used, about confidentiality of the information
gathered and how to handle the survey in the field.
There are also several specific elements to field workers training. Such issues include, which usually
indicated in a survey manual for regular use, description of the survey’s work, methods of data
collection, interviewing techniques, how to check and handle completed questionnaires, what to do
with non-response, standard definitions used in its questionnaires and content of questionnaires.
For example, each individual question or item needs to be discussed, and any issues of concepts,
definitions, coverage, reference periods and inter-relationships between different questions all need
to be fully explained.
74
The training procedure could be a formal training of courses that followed the teaching types of
lecturing, demonstrations and discussion in the class, practice of mock interviews in the class, trial
interviews or practices in the field, discussions on the results from field practices and performing
evaluation before deployment.
During the initial training period, survey manual is needed for immediate use and for reference
purposes during subsequent fieldwork, and refresher training. It must be fully comprehensive and it
must be easy to use both to obtain a general grouping and for referring to specific points The full
manual needs to cover all the topics that include the following.
x Context of the survey at least should treat reasons for conducting it, relevance to
development, purpose of particular questions and place within survey program.
x Content of the survey: may contain relationships between the questions, and for each question
indicate the concepts, definitions, and coverage and reference period.
Moreover, supervision staff needs additional training in supervision. This should include training
the personnel management skills of motivating and leading enumerators, in the specific skills of
supervising and checking field work, in organization and record keeping, and in the training of
enumerators.
75
Specific arrangements are very much dependent on local conditions, Arrangement of training
centre and transport facility must be made in advance by considering the magnitude of survey team
and the local transport conditions.
Apart from transport, the largest items of equipment needed are likely to be measuring instruments
of various kinds. The need for such items is survey-specific, depends on the type of survey to be
conducted.
Next, the most survey-specific equipment of all is the survey forms and questionnaires, and survey
manuals and other documents. These must all be planned, designed, printed and distributed in
good time. A document control system for the completed survey forms must also be established,
and any related summary data forms must also be designed, produced and distributed.
In addition, the question of stationary supplies is easy to forget, but vital to remember. The supply
may include pens, pencils, erasers and pencil sharpeners, clip-boards, bags, etc.
b) Public relation: It is important to publicize the survey by informing and involving responsible
local government and administrative personnel and the traditional community hierarchy in
developing countries. Their cooperation is essential in giving local credibility to the survey work.
They can also be helpful in providing facilities or giving information on local conditions and may
even be obstructive if not kept fully informed. In addition, channels of communication through
local organizations are also useful for publicity purpose. Local social and community groups,
farmers’ associations and cooperatives may all be appropriate for communication.
All local leaders and other prominent people can be particularly useful in spreading information and
even assist in persuading reluctant respondents once their own cooperation is gained. For publicity,
depending on the nature of the survey, some other channels such as radio, television, newspapers
and magazines, posters and leaflets can be used in all locally important languages.
Finally, the population affected by the survey must be given information about the survey. It is also
important to give them the opportunity to ask questions about the survey, and not simply told.
76
Supervision and quality checks are part of field management. In fieldwork, supervision is an
important duty aspect of supervisors. They should play a major role in controlling, coordinating,
checking the quality of field work and leading field administration. The fieldwork of supervisors
mostly consists of the following activities:
x Monitoring the progress of the field work and taking remedial action, if necessary,
x Coordinating the full range of administrative support services required at field level and
Since interviewers are human beings, their works are liable to have mistakes. It is therefore
advisable that supervisors should observe the enumerators at work, and check that they are
following correct procedures of data collection, whether enumerators made all the measurements
or interviews as expected, whether the response rate is satisfactory at different levels, whether
interviewers are asking questions interpreting and recording answers in accordance with
instructions.
There are some ways of checking the quality of interviewing but the three major ones are
observation of measurements or interviews at work, review and editing of completed
questionnaire, and re-interviews for a sub-sample of assigned units.
77
For large-scale survey, it may be necessary to prepare manuals for supervisors and office staff Its
content must include administration, coordination, supervision and quality checks.
78
Chapter 10: Survey Analysis
10.1. Data Processing
Data collecting using statistical techniques are in the form of numbers, and these numbers
represent values of variables, which measure characteristics of subjects, respondents, or other
cases. The numbers are in a raw form (raw data), on questionnaires, note pads, recording sheets,
or paper. The raw data needs to be converted in to a form suitable for analysis and interpretation.
Data processing is, therefore, the link between data collection and data analysis. This can be
achieved through sequence of activities, which include editing, coding, entry and tabulation.
a. Editing
It refers to checking and correction of data manually or by computer. The checking involves
whether the information contained in the questionnaire is complete, recorded in the prescribed
manner, accurate, internally consistent and from eligible respondent.
Checking for completeness involves ensuring that no section or page of the questionnaire is
missing and no answer to any question is omitted. It is important to check whether the
respondents refused to give an answer, the interviewer forgot to ask the question or record the
answer, or the question was not applicable to the respondent.
Checking for consistency can also contribute to data quality. This implies correction on the basis of
logical and substantive criteria, internal consistency and other information available within the
questionnaire. One must also try to check whether the answers are accurate. Inaccuracy may occur
due to carelessness or to a conscious attempt to give misleading answers, and it may arise from
either respondent or interviewer. An important area here I to spot possible interview bias in data
entry , for example, when a fixed pattern of response consistently appears in the questionnaires. In
addition, one must check for the respondents’ eligibility, whether the appropriate respondents
were contacted or not.
There are two kinds of editing, namely field editing and central editing. Field editing is intended to
uncover errors in recording response during the data collection stage. It allows access to the
79
respondent for correction and additional information. The Central editing is performed when the
completed questionnaire are returned to the office. The objectives are to correct major errors
such as those related questionnaire identification, and to prepare questionnaire for coding and data
entry so as to minimize the possibility of error in these latter operations.
The modes of editing include manual editing and computer editing. Manual editing is performed
by a group of editors, usually the field supervisors or trained editors. These editors are give a set of
editing instructions specifying in detail the rules and guidelines to be followed in editing. For
example, when an error is detected, the editor insert the correction alongside the original entry
which should be never be erased. The problem with manual editing is that it is time consuming
and costly exercise. The editor also liable to make mistakes and there is no guarantee that all
erroneous responses will be deleted.
Computer editing involves the use of computer facilities to detect inconsistencies in the
questionnaires. It allows a large number of editing (cleaning and validation) instructions to be
executed simultaneously, and hence speed and accuracy achieved.
As a whole, the important thing to remember in editing, regardless of the approach taken, is that
the objective is to present the trust picture of the universe represented by the survey and not to
hide deficiencies in the data collection operation.
b. Coding
Coding refers to the process of identifying and assigning a numeric character symbol to
questionnaire entries with the objective to prepare the data in a form suitable for entry in to the
computer. The coding procedure is a set of rules starting that certain numbers are assigned to
variable attributes. Researchers begin thinking about a coding procedure and code book before
they collect data. For example, a survey researcher pre-codes a structured questionnaire before
collecting data. With unstructured questionnaire, the survey responses will need to be classified
and then post-coding will start.
c. Data entry
80
The data must be transferred from raw data forms in to a format for computers. The aim is to
store the data in a machine-readable format, and then to use it for cleaning, validation and
statistical analysis. A subsequent transfer of parts or all of the data from one sheet or file to another
is also possible. There are different ways of transferring data of which direct entry and optical scan
sheet are the most common methods. Direct data is the more common and generally more
appropriate approach in developing countries.
When transferring data from one medium to another, it will give rise to a number of possible
errors. These errors may occur because some data may be lost, some data may be repeated, and
the value of some data item may be changed. All are due to entry clerks and can be spoted, but
only through the time consuming process of checking the entered data against the original forms.
Methods of checking include:
Either double entry approach, which involves entering every data item twice,
independently by different people, and compare the two versions for inconsistencies.
Or data list can be compared visually with original forms. This requires people working
in pairs, in which one reads the correct data from the survey form while the other
checks entries in the listing.
In undertaking the data processing activities mentioned above, the manpower required (editors,
data entry clerks, etc) must be recruited and trained.
d. Tabulation
Tabulating refers to simply counting the number of elements/ cases that fall in to each coded
category. Its primary objective is to organize data by groups so as to present information in a
quantifiable and readily understandable format. Data tabulation may take the form of a simple
tabulation or cross tabulation.
Simple tabulation involves counting single variable, and presents an empirical distribution of the
number of observation that fall in to each category of response. For example, data on gender can
be tabulated for male and female.
81
Simple tabulation sex:
Class Frequency
Male 500
Female 450
Total 950
Cross tabulation is a technique organizing data by specific groups, categories or classes to facilitate
comparison. In cross tabulation, two or more if the variables are treated simultaneously, the
numbers of cases that have the joint characteristics are counted. An example of three-way cross-
classification is shown below.
Ideally, the general tabulation plan would have to be devised at the questionnaire design stage.
However, at the stage if implementation, the data processor requires from the subject matter
specialists detailed and unambiguous specification of exactly how each table is constructed and
what its layout is.
82
Before tabulation it would be necessary to decide whether to use the weighted or the un-weighted
data. If the weighted data are used, it is important to weight the data appropriately depending on
the type of sampling design used, and to adjust for non coverage and non response.
Another way of presenting the same tabulated data is in graphic form. Some common types of
graphic presentation are the histogram, bar chart, pie chart, line graph and frequency polygon.
The main divisions of data analysis and interpretation, together with the respective statistical tools
and techniques adopted are outlined as follows:
a. Describing data
Measuring central tendency (arithmetic mean, median, mode)
Measures of dispersion (range, quartile deviation, standard deviation)
Statistical estimation (point and interval estimation, assessing differences)
b. Testing hypotheses
Formulate the null and alternative hypotheses
Specify the level of significance (α)
Select the appropriate test statistics
Determine the critical value for the chosen level.
Compute the value of the test statistics using the sample data.
Compare the computed value of the test statistics and the critical value.
Reject the null hypothesis H0 if the computed test statistic falls outside the accepted
region.
83
Accept the null hypothesis H0 if computed test statistic falls within the accepted
region.
In significance testing, the large number of statistical techniques is available and one has to select
the most appropriate on for use in a particular situation. The appropriate test will vary according to
the scale level of the data.
Statistical test for nominal data include chi-square analysis ( chi-square goodness-of-fit test, chi-square
test of independence), McNemar test, Chocran Q test.
Statistical test for Ordinal Data include Kolomogoroov-Semirnov test, median test, and Mann-
Whitney U test, Kruskal-Wallis test, Wilcoxon T test, Friedman 2-way analysis of variance.
Statistical test for Internal and Ratio Data: The Z-test and t-test are more powerful statistic tests which
are designed specifically for interval and ratio data.
c. Measuring Association
Many statistical techniques are available to measure the intensity of association between variables.
This part requires relatively more sophisticated techniques. The association can be measured by
simple and multiple regression and correlation, multiple discrimination analysis, multivariate
analysis of variance, canonical analysis, factor analysis, cluster analysis, dimensional scaling. The use
of the appropriate techniques would depend on the number of variables involved, the level of
measurement, whether variables are dichotomous or multi-chotomous.
A popular report is intended for a more general audience who is interested in reading the survey
findings but would not be particularly bothered about the survey techniques adopted. It makes less
use of detailed, complex statistical tables. It is designed for rapid reading and easy comprehension
of the main findings of the survey. With these objectives in mind, the report will normally make
more use of flow diagrams, pictures, charts and graphs.
There is no standard style of format for a survey report. The form, length, style and degree of
technicality of a survey report will depend on the subject, size of the study, type reader for whom it
is intended and, to a lesser extent, the relationship between the survey sponsor and the
investigator. The major components of an actually report appear below, which is merely a
suggested format.
Title page
Content page
Executive summary
Back ground/ Introduction
Survey objectives/ Describing the methods
Detailed findings/ Results and tables
Conclusions and recommendations
Appendix and reference
85
Chapter 11: Non-Sampling Error
11.1. The Nature of Survey and Error
The ultimate objective of any statistical survey is to produce estimates for specified characteristics,
applicable to specified population at a given time. The estimates are the results of the survey which
may require a serious of steps in the process. These results of a survey are used to make quantitative
statement about the population studied. These may be descriptive statement about the aggregate
population, analytical statements about the relationship among subgroups of the population, or
interpretive statements about the nature of social or economic process.
A survey error occurs when there is a discrepancy between the statements and reality. The error of
a particular survey estimate is the difference between that estimate and the true value of the survey.
Survey errors are generally divided into two major types: sampling error and non-sampling errors
Sampling errors are present by design and results from the conscious choice to study a subject to
get an estimate from a sample rather than from the whole population. Effort to control sampling
error are grounded in a well-developed theory, so that in designing a sample a researcher focuses
on development of estimation formulas and random selection techniques. Sampling errors are not
the result of mistakes per se, although mistakes in judgment when designing a sample may cause
larger error than necessary.
Non-sampling errors comprise all errors apart from sampling error that contribute to survey error.
It would arise even if the whole population units were investigated. Non-sampling errors are often
thought of as being due entirely to mistakes and deficiencies during the development, execution,
and analysis of survey procedures. Non-sampling errors are said to arise from wrongly conceived
definitions, imperfections in the tabulation plans, failure to obtain responses from all sample
members and so on. There is no comprehensive theory for assessing the impact of non-sampling
errors because of the complex nature of surveys and the multiple opportunities for errors. Some
common types of non-sampling error are response errors, measurement errors, recording and
86
transcription errors, processing errors, selection bias, etc. Unlike the sampling error, these errors
are not random in nature. They usually tend to bias the estimate in one direction.
The effect is summarized as follows. Total error = sampling error + Non-sampling error
It can be measured by: Mean Square Error = Sampling Variance + Square of the bias
A detailed treatment of the sources, measurement and control of non-sampling errors requires that
they be further broken down and categorized in ways that facilitate understanding of their nature.
Several schemes for classifying non-sampling errors are possible. One approach is to classify non-
sampling error by the stage of the survey in which they occur. The three major stages of survey are
survey design and preparation, data collection and data processing and analysis. Each of these stages
can be sub divided and they are useful in discussing the control of non-sampling error.
A second method of approaching non-sampling error is on the basis of observational and non-
observational errors. Observational errors include questionnaire error, data processing error, and
analysis (reporting) error, while non-observational error consists of interviewer error, respondent
error, and coverage error. The underlying measurement and control of non-sampling errors of
these types are as follows.
11.2.1. Non-observational
87
a) Coverage error
Coverage errors include non-coverage and over coverage of the survey units. Non-coverage is
failure to include some units of observation, either directly or implicitly in the operational sampling
frame. Failure to cover all units will result in an undercount of the total population. In such cases,
non-coverage will lead to error in the sample results if the missed units differ in characteristics from
the unit covered.
Errors of non-coverage should be distinguished from deliberate and explicit exclusion of the section
of the population from the defined target population. Survey objectives and practical difficulties
determine such deliberate exclusions. Such explicit exclusion from the study population are not
errors of non-coverage.
Over coverage or duplication is the inclusion of some units in the frame more than once, giving
them a larger than intended chance of selection into the sample. The sum of the absolute values of
non-coverage and over-coverage errors gives gross coverage error.
Frame: It is necessary that units at each stage of the sampling frame be exhaustive, non-overlapping
and uniquely identifiable in the field. Coverage errors arise because one or more of these conditions
are violated. Exhaustive means that all elements in the target population are accounted for. Non-
overlapping means that each lower stage unit belongs to one and only one unit at the next higher
stage. Uniquely identification refers to an unambiguous description of each unit, and
correspondence between the frame and the actual situation in the field.
Use of inappropriate sampling frame: The use of fixed area sampling frames is not generally applicable
for nomadic, semi-nomadic or other highly mobile population. For such populations, special
methods such as sampling through tribal structures or enumeration at water points have to be tried.
For mobile population appropriate methods have to be applied. Whatever the method used, serious
under coverage of such population can be expected in many circumstances.
88
Incorrect application of sampling procedure: When field workers are asked to select the sample
themselves, errors are more likely to occur and more difficult to control. The reasons behind this
are that:
Some of them will tend to favor smaller households to keep their work load small.
Others, with good intention, may substitute larger neighboring households for small
households.
Some field workers may favor more accessible, centrally located units to those at the
boundaries of the sample area.
Clearly, asking enumerators to select the sample is a bad practice and should be avoided.
b) Interviewer error
The interviewer is responsible for collecting data from the respondent in the most accurate and
efficient manner using proper interviewing techniques. A skilled interviewer can help the
respondent to provide accurate responses. At the same time, interviewers can be source of error by
failing to put the question clearly, by influencing respondents to answer incorrectly, by mis-
recoding correct responses, cheating, interviewers` variability, and error in respondent selection
c) Respondent error
Respondent error can be broadly classified into two categories: response error and non-response
error.
i) Non-response error
Non-response error arise from failure to include a designed sampling unit, or population elements
such as households, or other units of observation, which have been selected for inclusion in the
survey, or all or some of the data item that were to be collected. There could be several reasons for
failure to obtain complete results from all the units selected and it can arise from several different
sources, depending upon the survey situation. For example:
89
The field workers may fail to locate a selected household or sampling unit due to
inaccessibility of some of the sample areas or because of civil disturbance, security
problems, floods and other natural calamity.
The failure to contact respondents because the respondents may be away for the
entire survey period when the interviewer calls.
Respondents unable and unsuitable for interview because of physical, mental,
emotional, or language problems.
Failure to gain cooperation. This can happen when the respondent may be unwilling
or unable to respond due to many factors such as the reputation of the organization
conducting it, the nature of the questions to be asked, the length of the interview, etc.
The completed questionnaires may be lost or damaged after the interview because
field workers could not protect them properly.
Some of the data cannot be used because of poor quality or cheating.
Response error occurs in the data collection phase of a survey, and is distinguished from errors
which occur in the data processing phase. When information is obtained from respondent but it is
incorrect, it refers to response error. The response error may be unintentional or it may be
deliberate on the part of the respondents. A person may not know his exact age, or he may report
his age wrongly even when he knows it. The fear of the “evil eye” is known in some cultures in
reporting births, farm produce, etc.
There are two basic sources of response errors; errors arising from respondents and interviewers.
For example, the inability of respondents to provide the desired information is a common source of
response errors. This may arise from lack of knowledge, problem of recalling of the fact in the
distant past, mis-understanding of the questions, doesn`t wish to give the correct answer, etc.
Respondents sometimes purposely report certain information incorrectly to protect their dignity,
prestige, or simply to conform to what they think appropriate. For example:
90
Illiterate people report that they are able to read or write
Some people raise the level of their education, others the grade of their occupation.
Some people often exaggerate the salaries they receive and the rent they pay.
11.2.2. Observational
a) Questionnaire error: the sources of questionnaire error include poor design, type of questions
used, exclusive long questionnaire, inadequate interviewer instructions, or wrong measurements/
attitudinal scales used.
b) Data Processing error maybe caused by error in editing data, in coding, in computer data
entry, and in tabulation.
c) Analysis (reporting) error refers to the inappropriate statistical methods used in the
interpretations of the data.
91
Chapter 12: Measurements and Scaling
12.1. Levels of Measurement
A fundamental step in the conduct of research is measurement: the process through which
observations are translated into numbers. The nature of the measurement process that produces the
numbers determines the interpretation that can be made from them and the statistical procedures
that can be meaningfully used with them.
The four levels from lowest to greatest or highest precision are nominal, ordinal, interval, and
ratio. Each level gives a different type of information. Nominal measures indicate only that there is
a qualitative difference among categories (eg. Religion: Protestant, Catholic, Jew, Muslim).
Ordinal measures indicate a difference, plus the categories can be ordered ranked (eg. Leer grades:
A, B, C, D). Interval measures everything the first two do, plus it can specify the amount of
distance (eg. Temperature: 100 Celsius, 150 Celsius, 450 Celsius; IQ scores: 95, 110, 125).
Arbitrary zeros are there just to help keep score. Ratio measures do everything all the other levels
do, plus there is a true zero, which makes it possible to state relations in terms of proportion or
ratios (eg. Money income: $100, 500$; years of schooling: 1 year, 10 years, 13 years).
Discrete variables are nominal and ordinal, whereas continuous variables can be measured at the
interval or ratio level. A ratio level measure can be turned into an interval, ordinal, or nominal
level. The interval level can always be turned into an ordinal or nominal level, but the process does
not work in the opposite way!
92
12.2. Rating Scales
Scaling creates an ordinal, interval, or ratio measure of a variable expressed as a numerical score.
Scales are common in situations where a researcher wants to measure how an individual feels or
thinks about something. The respondent places a mark at a specific point along a numerically valued
continuum, or ticks off the response(s) from among the numerically ordered series of categories
provided, to express his or her attitude towards, the object under investigation.
Scaling is based on the idea of measuring the intensity, hardness, or potency of a variable. For
example, graphic rating scales ate elementary form of scaling. People indicate a rating by checking a
point on a line that runs from one extreme to another. This type of scale is easy to construct and
use. It conveys the idea of a continuum, and assigning numbers helps people think about quantities.
Scales assume that people with the same subjective feeling mark the graphic scale at the same place.
Illustrative example: In market survey scale points provided for question such as “can you please
rate this hair conditioners’ performance in terms of making your hair soft and manageable?”
Least Most
reliable reliable
1 2 3 4 5 6 7 8 9 10
There are various types of rating scales employed in different researches. Our primary focus is only
on the commonly used scales.
This technique measures the difference between words. It combines the verbal and diagrammatic
techniques using a seven-point, neutral-centered and bi-polar scale. The respondent places a cross
‘X’ in the position which indicates his or her thinking about a product or service in terms of the
‘construct’ or ‘dimension’ along a bi-polar adjective.
There are many developed bi-polar adjectives which have wide applications:
93
Active/passive savoury/tasteless
Crule/kind had/soft
Curved/straight new/old
Masculine/feminine good/bad
Untimely/timely week/strong
Unsuccessful/successful usual/unusual
Important/unimportant colourless/colourful
Angular/rounded slow/fast
Calm/excitable beautiful/ugly
False/true wise/foolish
Semantic Differential was developed to provide an indirect measure of how a person feels about a
concept, object, or other person. The technique measures subjective feelings toward something by
using adjectives such as listed above. The semantic Differential has been used for many purposes. In
marketing research, it tells how consumers feel about a product, in evaluating the image of a bank,
images of different car models, etc.
Studies of a wide variety of adjectives found that they fall into three major classes of meaning:
evaluation (good-bad or valuable-worthless or true-false), potency (strong-weak or heavy-light or
hard-soft), and activity (fast-slow or active-passive or hot-cold). Of the three classes of meaning,
evaluation is usually the most significant.
The seven-position hold score values from 1 to 7, with large values assigned to responses nearer the
favorable adjectives. For bi-polar adjective good/bad, the seven-points are:
Extremely good 7
Very good 6
Somewhat good 5
94
Neither good nor bad 4
Somewhat bad 3
Very bad 2
Extremely bad 1
For example, in a survey for evaluating bank image, the bi-polar adjectives shown below are used.
7 6 5 4 3 2 1
Secure : x : : : : : : : Insecure
Aggressive : : x : : : : : : conservative
friendly : : : x : : : : : Unfriendly
Large : : : : x : : : : Small
This rate shows that the bank is extremely secure, very conservative, somewhat friendly, neither
small nor large.
Profile or disaggregate analysis is just one possible analysis for semantic differential data. In this
approach the average score pertaining to the various groups of individuals are calculated for each
pair of polar adjectives and plotted on a master graph known at a ‘snake’ diagram.
For example: Two groups G1 and G2 of 100 respondents each were interviewed to offer their
ratings on a given Company in terms of four pairs of bi-polar adjectives: powerful-weak, old
fashioned-modern, reliable-unreliable, and rude-polite.
Powerful weak
7 6 5 4 3 2 1
G1 20 42 10 5 4 12 7 AV=5.05
G2 8 9 12 20 25 20 6 AV=3.71
95
Reliable Unreliable
7 6 5 4 3 2 1
G1 52 12 8 10 8 5 5 AV=5.55
G2 10 15 12 22 35 6 0 AV=4.25
Rude 7 6 5 4 3 2 1 Polite
G1 15 5 25 20 10 10 5 AV=3.75
G2 15 5 15 20 15 20 20 AV=3.75
The above average scores were calculated by summing up the weighted scores and then dividing it
by the number of respondents. For example, the average score offered on powerful-weak attribute
is :
Likert Scale:
Likert scale consists of a series of statement, each followed by four or five response alternatives. A
list of attitude measurements about the object under study is completed and the respondent
96
indicates his or her degree of agreement with each of these statements that are related the object in
question mostly on five-point scale. The respondents are usually offered five categories:- strongly
agree (SA), Somewhat agree (A), Neither agree nor disagree (Uncertain, undecided) (UD),
Somewhat disagree (D), strongly disagree (SD), though three or seven divisions are sometimes
used. The respondents are asked to select the position corresponding most closely with their
opinion.
To score the scale, the response category must be weighted. For favorable or positively stated
items, the number 5, 4, 3, 2, 1, respectively, are assigned to the response categories beginning at
the favorable end.
97
*7. Only God has the right to take a human life.
*These are negative items, agreement with which is considered to reflect a negative or unfavorable
attitude toward capital punishment.
A person with a favorable attitude towards the statement would agree with positive item and
disagree with the negative item. The number associated with each response are relates to provide
the over all score for each respondent. In this case ---- a 10-item scale ------ individual scores can
range from a low of 10 (if alternative”1” were chosen every time) to a high of 50 (if alternative “5”
were chosen every time). In general, the highest possible scale score is (5 x the number of items)
and the lowest possible score is (1 x the number of items).
Note that each item in a Likert scale is an ordinal measure, ranging from low of “strongly disagree”
to a high of “strongly agree”. Liker scales are widely used and very common in survey research.
They are called summated-rating or additive scales because a person`s score on the scale is computed
by summing the number of responses the person gives.
In deciding which item will ultimately be used in a Likert scale, a final criterion is whether the scale
item discriminates among people. We want to eliminate non-discriminating items from
consideration for our scale. Non-discriminating items are those that are responded to in a similar
fashion by both people who score high and people who score low on the overall score. Such items
in a scale can be selected on the basis of results from a pretest in which people respond to all the
98
preliminary items of the scale. One way of identifying non-discriminating items is by computing a
discriminary power score (DP score) for each item.
The first step in obtaining DP score is to calculate the total scores of each respondent and rank the
score from the highest to the lowest. We then identify the upper and lower quartiles of the
distribution of the total score. The upper quartile (Q3) is the cut-off point in a distribution above
which the highest 25 percent of the scores are located, and the lower quartile (Q1) is the cut-off
point below which the lowest 25 percent of the scores are located.
The following example illustrates the computation of DP scores for one item in a scale to which 40
people responded.
Response Value Weighted Weighted DP
Quartile N 1 2 3 4 5 Total Mean Score
Upper 10 0 1 2 4 3 39 3.9 2.1
Lower 10 2 8 0 0 0 18 1.8
The10 respondents are above the upper quartile, and 10 are below quartile. The next step is to
compute a weighted total on this item for the two groups. This is done by multiplying each score by
the number of respondents with that score. For those above the upper quartile, the weighted total
is :
The weighted mean is computed by dividing the weighted total by the number of cases in the
quartile, ie., 39/10 =3.90 . In a similar way, for the lower quartile, the weighted mean is
computed and found to be 1.8. Then we have 3.9 – 1.8 = 2.1 DP score. This process is repeated
for every item in the preliminary scale so that each item has a calculated DP score to make a final
selection. The best items are those with the highest DP score because this shows that people in the
upper and lower quartiles responded to the items very differently. As a general rule of thumb, as
99
many items as possible should have DP scores of 1.0 or greater, and few if any should drop below
0.50.
The Likert scale is one of the most popular multiple-item scales because of the many advantages it
posses:
It offers respondents a range of choices rather than the limited “yes-no” alternatives,
possible in some other scales.
Data produced by Likert-type scales are considered to be ordinal level, which enables
us to use more powerful statistical n nominal level data.
Likert scales are fairly straightforward to construct.
Whenever we summarize data we lose some information. One must be careful in interpreting a
single score based on a Likert scale because it is a summary of so much information.
100