Sample LLM
Sample LLM
UNIT 2
SAMPLING DESIGN
AND
DATA COLLECTION
Structure
1.1 Introduction
1.2 Objectives
1.3 Meaning of sampling
1.3.1 Sampling Design
1.3.2 Characteristics of Sampling Design
1.4 Types of sample design.
1.5 Data in research
1.5.1 Importance of accuracy in Data Collection
1.5.2 Types of data
1.5.3 Methods of collecting primary data
1.5.4 Sources of secondary data
1.6 Summary
1.7 Glossary
1.8 References/Bibliography
1.9 Suggested Readings
1.10 Terminal Questions
2.1 INTRODUCTION
Sampling is a process used in statistical analysis in which a predetermined number of
observations are taken from a larger population. The methodology used to sample from a
larger population depends on the type of analysis being performed but may include
simple random sampling or systematic sampling.
2.2 Objectives
After reading this unit the learner will be able to:
Understand the meaning of sampling
Understand sampling Design
Understand characteristics of Sampling Design
Understand aims in selection a sample
Understand the various types of sample design.
Understand role of data in research
Understand Types of data
1. Proportional: Sample design must result in a truly representative sample. This means
that the sample selected should be exactly or almost similar to the population it
represents I terms of data and characteristics.
2. Error Free: Sample design should reduce the probability of errors. The minimum
numbers of errors in any sample ensure correct data obtained and analyzed.
3. Budgeted: Sample design must be practical and be within the limits of funds
available for the research study.
5. Generalization of Results: Sample should be such that the results of the sample
study can be applied, in general, for the universe with a reasonable level of
confidence.
While developing a sampling design, the researcher must pay attention to the following
points:
i. Type of universe: The accuracy of the results in any study depends on how
clearly the universe or population of interest is defined. The universe can be
ii. Sampling unit: The sampling unit can be anything that exists within the
population of interest. An assessment has to be taken with reference to a
sampling unit before selecting sample. Sampling unit may be:
iii. Source list: It is also known as ‗sampling frame‘ from which sample is to be
drawn. It contains the names of all items of a finite universe. If source list is
not available, researcher has to prepare it. Such a list should be
comprehensive, correct, reliable and appropriate. It is extremely important for
the source list to be as representative of the population as possible.
iv. Size: The sample size should be justified, not be excessively large nor it
should be too small. Preferably the sample size should be optimal which
fulfills the requirements of efficiency, representativeness, reliability and
flexibility and representative of the population to obtain dependable
outcomes. Population variance, population size, parameters of interest, and
budgetary constraints are some of the factors that impact the sample size.
vii. Sampling procedure: Finally, the researcher must decide the type of sample
he will use i.e., he must decide about the technique to be used in selecting the
items for the sample. In fact, this technique or procedure stands for the sample
design itself. An ideal design is the one that for a given sample size and for a
given cost, has a smaller sampling error.
With non-probability sampling methods, we do not know the probability that each
population element will be chosen, and/or we cannot be sure that each population
element has a non-zero chance of being chosen.
This offers the advantages of convenience and cost but the disadvantage is that non-
probability sampling methods do not allow the estimation of the extent to which sample
statistics are possibly varying from population parameters.
E.g. To study the average spending or average number of days stayed by tourists visiting
religious destinations the researcher has the freedom to choose destinations and state
them to be representative of all other religious destinations.
Probability sampling: This Sampling technique uses randomization to make sure that
every element of the population gets an equal chance to be part of the selected sample.
It‘s alternatively known as ‗random sampling‘ or ‗chance sampling‘. Selection of winner
of a lottery selected through mechanical process gives all ticket holders an equal chance
of winning.
With probability sampling methods, each population element has a known (non-zero)
chance of being chosen for the sample.
A facility to measure the errors of estimation or the significance of results obtained from
a random sample by providing same chance to all options proves that random sampling
design is better than deliberate sampling design. The law of Statistical Regularity which
states that if on an average the sample chosen is a random one, the sample will have the
same composition and characteristics as the universe is truly applied here. This is the
reason why random sampling is considered as the best technique of selecting a
representative sample.
a) It gives each element in the population an equal probability of getting into the sample;
and all choices are independent of one another.
b) It gives each possible sample combination an equal probability of being chosen.
Voluntary Sampling: This constitutes of people who have keen interest in the topic
of survey being conducted and are themselves getting involved to contribute as
respondents.
E.g. for a survey or online poll being conducted on a social site like Facebook attracts
volunteers with common interests and they participate in it.
E.g. To study the popularity of handmade products or traditional goods the researcher
may choose to visit a local fair where it is easy to reach out to buyers of such goods
and services.
Probability Sampling Methods: The main types of probability sampling methods are
simple random sampling, stratified sampling, cluster sampling, multistage sampling, and
systematic random sampling. The basic advantage of probability sampling methods is that
they assure that the sample chosen is representative of the population thereby ensuring
that the statistical conclusions stand valid.
Simple random sampling. Simple random sampling refers to any sampling method
that has the following properties.
There are many ways to obtain a simple random sample. One way would be the lottery
method. Each of the N population members is assigned a unique number. The numbers
are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n
numbers. Population members having the selected numbers are included in the sample.
Stratified sampling. With stratified sampling, the population is divided into groups,
based on some characteristic. Then, within each group, a probability sample (often a
simple random sample) is selected. In stratified sampling, the groups are
called strata.
E.g. For a survey carried our across a state the population may be divided age wise
into groups or strata, like infants, children, minors, adolescents, teenagers, adults, etc.
Within each stratum, we might randomly select survey respondents.
Cluster sampling. With cluster sampling, every member of the population is assigned
to one, and only one, group. Each group is called a cluster. A sample of clusters is
With stratified sampling, the sample includes elements from each stratum. With cluster
sampling, in contrast, the sample includes elements only from sampled clusters.
For example, in Stage 1, we might use cluster sampling to choose clusters from a
population. Then, in Stage 2, we might use simple random sampling to select a subset
of elements from each chosen cluster for the final sample.
Systematic Random Sampling: This begins with creation of a list of each member of
the population. From the list, we randomly select the first sample element from the
first k elements on the population list. Thereafter, we select every kth element on the
list.
This method is different from simple random sampling since every possible sample of n
elements is not equally likely.
Solution
The correct answer is (D). A simple random sample requires that every sample of
size n (in this problem, n is equal to 400) has an equal chance of being selected. In this
problem, there was a 100 percent chance that the sample would include 100 guests of
each brand of hotel. There was zero percent chance that the sample would include, for
example, 99 Taj Guests, 101 Marriott Guests, 100 ITC Guests, and 100 Best Western
Guests. Thus, all possible samples of size 400 did not have an equal chance of being
selected; so this cannot be a simple random sample.
The fact that hotel guests of each of the brands were equally represented in the sample is
irrelevant to whether the sampling method was simple random sampling. Similarly, the
fact that population consisted of guests of different hotel brands is irrelevant.
While methods differ by discipline, the importance of collecting accurate and honest data
remains in place. The goal for all data collection is to capture quality evidence that allows
analysis to lead to the formulation of convincing and credible answers to the questions
that have been posed.
Despite the variation of the field of study or preference for defining data
(quantitative or qualitative), accurate data collection is essential to maintaining the
integrity of research.
Both the selection of appropriate data collection instruments (existing, modified, or
newly developed) and clearly delineated instructions for their correct use reduce the
likelihood of errors occurring.
A formal data collection process is necessary as it ensures that the data gathered are
both defined and accurate and that subsequent decisions based on arguments
embodied in the findings are valid.
The process provides both a baseline from which to measure and in certain cases an
indication of what to improve.
a) Primary Data: Primary data means original data that has been collected specially for
the purpose in mind. It means someone collected the data from the original source first
hand. Data collected this way is called primary data. The people who gather primary data
may be an authorized organization, investigator, enumerator or they may be just someone
with a clipboard. Those who gather primary data may have knowledge of the study and
may be motivated to make the study a success. These people are acting as a witness so
primary data is only considered as reliable as the people who gathered it.
b) Secondary Data: Refers to data which is collected by someone who is someone other
than the user. Common sources of secondary data for social science include censuses,
information collected by government departments, organizational records and data that
was originally collected for other research purposes. Secondary data analysis can save
time that would otherwise be spent collecting data and, particularly in the case
of quantitative data, can provide larger and higher-quality databases that would be
unfeasible for any individual researcher to collect on their own. In addition, analysts of
social and economic change consider secondary data essential, since it is impossible to
conduct a new survey that can adequately capture past change and/or developments.
However, secondary data analysis can be less useful in marketing research, as data may
be outdated or inaccurate.
Observation can yield information which people are normally unwilling or are unable to
provide the responses. e.g. Observing numerous plates containing leftover / not eaten
portions for a particular menu item indicates that food is not satisfactory.
Types of Observation:
Structured – for descriptive research
Unstructured - for exploratory research
Participant Observation
Non- participant observation
Disguised observation
Limitations: Because of these limitations, researchers often supplement observation with
survey research.
Feelings, beliefs and attitudes that motivate buying behavior and infrequent behavior
cannot be observed.
Expensive.
Unstructured Surveys: The interviewer probes the respondents and guides the
interview according to their answers. E.g. Debates on political issues on Television
Channels.
Direct Approach: The researcher asks direct questions about behaviors and thoughts.
e.g. Why don‘t you eat at MacDonald?
Indirect Approach: The researcher might ask: ―What kind of people eat at
MacDonald‘s?‖
From the response, the researcher may be able to discover why the consumer avoids
MacDonald‘s. It may suggest factors of which the consumer is not consciously aware.
Advantages:
Can be used to collect different kinds of information at same time.
Quick and low cost as compared to observation and experimental method.
Limitations:
Respondent‘s reluctance to answer questions asked by unknown interviewers about
things they consider private.
b. Telephone Interviewing:
Advantages:
Quick method
More flexible as interviewer can explain questions not understood by the respondent
Depending on respondent‘s answer they can skip some Qs and probe more on others
Allows greater sample control
Response rate tends to be higher than mail
Limitations:
Cost per respondent higher
Some people may not want to discuss personal Qs with interviewer
Interviewer‘s manner of speaking may affect the respondent‘s answers
Different interviewers may interpret and record response in a variety of ways
Under time pressure, data may be entered without actually interviewing
g. Personal Interviewing:
It is very flexible and can also be used to collect large amounts of information. Skilled
interviewers are able to keep the respondent attentive and clarify difficult questions in
case of a doubt. They can guide interviews, explore issues, and probe as the situation
demands. Personal interview can be used in any type of questionnaire and can be
conducted fairly quickly.
Types of Interviewing:
a. Intercept interviewing: It is an integral part of tourism research. It allows researcher
to reach known people in a shorter durations but at the same time it reaches out to
respondents whose details are not known. The interviewer has to make an effort to
gain attention and cooperation from respondents to assure apt responses. The
interviews can be conducted at different locations like residences, offices, public
spaces, shopping destinations etc. The interviewer uses own judgement to identify the
respondents depending on convenience and may also offer some compensations if the
interaction is prolonged.
Limitations:
Interviewer may be forceful in getting responses modified as per the objectives of
study.
There is possibility of an error and bias on the part of the interviewer who may not be
able to correctly judge the religion, age, race etc.
Interviewer may be uncomfortable talking to certain ethnic or age groups.
Such activity helps in identifying issues and subjects which may later be used in
conduct of study at larger scales or in case of direct interviews that are structured. The
responses are recorded and noted to analyze them at later stages.
This method is especially suited for managers of hotels and restaurants, who have
easy access to their customers. e.g. Some hotel managers often invite a group of hotel
guests from a particular market segment to have a free breakfast with them. Managers
get the chance to meet the guests and discuss what they like about the hotel and what
the hotel could do to make their stay more enjoyable and comfortable. The guests
appreciate this recognition and the manager gets valuable information. Restaurant
managers use the same approach by holding discussion meetings over lunch or dinner.
Piloting the Questionnaire: The questionnaire before being finalized should be cross
checked with peers, managers etc. Thereafter questionnaire must be piloted i.e. it should
be tested to see if it is obtaining the results as per objectives or not. This is done by asking
people to read it through and see if there are any ambiguities which you have not noticed.
They should also be asked to comment about the length, structure and wording of the
questionnaire. Alter the questions accordingly.
COLLECTING DATA: Data Collection becomes important once the other critical
issues like hypothesis, objectives, research problem, sampling design, location, and
population for study are addressed. This data gives the inputs from which the inferences
are drawn leading to conclusive findings. Depending upon your plans, you might
commence interviews, mail out a questionnaire, conduct experiments and/or make
observations.
Collecting data through involves ethical issues in relation to the participants and the
researcher:
• Those from whom information is collected or those who are studied by a researcher
become participants of the study.
• Anyone who collects information for a specific purpose, adhering to the accepted
code of conduct, is a researcher.
Ethical issues concerning research participants:
a. Safety of respondents: During the course of collecting information the respondents
should not be subjected to unnecessary harassment, anxiety, or putting them through
c. Incentives: The data collected does not need to be exchanged for a price as this
deters or de-motivates the respondents to participate in a research study. Offering
incentives, gifts, etc for seeking information is unethical and equivalent to bribing.
e. Misuse of data: The data collected has to be used only for the purpose it is collected
for not for making unethical usage. E.g. if the data of users is shared by a banking
institution with an advertising company it leads to invasion of privacy and rights of
the bank‘s clients.