0% found this document useful (0 votes)
4 views

Sampling Techniques Notes

Data Collection is the process of gathering information from various sources to address statistical inquiries, which is crucial for informed decision-making and trend analysis. It involves methods such as interviews, questionnaires, observations, experiments, and can be categorized into primary and secondary data. Sampling techniques, both probability and non-probability, are used to select representative samples from populations for research purposes.

Uploaded by

naufathrafiya410
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Sampling Techniques Notes

Data Collection is the process of gathering information from various sources to address statistical inquiries, which is crucial for informed decision-making and trend analysis. It involves methods such as interviews, questionnaires, observations, experiments, and can be categorized into primary and secondary data. Sampling techniques, both probability and non-probability, are used to select representative samples from populations for research purposes.

Uploaded by

naufathrafiya410
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

What is Data Collection?

Data Collection is the process of collecting information from relevant


sources to find a solution to the given statistical inquiry. Collection of
Data is the first and foremost step in a statistical investigation. It’s an
essential step because it helps us make informed decisions, spot trends,
and measure progress.

Different methods of collecting data include


 Interviews
 Questionnaires
 Observations
 Experiments
 Published Sources and Unpublished Sources
Here, statistical inquiry means an investigation by any agency on a topic
in which the investigator collects the relevant quantitative information. In
simple terms, a statistical inquiry is a search for truth by using statistical
methods of collection, compiling, analysis, interpretation, etc. The basic
problem for any statistical inquiry is the collection of facts and figures
related to this specific phenomenon that is being studied. Therefore, the
basic purpose of data collection is collecting evidence to reach a sound
and clear solution to a problem.

Terms Related to Data Collection


Data: Data is a tool that helps an investigator in understanding the
problem by providing him with the information required. Data can be
classified into two types; viz., Primary Data and Secondary Data.
Investigator: An investigator is a person who conducts the statistical
enquiry.
Enumerators: In order to collect information for statistical enquiry, an
investigator needs the help of some people. These people are known as
enumerators.
Respondents: A respondent is a person from whom the statistical
information required for the enquiry is collected.
Survey: It is a method of collecting information from individuals. The
basic purpose of a survey is to collect data to describe different
characteristics such as usefulness, quality, price, kindness, etc. It involves
asking questions about a product or service from a large number of
people.

Types of data collection :

Primary Data
Primary data refers to information collected directly from first-hand
sources specifically for a particular research purpose. This type of data is
gathered through various methods, including surveys, interviews,
experiments, observations, and focus groups. One of the main advantages
of primary data is that it provides current, relevant, and specific
information tailored to the researcher’s needs, offering a high level of
accuracy and control over data quality.
Methods of Collecting Primary Data
There are a number of methods of collecting primary data, Some of the
common methods are as follows:

1. Interviews: Collect data through direct, one-on-one conversations with


individuals. The investigator asks questions either directly from the
source or from its indirect links.
Direct Personal Investigation: The method of direct personal
investigation involves collecting data personally from the source of origin.
In simple words, the investigator makes direct contact with the person
from whom he/she wants to obtain information. For example, direct
contact with the household women to obtain information about their daily
routine and schedule.
Indirect Oral Investigation: In the indirect oral investigation method of
collecting primary data, the investigator does not make direct contact
with the person from whom he/she needs information, instead they collect
the data orally from some other person who has the necessary required
information. For example, collecting data of employees from their
superiors or managers.
Advantage: Provides real-time, natural data; no reliance on self-reported
information.
Disadvantage: Observer bias; limited to what can be seen; may influence
subjects’ behavior.
Suitable Use Case: Behavioral studies, user experience research.
2. Questionnaires: Collect data by asking people a set of questions,
either online, on paper, or face-to-face. In this method the investigator
prepares a questionnaire to collect Information through Questionnaires
and Schedules , while keeping in mind the motive of the study, . The
investigator can collect data through the questionnaire in two ways:
Mailing Method: This method involves mailing the questionnaires to the
informants for the collection of data. The investigator attaches a letter
with the questionnaire in the mail to define the purpose of the study or
research.
Enumerator’s Method: This method involves the preparation of a
questionnaire according to the purpose of the study or research. However,
in this case, the enumerator reaches out to the informants himself with the
prepared questionnaire.
Advantage: Can reach a large audience quickly and cost-effectively.
Disadvantage: Responses may be biased or inaccurate; low response rates.
Suitable Use Case: Customer satisfaction surveys, market research.
3. Observations: The observation method involves collecting data by
watching and recording behaviors, events, or conditions as they naturally
occur. The observer systematically watches and notes specific aspects of
a subject’s behavior or the environment, either covertly or overtly.
Advantage: Provides real-time, authentic data without reliance on self-
reported information.
Disadvantage: Observer bias can influence the results, and the presence
of an observer might alter subjects’ behavior.
Suitable Use Case: Studying user interactions with a product in a natural
setting, monitoring wildlife behavior, or assessing classroom dynamics.
4. Experiments: The experiment method involves manipulating one or
more variables to determine their effect on another variable, within a
controlled environment. Researchers create two groups (control and
experimental), apply the treatment or variable to the experimental group,
and compare the outcomes between the groups.
Advantage: Allows for the establishment of cause-and-effect
relationships with high precision.
Disadvantage: Experiments can be artificial, limiting the ability to
generalize findings to real-world settings, and they can be resource-
intensive.
Suitable Use Case: Testing the efficacy of a new drug, assessing the
impact of a new teaching method, or evaluating the effect of a marketing
campaign.
5. Focus Group: The focus group method involves gathering a small
group of people to discuss a specific topic or product, facilitated by a
moderator. A group of 6-12 participants engages in a guided discussion
led by a moderator who asks open-ended questions to elicit opinions,
attitudes, and perceptions.
Advantage: Provides in-depth insights and diverse perspectives through
interactive discussions, revealing the reasoning behind participants’
thoughts and feelings.
Disadvantage: Results can be influenced by dominant participants or
groupthink, and the findings are not easily generalizable due to the small,
non-representative sample size.
Suitable Use Case: Exploring customer attitudes towards a new product,
gathering feedback on a marketing campaign, or understanding public
opinion on social issues.
6. Information from Local Sources or Correspondents: In this method,
for the collection of data, the investigator appoints correspondents or
local persons at various places, which are then furnished by them to the
investigator. With the help of correspondents and local persons, the
investigators can cover a wide area.
Secondary Data
Secondary data refers to information that has already been collected,
processed, and published by others. This type of data can be sourced from
existing research papers, government reports, books, statistical databases,
and company records. The advantage of secondary data is that it is readily
available and often free or less expensive to obtain compared to primary
data. It saves time and resources since the data collection phase has
already been completed.
Methods of Collecting Secondary Data
Secondary data can be collected through different published and
unpublished sources. Some of them are as follows:
1. Published Sources
Government Publications: Government publishes different documents
which consists of different varieties of information or data published by
the Ministries, Central and State Governments in India as their routine
activity. As the government publishes these Statistics, they are fairly
reliable to the investigator. Examples of Government publications on
Statistics are the Annual Survey of Industries, Statistical Abstract of India,
etc.
Semi-Government Publications: Different Semi-Government bodies
also publish data related to health, education, deaths and births. These
kinds of data are also reliable and used by different informants. Some
examples of semi-government bodies are Metropolitan Councils,
Municipalities, etc.
Publications of Trade Associations: Various big trade associations
collect and publish data from their research and statistical divisions of
different trading activities and their aspects. For example, data published
by Sugar Mills Association regarding different sugar mills in India.
Journals and Papers: Different newspapers and magazines provide a
variety of statistical data in their writings, which are used by different
investigators for their studies.
International Publications: Different international organizations like
IMF, UNO, ILO, World Bank, etc., publish a variety of statistical
information which are used as secondary data.
Publications of Research Institutions: Research institutions and
universities also publish their research activities and their findings, which
are used by different investigators as secondary data. For example
National Council of Applied Economics, the Indian Statistical Institute,
etc.
2. Unpublished Sources
Unpublished sources are another source of collecting secondary data. The
data in unpublished sources is collected by different government
organizations and other organizations. These organizations usually collect
data for their self-use and are not published anywhere. For example,
research work done by professors, professionals, teachers and records
maintained by business and private enterprises.

Sampling:
the sampling method or sampling technique is the process of
studying the population by gathering information and analyzing that data.
It is the basis of the data where the sample space is enormous.
There are several different sampling techniques available, and they can be
subdivided into two groups. All these methods of sampling may involve
specifically targeting hard or approach to reach groups.

Types of Sampling Method


In Statistics, there are different sampling techniques available to get
relevant results from the population. The two different types of sampling
methods are::
 Probability Sampling
 Non-probability Sampling

What is Probability Sampling?


The probability sampling method utilizes some form of random selection.
In this method, all the eligible individuals have a chance of selecting the
sample from the whole sample space. This method is more time
consuming and expensive than the non-probability sampling method. The
benefit of using probability sampling is that it guarantees the sample that
should be the representative of the population.

Probability Sampling Types


Probability Sampling methods are further classified into different types,
such as simple random sampling, systematic sampling, stratified sampling,
and clustered sampling. Let us discuss the different types of probability
sampling methods along with illustrative examples here in detail.

 Simple Random Sampling


In simple random sampling technique, every item in the population has an
equal and likely chance of being selected in the sample. Since the item
selection entirely depends on the chance, this method is known as
“Method of chance Selection”. As the sample size is large, and the item is
chosen randomly, it is known as “Representative Sampling”.

Example:
Suppose we want to select a simple random sample of 200 students from
a school. Here, we can assign a number to every student in the school
database from 1 to 500 and use a random number generator to select a
sample of 200 numbers.

 Systematic Sampling
In the systematic sampling method, the items are selected from the target
population by selecting the random selection point and selecting the other
methods after a fixed sample interval. It is calculated by dividing the total
population size by the desired population size.

Example:
Suppose the names of 300 students of a school are sorted in the reverse
alphabetical order. To select a sample in a systematic sampling method,
we have to choose some 15 students by randomly selecting a starting
number, say 5. From number 5 onwards, will select every 15th person
from the sorted list. Finally, we can end up with a sample
of some students

 Stratified Sampling
In a stratified sampling method, the total population is divided into
smaller groups to complete the sampling process. The small group is
formed based on a few characteristics in the population. After separating
the population into a smaller group, the statisticians randomly select the
sample.

For example, there are three bags (A, B and C), each with different balls.
Bag A has 50 balls, bag B has 100 balls, and bag C has 200 balls. We
have to choose a sample of balls from each bag proportionally. Suppose 5
balls from bag A, 10 balls from bag B and 20 balls from bag C.

 Clustered Sampling
In the clustered sampling method, the cluster or group of people are
formed from the population set. The group has similar significatory
characteristics. Also, they have an equal chance of being a part of the
sample. This method uses simple random sampling for the cluster of
population.

Example:
An educational institution has ten branches across the country with
almost the number of students. If we want to collect some data regarding
facilities and other things, we can’t travel to every unit to collect the
required data. Hence, we can use random sampling to select three or four
branches as clusters.

All these four methods can be understood in a better manner with the help
of the figure given below. The figure contains various examples of how
samples will be taken from the population using different techniques.
Uses of probability sampling
Probability sampling methods find widespread use across diverse
research disciplines because of their ability to yield representative and
unbiased samples. The advantages of employing probability sampling
include the following:

Representativeness
Probability sampling assures that every element in the population has a
non-zero chance of being included in the sample, ensuring
representativeness of the entire population and decreasing research bias to
minimal to non-existent levels. The researcher can acquire higher-quality
data via probability sampling, increasing confidence in the conclusions.

Statistical inference
Statistical methods, like confidence intervals and hypothesis testing,
depend on probability sampling to generalize findings from a sample to
the broader population. Probability sampling methods ensure unbiased
representation, allowing inferences about the population based on the
characteristics of the sample.

Precision and reliability


The use of probability sampling improves the precision and reliability of
study results. Because the probability of selecting any single
element/individual is known, the chance variations that may occur in non-
probability sampling methods are reduced, resulting in more dependable
and precise estimations.

Generalizability
Probability sampling enables the researcher to generalize study findings
to the entire population from which they were derived. The results
produced through probability sampling methods are more likely to be
applicable to the larger population, laying the foundation for making
broad predictions or recommendations.

Minimization of Selection Bias


By ensuring that each member of the population has an equal chance of
being selected in the sample, probability sampling lowers the possibility
of selection bias. This reduces the impact of systematic errors that may
occur in non-probability sampling methods, where data may be skewed
toward a specific demographic due to inadequate representation of each
segment of the population

Challenges with Probability Sampling:

o Resource Intensive: This method requires a complete list of the


population, substantial time, and can be costly.
o Non-response bias: Even with random selection, the possibility of
non-response can introduce bias into the results.
o Complexity: Implementing probability sampling correctly,
particularly stratified or cluster sampling, can be complex and
requires expertise.

What is Non-Probability Sampling?


The non-probability sampling method is a technique in which the
researcher selects the sample based on subjective judgment rather than
the random selection. In this method, not all the members of the
population have a chance to participate in the study.
Non-Probability Sampling Types
Non-probability Sampling methods are further classified into different
types, such as convenience sampling, consecutive sampling, quota
sampling, judgmental sampling, snowball sampling. Here, let us discuss
all these types of non-probability sampling in detail.

 Convenience Sampling
In a convenience sampling method, the samples are selected from the
population directly because they are conveniently available for the
researcher. The samples are easy to select, and the researcher did not
choose the sample that outlines the entire population.
Example:
In researching customer support services in a particular region, we ask
your few customers to complete a survey on the products after the
purchase. This is a convenient way to collect data. Still, as we only
surveyed customers taking the same product. At the same time, the
sample is not representative of all the customers in that area.

 Consecutive Sampling
Consecutive sampling is similar to convenience sampling with a slight
variation. The researcher picks a single person or a group of people for
sampling. Then the researcher researches for a period of time to analyze
the result and move to another group if needed.

 Quota Sampling
In the quota sampling method, the researcher forms a sample that
involves the individuals to represent the population based on specific
traits or qualities. The researcher chooses the sample subsets that bring
the useful collection of data that generalizes the entire population.

 Purposive or Judgmental Sampling


In purposive sampling, the samples are selected only based on the
researcher’s knowledge. As their knowledge is instrumental in creating
the samples, there are the chances of obtaining highly accurate answers
with a minimum marginal error. It is also known as judgmental sampling
or authoritative sampling.

 Snowball Sampling
Snowball sampling is also known as a chain-referral sampling technique.
In this method, the samples have traits that are difficult to find. So, each
identified member of a population is asked to find the other sampling
units. Those sampling units also belong to the same targeted population.

Uses of non-probability sampling


Non-probability sampling approaches are employed in qualitative or
exploratory research where the goal is to investigate underlying
population traits rather than generalizability. Non-probability sampling
methods are also helpful for the following purposes:

Generating a hypothesis
In the initial stages of exploratory research, non-probability methods such
as purposive or convenience allow researchers to quickly gather
information and generate hypothesis that helps build a future research
plan.

Qualitative research
Qualitative research is usually focused on understanding the depth and
complexity of human experiences, behaviors, and perspectives. Non-
probability methods like purposive or snowball sampling are commonly
used to select participants with specific traits that are relevant to the
research question.

Convenience and pragmatism


Non-probability sampling methods are valuable when resource and time
are limited or when preliminary data is required to test the pilot study. For
example, conducting a survey at a local shopping mall to gather opinions
on a consumer product due to the ease of access to potential participants.

Challenges with Non-Probability Sampling:

o Potential for Bias: Since selection is not random, this method can
introduce bias and limit the generalizability of the findings.
o Quality of Data: The quality of data may be lower due to the
subjective nature of the selection process.
o Cannot Measure Sampling Error: It is impossible to measure the
sampling error or make statistical inferences about the population

Difference Between Probability and Non-Probability Sampling

Further detailed differentiation of probability vs non probability sampling


are given below:

1. Selection method: In probability sampling, individuals are

selected randomly, whereas in non-probability sampling, selection


is based on accessibility or the researcher's judgment.
2. Bias: Probability sampling is less prone to bias due to its random

nature, while non-probability sampling can be more susceptible.


3. Generalization: Results from probability sampling can be

generalized to the larger population, while non-probability


sampling may not permit such extrapolation.
4. Accuracy Estimation: Probability sampling allows for the

estimation of sample accuracy, unlike non-probability sampling.


5. Complexity: Probability sampling is more complex to implement

due to its need for a complete list of the population, whereas non-
probability sampling is less complex.
6. Time: Probability sampling can be more time-consuming, while

non-probability sampling is usually quicker to execute.


7. Cost: Due to its complexity and time requirements, probability

sampling can be costlier than non-probability sampling.


8. Sample Representation: Probability samples are more likely to

accurately represent the population compared to non-probability


samples.
9. Data Quality: Generally, the data quality of probability sampling

is higher, thanks to its reduced bias and better representation.


10. Sample Size Requirement: Probability sampling often requires a

larger sample size for statistical significance, while non-probability


sampling can work with smaller groups.

You might also like