0% found this document useful (0 votes)
2 views

Module 2 - Sampling

Sampling is a method used to select a subset of a population for statistical analysis, allowing researchers to gather insights without surveying the entire population. There are two main types of sampling: probability sampling, where every member has an equal chance of selection, and non-probability sampling, where selection is based on subjective judgment. Each type has various techniques and applications, with probability sampling being more representative and non-probability sampling being useful in exploratory research or when resources are limited.

Uploaded by

harishbl2323
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 2 - Sampling

Sampling is a method used to select a subset of a population for statistical analysis, allowing researchers to gather insights without surveying the entire population. There are two main types of sampling: probability sampling, where every member has an equal chance of selection, and non-probability sampling, where selection is based on subjective judgment. Each type has various techniques and applications, with probability sampling being more representative and non-probability sampling being useful in exploratory research or when resources are limited.

Uploaded by

harishbl2323
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

SAMPLING

Types of Sampling: Sampling Methods


with Examples

What is sampling?

Sampling is a technique of selecting individual members or a subset of the


population to make statistical inferences from them and estimate
characteristics of the whole population. Different sampling methods are widely
used by researchers in market research so that they do not need to research
the entire population to collect actionable insights.
It is also a time-convenient and a cost-effective method and hence forms the
basis of any research design. Sampling techniques can be used in a research
survey software for optimum derivation.

For example, if a drug manufacturer would like to research the adverse side
effects of a drug on the country’s population, it is almost impossible to conduct
a research study that involves everyone. In this case, the researcher decides
a sample of people from each demographic and then researches them, giving
him/her indicative feedback on the drug’s behavior.

Types of sampling: sampling methods

Sampling in market research is of two types – probability sampling and


non-probability sampling. Let’s take a closer look at these two methods of
sampling.

1. Probability sampling:

Probability sampling is a sampling technique where a researcher sets a


selection of a few criteria and chooses members of a population
randomly. All the members have an equal opportunity to be a part of the
sample with this selection parameter.

The most critical requirement of probability sampling is that everyone in your


population has a known and equal chance of getting selected. For example,
let us say that you have passed an amendment to employee code in your
organization, if you have a population of 100 people, every person would have
odds of 1 in 100 for getting selected. Probability sampling gives you the best
chance to create a sample that is truly representative of the population.

From the responses received, management will now be able to know whether
employees in that organization are happy or not about the amendment.

2. Non-probability sampling:

In non-probability sampling, the researcher chooses members for research at


random. This sampling method is not a fixed or predefined selection process.
This makes it difficult for all elements of a population to have equal
opportunities to be included in a sample.

Non-probability sampling is defined as a sampling technique in which the


researcher selects samples based on the subjective judgment of the
researcher rather than random selection. It is a less stringent method. This
sampling method depends heavily on the expertise of the researchers. It is
carried out by observation, and researchers use it widely for qualitative
research.

Non-probability sampling is a sampling method in which not all members of


the population have an equal chance of participating in the study, unlike
probability sampling. Each member of the population has a known chance of
being selected. Non-probability sampling is most useful for exploratory studies
like a pilot survey (deploying a survey to a smaller sample compared to
pre-determined sample size). Researchers use this method in studies where it
is impossible to draw random probability sampling due to time or cost
considerations.

Let us discuss the various probability and non-probability sampling methods


that you can implement in any market research study.

Types of probability sampling with examples:

Probability sampling is a sampling technique in which researchers choose


samples from a larger population using a method based on the theory of
probability. This sampling method considers every member of the population
and forms samples based on a fixed process.

For example, in a population of 1000 members, every member will have a


1/1000 chance of being selected to be a part of a sample. Probability
sampling eliminates bias in the population and gives all members a fair
chance to be included in the sample.

There are four types of probability sampling techniques:


● Simple random sampling:

One of the best probability sampling techniques that helps in saving


time and resources, is the Simple Random Sampling method. It is a
reliable method of obtaining information where every single member of
a population is chosen randomly, merely by chance. Each individual has
the same probability of being chosen to be a part of a sample.
For example, in an organization of 500 employees, if the HR team
decides on conducting team building activities, it is highly likely that they
would prefer picking chits out of a bowl. In this case, each of the 500
employees has an equal opportunity of being selected.

● Cluster sampling:

Cluster sampling is a method where the researchers divide the entire


population into sections or clusters that represent a population. Clusters
are identified and included in a sample based on demographic
parameters like age, sex, location, etc. This makes it very simple for a
survey creator to derive effective inference from the feedback.
For example, if the United States government wishes to evaluate the
number of immigrants living in the Mainland US, they can divide it into
clusters based on states such as California, Texas, Florida,
Massachusetts, Colorado, Hawaii, etc. This way of conducting a survey
will be more effective as the results will be organized into states and
provide insightful immigration data.
● Systematic sampling:

Researchers use the systematic sampling method to choose the sample


members of a population at regular intervals. It requires the selection of
a starting point for the sample and sample size that can be repeated at
regular intervals. This type of sampling method has a predefined range,
and hence this sampling technique is the least time-consuming.
For example, a researcher intends to collect a systematic sample of 500
people in a population of 5000. He/she numbers each element of the
population from 1-5000 and will choose every 10th individual to be a
part of the sample (Total population/ Sample Size = 5000/500 = 10).

● Stratified random sampling:

Stratified random sampling is a method in which the researcher divides


the population into smaller groups that don’t overlap but represent the
entire population. While sampling, these groups can be organized and
then draw a sample from each group separately.
For example, a researcher looking to analyze the characteristics of
people belonging to different annual income divisions will create strata
(groups) according to the annual family income. Eg – less than $20,000,
$21,000 – $30,000, $31,000 to $40,000, $41,000 to $50,000, etc. By
doing this, the researcher concludes the characteristics of people
belonging to different income groups. Marketers can analyze which
income groups to target and which ones to eliminate to create a
roadmap that would bear fruitful results.

Uses of probability sampling

There are multiple uses of probability sampling:

● Reduce Sample Bias:

Using the probability sampling method, the bias in the sample derived
from a population is negligible to non-existent. The selection of the
sample mainly depicts the understanding and the inference of the
researcher. Probability sampling leads to higher quality data collection
as the sample appropriately represents the population.

● Diverse Population:

When the population is vast and diverse, it is essential to have


adequate representation so that the data is not skewed towards one
demographic. For example, if Square would like to understand the
people that could make their point-of-sale devices, a survey conducted
from a sample of people across the US from different industries and
socio-economic backgrounds helps.
● Create an Accurate Sample:

Probability sampling helps the researchers plan and create an accurate


sample. This helps to obtain well-defined data.

Types of non-probability sampling with examples

The non-probability method is a sampling method that involves a collection of


feedback based on a researcher or statistician’s sample selection capabilities
and not on a fixed selection process. In most situations, the output of a survey
conducted with a non-probable sample leads to skewed results, which may
not represent the desired target population. But, there are situations such as
the preliminary stages of research or cost constraints for conducting research,
where non-probability sampling will be much more useful than the other type.

Four types of non-probability sampling explain the purpose of this sampling


method in a better manner:

● Convenience sampling:

This method is dependent on the ease of access to subjects such as


surveying customers at a mall or passers-by on a busy street. It is
usually termed as convenience sampling, because of the researcher’s
ease of carrying it out and getting in touch with the subjects.
Researchers have nearly no authority to select the sample elements,
and it’s purely done based on proximity and not representativeness.
This non-probability sampling method is used when there are time and
cost limitations in collecting feedback. In situations where there are
resource limitations such as the initial stages of research, convenience
sampling is used.
For example, startups and NGOs usually conduct convenience
sampling at a mall to distribute leaflets of upcoming events or promotion
of a cause – they do that by standing at the mall entrance and giving out
pamphlets randomly.

● Judgmental or purposive sampling:

Judgemental or purposive samples are formed by the discretion of the


researcher. Researchers purely consider the purpose of the study, along
with the understanding of the target audience. For instance, when
researchers want to understand the thought process of people
interested in studying for their master’s degree. The selection criteria
will be: “Are you interested in doing your masters in …?” and those who
respond with a “No” are excluded from the sample.
● Snowball sampling:

Snowball sampling is a sampling method that researchers apply when


the subjects are difficult to trace. For example, it will be extremely
challenging to survey shelterless people or illegal immigrants. In such
cases, using the snowball theory, researchers can track a few
categories to interview and derive results. Researchers also implement
this sampling method in situations where the topic is highly sensitive
and not openly discussed—for example, surveys to gather information
about HIV Aids. Not many victims will readily respond to the questions.
Still, researchers can contact people they might know or volunteers
associated with the cause to get in touch with the victims and collect
information.

● Quota sampling:

Quota sampling is defined as a non-probability sampling method in which


researchers create a sample involving individuals that represent a population.
Researchers choose these individuals according to specific traits or qualities.
They decide and create quotas so that the market research samples can be
useful in collecting data. These samples can be generalized to the entire
population. The final subset will be decided only according to the interviewer’s
or researcher’s knowledge of the population.
For example, a cigarette company wants to find out what age group prefers
what brand of cigarettes in a particular city. He/she applies quotas on the age
groups of 21-30, 31-40, 41-50, and 51+. From this information, the researcher
gauges the smoking trend among the population of the city.

Types of quota sampling:


Quota sampling can be of two kinds – controlled quota sampling and
uncontrolled quota sampling. Here’s what they mean:

Controlled quota sampling:


Controlled quota sampling imposes restrictions on the researcher’s choice of
samples. Here, the researcher is limited to the selection of samples.

Uncontrolled quota sampling:


Uncontrolled quota sampling does not impose any restrictions on the
researcher’s choice of samples. Here, the researcher chooses sample
members at will.

Quota sampling example:


Let’s look at a basic example of quota sampling:

A researcher wants to survey individuals about what smartphone brand they


prefer to use. He/she considers a sample size of 500 respondents. Also,
he/she is only interested in surveying ten states in the US. Here’s how the
researcher can divide the population by quotas:

● Gender: 250 males and 250 females


● Age: 100 respondents each between the ages of 16-20, 21-30, 31-40,
41-50, and 51+
● Employment status: 350 employed and 150 unemployed people.

○ (Researchers apply further nested quotas . For eg, out of the


150 unemployed people, 100 must be students.)

● Location: 50 responses per state

Depending on the type of research, the researcher can apply quotas based on
the sampling frame. It is not necessary for the researcher to divide the quotas
equally. He/she divides the quotas as per his/her need (as shown in the
example where the researcher interviews 350 employed and only 150
unemployed individuals). Random sampling can be conducted to reach out to
the respondents.

How to perform quota sampling:


Probability sampling techniques involve a significant amount of rules that the
researcher needs to follow to form samples. But, since quota sampling is a
non-probability sampling technique, there are no rules for formally creating
samples. Usually, there are four steps to form a quota sample. Here are the
steps:

1. Divide the sample population into subgroups: With stratified


sampling, the researcher bifurcates the entire population into mutually
exhaustive subgroups, i.e., the elements of each of the subgroups
becomes a part of only one of those subgroups. Here, the researcher
applies random selection.
2. Figure out the weightage of subgroups: The researcher evaluates
the proportion in which the subgroups exist in the population. He/she
maintains this proportion in the sample selected using this type of
sampling method.
3. For example, if 58% of the people who are interested in purchasing
your Bluetooth headphones are between the age group of 25-35
years, your subgroups also should have the same percentages of
people belonging to the respective age group.
4. Select an appropriate sample size: In the third step, the researcher
should select the sample size while maintaining the proportion
evaluated in the previous step. If the population size is 500, the
researcher can pick a sample of 50 elements.
The sample chosen after following the first three steps should
represent the target population.
5. Conduct surveys according to the quotas defined: Make sure to
stick to the predefined quotas to achieve actual actionable results.
Don’t survey quotas that are full and focus on completing surveys for
each quota.

Applications of quota sampling:


Below are the instances where quota sampling is applied and used.

● In situations where researchers have specific criteria for conducting


research, it allows the selection of subgroups, due to which it
becomes extremely convenient for researchers to obtain desired
results. A trait or characteristic can be the filter for subgroup
formation.
● The researcher uses this method when he/she has time constraints.
Applying quotas gives the researcher an idea of the whole population
of interest in very little time.
● Quotas are applied when the researcher is on a tight budget. Instead
of researching a large population, the researcher saves money by
using a few quotas to get the whole picture of the population.
● Some research studies do not require pinpoint accuracy due to the
nature of the research project. It is ideal for applying to quota
sampling for these studies.

Uses of non-probability sampling

Non-probability sampling is used for the following:

● Create a hypothesis:

Researchers use the non-probability sampling method to create an


assumption when limited to no prior information is available. This
method helps with the immediate return of data and builds a base for
further research.
● Exploratory research:

Researchers use this sampling technique widely when conducting


qualitative research, pilot studies, or exploratory research.Exploratory
research is defined as a research used to investigate a problem which is not
clearly defined. It is conducted to have a better understanding of the existing
problem, but will not provide conclusive results. ... Such a research is usually
carried out when the problem is at a preliminary stage.

● Budget and time constraints:

The non-probability method when there are budget and time constraints,
and some preliminary data must be collected. Since the survey design
is not rigid, it is easier to pick respondents at random and have them
take the survey or questionnaire.

How do you decide on the type of sampling to use?

For any research, it is essential to choose a sampling method accurately to


meet the goals of your study. The effectiveness of your sampling relies on
various factors. Here are some steps expert researchers follow to decide the
best sampling method.
● Jot down the research goals. Generally, it must be a combination of
cost, precision, or accuracy.
● Identify the effective sampling techniques that might potentially
achieve the research goals.
● Test each of these methods and examine whether they help in
achieving your goal.
● Select the method that works best for the research.

Difference between probability sampling and non-probability


sampling methods

We have looked at the different types of sampling methods above and their
subtypes. To encapsulate the whole discussion, though, the significant
differences between probability sampling methods and non-probability
sampling methods are as below:

Probability Sampling Methods Non-Probability Sampling


Methods
Definition Probability Sampling is a Non-probability sampling is a
sampling technique in which sampling technique in which the
samples from a larger population researcher selects samples based
are chosen using a method on the researcher’s subjective
based on the theory of judgment rather than random
probability. selection.

Alternative Random sampling method. Non-random sampling method


ly Known
as

Population The population is selected The population is selected


selection randomly. arbitrarily.

Nature The research is conclusive. The research is exploratory.

Sample Since there is a method for Since the sampling method is


deciding the sample, the arbitrary, the population
population demographics are demographics representation is
conclusively represented. almost always skewed.
Time Takes longer to conduct since This type of sampling method is
Taken the research design defines the quick since neither the sample or
selection parameters before the selection criteria of the sample are
market research study begins. undefined.

Results This type of sampling is entirely This type of sampling is entirely


unbiased and hence the results biased and hence the results are
are unbiased too and conclusive. biased too, rendering the research
speculative.

Hypothesi In probability sampling, there is In non-probability sampling, the


s an underlying hypothesis before hypothesis is derived after
the study begins and the conducting the research study.
objective of this method is to
prove the hypothesis.

Sampling Errors

What are Sampling Errors?


Sampling errors are statistical errors that arise when a sample does not
represent the whole population. They are the difference between the real
values of the population and the values derived by using samples from the
population.

Sampling errors occur when numerical parameters of an entire population are


derived from a sample of the entire population. Since the whole population is
not included in the sample, the parameters derived from the sample differ
from those of the actual population.

They may create distortions in the results, leading users to draw incorrect
conclusions. When analysts do not select samples that represent the entire
population, the sampling errors are significant.

Sampling Errors Explained


Sampling errors are deviations in the sampled values from the values of the
true population emanating from the fact that a sample is not an actual
representative of a population of data.

Since there is a fault in the data collection, the results obtained from sampling
become invalid. Furthermore, when a sample is selected randomly, or the
selection is based on bias, it fails to denote the whole population, and
sampling errors will certainly occur.

They can be prevented if the analysts select subsets or samples of data to


represent the whole population effectively. Sampling errors are affected by
factors such as the size and design of the sample, population variability, and
sampling fraction.

Increasing the size of samples can eliminate sampling errors. However, to


reduce them by half, the sample size needs to be increased by four times. If
the selected samples are small and do not adequately represent the whole
data, the analysts can select a greater number of samples for satisfactory
representation.
The population variability causes variations in the estimates derived from
different samples, leading to larger errors. The effect of population variability
can be reduced by increasing the size of the samples so that these can more
effectively represent the population.

Moreover, sampling errors must be considered when publishing survey results


so that the accuracy of the estimates and the related interpretations can be
established.

Example of Sampling Errors


Suppose the producers of Company XYZ want to determine the viewership of
a local program that airs twice a week. The producers will need to determine
the samples that can represent various types of viewers. They may need to
consider factors like age, level of education, and gender.
For example, people between the ages of 14 and 18 usually have fewer
commitments, and most of them can spare time to watch the program twice
weekly. On the contrary, people between the age of 18 and 35 usually have
tighter schedules and will not have time to watch TV.

Hence, it is important to draw a sample proportionately. Otherwise, the results


will not represent the real population.

Since the exact population parameter is not known, sampling errors for
samples are generally unknown. However, analysts can use analytical
methods to measure the amount of variation caused by sampling errors.

Categories of Sampling Errors


● Population Specification Error – Happens when the analysts do not
understand who to survey. For example, for a survey of breakfast
cereals, the population can be the mother, children, or the entire family.
● Selection Error – Occurs when the respondents’ survey participation is
self-selected, implying only those who are interested respond. Selection
errors can be reduced by encouraging participation.
● Sample Frame Error – Occurs when a sample is selected from the
wrong population data.
● Non-Response Error – Occurs when a useful response is not obtained
from the surveys. It may happen due to the inability to contact potential
respondents or their refusal to respond.

Non-Sampling Error

What is Non-Sampling Error?


Non-sampling error refers to an error that arises from the result of data
collection, which causes the data to differ from the true values. It is different
from sampling error, which is any difference between the sample values and
the universal values that may result from a limited sampling size.

Non-sampling errors can come in various forms, including non-response error,


measurement error, interviewer error, adjustment error, and processing error.

Mechanics of Non-Sampling Error


Non-sampling error can arise when either a sample or an entire population
(census) is taken. It falls under two categories:

1. Random errors
Random errors are errors that cannot be accounted for and just happen. In
statistical studies, it is believed that each random error offsets each other,
generally speaking, so they are of little to no concern.

2. Systematic errors
Systematic errors affect the sample of the study and, as a result, will often
create useless data. A systematic error is consistent and repeatable, so the
study’s creators must take great care to mitigate such an error.

Non-sampling errors can occur from several aspects of a study. The most
common non-sampling errors include errors in data entry, biased questions
and decision-making, non-responses, false information, and inappropriate
analysis.
Types of Non-Sampling Errors
There are several types of non-sampling errors, including:

1. Non-response error
A non-response error is caused by the differences between the people who
choose to participate compared to the people who do not participate in a
given survey. In other words, it exists when people are given the option to
participate but choose not to; therefore, their survey results are not
incorporated into the data.

2. Measurement error
A measurement error refers to all errors relating to the measurement of each
sampling unit, as opposed to errors relating to how they were selected. The
error often arises when there are confusing questions, low-quality data due to
sampling fatigue (i.e., someone is tired of taking a survey), and low-quality
measurement tools.

3. Interviewer error
Interviewer error occurs when the interviewer (or administrator) makes an
error when recording a response. In qualitative research, an interviewer may
lead a respondent to answer a certain way. In quantitative research, an
interviewer may ask the question differently, which leads to a different result.

4. Adjustment error
An adjustment error describes a situation where the analysis of the data
adjusts it so that it is not entirely accurate. Forms of adjustment error include
errors with weighting the data, data cleaning, and imputation.

5. Processing error
A processing error arises when there is a problem with processing the data
that causes an error of some kind. An example will be if the data were entered
incorrectly or if the data file is corrupt.

Sampling Error vs. Non-Sampling Error


Often, sampling error and non-sampling error are used in similar contexts, but
there are some crucial differences between both concepts. They include:

1. Sampling error can arise even when no apparent mistake’s been made, as
opposed to non-sampling error, which arises when a mistake occurs.

2. Sampling error occurs when the sample is not representative of the


universal truth, whereas non-sampling error is specific to a certain study
design.

3. Sampling error can be reduced greatly as sampling size increases, but


non-sampling error requires more methodical processes to reduce.

4. Sampling error is often caused by internal factors, whereas non-sampling


error is caused by external factors not entirely related to a survey, study, or
census.

How to Reduce Errors


Reducing non-sampling error is not as easily achieved as reducing sampling
error. With sampling error, you can reduce the risk of error by simply
increasing the sample size. It will not work for non-sampling error, which is
often very difficult to detect and eliminate (unless very methodical
consideration is given to the source of the error).

To effectively reduce non-sampling error, careful consideration must be taken


by those designing the study to ensure the validity of the results. As such, a
researcher may design a mechanism into the study to reduce the error while
subsequently not introducing another error.

For example, a researcher may pay the individual a bonus depending on the
accuracy of their data entry, or they may film all interviews to ensure that the
interviewer stays on topic and on script.

Sampling Distribution
What Is a Sampling Distribution?
A sampling distribution is a probability distribution of a statistic obtained from a larger number of
samples drawn from a specific population. The sampling distribution of a given population is the
distribution of frequencies of a range of different outcomes that could possibly occur for a
statistic of a population.

In statistics, a population is the entire pool from which a statistical sample is drawn. A population
may refer to an entire group of people, objects, events, hospital visits, or measurements. A
population can thus be said to be an aggregate observation of subjects grouped together by a
common feature.

● A sampling distribution is a statistic that is arrived at through repeated


sampling from a larger population.
● It describes a range of possible outcomes of a statistic, such as the mean
or mode of some variable, as it truly exists in a population.
● The majority of data analyzed by researchers are actually drawn from
samples, and not populations.

Understanding Sampling Distribution


A lot of data drawn and used by academicians, statisticians, researchers, marketers, analysts,
etc. are actually samples, not populations. A sample is a subset of a population.

For example, a medical researcher that wanted to compare the average weight of all babies
born in North America from 1995 to 2005 to those born in South America within the same time
period cannot within a reasonable amount of time draw the data for the entire population of over
a million childbirths that occurred over the ten-year time frame. He will instead only use the
weight of, say, 100 babies, in each continent to make a conclusion. The weight of 200 babies
used is the sample and the average weight calculated is the sample mean.

Now suppose that instead of taking just one sample of 100 newborn weights from each
continent, the medical researcher takes repeated random samples from the general population,
and computes the sample mean for each sample group. So, for North America, he pulls up data
for 100 newborn weights recorded in the US, Canada and Mexico as follows: four 100 samples
from select hospitals in the US, five 70 samples from Canada and three 150 records from
Mexico, for a total of 1200 weights of newborn babies grouped in 12 sets. He also collects a
sample data of 100 birth weights from each of the 12 countries in South America.

The average weight computed for each sample set is the sampling distribution of the mean. Not
just the mean can be calculated from a sample. Other statistics, such as the standard deviation,
variance, proportion, and range can be calculated from sample data. The standard deviation
and variance measure the variability of the sampling distribution.

The number of observations in a population, the number of observations in a sample and the
procedure used to draw the sample sets determine the variability of a sampling distribution.

The standard deviation of a sampling distribution is called the standard error.

While the mean of a sampling distribution is equal to the mean of the population, the standard
error depends on the standard deviation of the population, the size of the population and the
size of the sample.

Knowing how spread apart the mean of each of the sample sets are from each other and from
the population mean will give an indication of how close the sample mean is to the population
mean. The standard error of the sampling distribution decreases as the sample size increases.

Special Considerations
A population or one sample set of numbers will have a normal distribution. However, because a
sampling distribution includes multiple sets of observations, it will not necessarily have a
bell-curved shape.

Following our example, the population average weight of babies in North America and in South
America has a normal distribution because some babies will be underweight (below the mean)
or overweight (above the mean), with most babies falling in between (around the mean). If the
average weight of newborns in North America is seven pounds, the sample mean weight in
each of the 12 sets of sample observations recorded for North America will be close to seven
pounds as well.

However, if you graph each of the averages calculated in each of the 1,200 sample groups, the
resulting shape may result in a uniform distribution, but it is difficult to predict with certainty what
the actual shape will turn out to be. The more samples the researcher uses from the population
of over a million weight figures, the more the graph will start forming a normal distribution.

You might also like