Module III-Sampling Design
Module III-Sampling Design
Module III
Meaning and Definition
◦ Sampling refers to the method of selecting a small pattern of data from large population for the
purpose of carrying out an investigation. The selected pattern is termed as sample which is a
small and manageable version of large set of data. Sampling is most widely used in statistical
testing where size of population is too large such that it is impossible to include each individual
observation in test.
◦ Under this technique, to ease the process of doing a research on whole population, it is divided
into small sampling unit. These sampling units represent the characteristics of whole
population and should not reflect bias towards a particular attribute. Samples drawn from
population are used by researcher for making statistical inferences and estimating the
information about whole population. Methodology to be used for the technique of sampling
depends upon type of analysis being conducted by researcher.
◦ Characteristics of Sampling
◦ Various characteristics of sampling are discussed in points given below: –
1.Goal-oriented: Design of sampling should be goal oriented. It must align clearly with the objectives
of research being conducted and should be in accordance with conditions of survey.
2.Proper universe representation: Sample chosen should adequately represent the characteristics of
whole population from which it is taken. It should fairly represent details about all units without any
biasness. There are different methods of choosing a sample and it need to be chosen with utmost
care as improper sampling would lead to error in survey.
3.Proportional: Size of sample should be proportional with the size of population. It should be large
enough for representing the whole universe and must provide statistical reliability. Sample must
ensure proper accuracy for carrying out the particular research study.
4.Economical: Process of sampling should be economical requiring minimum cost and efforts for
attaining the objectives of survey.
5.Random selection: Sample units should be selected on a random basis under which every unit has
an equal chance of being chosen. It will ensure that sample is a fair representative of whole
population.
6.Practical: Design of sample should be simple and practical. It must be capable of easily understood
and applicable in fieldwork.
◦ Advantages of Sampling
◦ Various advantages of sampling are as discussed below: –
1.Lower sampling cost: Sampling reduces the overall cost involved in doing research. The cost for
collecting data about entire population is quite high. Sampling reduces the population into small
manageable units. Acquiring data about sample of population involves lower cost which is one
of the major advantage.
2.Less time consuming: Sampling reduces the overall time by reducing the size of population.
Data is not collected about every member in population but only related to sample is gathered. It
is less time-consuming in comparison to census technique.
3.Higher accuracy of data: A sample represents the whole population from which it is drawn. It is
used for calculation of desired descriptive statistics and a stability of derived sample value can
be easily determined. Samples permit a high level of accuracy because of limited area of
operations. It enables in proper execution of field work and results of studies conducted on the
basis of theses sample units turn out to be accurate.
4.Higher scope of sampling: Sampling enables investigators to easily arrive at generalizations
about set of data. It would be totally impractical to study whole population as it is too large for
measuring characteristics of all individual members. Process of sampling by analyzing variables
within small proportion of population ease in arriving at generalizations.
5.Intensive and exhaustive data: In studies based on sample units, observations are made of a
limited number. Therefore, exhaustive and intensive data are collected.
6.Suitable in case of limited resources: Sampling is very effective technique of collecting
information in presence of limited resources with organization. Studying the whole population
requires large amount of resources both in term of money and time. Sampling makes it possible
to cover whole population satisfactorily even by employing limited resources.
7.Better rapport: Good rapport in between the researcher and respondents is must for carrying
out an effective research study. In presence of large population, various issues of rapport arise.
◦ Disadvantages of Sampling
◦ Accuracy of sample is dependent upon appropriateness of sample method used. Theory of
sampling focuses on improving the efficiency of sampling. Major difficulties are pose at the time
of estimation, selection and administration of samples. Various disadvantages of sampling
process are discussed in points given below: –
1.Chance of Bias: Major limitation that arises with sampling is chance of biasness in choosing
sample units. Selection of samples is a judgmental task as it is based on mindset of individual
choosing them. These biased selection does not truly represent the whole population and may
lead to faulty conclusions by researcher.
2.Difficulty in choosing a truly representative sample: Choosing an adequate and reliable sample
that is a truly representative of population remains a difficult task. In case the phenomena under
study is of complex nature involving heterogeneous data, it becomes difficult to select proper
samples.
3.Lack of adequate subject knowledge: Application of sampling process requires proper
knowledge regarding sampling technique by individual selecting sample units. This process
requires computation of probable error and statistical analysis. There are chances of serious
mistakes being committed by researcher in case if he lacks specialized knowledge about
sampling. Consequently, overall results of research study conducted will be misleading.
◦ 4.Impossibility of sampling: Process of sampling is not applicable in cases where universe is
too small consisting of heterogeneous set of data. It is difficult to derive a representative
sample in such cases. Census study is the only alternative for doing study for such phenomena.
Also, sampling is inadequate for studies that needs a high degree of accuracy. There are always
chance of errors in sampling even if sample units are chosen with utmost care.
Types of Sampling
◦ When you conduct research about a group of people, it’s rarely possible to collect data from every person
in that group. Instead, you select a sample. The sample is the group of individuals who will actually
participate in the research.
◦ To draw valid conclusions from your results, you have to carefully decide how you will select a sample
that is representative of the group as a whole. This is called a sampling method. There are two primary
types of sampling methods that you can use in your research:
• Probability sampling involves random selection, allowing you to make strong statistical inferences
about the whole group.
• Non-probability sampling involves non-random selection based on convenience or other criteria,
allowing you to easily collect data.
◦ Probability sampling methods
◦ Probability sampling means that every member of the population has a chance of being selected. It is
mainly used in quantitative research. If you want to produce results that are representative of the whole
population, probability sampling techniques are the most valid choice.
◦ There are four main types of probability sample.
◦ 1. Simple random sampling
◦ In a simple random sample, every member of the population has an equal chance of being selected. Your
sampling frame should include the whole population.
◦ To conduct this type of sampling, you can use tools like random number generators or other techniques
that are based entirely on chance.
◦ Example: Simple random samplingYou want to select a simple random sample of 1000 employees of a
social media marketing company. You assign a number to every employee in the company database from
1 to 1000, and use a random number generator to select 100 numbers.
◦ 2. Systematic sampling
◦ Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct.
Every member of the population is listed with a number, but instead of randomly generating numbers,
individuals are chosen at regular intervals.
◦ Example: Systematic samplingAll employees of the company are listed in alphabetical order. From the
first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th
person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people.
◦ If you use this technique, it is important to make sure that there is no hidden pattern in the list that might
skew the sample. For example, if the HR database groups employees by team, and team members are
listed in order of seniority, there is a risk that your interval might skip over people in junior roles,
resulting in a sample that is skewed towards senior employees.
◦ 3. Stratified sampling
◦ Stratified sampling involves dividing the population into subpopulations that may differ in important ways.
It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the
sample.
◦ To use this sampling method, you divide the population into subgroups (called strata) based on the relevant
characteristic (e.g., gender identity, age range, income bracket, job role).
◦ Based on the overall proportions of the population, you calculate how many people should be sampled
from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup.
◦ Example: Stratified samplingThe company has 800 female employees and 200 male employees. You want
to ensure that the sample reflects the gender balance of the company, so you sort the population into two
strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men,
which gives you a representative sample of 100 people.
◦ 4. Cluster sampling
◦ Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar
characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select
entire subgroups.
◦ If it is practically possible, you might include every individual from each sampled cluster. If the clusters
themselves are large, you can also sample individuals from within each cluster using one of the techniques
above. This is called multistage sampling.
◦ This method is good for dealing with large and dispersed populations, but there is more risk of error in the
sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled
clusters are really representative of the whole population.
◦ Example: Cluster samplingThe company has offices in 10 cities across the country (all with roughly the same
number of employees in similar roles). You don’t have the capacity to travel to every office to collect your data,
so you use random sampling to select 3 offices – these are your clusters.
◦ Non-probability sampling methods
◦ In a non-probability sample, individuals are selected based on non-random criteria, and not every
individual has a chance of being included.
◦ This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. That means
the inferences you can make about the population are weaker than with probability samples, and your
conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as
representative of the population as possible.
◦ Non-probability sampling techniques are often used in exploratory and qualitative research. In these
types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial
understanding of a small or under-researched population.
◦ 1. Convenience sampling
◦ A convenience sample simply includes the individuals who happen to be most accessible to the
researcher.
◦ This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is
representative of the population, so it can’t produce generalizable results. Convenience samples are at
risk for both sampling bias and selection bias.
◦ Example: Convenience samplingYou are researching opinions about student support services in your
university, so after each of your classes, you ask your fellow students to complete a survey on the topic.
This is a convenient way to gather data, but as you only surveyed students taking the same classes as you
at the same level, the sample is not representative of all the students at your university.
◦ 2. Voluntary response sampling
◦ Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead
of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g.
by responding to a public online survey).
◦ Voluntary response samples are always at least somewhat biased, as some people will inherently be more
likely to volunteer than others, leading to self-selection bias.
◦ Example: Voluntary response samplingYou send out the survey to all students at your university and a lot
of students decide to complete it. This can certainly give you some insight into the topic, but the people
who responded are more likely to be those who have strong opinions about the student support services,
so you can’t be sure that their opinions are representative of all students.
◦ 3. Judgement sampling:
◦ This type of sampling, also known as purposive sampling, involves the researcher using their expertise to
select a sample that is most useful to the purposes of the research.
◦ It is often used in qualitative research, where the researcher wants to gain detailed knowledge about a
specific phenomenon rather than make statistical inferences, or where the population is very small and
specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make
sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your
arguments.
◦ Example: Purposive sampling You want to know more about the opinions and experiences of disabled
students at your university, so you purposefully select a number of students with different support needs
in order to gather a varied range of data on their experiences with student services.
◦ 4. Snowball sampling
◦ If the population is hard to access, snowball sampling can be used to recruit participants via other
participants. The number of people you have access to “snowballs” as you get in contact with more
people. The downside here is also representativeness, as you have no way of knowing how representative
your sample is due to the reliance on participants recruiting others. This can lead to sampling bias.
◦ Example: Snowball samplingYou are researching experiences of homelessness in your city. Since there is
no list of all homeless people in the city, probability sampling isn’t possible. You meet one person who
agrees to participate in the research, and she puts you in contact with other homeless people that she
knows in the area.
◦ 5. Quota sampling
◦ Quota sampling relies on the non-random selection of a predetermined number or proportion of units.
This is called a quota.
◦ You first divide the population into mutually exclusive subgroups (called strata) and then recruit sample
units until you reach your quota. These units share specific characteristics, determined by you prior to
forming your strata. The aim of quota sampling is to control what or who makes up your sample.
◦ Example: Quota samplingYou want to gauge consumer interest in a new produce delivery service in
Boston, focused on dietary preferences. You divide the population into meat eaters, vegetarians, and
vegans, drawing a sample of 1000 people. Since the company wants to cater to all consumers, you set a
quota of 200 people for each dietary group. In this way, all dietary preferences are equally represented in
your research, and you can easily compare these groups.You continue recruiting until you reach the quota
of 200 participants for each subgroup.
◦ Population vs. sample
◦ First, you need to understand the difference between a population and a sample, and identify the target population of
your research.
• The population is the entire group that you want to draw conclusions about.
• The sample is the specific group of individuals that you will collect data from.
◦ The population can be defined in terms of geographical location, age, income, or many other characteristics.
◦ It can be very broad or quite narrow: maybe you want to make inferences about the whole adult population of your
country; maybe your research focuses on customers of a certain company, patients with a specific health condition, or
students in a single school.
◦ It is important to carefully define your target population according to the purpose and practicalities of your project.
◦ If the population is very large, demographically mixed, and geographically dispersed, it might be difficult to gain access
to a representative sample. A lack of a representative sample affects the validity of your results, and can lead to several
research biases, particularly sampling bias.
◦ The sample size is a measure of the number of individual samples that are present or have been observed
during an experiment or a survey conducted. To understand it a little better, we can take an example. If we
test 100 plants for a certain type of disease-causing virus, the sample size is 100. To carry out the survey,
you received a 30,500 complete set of questionnaires, then the size of the sample is 30,500. The sample size
in statistics is represented by the letter ‘n.' Hence, the Sample Size definition is a measure of the number of
samples for a particular study or research.
◦ Factors Contributing To Sample Size Collection
◦ There are certain factors that are taken into consideration before determining the sample size of a
particular experiment or a study. These are-
• Size of the Population- The size of the population that is being studied is the first thing that is considered.
The study that involves the conclusion from a larger area says an entire country will require a larger sample
size in comparison to studying that includes smaller areas like a state or a city. The margin of error- Is
another thing that is considered. Given the fact that the data that has been collected is accurate to what
extent? There are high chances of errors in the data, so the margin of error is always considered.
• Standard Deviation – Standard deviation refers to the amount of deviation seen in the individual sample
and the whole group of samples. Let’s say, for example, if the samples of soil are collected from a park, they
are likely to show lesser deviation in the amount of nitrogen content in them, as opposed to the samples of
soil that have been collected from across the nation.
◦ What Dangers Are Related to Small Sample Size?
◦ It has been said that the smaller size, the more accurate the findings!/Let us understand
this by an example. A team decided to study how many people exercise daily in a country
and what they did was pick up 5 people and interviewed them regarding the same. Two of
them said that they exercise regularly. The outcome of the study would be that 40% of the
population exercise regularly, and this would represent the country as a whole. The data
does carry a lot of inaccuracy, and the Margin of error is quite high. So smaller the sample
size, the higher are the margins of error and vice versa. Hence, it is advisable to select a
large sample size to conduct any given experiment or study.
◦
◦Calculating the Sample Size
◦To calculate the size of the sample, the figures you need are- Desired confidence level, margin or error, the total number of people in the
population. There are two sample size formulas-
1.
Sample Size Calculation is Done In the Following Manner
◦Sample size can be calculated by (Percentage Distribution of 50)/ (?percentage of Margin of error/ Score of confidence level) squared)
◦N = ({1.96(5)}/0.5)2
◦
◦Which comes out to be 384.16 and if we round it off, it is 385.
◦ Conclusion
◦ Not only a small sample size, but even the larger one is a problem. The interpretation with
such large samples makes it difficult for the research and also affects the figures. Hence, it
is not recommended to take larger samples. A moderately good amount of sample is what
should be taken in order to obtain accurate results. Though there is no specified limit to
decide the sample size there are a few thumb rules that can be followed. One of them says
a minimum of 30 samples should be taken, and another says 12 minimum samples should
be considered before carrying out a study.