Sample and Sampling
Sample and Sampling
like cost, time, and accessibility. Therefore, researchers typically select a smaller, manageable
subset of the population, known as a sample, to study. The process of selecting this subset is
called sampling or the sampling technique. The primary goal is to choose a sample that
accurately represents the larger population, allowing researchers to draw valid conclusions and
make statistical inferences about the population based on the sample data. Sampling methods are
broadly categorized into probability sampling (random selection, generalizable results) and non-
probability sampling (non-random selection, often used for exploratory research). The choice of
sampling technique significantly impacts the reliability and generalizability of research findings.
Understanding and mitigating potential sampling bias is crucial for ensuring the external validity
of the study.
What is a Sample?
A population refers to the complete set of individuals, objects, events, or items that share
common characteristics relevant to a research question. It's the entire group about which the
researcher wishes to draw conclusions. Populations can be broadly defined (e.g., all adults in a
country) or narrowly defined (e.g., patients with a specific condition at one hospital).
A sample is a subset, a smaller, manageable group of individuals or items selected from the
larger population. It is the specific group from which data is actually collected. The elements
within a sample are sometimes referred to as sample points, sampling units, or observations.
The core idea is that the sample should be representative of the population it is drawn from. This
representativeness allows researchers to generalize findings from the sample back to the entire
population using inferential statistics. If the sample accurately reflects the characteristics of the
population, the conclusions drawn from studying the sample are likely applicable to the
population.
Sampling Frame
The sampling frame is the actual list or source from which the sample is drawn. Ideally, it should
encompass the entire target population without including any elements outside that population.
For example, a company's HR database listing all employees could serve as the sampling frame
for a study on that company's workforce. Mismatches between the sampling frame and the target
population can introduce sampling bias.
Time-Saving: Data collection and analysis are faster with a sample compared to a census.
Resource Efficiency: Fewer resources (personnel, equipment) are needed to study a sample.
Accuracy: A well-selected sample can sometimes yield more accurate results than a poorly
conducted census, as more resources can be dedicated to ensuring data quality from each
participant.
Necessity for Destructive Testing: In cases where the measurement process destroys the item
being tested (e.g., testing the lifespan of fuses), sampling is the only viable option.
Accessibility: Some populations are difficult to reach in their entirety (e.g., homeless individuals,
disaster survivors).
A sampling technique or sampling method refers to the statistical approach or process used to
select a representative sample from a population . It's the strategy employed to choose the
individuals or items that will participate in the research . The fundamental goal is to select a
sample that mirrors the characteristics of the target population as closely as possible, thereby
allowing for generalization of the research findings. The specific method chosen depends on
the
research objectives, the nature of the population, and available resources . Researchers should
clearly justify their chosen sampling method in their reports .
Sampling methods are primarily divided into two major categories: probability sampling and
non-probability sampling .
1. Probability Sampling
This approach involves random selection, ensuring that every member of the target population
has a known, non-zero chance of being included in the sample. It is the foundation for making
strong statistical inferences about the population.
Advantages: Provides the best chance of obtaining a representative sample, allows for estimation
of sampling error, reduces sampling bias, enables generalization to the population.
1.Simple Random Sampling (SRS): Every individual in the population has an equal chance
of selection. Selection is typically done using random number generators or similar chance-
based methods. Example: Assigning numbers to all 1000 employees in a company and
randomly selecting 100 numbers.
3.Stratified Sampling: The population is first divided into mutually exclusive subgroups
(strata) based on relevant characteristics (e.g., age, gender, income). Then, a random or
systematic sample is drawn from each stratum, often in proportion to the stratum's size in the
population .
This ensures representation of key subgroups . Example: Dividing company employees into
male
and female strata and randomly selecting 80 women and 20 men to match the company's gender
ratio in a sample of 100 .
4. Cluster Sampling: The population is divided into subgroups (clusters), typically based on
geography or other natural groupings. Unlike stratified sampling, clusters should ideally be mini-
representations of the population. Entire clusters are then randomly selected, and all individuals
within the selected clusters (or a random sample from within them) are included in the study .
This is useful for large, dispersed populations but carries a higher risk of error if clusters differ
significantly . Example: Randomly selecting 3 out of 10 company office locations (clusters) and
surveying all employees in those 3 offices .
5.Multistage Sampling: A more complex form of cluster sampling where sampling occurs
in stages. Smaller groups are progressively selected from larger groups at each stage.
Example:
Randomly selecting states, then cities within those states, then neighborhoods within those cities,
then household within those neighborhoods.
2. Non-Probability Sampling
This approach involves non-random selection based on criteria like convenience, accessibility, or
the researcher's judgment. Not every member of the population has a chance of being selected.
Advantages: Often easier, faster, and less expensive to implement.Useful for exploratory
research, generating hypotheses, or studying hard-to-reach populations.
Disadvantages: Higher risk of sampling bias, results may not be generalizable to the broader
population, difficult to estimate sampling error. Inferences about the population are weaker.
6.Convenience Sampling: Selecting individuals who are easiest to reach or most accessible
to the researcher (also called accidental or haphazard sampling) . Example: Surveying fellow
students in your own classes about university services. Highly prone to bias.
2.Voluntary Response Sampling: Individuals self-select into the study, often by responding to
an open invitation (e.g., an online poll). Prone to self-selection bias, as those with strong opinions
or particular characteristics are more likely to participate.
3.Purposive Sampling (Judgmental Sampling): The researcher uses their expertise or
judgment to deliberately select participants who meet specific criteria relevant to the research
question . Often used in qualitative research or when targeting a very specific population.
Example:
Selecting students with specific disabilities to understand their experiences with university
support services.
4. Snowball Sampling (Chain-Referral Sampling): Used when participants are hard to find.
Initial participants are asked to refer others who meet the study criteria . Useful for hidden or
hard-to-reach populations (e.g., homeless individuals) but can lead to a non-representative
sample . Example: Asking an initial homeless participant to recommend other homeless
individuals for the study.
5.Quota Sampling: The researcher first identifies relevant subgroups (strata) and determines the
proportion of each subgroup needed in the sample. Participants are then recruited non-randomly
(often via convenience or judgment) until the quotas for each subgroup are filled. Aims to control
the sample composition but selection within quotas is non-random. Example: Aiming for a
sample of 1000 consumers with quotas of 200 for meat-eaters, 200 for vegetarians, and 200 for
vegans, recruiting non-randomly until each quota is met.
Sampling Bias
Definition
Sampling bias occurs when the sampling process systematically favors certain members or
characteristics of the population over others, resulting in a sample that is not representative of the
population. It's a systematic error introduced during sample selection. This bias limits the
generalizability (external validity) of the research findings. Findings from biased samples can
only be reliably generalised to populations similar to the sample itself.
Under coverage Bias: Certain groups within the population are inadequately represented
in the sample. Example: Online surveys missing elderly or low-income groups.
Self-Selection / Voluntary Response Bias: Participants choose to be in the study, leading
to a sample not representative of those less inclined to volunteer.
Nonresponse Bias: Differences between respondents and non-respondents skew the
results . Example: Employees with high workloads being less likely to respond to a stress
survey.
Survivorship Bias: Focusing only on cases that "survived" a process, ignoring those that
didn't. Example: Studying only successful companies, ignoring failed ones.
Convenience Bias: Selecting easily accessible participants.
Judgment Bias: Bias introduced through the researcher's subjective selection.
Recall Bias: Participants inaccurately remember past events or experiences.
Exclusion Bias: Intentionally leaving out certain groups.
Healthy User Bias: Participants in health interventions are often healthier or more health-
conscious than the general population.
Pre-screening/Advertising Bias: How participants are recruited or screened influences the
sample composition.
Sample Size
The number of individuals included in a sample is crucial. Sample size determination depends on
factors like the population size, the variability within the population, the desired level of
precision, the research design, and the statistical analysis planned.
While larger samples can reduce random error, they do not eliminate systematic bias if the
sampling method itself is flawed.