Sampling Article
Sampling Article
net/publication/383310329
Selecting the Right Sample Size: Methods and Considerations for Social
Science Researchers
CITATIONS READS
0 1,395
4 authors, including:
Sathyanarayana S S Pushpa B V
M P Birla Institute of Management M P Birla Institute of Management
38 PUBLICATIONS 132 CITATIONS 6 PUBLICATIONS 3 CITATIONS
Dr Hema Harsha
M P Birla Institute of Management
10 PUBLICATIONS 5 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sathyanarayana S S on 24 September 2024.
ABSTRACT
The aim of this study is to provide a comprehensive framework of various sampling techniques utilized in social
science research. Sampling is a critical step in research design, influencing the accuracy and reliability of study
findings. This article covers essential methodologies including estimating a population proportion (single
proportion), estimating a population mean, estimating the difference between two population means, and
estimating the difference between two population proportions. Additionally, it delves into sample size
determination methods, highlighting Cochran’s formula for survey research, Nunnally’s formula for scale
development, Yamane’s formula, and Krejcie and Morgan’s table. The concept of confidence intervals and
confidence levels is thoroughly explored, elucidating their significance in inferential statistics. By examining how
confidence intervals work, the study emphasizes the importance of precision and reliability in research estimates.
The article also addresses critical sampling considerations that researchers must account for to ensure robust and
valid results. The findings provide a detailed comparison of these methodologies, offering insights into their
applicability and limitations in various research scenarios. This guide serves as a valuable resource for
researchers, aiding them in selecting appropriate sampling techniques for their studies, thereby enhancing the
quality and credibility of their research outcomes.
JEL Classification: C83, C10, C13, C18
Keywords: Sampling Techniques, Population Proportion, Population Mean, Confidence Intervals, Sample
Size Determination
----------------------------------------------------------------------------------------------------------------------------- ----------
Date of Submission: 15-07-2024 Date of acceptance: 31-07-2024
----------------------------------------------------------------------------------------------------------------------------- ----------
I. INTRODUCTION
Sampling is a fundamental component of social science research, serving as a cornerstone for drawing
reliable conclusions about populations. By selecting representative samples, researchers can generalize findings
to broader contexts (Trochim and Donnelly (2008)). This process is essential for ensuring the generalizability of
research findings and enhancing the external validity of studies. Moreover, sampling enables researchers to
optimize resource allocation, minimizing costs and time requirements while still obtaining meaningful results
(Babbie (2020)). Through well-designed sampling techniques, such as probability sampling, researchers can
achieve accuracy and precision in estimating population parameters (Fowler (2013)). This approach helps mitigate
bias and ensures that each member of the population has an equal chance of being included in the sample.
Additionally, sampling addresses ethical considerations by minimizing the burden on participants and protecting
their rights and welfare (Bryman (2016)). By selecting representative subsets and anonymizing data, researchers
can uphold confidentiality and privacy standards. Therefore, sampling plays a vital role in social science research,
facilitating generalizability, resource efficiency, accuracy, precision, and ethical considerations, thus contributing
to the integrity and validity of research findings.
The sample size in research significantly influences the power, precision, and generalizability of the
study’s findings. Understanding these implications is crucial for designing robust studies and interpreting results
accurately. Power refers to the probability of detecting a true effect when it exists. A larger sample size increases
the statistical power of a study, making it more likely to identify significant effects. Smaller sample sizes can
result in underpowered studies, increasing the risk of Type II errors (Cohen (1992); Maxwell, Kelley, and Rausch
(2008)). Precision refers to the degree to which repeated measurements under unchanged conditions show the
same results. Larger sample sizes reduce the standard error of the estimate, leading to narrower confidence
intervals and more precise estimates of population parameters (Kish (1965); Hair et al. (2010)). Generalizability
is the extent to which the findings of a study can be applied to the broader population. A larger and more
representative sample size improves the external validity of the study, ensuring that the results are applicable to a
wider population (Babbie (2016); Henrich, Heine, and Norenzayan (2010)).
2(𝑍𝛼 + 𝑍𝛽 )2 . 𝜎 2
2
𝑛=
(𝜇1 − 𝜇2 )2
Where:
𝑛 is the required sample size for each group.
𝑍𝛼 is the critical value from the standard normal distribution corresponding to the desired significance level (𝛼 /2).
2
𝑍𝛽 is the critical value from the standard normal distribution corresponding to the desired statistical power (1−𝛽 ).
σ is the common standard deviation of the populations.
𝜇1 − 𝜇2 is the difference in the means of the populations or groups.
This formula provides an estimate of the sample size needed in each group to detect a difference between the
means with a specified level of significance and statistical power. A larger sample size generally leads to a smaller
margin of error, providing more precise estimates of the difference between the population means. It is important
to note that researchers typically use prior information, pilot studies, or literature reviews to estimate the common
standard deviation (𝜎 ) and select appropriate values for the significance level (𝛼 ), statistical power (1−β), and
desired difference in means (𝜇1 − 𝜇2 ) based on the research objectives and practical considerations.
Example 1: Let us consider an example where a researcher wants to compare the effectiveness of two different
teaching methods, Method A and Method B, in improving students’ test scores. He wants to estimate whether
there is a significant difference in the average test scores between the two methods with a significance level of
0.05 and a statistical power of 0.80. Suppose you conducted a pilot study or reviewed previous research and found
that the common standard deviation of test scores for both methods is 10 points (𝜎 =10). He wants to detect a
difference of at least 5 points (𝜇1 − 𝜇2 = 5) between the average test scores of the two methods.
Solution
Using the formula for estimating sample size for comparing two population means:
2(𝑍𝛼 + 𝑍𝛽 )2 . 𝜎 2
2
𝑛=
(𝜇1 − 𝜇2 )2
where:
𝑍𝛼 =1.96 (corresponding to a significance level of 0.05)
2
𝑍 𝛽 =0.84 (corresponding to a statistical power of 0.80)
𝜎 =10 (common standard deviation)
(𝜇1 − 𝜇2 )=5 (difference in means)
2(1.96 + 0.84)2 . 102
𝑛=
(5)2
𝑛 = 62.72
Rounding up to the nearest whole number, the researcher would need a sample size of approximately 63 students
for each teaching method group to compare the effectiveness of Method A and Method B in improving test scores
with a significance level of 0.05 and a statistical power of 0.80, assuming a common standard deviation of 10
points and a difference in means of 5 points.
Example 2: Let us consider a more complex example involving the comparison of the effectiveness of two
different medications, Medication A and Medication B, in reducing blood pressure for patients with hypertension.
A researcher wants to estimate whether there is a significant difference in the average reduction in blood pressure
between the two medications with a significance level of 0.01 and a statistical power of 0.90. Suppose he has
DOI: 10.35629/8028-1307152167 www.ijbmi.org 159 | Page
Selecting The Right Sample Size: Methods And Considerations For Social Science Researchers
conducted a pilot study or reviewed previous research and found that the common standard deviation of blood
pressure reduction for both medications is 12 mmHg (𝜎 =12). He wants to detect a difference of at least 8 mmHg
(𝜇1 − 𝜇2 = 8) in the average reduction in blood pressure between the two medications.
Solution: Using the formula for estimating sample size for comparing two population means:
2(𝑍𝛼 + 𝑍𝛽 )2 . 𝜎 2
2
𝑛=
(𝜇1 − 𝜇2 )2
where:
𝑍𝛼 =2.576 (corresponding to a significance level of 0.01)
2
𝑍 𝛽 =1.282 (corresponding to a statistical power of 0.90)
𝜎 =12 (common standard deviation)
(𝜇1 − 𝜇2 )=8 (difference in means)
2(2.576 + 1.282)2 . 122
𝑛=
(8)2
𝑛 ≈ 46.5288
Rounding up to the nearest whole number, the researcher would need a sample size of approximately 47
patients for each medication group to compare the effectiveness of Medication A and Medication B in reducing
blood pressure with a significance level of 0.01 and a statistical power of 0.90, assuming a common standard
deviation of 12 mmHg and a difference in means of 8 mmHg.
YAMANE’S FORMULA
Yamane’s formula is a straightforward method used for determining sample sizes in survey research, particularly
in social science studies where the population size is known or easily determinable. It is commonly used in
situations where researchers want to obtain a representative sample from a large population. The formula was
proposed by S. Yamane in his book “Statistics: An Introductory Analysis” (1967).
𝑁
𝑛=
1 + 𝑁𝑒 2
Where: 𝑛 = sample size, 𝑁 = population size, 𝑒 = margin of error (expressed as a proportion, usually between 0
and 1)
Yamane’s formula assumes a simple random sampling method and is based on the population size and the desired
margin of error for the sample estimate. The margin of error represents the acceptable amount of variability
between the sample estimate and the true population parameter.
Example: Suppose a researcher wants to conduct a survey on a university campus with a total student population
of 10,000 students. They aim to achieve a margin of error of 0.05 (5%).
𝑁
𝑛=
1 + 𝑁𝑒 2
Using Yamane’s formula:
10,000
𝑛=
1 + 10,000 x 0.052
𝑛 ≈ 385
So, according to Yamane’s formula, the researcher would need a sample size of approximately 385 students to
achieve a margin of error of 5% in their survey.
Example 2: A nonprofit organization is conducting a survey to assess the satisfaction levels of citizens regarding
the quality of healthcare services in a large city. The city has a total population of 500,000 residents. The
organization wants to ensure a representative sample with a margin of error of no more than 2%. They plan to use
Yamane’s formula to determine the required sample size for their survey.
Solution
Population size (𝑁 ) = 500,000; Margin of error (𝑒 ) = 0.02 (2%):
Using Yamane’s formula:
𝑁
𝑛=
1 + 𝑁𝑒 2
5,00,000
𝑛=
1 + 5,00,000 x 0.022
𝑁10,000
𝑛=
1 + 10,000 x 0.052
𝑛 ≈ 2,487.56
So, according to Yamane’s formula, the nonprofit organization would need a sample size of approximately 2,488
residents to achieve a margin of error of 2% in their survey. However, since the calculated sample size is not a
whole number, the organization must decide whether to round up or down. Rounding up to the nearest whole
number ensures a slightly larger sample size, providing additional assurance in the survey’s representativeness
and reliability. Therefore, the organization might decide to round the sample size up to 2,488 participants.
The adjusted Yamane formula incorporating the population variance (π) and the z-score (z) for a given significance
level (α) is designed to increase accuracy, especially for dichotomous variables. The modified formula is:
𝑁. 𝑧 2 . 𝜋(1 − 𝜋)
𝑛=
𝑁. 𝑒 2 + 𝑧 2 . 𝜋(1 − 𝜋)
PRACTICAL APPLICATION
Ease of Use: Researchers can simply look up the population size in the table and find the corresponding sample
size without performing complex calculations.
Accuracy: The table ensures that the sample size is sufficient to make reliable inferences about the population.
Standardization: It provides a standardized method to determine sample size, promoting consistency across
different studies.
Example: Imagine you are conducting a survey to understand the job satisfaction levels of employees at a large
corporation. The total number of employees at the corporation (population size, 𝑁 ) is 1,200. Using Krejcie and
Morgan’s Table:
Population Size (N): Find the row in the table that corresponds to a population size of 1,200.
Sample Size (S): Look at the recommended sample size for a population of 1,200.
According to Krejcie and Morgan’s Table, for a population size of 1,200, the recommended sample size (S) is
approximately 291. This is the number of employees you need to survey to obtain statistically significant results
with a 95% confidence level and a margin of error of ±5%.
𝑋 2 . 𝑁. 𝑃. (1 − 𝑃)
𝑆= 2
𝑑 . (𝑁 − 1) + (𝑋 2 . 𝑃. (1 − 𝑝)
2
Parameters: N= 1,200; 𝑋 ≈3.841 (for 95% confidence level); P=0.5 (population proportion, for maximum
variability); d=0.05 (degree of accuracy, or margin of error)
REFERENCES
[1]. Babbie, E. (2016). The Practice of Social Research (14th ed.). Belmont, CA: Wadsworth.
[2]. Babbie, E. R. (2020). The practice of social research. Cengage AU.
[3]. Bartlett, J. E., Kotrlik, J. W., & Higgins, C. C. (2001). Organizational research: Determining appropriate sample size in survey
research. Information Technology, Learning, and Performance Journal, 19(1), 43-50.
[4]. Brewer, J., & Hunter, A. (1989). Multimethod research: A synthesis of style. Newbury Park, CA: Sage.
[5]. Bryman, A. (2016). Social research methods. Oxford university press.
[6]. Cochran, W. G. (1977). Sampling Techniques (3rd ed.). New York: John Wiley & Sons.
[7]. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Second Edition. Hillsdale, NJ: Lawrence Erlbaum Associates,
Publishers.
[8]. Cohen, J. (1992). Quantitative methods in psychology: A power primer. Psychological Bulletin, 112(1), 155-159.
[9]. Cohen, J. (1992). Statistical power analysis. Current Directions in Psychological Science, 1(3), 98-101.
[10]. Creswell, J. W. (2002). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Upper Saddle
River, NJ: Pearson Education.
[11]. Creswell, J. W. (2013). Qualitative Inquiry and Research Design: Choosing Among Five Approaches (3rd ed.). Thousand Oaks, CA:
SAGE Publications.
[12]. Creswell, J. W., & Clark, V. L. P. (2017). Designing and conducting mixed methods research. Sage publications.
[13]. Cumming, G., & Finch, S. (2005). Inference by eye: Confidence intervals and how to read pictures of data. American Psychologist,
60(2), 170-180. DOI: 10.1037/0003-066X.60.2.170
[14]. Ellis, P. D. (2010). The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results.
Cambridge University Press.
[15]. Etikan, I., Musa, S. A., & Alkassim, R. S. (2016). Comparison of convenience sampling and purposive sampling. American Journal
of Theoretical and Applied Statistics, 5(1), 1-4. DOI: 10.11648/j.ajtas.20160501.11
[16]. Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and
regression analyses. Behavior Research Methods, 41(4), 1149-1160.
[17]. Fowler, F. J. (2013). Survey Research Methods (5th ed.). Thousand Oaks, CA: SAGE Publications.
[18]. Funder, D. C., & Ozer, D. J. (2019). Evaluating effect size in psychological research: Sense and nonsense. Advances in Methods and
Practices in Psychological Science, 2(2), 156-168. DOI: 10.1177/2515245919847202
[19]. Gardner, M. J., & Altman, D. G. (1986). Confidence intervals rather than P values: Estimation rather than hypothesis testing. British
Medical Journal (Clinical Research Ed.), 292(6522), 746-750. DOI: 10.1136/bmj.292.6522.746
[20]. Green, S. B. (1991). How many subjects does it take to do a regression analysis? Multivariate Behavioral Research, 26(3), 499-510.
https://doi.org/10.1207/s15327906mbr2603_7