A Basic Approach in Sampling Methodology and Sample Size Calculation 249
A Basic Approach in Sampling Methodology and Sample Size Calculation 249
Review Article
Abstract
A major concept in clinical and experimental studies is the selection of subjects or units for the conduct of studies. This is essentially in order to make acceptable
inferences that can be generalized to the population. Failure to select a sample correctly from the population could result in errors that can produce a misleading
conclusion. Therefore, this study introduces us to basic concepts in sampling methodology and various methods used in sample selection. In conclusion, method
on how to calculate sample size for a survey is discussed.
Keywords: Population; Sample; Survey; Sample size; Sampling; Sampling error; Randomization
© 2019 - Medtext Publications. All Rights Reserved. 050 2019 | Volume 1 | Article 1006
MedLife Clinics
Population size: Often time, getting to obtain information from • Third Step: Having confirmed that 20 Nurses are needed,
all subjects or items of interest could be cumbersome and drudgery i.e. sample size (n=20). By using a random number software
especially if the population is so large. Hence, obtaining a sample generator, 20 random numbers can be generated that are
from such population is considered the best alternative. found between 1 through 200. The 20 random numbers
generated that matches the assigned numbers in the second
Sampling frame
step becomes the sample to be selected.
It is a list consisting of all population elements from which sample
can be drawn. A list of older people in an aged support home, register • The merit of this method is that bias is almost non-existent
list of deaths or live births are all examples of the sample frame. and the researcher might not necessarily have the knowledge
about the subjects/units. Also, it is quite basic with no
Sampling techniques knowledge of high quantitative computations. It is quite easy
The method through which a sample is selected from the to get a sample size from it since the population is large.
population is called sampling. Sampling techniques play a critical role
A major demerit of this method is that it could be strenuous with
in clinical and any other experimental studies [6]. There are two broad
high financial implications especially if the population is so large.
divisions of sampling techniques and they are probability sampling
This method can only be carried out provided that there is a sampling
and non-probability sampling [7,8].
frame. Without a sampling frame, it is quite impossible. It is a fair
Probability sampling method of selection but however, this selection could be deemed unfit
This is a sampling method in which every unit or subject in the especially if the eventual selected subjects or units do not meet the
population has an equal chance or probability of being selected. Samples envisaged objectives. An example is the random selection of Nurses
selections are done randomly [9]. Probability sampling is considered who do not have requisite and qualified knowledge that ought to be
the best form of sampling due to the equal representativeness of the a base requirement for entry into an advanced training technique
sampling units [10,11]. Examples of probability sampling techniques course leaving those qualified behind in the pool.
include: Simple random sampling, systematic sampling, cluster Stratified sampling: If a population consists of various distinct
sampling, and stratified sampling (Figure 1). groups, they can be grouped into sub-categories called Strata. After
which a random selection of a sample from each stratum (singular form
of Strata) can be carried out. Thus, each stratum is an independent sub-
population cohort of the generation population with each consisting
of unique or homogenous group classification. Each element in the
stratum thus has an equal chance of being selected. As an example,
Medical officers in a hospital can be divided into categories relative
to years of experience, sex, race, departments or types of institution
graduated from (private or public medical schools).
In terms of years of experience stratification, those having 0 years
to 3 years of experience can be the first category (first stratum), while
those with 4 years to 7 years of experience could be the second stratum
and those with above 7 years could be grouped in a third category
(third stratum). The strata thus become more homogenous in nature
based on years of experience. Therefore, a random sample selection
of medical officers will be done from each of the three strata formed.
Figure 1: Probability Sampling.
The advantage of a stratified sampling is that it ensures a fair
and even selection of units that vary across different groups or
Simple random sampling (SRS): This is the simplest form of a
classification thus reducing sampling bias. However, the drawback for
probability random sampling method. When selecting a sample, each
this method is that it requires a sampling frame for each stratum and
observational unit/subject has an equal chance of being selected.
it could also be difficult when selecting the criteria for stratification.
This method is most suitable when the population is homogenous
with fairly small population size. SRS can be done through a lottery Systematic sampling: This is a probability sampling technique in
system or allotment of random numbers to population units from a which every k'th element in a population list is selected sequentially
random number table or a computer-generated numbers algorithm to [12]. To mitigate bias, it is advised that the first element selection
experimental units after which samples that matched some random should be at random while subsequent selection will follow every k'th
numbers are selected [12]. A simple description of SRS is given below: element ordering. Given that an NGO volunteer register list consists of
2000 individuals, and 100 participants are required to be selected for a
Suppose a hospital has 200 Nurses and the management board is
program participation, the first person can be selected randomly (say
interested to sponsor 20 to an international conference.
individual on 9th position on the list), then every 10th individual from
• First Step: A name list of all the 200 Nurses is needed to be the first selected individual will be selected till the 100 participants
made. The whole 200 Nurses is the population (N) are completed. In the example above, after the 9th individual on the
list was selected, the second selection will be the individual in the 19th
• Second Step: A sequence numbering is assigned to each of position, followed by the individual on 29th position and so on till the
the 200 Nurses (say from 1,2, 3…..200). This list is called the
100 participants limit is reached.
sampling frame.
© 2019 - Medtext Publications. All Rights Reserved. 051 2019 | Volume 1 | Article 1006
MedLife Clinics
The advantage of this method is that is simple and selection Quota sampling: This method is widely used in polls and
procedure is easy. Sample selection is evenly distributed over the considered the non-probability version of stratified sampling. The
population. The disadvantage is that if there is a hidden pattern in population is divided into unique distinct groups of units/objects
the list, there could be a problem of overrepresentation and sample and a researcher makes selections from each of the distinct groups
selection can be compromised. Example of this hidden pattern could based on their selection criteria, proportion or preference. Suppose
be admission of nursing students into an academic institution. If the a researcher is expected to obtain the opinion of five hundred (500)
school policies favor admission of students from the locality before women on the use of contraception relative to the socio-economic
admission consideration for an outsider, the school register lists could level, the researcher can decided to go to a low income community
be filled with local students before others on the register list. Using a and select 250 women based on his judgment and equally replicate
systematic sampling on such a register list for a research study that such exercise in a high-income neighborhood by selecting another
requires student perception of the school admission policy could be 250 women.
sentimental and compromised.
Quota sampling is useful where there is a time limitation, low
Cluster sampling: This is one of the most popular sampling budget and a sampling frame is not available. However, it is very prone
techniques used in epidemiological researches. However, it is often to selection bias since each selected subject does not have an equal
confused with the method of stratified sampling but there is a unique chance of being chosen.
difference between the two sampling methods. In Cluster sampling,
there is a division of the population into groups, and these groups Purposive sampling: This is also called a judgmental or subjective
are called clusters. Then, some clusters are randomly selected. Each sampling method. This is a method of non-probability sampling based
unit or subject found in the randomly selected clusters is then totally on the knowledge and understanding of a researcher in selecting the
included in the sample needed. Whereas in stratified sampling, there needed sample from a population for a study. A researcher in this
is only a random selection of elements from each of the strata to form study reach out to participants he felt can meet up with the objective
the required sample. of a study or subject of interest the study is investigating. Individuals
with no knowledge of study objectives are not selected. This method
Example: Given that a town is selected for an immunization
is commonly used in qualitative research studies and focus group
intervention program for children aged 5 years and below and the
town is made of ten (10) districts. The ten (10) districts translate to ten discussions in which experts in the subject of interest are selected
(10) clusters. Given that three (3) districts are randomly selected out based on the expert experiences and knowledge inclinations.
of the ten (10) districts, the selected three (3) districts are the clusters The advantages of this sampling technique are that it is time and
to be considered for the program and all children aged 5 years and cost-effective. It makes it easier to narrow to subjects of interest. The
below in these three (3) selected districts are included in the sample disadvantages of this method are that the researcher is prone to make
for the intervention program. a subjective judgment and thus increase the bias on sample selection.
Cluster sampling becomes more effective if the clusters It is not always reliable and the result of the study might not be
constituents are heterogeneous in nature unlike in stratified sampling generalized to the population.
in which the strata are homogenous. A major advantage of cluster Convenience sampling: This is a sampling selection based on
sampling is cost-effectiveness and timeliness but drawbacks include a the accessibility of respondents within reach. It can also be called
high margin of sampling error. Accidental sampling. Subjects or units are merely selected because
Non-probability sampling they can easily be found around and the researcher has regular access
This is a sampling technique in which every object or experimental to them. Examples of this sampling method could entail surveying
does not have an equal chance or probability of being selected [13]. As friends, neighbors or families, moving across the corners of the
a result of the skewed selection of subjects/units, they are often highly streets to ask for volunteers for their opinion, online polls and so on.
susceptible to bias and other forms of selection errors. This sampling Convenience sampling is often used as a pilot study in order to gain
technique is based on Researcher preference and discretion. Examples insight before a full-fledge research activity takes place. It is a very
of non-probability techniques are quota sampling, purposive simple and easy sampling technique. When used as a pilot study, it
sampling, convenience sampling, and snowball sampling (Figure 2). enables researchers to reframe the area of research questions that
lacks clarity and add up other insightful questions in a bid to generate
effective responses. It is time-efficient and requires low financial
expenditure to carry out. However, it is dangerous because subjects or
respondents that are actually fitting for a research objective might not
be the set of respondents that were eventually selected for the study.
Hence, results from such studies can be highly flawed with errors and
deviation from the research objectives.
Snow ball sampling: This sampling method is commonly used
in very sensitive culturally studies and also used in situations of the
rarity of experimental subjects/respondents or when they are very
hard to reach. It is based on a referral mechanism in which identified
subjects help to reach out to other unknown subjects. Examples of
this sampling technique could be studied on public health issues such
as research on the consumption of illicit drugs, HIV/AIDs infected
Figure 2: Non- Probability Sampling.
individuals, or victims of sexual abuse.
© 2019 - Medtext Publications. All Rights Reserved. 052 2019 | Volume 1 | Article 1006
MedLife Clinics
This method lowers the research cost of selecting a study’s sample sentiments, fatigue and so on. Errors can also occur when collating or
but could be the plague with a lot of bias as a selected sample may sorting data by research staff. There could be wrong data imputations
not be true representatives across the cross-section of participants during analysis and usage of wrong statistical or mathematical
required. methods. Results interpretation could be wrongly typed or organized
Sampling and Non- Sampling Errors appropriately (Figure 3).
Errors need to be taking into account when conducting research Sample size calculation
studies. This is essential especially when such studies are imperative Many studies are being conducted without taking cognizant of the
for the general population. There are two major types of errors in sample size effect. Most researchers/investigators with no quantitative
study and they are sampling and non-sampling errors. background often find this as a major barrier in their research studies.
Sampling error After setting a research objective and identifying the appropriate
respondents, a major step to consider again is getting an appropriate
The error that occurs due to the use of a sample rather than the
sample size for the study.
whole population is called Sampling Error. This error is a deviation
from the actual measure, trait or attributes of true entity. The major Getting appropriate and adequate sample sizes of respondents
cause of sampling error is that subjects are drawn out as sample from who are randomly selected helps reduce sampling error or biases
the population and these drawn-out subjects could as well have their in researches. The adequacy of the sample size is not just about the
own individual inherent variability. Aside from this, the use of a proportion to the population but also takes into consideration the
defective or wrong sampling method could result in this type of error. selected sample in at last to the diversity existing in the population,
Two major factors that influence the extent of sampling errors are the objectives of the investigators as well as the statistical modeling
the sample size and sampling technique. In quantitative studies, the techniques to be employed [14]. Though it is established that larger
confidence limits, standard error, p-values, co-efficient of variances sample size that nears the population diminish sampling errors and
are used as measures of sampling errors. As a result of sampling error, increases the result validity [15], however, there is a tendency that
researchers are always advised to use the most suitable technique in seeking to get more samples that happen to exceeds an appropriate
sample selection that fits the study objectives and also makes use of computed size could overstrain resources of investigators [16].
sufficient sample size in order to reduce the magnitude of sampling
There are several sample size calculating methods relative to
error. Likewise, randomization during sampling is encouraged.
the objectives of the studies and study design. However, a common
Non-sampling error formula for calculating sample size in survey studies from a finite
All other forms of errors that occur apart from sampling error population (countable population) is given below.
are called non-sampling errors. These errors could occur as a result N*X
of several factors which could stem out from the instrument used n=
X + N −1
in data collection, the subject/respondents or the individuals that
collect, collate, sort, analyze and presents the data output. In terms Where
=
of the instrument, the instrument might not be well specified. If the X ( Zα / 22 * P(1 − P)) / MOE 2
instrument is a questionnaire, the questionnaire might not consist of
appropriate or clear questions that align with the research objectives. n= Sample size; p= Proportion of sample; MOE= Margin of
The questions and instructions might also be difficult for respondents error; N= Population size
to understand thus leading to inappropriate responses.
Z-(α/2) =the critical value of the normal distribution at a α/2 (for
Errors from respondents could be as a result of mood/feeling, a confidence interval level of 95%, α is 0.05 and the critical value is
non-response, deliberate answer omissions, false responses, 1.96). This entails the required level of confidence for the estimate.
© 2019 - Medtext Publications. All Rights Reserved. 053 2019 | Volume 1 | Article 1006
MedLife Clinics
effectiveness. 6. Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques for
clinical research. Ann Ind Acad Neurol. 2011;14(4):287-90.
Solution:
7. Elfil M, Negida A. Sampling methods in Clinical Research; an Educational Review.
Thus, N=3510, p= 0.5, MOE= 0.05, and z_ (α/2) = 1.96 Emerg (Tehran). 2017;5(1):e52.
N*X 8. Shorten A, Moorley C. Selecting the sample. Evid Based Nurs. 2014:17(2):32-3.
n=
X + N −1 9. Fowler FJ. Sampling. Survey Research Methods. 4th ed. Thousand Oaks: Sage
2 2 Publications; 2009. p. 19-47.
=X ( Zα / 2 * P(1 − P)) / MOE
10. Thompson SK. Sampling. 3rd ed. Hoboken NJ: John Wiley & Sons Inc; 2012.
1.962 *0.5(1 − 0.5) 11. Curtin R, Presser S, Singer E. Changes in telephone survey non response over the past
X= quarter century. Public Opinion Quarterly. 2005;69(1):87-98.
0.052
X= 384.16 12. Babbie E. The Logic of Sampling. The Practice of Social Research. 10th ed. Belmont:
Hadsworth/Thomson Learning; 2004. p. 178-217.
3510*384.16
n= 13. Etikan I, SulaimanAbubakar M, Rukayya Sunusi A. Comparison of Convenience
384.16 + 3510 − 1 Sampling and Purposive Sampling. Am J Theor Appl Stat. 2016;5(1):1-4.
study. 17. Bartlett JE, Kotrlik JW, Higgins CC. Organizational Research: Determining
Conclusion Appropriate Sample Size in Survey Research. Inf Technol Learn Perform J. 2001;19:43-
50.
In conclusion, the importance of sampling in carrying out research
cannot be overemphasized. The need to identify the most suitable
sampling method is important as this plays an effect on the level of
error recorded. A large sample size that nears the population size can
provide a better estimate that can be generalized to the population. In
reducing the risks of non-sampling errors, researchers are urged to
take into account modalities that can effectively ensure valid studies
from tested and review instruments of data collection to the use of
qualified enumerators and other workers involved in data analysis and
result preparations.
© 2019 - Medtext Publications. All Rights Reserved. 054 2019 | Volume 1 | Article 1006