0% found this document useful (0 votes)
6 views66 pages

Sample Size

Uploaded by

smart9aparna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views66 pages

Sample Size

Uploaded by

smart9aparna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 66

Sample Size &

Sampling Method
• Cooking rice example
Why sampling?
• To know prevalence of certain diseases or health
conditions in particular area
• Done by studying very small number of population
• No need to take whole population
– E.g. To know prevalence rate of anaemia, HT, DM,
obesity etc.
– Various rates like accident rate, death rate, birth rate
etc.
Identifying such problems in
different set up
• In India, Gujarat, Rajkot district, Rajkot city, particular
Taluka or village etc.
• Rural/Urban area
• In certain religion – Hindu, Muslim, Sikh, Christian etc.
• Various age group : 0-5 years, 15-45 years, > 60 years
• Gender based: Male/Female
• Various groups: Students, teachers, doctors, bank
employees, pregnant women, elderly etc.
Definition
• Sample is a finite number of people, object or items
statistically selected from the population

• Sampling Procedure by which some members of a given


population are selected as representatives of the entire
population
Definitions
The class or families living in the city from which you
select your sample are called the population or
study population, and are usually denoted by the
letter N.

The small group of students or families from whom


you collect the required information to estimate the
average age of the class or average income is called
the sample

The number of students or families from whom you


obtain the required information is called the sample
size and is usually denoted by the letter n
Definitions (Cont.)
The way you select students or families is called
the Sampling design or Sampling method

Each student or family that becomes the basis


for selecting your sample is called the sampling
unit or sampling element
The smallest unit of a population (individuals in a
survey) are called sampling units
A list identifying each student or family in the
study population is called the sampling frame
Sampling frame
• A list defining the population from which
the sample will be drawn
• Examples:
– Telephone book
– Voter list

• Essential for probability sampling


The main objectives of sampling

• Estimation of population parameters (mean, proportion, etc.)


from the sample statistics

• To test the hypothesis about the population from which the


sample or samples are drawn

• To compare with known fact/indices/data

Inference drawn from a sample applies to the defined


population (universe) from which sample/samples are
drawn and not to other population
Why Sample Size Calculation?

Large Sample:

Cost, Time and Personnel

Unethical
Why Sample Size Calculation? (contd.)

Small Sample:

Unable to detect clinically


important results.

Not very small and not very large


A sample of size greater than 30 is
considered large enough for statistical
purposes
Limitation
• Must be done by qualified and experienced
persons

• There is the possibility of sampling errors


Sampling Techniques
• Probability Sampling: • Non-probability Sampling:
1. Simple Random 1. Purposive sampling
Sampling
2. Snow ball sampling
2. Systematic Random
3. Quota Sampling
Sampling
3. Stratified Random 4. Convenience Sampling
Sampling
4. Cluster Sampling
5. Multi-stage Sampling
6. Multi-phase Sampling
Non probability
Probability sampling
sampling

• Each member of the • Members are selected


population has a from the population in
known non-zero some non-random
probability of being manner.
selected. • Probability of sampling
• Probability of error can not be
sampling error can calculated.
be calculated.
1. Simple random sampling
• It is applicable when the population is small,
homogeneous and readily available
• Principle here is that every unit of the population
has an equal chance for selection
• The sample may be drawn unit by unit, either by
numbering or from published tables
• To ensure random selection one may adopt either
lottery method or table of random number
Lottery method
• Suppose, 10 patients are to be put on drug
trial out of 100 patients.
o Note the serial number of patients on 100 cards
and shuffle them well
o Draw out one and note the number
o Replace the card drawn, reshuffle and draw the
2nd card
o Repeat the process till 10 numbers are drawn
o Reject the cards that are drawn for 2nd time
Table of random number
- Published tables of random numbers are used.
- To draw a sample of 10 out of 100.
– First give serial number to all of the 100 patients
– Total no of patients are 100, a 3 digit number
– example of random number table
369 495
428 572
565 169
969 786
385 094

– Numbers less than 100 may be chosen as they are


– Numbers higher than 100 may be divided by 100 and
the remainders chosen as sample unit. e.g.
369/100=3.69 so, 69 is remainders and it is our sample
unit
• In above table chosen numbers are: 69, 28, 65, 69, 85, 95,
72, 69, 86 and 94
• 69 repeated 3 times. So two more 3 digit numbers from
the subsequent rows i.e. 441, and 811 are taken, So new
number is chosen 41 and 11 as Sample unit.
• Final 10 sample units are: 69, 28, 65, 85, 95, 72, 86, 94, 41
and 11

Merits:
1). Scientific method
2). More representative
3). More economical
Random number table
Merits of simple random sampling
1) Scientific method
2) More representative
3) More economical
2. Systematic sampling
• This method is used in those cases when a complete list of
population is available.

• It is often applied to field studies when the population is large,


scattered and not homogeneous.

 Systematic procedure is followed to choose a sample by taking Kth


sample interval
K = Total population
Desired sample size

K= 1000 10
100 So, every
10th subject
will be taken
• One random number is chosen from 10 cards
serially numbered from 1 to 10
• Suppose it is 7, then sample are 7, 7+10 = 17, 17
+ 10 = 27, 37, 47,...., 987,997
• So, every Kth number is chosen
• E.g.
• Every 6th patient coming to OPD
• Every 3rd roll numbered student
Merits of systematic random
sampling
1) Procedure is simple
2) Time and labour involved in the selection/collection
of sample is relatively small
3) If population is sufficiently large, it is preferred
sampling method
3. Stratified sampling
• This method is used when the population is
composed of diverse segment (Not homogenous).

• Population under study is first divided into


homogenous group called ‘Strata’
E.g. Workers: Skilled/unskilled/clerical/non clerical
Religion wise: Hindu /Muslim /Christian /Sikh / Buddha
Age wise: <5 yrs/ 5-14 years/ 15-49 years/>50 years
3. Stratified sampling
3. Stratified sampling
• Sample is drawn from each stratum by
Simple / Systematic random method in
proportion to its size

• Merits:
1. More representative sample of population can
be studied
2. It gives greater accuracy among various strata
4. Multistage sampling
• Sampling procedure carried out in several
stages using random sampling techniques
• This is employed in large country surveys
4. Multistage sampling
• E.g. In a survey of villages in India:
– In 1st stage six states are randomly selected
– In 2nd stage four district are randomly selected
from each state
– In 3rd stage, two taluka from each district are
randomly selected
– In 4th stage, two villages from each taluka are
randomly selected
– So, 6x4x2x2 = 96 villages can be selected for the
survey
4. Multistage sampling
• Merits:
1. The sample is spread over entire population
2. Sampling frame of entire population is not
required
3. Every unit has equal chance of selection
4. Save time & cost

• Demerits:
1. Sampling error is high compared to SRS
5. Cluster sampling
• A cluster is randomly selected group
• This method is used when units of population
are natural groups or clusters such as villages,
wards, blocks, slums of town, factories etc.
• From entire population, usually 30 clusters
are surveyed.
5. Cluster sampling
• Identification of clusters for data collection.
– List all cities/towns/villages/wards/slum areas
– Calculate cumulative population and divide the same
by 30. This gives a sample interval
– Select random number less than or equal to sampling
interval. This form the first cluster
– Random number plus sampling interval gives
population of 2nd cluster
– No. in 2nd cluster + sample interval = 3rd cluster....
– Now, all houses in each cluster are numbered
– First house for survey in each cluster should be
selected randomly
Sr. Name of slum Population Cumulative Selected
No. area population cluster
1 AA 1500 1500
2 AB 2000 3500 1st
3 AC 1000 4500
4 AD 3000 7500 2nd
5 AE 500 8000 3rd
6 AF 750 8750
7 AG 2250 11000 4th
8 AH 3750 14750 5th
9 AI 2000 16750
---- ------- ------- ------ ------
64 CL 1100 90000

o 90000/30 = 3000. so, 3000 is cluster interval in 30 cluster survey.


o Suppose, the random number is 1993
o 1993 falls in 2nd area AB. It is selected as 1st cluster for study.
o 1993+3000 = 4993. 4993 falls in 4th area AD which will be 2nd cluster
o 4993+3000 = 7993 + 3000= 10993
• WHO survey to evaluate vaccination coverage
UIP
Survey of 210 children from 30 clusters, taking 7
children in the age group 12-23 months from each
cluster
6. Multiphase sampling
• In this method part of information is collected
from the whole sample and part of
information is from sub sample.
• Example of tuberculosis survey:
– 1st phase: Clinical history from all subjects of
sample
– 2nd phase: those who are positive for symptoms
are screened by x-ray chest/Montoux test, which
is more expensive than the first step
6. Multiphase sampling
– 3rd phase: sputum examination for those who are
positive for x-ray chest and clinical symptoms for
confirmation of tuberculosis

• So, those who need sputum examination will


be smaller in numbers
Non-probability Sampling

• It is the sampling procedure which does not have


any basis for estimating the probability that each
subject have been included in the sample

• In this type of sampling, subjects are selected


deliberately by the researcher

• Researcher's choice is supreme


Non-probability Sampling

• In this design, personal element has a great


chance of entering into the selection of sample

• The investigator may select the sample which


may yield results favourable to his/her point of
view

• High chance that the entire study is biased


Non-probability Sampling

• But, if the investigator are impartial, work


without bias and have the necessary
experience so as to take sound judgement,
the results obtained from an analysis may be
reliable
Non-probability Sampling

1. Quota Sampling
2. Purposive Sampling
3. Convenience Sampling
4. Snow Ball Sampling
Purposive/Judgemental sampling
• Selection of sample based on researcher’s
judgement regarding best population to get
required information related to the study

• Useful when you want to develop something


about which only little is known
• E.g.
– All the patients with some rare syndrome are
selected
Quota sampling
Quota sampling may be viewed as two-stage restricted
judgmental sampling.
– The first stage consists of developing control categories, or
quotas, of population elements.
– In the second stage, sample elements are selected based
on convenience or judgment.

• Pre-planed number of subjects in specified


categories (e.g. 60% men, 40% women)
• Interviewer selects first available subject from
specified categories
• Least expensive of all methods
• No requirement of sampling frame
Snowball sampling
• Process of selecting sample using networks.
• Start with a few individuals and extends to
their contacts until required samples are
collected
Snowball sampling
• Advantage
– Need to make contact with only few people initially
– No need of entire sampling frame

• Disadvantage:
– Choice of entire sample rests on the choice of
individuals at the first stage
– If they are from a particular group then the study
might get biased
CALCULATION OF SAMPLE SIZE
Calculation of Sample Size
• What should be the sample size?
– To get the correct results

• To small samples  Study is not valid


• To large sample  Laborious, costly & time
consuming

• We need optimum sample size, which gives


reliable results
Sample size calculation

• For Qualitative data

• For Quantitative data


Sample size calculation for qualitative data:

• In such data, we deal with proportions such as


morbidity rates, cure rates, vaccinated, died,
survived etc.
• The first step is to decide how large an error due to
sampling defects can be tolerated or allowed in the
estimates

• Such allowable error has to be stated by the investigator

• For finding the suitable size of the sample, the


assumption usually made is that the allowable error does
not exceed 10% or 20% of the positive character.
For Qualitative data

N = 4pq
p= positive character
L2

q=negative character=1-p, or q= 100-p in percentage as


p+q=100%

L= allowable error of p, usually 10% or 20% of p


• Incidence rate in the last influenza epidemic
was found to be 50/1000(5%) of the
population exposed. What should be the size
of sample to find incidence rate influenza in
the current epidemic if allowable error is 10%
and 20%?
p=5%,
q=p-100=100-5=95%
L= 0.5 (at 10% of p)
L =1 (at 20% of p)
p = 5%,
q = p-100=100-5=95%
L = 0.5 (at 10% of p)
L = 1 (at 20% of p)

•Sample size calculation at 10% allowable error


n= 4pq = 4 x 5 x 95 = 7600
L2 0.5 x 0.5
• Sample size calculation at 20% allowable error
n= 4pq = 4 x 5 x 95 = 1900
L2 1x1
• Hookworm prevalence rate was 30% before
the specific treatment and adoption of other
measures. Calculate the size of sample
required to find the prevalence rate not if
allowable error is 10% or 20%.
For quantitative data

• If Standard Deviation (SD) of population is


known from past experience, the size of the
sample can be determined by the following
formula with desired allowable error (L) at 5%
risk level.

n = 4 SD2
L2
• Mean pulse rate of a population is believed to
be 72/minute with Standard Deviation of 8.
Calculate minimum sample size to verify this if
allowable error is 1 at 5% significance level.
• Mean systolic blood pressure in one college
students was found to be 120 with SD of 10.
Calculate the minimum size of the sample to
verify the result if allowable error is 2 at 5%
risk.
Example
• Mean hemoglobin level of girl students in the
colleges is estimated to be 11.5 gm% with SD
of 1.5 gm%.
• Calculate sample size for a study of
Hemoglobin estimation of girls of
physiotherapy colleges of Saurashtra region
with allowable error of 1 gm% & 0.5 gm%.
Thank You

You might also like