Sampling and Sample Size: Dr. Keerti Jain, NIIT University, Neemrana
Sampling and Sample Size: Dr. Keerti Jain, NIIT University, Neemrana
SIZE
POPULATION AND
2
SAMPLE
Population:
a set which includes all measurements
of interest to the researcher
(The collection of all responses,
measurements, or counts that are
of interest)
Sample:
A subset of the population
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
3 POPULATION DEFINITION
4 EXAMPLE
5 SAMPLING
6 WHY SAMPLING?
7
WHY SAMPLING?
• Less costs
• Less field time
• But less accuracy
• When it’s impossible to study the whole
population
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
9
TERMINOLOGY
Target Population:
The population to be studied/ to which the investigator wants to generalize his results
Sampling Unit:
Smallest unit from which sample can be selected
Study Population:
The part of target population from which the investigation collect the sample population
Sampling frame:
List of all the sampling units from which sample is drawn
Sampling scheme:
Method of selecting sampling units from sampling frame
Dr. Keerti Jain, NIIT University Neemrana
07/22/20
10 SAMPLING
STUDY POPULATION
Sample Frame
SAMPLE
TARGET POPULATION
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
11 SAMPLING BREAKDOWN
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
13 IMPORTANCE OF SAMPLING
FRAME
• In the most straightforward case, such as the sentencing of a batch of material
from production (acceptance sampling by lots), it is possible to identify and
measure every single item in the population and to include any one of them in
our sample. However, in the more general case this is not possible.
• There is no way to identify all rats in the set of all rats. Where voting is not
compulsory, there is no way to identify which people will actually vote at a
forthcoming election (in advance of the election)
• As a remedy, we seek a sampling frame which has the property that we can
identify every single element and include any in our sample.
• The sampling frame must be representative of the population
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
• Sampling procedure
• Sample size
• Participation (response)
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
15 SAMPLING PROCESS
16
TYPES OF SAMPLING
TECHNIQUES
• Probability Sampling
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
17
NON PROBABILITY SAMPLING
18
PROBABILITY SAMPLING
• Random sampling
• Each subject has a known probability of being selected
19 TYPES OF NON-PROBABILITY
SAMPLE
• Convenience sample
• Purposive sample
• Judgmental Sampling
• Quota Sampling
• SnowBall Sampling
• Panel Sampling
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
20 TYPES OF PROBABILITY
SAMPLING
• Simple Random Sample
• Systematic random sample
• Stratified random sample
• Multistage sample
• Multiphase sample
• Cluster sample
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
21
Errors in Sample
22
TYPE 1 ERROR
23
TYPE 2 ERROR
25 EXAMPLE 1
26 SOLUTION
In order to ensure that the 95% confidence interval estimate of the mean
systolic blood pressure in children between the ages of 3 and 5 with
congenital heart disease is within 5 units of the true mean, a sample of size
62 is needed.
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
27
Example 2
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
29
EXAMPLE 3
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
30 EXAMPLE 4
An investigator wants to compare two diet programs in children who are obese. One diet
is a low fat diet, and the other is a low carbohydrate diet. The plan is to enroll children
and weigh them at the start of the study. Each child will then be randomly assigned to
either the low fat or the low carbohydrate diet. Each child will follow the assigned diet
for 8 weeks, at which time they will again be weighed. The number of pounds lost will
be computed for each child. Based on data reported from diet trials in adults, the
investigator expects that 20% of all children will not complete the study. A 95%
confidence interval will be estimated to quantify the difference in weight lost between
the two diets and the investigator would like the margin of error to be no more than 3
pounds. How many children should be recruited into the study?
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
31 SOLUTION
Samples of size n1=56 and n2=56 will ensure that the 95% confidence interval for
the difference in weight lost between diets will have a margin of error of no more
than 3 pounds. Again, these sample sizes refer to the numbers of children with
complete data.
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
where p is proportion
E is sampling error or tolerable margin of error
E= difference between population proportion and sample proportion
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
33
EXAMPLE 5
SOLUTION
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
34 EXAMPLE 6
35 SOLUTION
37
EXAMPLE 7
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
38 EXAMPLE 8
39 SOLUTION
The sample sizes (i.e., numbers of women who smoked and did not smoke
during pregnancy) can be computed using the formula shown above.
National data suggest that 12% of infants are born prematurely. We will use
that estimate for both groups in the sample size computation.
Samples of size n1=508 women who smoked during pregnancy and n2=508
women who did not smoke during pregnancy will ensure that the 95%
confidence interval for the difference in proportions who deliver prematurely
will have a margin of error of no more than 4%.
Dr. Keerti Jain, NIIT University Neemrana 07/22/20
40