0% found this document useful (0 votes)
16 views27 pages

Slidesc53 - 1 - 2 Statistics

The document discusses various sampling methods including Simple Random Sampling with Replacement (SRSWR), Systematic Sampling, and Cluster Sampling, highlighting their characteristics and applications. It provides R code examples for selecting systematic samples and estimating population means using cluster sampling. Additionally, it addresses sample size estimation for achieving specified tolerable errors in statistical studies.

Uploaded by

Ki Yan Shih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views27 pages

Slidesc53 - 1 - 2 Statistics

The document discusses various sampling methods including Simple Random Sampling with Replacement (SRSWR), Systematic Sampling, and Cluster Sampling, highlighting their characteristics and applications. It provides R code examples for selecting systematic samples and estimating population means using cluster sampling. Additionally, it addresses sample size estimation for achieving specified tolerable errors in statistical studies.

Uploaded by

Ki Yan Shih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

STAC53

Slides 1 contd.

1
Simple Random Sampling with replacement (SRSWR)

2
Simple Random Sampling with replacement (SRSWR)

3
Systematic Sampling

4
Systematic Sampling
• If the population is in random order, a systematic sample will
behave much like an SRS.
• Systematic sampling does not necessarily give a
representative sample, if the listing of population units is in
some periodic order.

5
Systematic Sampling

6
Example : Systematic Sampling

7
Notes
• An unbiased estimator of the population variance
based on a systematic sample is not available.
• If the population is in a random order, a systematic
sample will behave much like an SRSWOR and so the
variance formula based on a SRSWOR can be used.

8
An R function for selecting a systematic sample
#This R code selects a systematic sample of size n from a
# population of size N.
# The values of N and n must be provided
sys.sample = function(N,n){
+ k = ceiling(N/n) #ceiling(x) gives the smallest integer
greater than or equal to x.
+ #This means ceiling (2.1) = 3
+ r = sample(1:k, 1)
+ sys.samp = seq(r, r + k*(n-1), k)
+ cat("The selected systematic sample is: \"", sys.samp,
"\"\n")
+ # Note: the last command \n prints the result in a new line
+ }

9
An R function for selecting a systematic sample
# To select a systematic sample, type the
following command providing the values of N and
n
set.seed(123)
sys.sample(50, 5)
The selected systematic sample is: " 3 13 23 33
43 "

10
Cluster sampling
• In some practical situations, the population is grouped
into clusters.
• In cluster sampling we select a random sample of
these clusters and observe some or all elements in the
selected clusters.
• Convenience and reduced cost are often the reasons
for using cluster sampling.

11
Cluster sampling: Notations
• Population consists 𝑁 clusters.
• The 𝑖th cluster consists of 𝑀𝑖 elements. For convenience, let
us assume that that all clusters have the same number of
elements, i.e. 𝑀𝑖 = 𝑀.
• Let 𝑦𝑖𝑗 be the measurement on the 𝑗th element in the 𝑖th
cluster.
• Then the population size is 𝑀0 = σ𝑁 𝑖=1 𝑀𝑖 = 𝑁𝑀

12
Cluster sampling: Notations and formulas

13
One-stage cluster sampling

14
Result

15
Example
• Example 5.2 of Lohr (2010) p 171 gives data set from which a
student wants to estimate the average grade point average
(GPA) in his dormitory. Instead of obtaining a listing of all
students in the dorm and conducting an SRS, he notices that
the dorm consists of 100 suites, each with four students; he
chooses 5 of those suites at random, and asks every person
in the 5 suites what her or his GPA is.

16
The R code to estimate the population mean and its standard error
using Cluster Sampling
cluster <- read.table("eg52cluster.txt", header=1)
cluster

17
The R code to estimate the population mean and its standard error
using Cluster Sampling
library(doBy) # Install the doBy package if you haven’t done so
summaryBy(GPA ~ suite, data=cluster, FUN = function(x) {c(m =
mean(x), s = sd(x), n=length(x)) }

18
The R code to estimate the population mean and its standard error
using Cluster Sampling
library (survey)
N = 100 # The total number of clusters
n = 5 # The number of clusters selected (this is not the sample size)
M = 4 # The number of elements in each cluster
# Note: This is assumed to be equal here
clusterdesign <- svydesign(id=~suite, fpc=rep(N, n*M), data=cluster)
# id = ~suite tells R that suites are the clusters.
svymean(~GPA, clusterdesign)
mean SE
GPA 2.826 0.1637

19
Sample size estimation
• The answer to the question ”how big the sample should be?”
depends on how much error we can tolerate.
• There are two ways to specify the tolerable error.

20
Sample size estimation
• When the absolute tolerable error is specified and if SRSWOR is used,
then the equation relating the absolute tolerable error(e) to the
sample size comes from the formula for the confidence intervals for
the population mean:

21
22
Sample size estimation
• When the relative tolerable error (𝑟 ) is specified and if
SRSWOR is used, then (substituting 𝑟𝑦ത𝑈 for 𝑒 )

23
Example

24
Example

25
Sample size for estimating proportions

26
Sample size for estimating proportion
• A population has 𝑁 = 1500. Determine the sample size necessary to
estimate a population proportion to within .03 with 90% confidence
assuming you have no knowledge of the true population proportion.
• 𝑒 = 0.03
• For 90 % confidence interval 𝑧 = 1.645
• If you have no knowledge of the true population proportion, use 𝑝 =
1/2 = 0.5
• 𝑛0 = 𝑧𝛼2 𝑆 2 /𝑒 2
2

27

You might also like