Kuliah SamplingDesign
Kuliah SamplingDesign
Kuliah SamplingDesign
THEORETICAL
PROBLEM
FRAMEWORK GENERATION OF SCIENTIFIC
DEFINITION
Variables clearly HYPOTHESES RESEARCH
Research Problem DESIGN
identified and labeled
Deliniated
Report Writing
Issues Involved in the Research Design
MEASUREMENT
DETAILS OF STUDY
establishing • Operational DATA
• Causal relations • Studying event definition
• Manipulation • Items
ANALYSIS
• Exploration • Corelations
• Description • Group Differences, • Control • Scaling
• Hypotheses rank, etc • Simulation • Contrived • Categorizing
Testing • Non-contrived • Coding
Extent of
Type of Researcher
PROBLEM STATEMENTS
• Hypotheses
Unit of Sampling Time Horizon Data Collection testing
Analysis Design Method
• Individuals • Sampling Method • Cross-sectional
• Dyads • Sampling Size • Longitudinal • Observation
• Groups • Interview
• Organizations • Questionaire
• etc • Physical
Measurements
• Unobtrusive
Data Collection Process
Preliminary Sample
Planning Design
Coding Editing
Presentation
Results
Definition of sampling
(estimate)
Parameter Statistic
true proportion sample proportion
true mean sample mean
Sampling and representativeness
Target Population Sampling Population Sample
Sampling
Population
Sample
Target Population
Types of Total survey
error
Survey Error
Sample design
Measurement error
error
Surrogate
Processing information
error error Selection
Frame error
error
Interviewer
Response error
error Population
specification
Nonresponse Instrument bias error
bias
Sampling error
• Random difference between sample and population from which
sample drawn
• Size of error can be measured in probability samples
• Expressed as “standard error”
– of mean, proportion…
• Standard error (or precision) depends upon:
– Size of the sample
– Distribution of character of interest in population
Sampling errors
• This is not an "error" in the sense of making a mistake. Rather, it
is a measure of the possible range of approximation in the
results because a sample was used
• Interviews with a representative sample of 1,000 adults can
accurately reflect the opinions of nearly ~2 million adults
• This range of possible results is called the error due to sampling,
often called the Margin Of Error (MOE)
Sampling errors
Population distribution, e.g. income
m ( population mean)
The sample mean falls here only because
Sampling error
certain randomly selected observations
were included in the sample
Sample
x ( sample mean)
Non-sampling errors…
– Process errors:
• Examples include measurement error, interviewer
error, and processing error.
• It can be minimised by proper interviewer training,
good questionnaire design, pre-testing, and careful
management of the data recording process.
– Theproblem is most serious when a bias is
created.
Non-sampling errors…
• Errors in data acquisition:
– Selection bias
– Randomly select people – don’t let them/you
select these people!!
– Non-response errors
• Anonymity, questionnaire design, relevance
• Call backs, substitution, re-weighting data
Non-sampling error
Population
• Important points:
• Sample size is NOT related to representativeness … you
could sample 20,000 persons walking by a street corner and
the results would still not represent the city; however, an n of
100 could be “right on.”
Sample Accuracy
• Important points:
• Sample size, however, IS related to accuracy. How
close the sample statistic is to the actual population
parameter (e.g. sample mean vs. population mean) is
a function of sample size.
Sample Size AXIOMS
If we conducted our study over and over, e.g.1,000 times, we would expect our result to fall within a
known range (+ 1.96 s.d.’s of the mean). Based upon this, there are 95 chances in 100 that the true
value of the universe statistic (proportion, share, mean) falls within this range!
Normal Distribution
n = 500
n = 1000
We also know that, given the amount of variability in the population, the
sample size affects the size of the confidence interval; as n goes down the
interval widens (more “sloppy”)
Central Limit Theorem
So, what have we learned thus far?
There is a relationship among:
• the level of confidence we desire that our results be repeated
within some known range if we were to conduct the study
again, and…
• the variability (in responses) in the population and…
• the amount of acceptable sample error (desired accuracy) we
wish to have and…
• the size of the sample.
Sample Size Formula
• The formula requires that we
a. specify the amount of confidence we wish to have,
b. estimate the variance in the population, and
c. specify the level of desired accuracy we want.
• When we specify the above, the formula tells us
what sample size we need to use….n
Sample Size Formula for Estimating
a Mean
Communication Research 35
Sample Size Formula
Estimating a Mean
This requires a different formula
How to estimate s?
• Use standard deviation of the sample from a previous study on the target
population
• Conduct a pilot study of a few members of the target population and
calculate s
Sample Size Calculation
Example: Estimating the Mean of a Population
What is the required sample size, n?
• Management wants to know customers’ level of satisfaction with their service. They
propose conducting a survey and asking for satisfaction on a scale from 1 to 10
(since there are 10 possible answers, the range = 10).
• Management wants to be 99% confident in the results (99 chances in 100 that true
value is captured) and they do not want the allowed error to be more than + .5 scale
points.
• What is n?
Sample Size Calculation
• Sampling Error
• Probability Statements about the Sampling Error
• Interval Estimation: Assumed Known
• Interval Estimation: Estimated by s
Sample Size Formula
Sampling Error
• The absolute value of the difference between an unbiased
point estimate and the population parameter it estimates is
called the sampling error.
• For the case of a sample mean estimating a population mean,
the sampling error is
Sampling Error = | x m|
Sample Size Formula
Probability Statements about the Sampling error
Sampling
distribution
of
x
1 - of all
/2 /2
xvalues
x
m
Interval Estimate of a Population Mean:
Large-Sample Case (n > 30)
• Assumed Known
x z /2
n
where: isxthe sample mean
1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
is the population standard deviation
n is the sample size
Interval Estimate of a Population Mean:
Large-Sample Case (n > 30)
estimated by s
In most applications the value of the population standard
deviation is unknown. We simply use the value of the
sample standard deviation, s, as the point estimate of the
population standard deviation.
s
x z / 2
n
Example: Airport Check-in Counter
x
Example: Airport Check-in Counter
At 95% confidence, what is the margin of error?
Margin of error is defined as:
z
/2
n
Given:
= 5 minutes ,n = 49 , = 1 - .95 = .05
Margin of error :
5 5
z / 2 z .05 / 2 z .025
n 49 7
5
1.96 1.4
7
Example: Airport Check-in Counter
x z / 2
n
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30)
t distribution
(10 degrees of freedom)
z, t
0
t Distribution
/2 Area or Probability in the Upper Tail
/2
t
0 t/2
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30) and Estimated by s
Interval Estimate s
x t / 2
n
where 1 - = the confidence coefficient
t/2 = the t value providing an area of /2 in the
upper tail of a t distribution
with n - 1 degrees of freedom
s = the sample standard deviation
Example
Interval Estimation of a Population Mean:
Small-Sample Case (n < 30) with Estimated by s
In the testing of a new method, 18 employees were selected
randomly and asked to try the new method. The sample mean
production rate for the 18 employees was 80 parts per hour and
the sample standard deviation was 10 parts per hour. Provide a
95% confidence interval estimate for the population mean
production rate for the new method. Assume the population has
a normal probability distribution.
Example
Given : x 80, s 10, n 18
1 - .95 .05, .025
2
s 10
2.36
n 18
t .025,17 2.11
s
x t .025, 17 80 2.112.36
n
80 4.98
Summary of Interval Estimation Procedures
for a Population Mean
Yes No
n > 30 ?
No
known ? Popul.
Yes
approx.
Yes normal
Use s to
No ?
estimate known ?
No
Yes Use s to
estimate
s s Increase n
x z / 2 x z / 2 x t / 2 x t / 2
n n n n to > 30
Sample Size for an Interval Estimate
of a Population Mean
Margin of Error
e z / 2
n
Necessary Sample Size
( z / 2 )
2 2
n 2
e
Example: Starting Salaries of College Graduates
Sample Size for an Interval Estimate of a Population Mean
n
( z / 2 )
2 2
1.96 2,000
2 2
e 2
200 2
3.8416 4,000,000
385
40,000
The Confidence Interval Method
• Confidence interval approach: applies the concepts of accuracy,
variability, and confidence interval to create a “correct” sample
size
• Two types of error:
• Non-sampling error: pertains to all sources of error other than
sample selection method and sample size
• Sampling error: involves sample selection and sample size…this
is the error that we are controlling through formulas
• Sample error formula:
Interpreting the meaning of a confidence
interval estimate
• and , the point estimates m and p respectively,
x
provides the best guess of these population
parameters.
• The estimated standard errors provide information
about the sampling variability.
• Confidence interval estimates not only gives us an idea
about the value of the estimated parameter, but also
informs us about the sampling variability via the
estimated standard error and the level of confidence
(1-).
The Confidence Interval Method
The relationship between sample size and sample error:
How to Calculate Sample Error (Accuracy)
pq
error z Where z = 1.96 (95%)
or 2.58 (99%)
n
Sample Size and Accuracy
sp 16%
14%
12%
Accuracy
10%
8%
6%
4%
2%
0%
1100
1250
1400
1550
1700
1850
2000
50
200
350
500
650
800
950
66
Sample Size
Accuracy Levels for Different Sample Sizes
The “p” you found in your sample
• At 95% ( z = 1.96)
• n p=50% p=70% p=90%
• 10 ±31.0% ±28.4% ±18.6%
• 100 ±9.8% ±9.0% ±5.9%
• 250 ±6.2% ±5.7% ±3.7% 1.96 sp
• 500 ±4.4% ±4.0% ±2.6%
• 1,000 ±3.1% ±2.8% ±1.9%
Sampling distribution p (1 p )
of
p p
n
/2 /2
p
p
z / 2 p z / 2 p
Interval Estimation of a Population
Proportion
Interval Estimate
p (1 p )
p z / 2
n
where: 1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard normal probability
distribution
p is the sample proportion
Sample Size Formula for
Estimating a Proportion
The sample size formula for estimating a proportion (also
called a percentage or share):
Sample Size Formula for
Estimating a Proportion
How to estimate variability (p and q shares) in
the population ?
E Z / 2 p Z.05 / 2 p Z.025 p
p(1 p)
Z.025
n Notice that the value of in
2 p (1 p )
this example is 0.5 and not
E Z.025
2
n 0.44. Why?
2 p (1 p )
n Z.025
E2
2 .5(1 .5) .25
1.96 3.8416 1068
.0009
2
.03
Other Methods of Sample Size Determination