Sampling Theory
Sampling Theory
It’s a milestone now. From now on, we’ll be dealing with the unknown monster, the population. We don’t
get to see the whole picture, period. We don’t even know whether our guess come close enough. That’s
the thing we are trying to do.
Again, I want you to know why exactly we can’t get to see the whole picture, that is, the population. It’s
nothing mysterious, it’s usually because simply that the population is too large to know them all.
Why sampling? Because it’s impossible to know the entire population.
How to sample and what for. Here’s a list of questions.
1. What is a population?
It is a collection of all the objects that possess the characters that we’re interested in knowing.
2. Ways to investigate.
Census
Sample survey
3. Comparisons between the two ways.
Time
Money
Precision
Know-how
4. What is a random sample? What are m, mu and σ , sigma?
10. So now we’re taking the value of the sample mean for the population mean. What exactly is thing that
we do?
12. How sure are we assuming that the population is normal? Approximately normal?
13. So, now that the population is good enough, what can we achieve? In terms of inference.
14. We’re using the sample mean for the estimator. It is a random variable, that is, it can be a different
number every time we take a sample. How can we be comfortable using such a thing to estimate
anything?
For finite populations, the population is expressed with a collection of numbers, x 1 , x 2 ,…, x n . The mean,
m=
1
N ∑ xi , and the variance, s²= N1 ∑ ( x i− μ )2 .
For infinite populations, it is expressed with x 1 , x 2 ,…, x n ,…, that is, the last element is missing. The
computing formulas can not be used. We assume that there is a random variable X possessing the
population distribution. Then we know that m=E[X], s²=s²{X}, and they are also true numbers.
We say that we use X to estimate m. Actually we mean the following:
First, we determine the statistic to use for the estimator.
Then we will take a random sample and calculate the said statistic.
We then use the calculated value for the estimate of the target parameter.
So you see, estimator is a tool, a formula, a means for the job.
After obtaining the sample and calculate the value of the statistic, or the estimator, the number calculated is
an estimate. We say that it is a realization of the process.
Population Sample
x 1 , x 2 ,… , x n X1 , X2 , … , Xn
m and s² X , and S²
2 Each entry in a table of random digits has probability 0.1 of being a 0, and the digits are independent of
one another. Each line in a specific table of random digits, say, contains 40 random digits. What are the
mean μ and standard deviation σ of the number of 0’s in a randomly selected line?
3 The weight of tomatoes chosen at random from a bin at the farmer’s market follows a Normal
distribution with mean μ = 12 ounces and standard deviation σ = 2.5 ounce. Suppose we pick four
tomatoes at random from the bin and find their total weight T. The random variable T is