STT251 Lecture-02
STT251 Lecture-02
where N = population size = the total number of individuals, n = sample size = number of
individuals selected in the random sample. σ 2 = standard deviation of the individuals in the
population.
We use the above formula for computing SE ( x̄) , when N-n is not very large. When N is large,
relative to n, we use the formula,
2
√
SE ( x̄)= σ = σ
n √n
This formula can also be used for calculating SE ( x̄) for infinite population.
Example 2.2: The U.S. Bureau of census wishes to estimate the birth rates per 1,00,000 people in
the nation's largest cities. It is known that the standard deviation in the birth rates for these 100
urban centres is 12 births per 1,00,000 people. Then
(a) Calculate the variance and standard error of the sampling distribution of i) n = 8 cities ii) n =
15 cities.
(b) Compare the values obtained in both the cases.
Solution:
(a) (i) Here N = 100 and n = 8 and the population variance is 12. Therefore n / N less than 0.1.
Then we use the formula for calculating the variance as
2 2
12
σ 2x̄ = σ = =18
n 8
and the standard error is σ x̄ = √ 18=4.24
(ii) In this case N=100 and n=15 and therefore n slash N is greater than 0.1. Also, σ =12.
Therefore we use the formula for variance as
N −n σ 2
σ 2x̄ = ( )
N −1 n
=8.24
Hence, σ x̄ = √ 8.24=2.87
(b) On comparing both the values we observe that, the larger sample has a smaller standard error
and will tend to result in less sampling error in estimating the birth rates in the 100 cities.
determine the mean of each sample, the distribution of these sample means will tend to be described
by the normal probability distribution with a mean μ and variance σ 2 /n .
More formally, the CLT tells us that,
2
if 2
x ~ N ( μ , σ ) then x̄ ~ N ( μ , σ ) where n is sufficiently large
n
Or in other words, we can say that, the sampling distribution of sample means approaches to a
normal distribution. The sampling distribution of the mean of a random sample drawn from any
population is approximately normal for a sufficiently large sample size .
The central limit theorem tells us that no matter the population distribution, the sampling
distribution’s shape will approach normality as the sample size (N) increases.
This is useful, as the research never knows which mean in the sampling distribution is the same as
the population mean, but by selecting many random samples from a population, the sample means
will cluster together, allowing the research to make a very good estimate of the population mean.
Thus, the sampling error will decrease as the sample size (n) increases.
Example 2.3: A statistician is analyzing the participation rate in a school’s math club. The school
has 6 students, and each student either participates in the math club (coded as 1) or does not
participate (coded as 0). The data is summarized as follows:
Student Participates in Math Club (1) / Does Not Participate (0)
A 1
B 1
C 0
D 1
E 0
F 1
μ ^p =0.633
The population proportion p of students participating in the math club is 0.67, which is close to the
mean of the sampling distribution μ ^p =0.633 . This demonstrates that the sampling distribution of
the sample proportion is an unbiased estimator of the population proportion.
References:
1. Hogg, R. V. and Craig, A. T. (2012), Introduction of Mathematical Statistics, 7 th Edition,
Pearson Education, New Delhi.
2. Hoel, P. G. (1984), Introduction to Mathematical Statistics, 5th Edition, John Wiley and
Sons, New York.
3. Mood, A. M., Graybill, F. A., Boes, D. C. (1974), Introduction to the Theory of Statistics,
3rd Edition, McGraw-Hill, USA.
4. https://people.ohio.edu/ruhil/statsbook/sampling.html#the-sampling-distribution-of-binary-
proportions
5. chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://egyankosh.ac.in/bitstream/
123456789/14027/1/Unit-4.pdf