0% found this document useful (0 votes)
14 views6 pages

STT251 Lecture-01

The document introduces the concept of sampling distribution, which is the probability distribution of a statistic derived from repeated random samples of a fixed size from a population. It explains how to calculate sample statistics such as means and variances, and discusses the importance of sampling distributions in inferential statistics. Additionally, it outlines various types of sampling distributions, including those for means, differences between means, and proportions.

Uploaded by

RA Shajid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

STT251 Lecture-01

The document introduces the concept of sampling distribution, which is the probability distribution of a statistic derived from repeated random samples of a fixed size from a population. It explains how to calculate sample statistics such as means and variances, and discusses the importance of sampling distributions in inferential statistics. Additionally, it outlines various types of sampling distributions, including those for means, differences between means, and proportions.

Uploaded by

RA Shajid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Chapter # 01 Sampling Distribution

Introduction
Sampling distribution of a statistic may be defined as the probability law, which the statistic
follows, if repeated random samples of a fixed size are drawn from a specified population.
Let us consider a random sample x1, x2, ...., xn of size n drawn from a population containing N units.
Let us further suppose that we are interested in the sampling distribution of the statistic x̄ (i.e.,
sample mean), where
1
x̄= (x1 + x 2 +. .. .+ x n)
n
If the population size N is finite, there is a finite number (say k) of possible ways of drawing n units
in the sample out of a total of N units in the population. Although the k samples are distinct, the
sample means may not be all different, but each of these will occur with equal probability. Thus, we
can construct a table showing the set of possible values of the statistic x̄ and also the probability
that x̄ will take each of these values. This probability distribution of the statistic x̄ is called
'sampling distribution' of sample mean. The above method is quite general, and the sampling
distribution of any other statistic, say, median or standard deviation of the sample, may be obtained.

What is a Sampling Distribution?


A sampling distribution refers to a probability distribution of a statistic that comes from choosing
random samples of a given population. Also known as a finite sample distribution it represents the
distribution of frequencies on how spread apart various outcomes will be for a specific population.

Sampling Distribution | Lecture #1


Lectured by
Subject: STT251: Sampling Technique
Md. Kaderi Kibria,STAT, HSTU

The sampling distribution depends on multiple factors – the statistic, sample size, sampling process,
and the overall population. It is used to help calculate statistics such as means, ranges, variances,
and standard deviations for the given sample.

How Does it Work?


1. Select a random sample of a specific size from a given population.
2. Calculate a statistic for the sample, such as the mean, median, or standard deviation.
3. Develop a frequency distribution of each sample statistic that you calculated from the step
above.
4. Plot the frequency distribution of each sample statistic that you developed from the step
above. The resulting graph will be the sampling distribution.

A fair die is thrown infinitely many times, with the random variable X = # of spots on any throw.
The probability distribution of X is:
X 1 2 3 4 5 6
P(X) 1/6 1/6 1/6 1/6 1/6 1/6

…and the mean and variance are calculated as well:

A sampling distribution is created by looking at all samples of size n=2 (i.e. two dice) and their
means…

While there are 36 possible samples of size 2, there are only 11 values for x̄ , and some (e.g.
=3.5) occur more frequently than others (e . g . x̄ =1).

Sampling Distribution | Lecture #1


Lectured by
Subject: STT251: Sampling Technique
Md. Kaderi Kibria,STAT, HSTU

The sampling distribution of x̄ is shown below:

P( ) 6/36
1.0 1/36
1.5 2/36 5/36
2.0 3/36
2.5 4/36 4/36
3.0 5/36
)
3.5 6/36
P(

4.0 5/36 3/36


4.5 4/36
5.0 3/36
5.5 2/36
2/36
6.0 1/36
1/36
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

Compare the distribution of X and x̄

1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

As well, note that:

We can generalize the mean and variance of the sampling of two dice:

And to n-dice

Sampling Distribution | Lecture #1


Lectured by
Subject: STT251: Sampling Technique
Md. Kaderi Kibria,STAT, HSTU

The standard deviation of the sampling distribution is called the standard error:

Example 1.1: Suppose we have a population of N=4 incomes of four business firms and we want to
find the average return of these firms. The incomes (in Lakhs) are 100,200, 300 and 400. We first
note that in this case the (population) mean income is 250 lakhs. Now we use this situation to
illustrate how sample means differ from the population mean. Suppose we select a sample of n = 2
observations in order to estimate the population mean μ . Now, there are C(4,2) = 6 possible
samples of size 2 and we will randomly be selecting one sample from this. We shall now calculate
the means of these 6 different samples. These six different samples and their means are given in the
following table.

Sample Sample elements (Xi) Sample means ( x̄ )


1 100,200 150
2 100,300 200
3 100,400 250
4 200,300 250
5 200,400 300
6 300,400 350

Now, from the table above, you can find that each sample has a different mean, with the exception
of third and fourth samples. Therefore four of the six samples will result in some error in the
estimation process. This sampling error is the difference between the population mean μ and the
sample mean we use to estimate it. Let us now consider the possible sample means and calculate
with their probability. We assume that each sample is equally likely to be chosen. Then the
probability of selecting a sample is 1/6.
Then we list every possible sample means and their respective possibilities in a table.

Sample mean Number of samples Probability


X̄ yielding X̄ P( X̄ )
150 1 1/6
200 1 1/6
250 2 2/6
300 1 1/6
350 1 1/6

The table obtained above is called the sampling distribution of mean.

Sampling Distribution | Lecture #1


Lectured by
Subject: STT251: Sampling Technique
Md. Kaderi Kibria,STAT, HSTU

Types of Sampling Distributions


Here is a brief description of the types of sampling distributions:
Sampling Distribution of the Mean
This method shows a normal distribution where the middle is the mean of the sampling distribution.
As such, it represents the mean of the overall population. In order to get to this point, the researcher
must figure out the mean of each sample group and map out the individual data.
Let X̄ be the sample mean of a random sample of size n drawn from a population having mean µ
and standard deviation σ , then the mean of X̄ is
µ X̄ =µ
and the standard deviation of X̄ is
σ X̄ = σ / √ n i.e.,

Sampling Distribution of the Difference between Two Means


Suppose that we have two populations, the first with mean µ1 and standard deviation σ1 , and the
second with mean µ2 and standard deviation σ2. We take a random sample of size n1 from the first
population and measure some variable X1 , and take an independent random sample of size n2 from
the second population and measure the value of the some variable X2 .
By the Central Limit Theorem, we know that, if n1 and n2 are sufficiently large,
X̄ 1 ~ N ( µ1 , σ 1 / √ n 1),
and
X̄ 2~ N (µ 2 , σ 2 / √ n 2) .
It can also be shown that
2 2
σ1 σ2
(
X̄ 1 − X̄ 2~ N µ 1 − µ 2 ,
√ )
+
n1 n2
.

Sampling Distribution of Proportion


This method involves choosing a sample set from the overall population to get the proportion of the
sample. The mean of the proportions ends up becoming the proportions of the larger group. The
population parameter in this case is the proportion. We denote this by π . In general, for a finite
population, we define the population proportion π as

k
π=
n

Sampling Distribution | Lecture #1


Lectured by
Subject: STT251: Sampling Technique
Md. Kaderi Kibria,STAT, HSTU

where k is the number of observations that fall in a particular category and n is the total number of
observation. When the population is very large, we may take samples to study the population and
for each sample we calculate the sample proportion, p, as
s
p^ =
n
where s denotes the number of observation in the sample which meet the particular characteristic,
under study and n is the sample size.

If the sample size is much smaller than the size of a population with proportion p of successes, then
the mean and standard deviation of p^ are:

p (1− p)
μ ^p= p and σ ^p=
√ n

T-distribution, Chi-square and F-distribution are also the sampling distribution.

Why We Study Sampling Distributions?


Sample statistics from the basis of all inferences drawn about populations. Thus, sampling
distributions are of great value in inferential statistics. The sampling distribution of a sample
statistic possess well-defined properties which help lay down rules for making generalizations about
a population on the basis of a single sample drawn from it. The variations in the value of sample
statistic not only determine the shape of its sampling distribution, but also account for the element
of error in statistical inference. If we know the probability distribution of the sample statistic, then
we can calculate risks (error due to chance) involved in making generalization about the population.
With the help of the properties of sampling distribution of a sample statistic, we can calculate the
probability that the sample statistic assumes a particular value or has a value in a given interval.
This ability to calculate the probability that the sample statistic lies in a particular interval is the
most important factor in all statistical inferences.

References:

1. Hogg, R. V. and Craig, A. T. (2012), Introduction of Mathematical Statistics, 7 th Edition,


Pearson Education, New Delhi.
2. Hoel, P. G. (1984), Introduction to Mathematical Statistics, 5th Edition, John Wiley and
Sons, New York.
3. Mood, A. M., Graybill, F. A., Boes, D. C. (1974), Introduction to the Theory of Statistics,
3rd Edition, McGraw-Hill, USA.

Sampling Distribution | Lecture #1

You might also like