0% found this document useful (0 votes)
10 views37 pages

Statistics For Economists Lecture V

The document discusses sampling and sampling distributions, highlighting the importance of representative samples for statistical analysis. It explains the central limit theorem, which states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases. Additionally, it covers the use of the t-distribution for estimating population means when the population standard deviation is unknown.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views37 pages

Statistics For Economists Lecture V

The document discusses sampling and sampling distributions, highlighting the importance of representative samples for statistical analysis. It explains the central limit theorem, which states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases. Additionally, it covers the use of the t-distribution for estimating population means when the population standard deviation is unknown.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

5-1

Sampling and Sampling Distributions

By:

Amsalu B. (MSc.)

Lecturer, Department of Economics

Wolkite University

Email: [email protected]

Wolkite, Addis Ababa, Ethiopia 1


5-2
Sampling and Sampling Distributions
• Sampling is a representative sample or portion of elements of a
population or process is selected and then analyzed.
• The sampling distribution of is the probability distribution of
all possible values of the sample mean.
• Sample distribution is the distribution of measured values of
statistic in random samples drawn from a given population.
• When the mean values obtained from samples are distributed
normally, it implies that this distribution is useful for describing
the characteristics (properties) of sampling distribution.
5-3
Cont’d
• The properties of sampling distribution, help to frame rules for
making statistical inferences about a population on the basis of
a single sample drawn from it, that is, without even repeating
the sampling process.
• When the population has a normal distribution, then the
probability distribution provides a useful description of the
distribution of population values.
• Population distribution is the distribution of values of its
elements members and has mean denoted by µ, variance σ2
and standard deviation σ.
5-4
Sampling Distribution of Mean
 Sampling Distribution of the mean from normal population

 For any given sample of size n taken from a population with


mean µ and standard deviation σ, the sampling distribution of
a sample statistic, such as mean and standard deviation are
defined respectively by;
5-5
Cont’d

5-6
Cont’d

5-7
Central Limit Theorem
 In selecting random samples of size n from a population, the
sampling distribution of the sample mean can be approximated
by a normal distribution as the sample size becomes large.
 Use the central limit theorem is to solve problems involving
sample means for large samples.
 As the sample size n increases without limit, the shape of the
distribution of the sample means taken with replacement from a
population with mean and standard deviation will approach a
normal distribution.
5-8
Cont’d
 It’s important to remember two things when you use the
central limit theorem:
 When the original variable is normally distributed, the

distribution of the sample means will be normally distributed,


for any sample size n.
 When the distribution of the original variable might not be

normal, a sample size of 30 or more is needed to use a


normal distribution to approximate the distribution of the
sample means. Because the larger the sample, the better the
approximation will be.
5-9
Cont’d

• Comparing the population


distribution and the sampling Unif orm Distribution (1,8)
0.2

distribution of the mean:

P (X )
• The sampling distribution is more 0.1

bell-shaped and symmetric.


0.0
1 2 3 4 5 6 7 8

• Both have the same center. X

Sampling Distribution of the Mean


• The sampling distribution of the
mean is more compact, with a 0.10

smaller variance. P(X)


0.05

0.00
1 .0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
X
5-10

Cont’d
Normal Uniform Skewed General

Population

n=2

n = 30

 X  X  X  X

The Central Limit Theorem Applies to Sampling Distributions from Any Population
5-11
Cont’d
• When the data values are evenly distributed about the mean, a
distribution is said to be a symmetric distribution.
• When the majority of the data values fall to the left or right of the
mean, the distribution is said to be skewed.
• When the majority of the data values fall to the right of the mean,
the distribution is said to be a negatively or left-skewed
distribution.
• The mean is to the left of the median, and the mean and the
median are to the left of the mode.
• When the majority of the data values fall to the left of the mean, a
distribution is said to be a positively or right-skewed distribution.
5-12
Cont’d
 In particular, if the sampling distribution of x is normal, the
standard error of the mean σx can be used in conjunction with
normal distribution to determine the probabilities of various
values of sample mean.

 For this purpose, the value of sample mean x is first converted


into a value z on the standard normal distribution to know
how any single mean value deviates from the mean x of sample
mean values, by using the formula.
5-13
Z- Standard Normal Distribution Table (Right)
5-14
Z- Standard Normal Distribution Table (Left)
5-15
Cont’d
 Example 1: A survey found that women spend on average $146.21
on beauty products during the summer months. Assume the
standard deviation is $29.44. Find the percentage of women
who spend less than $160.00. Assume the variable is normally
distributed at 70 percent.
 Solution:

 Hence $160.00 is 0.47 of a standard deviation above the mean of


$146.21, as shown in the z distributions.
5-16
Cont’d

 The area under the curve to the left of z = 0.47 is 0.6808.


 Therefore 0.6808, or 68.08% of the women spend less than
$160.00 on beauty products during the summer months.
5-17
Cont’d
• Example 2: The average age of a vehicle registered in the US is 8
years, or 96 months. Assume the standard deviation is 16
months. If a random sample of 36 vehicles is selected, find the
probability that the mean of their age is between 90 and 100
months.

• Solution: Since the sample is 30 or larger, the normality


assumption is not necessary.
5-18
Cont’d
5-19
Cont’d
• To find the area between the two z values of -2.25 and 1.50, look
up the corresponding area in table and subtract one from the
other.
• The area for z = -2.25 is 0.0122, and the area for z = 1.50 is
0.9332.
• Then the area between the two values is 0.9332 - 0.0122 = 0.9210,
or 92.1%.
• Hence, the probability of obtaining a sample mean between 90
and 100 months is 92.1%; that is, P(90< x < 100) = 92.1%.
5-20
Margin of Error and the Interval Estimate
 Interval estimate of a population mean: σ known

 Where (1 -α) is the confidence coefficient and Zα/2 is the z value


providing an area of α/2 in the upper tail of the standard normal
probability distribution.
 Note that:
 A narrower confidence interval is more precise

 Larger samples give more precise estimates

 Small variance leas to more precise estimates

 Lower confidence coefficients allow us to construct more precise

estimate
5-21
Cont’d
 Values of zα/2 for the most commonly used confidence
levels.

 Comparing the results for the 90%, 95%, and 99% confidence
levels, we see that in order to have a higher degree of confidence,
the margin of error and thus the width of the confidence interval
must be larger.
5-22
Cont’d
 Example: assumes that a known value of σ = $20 for the
population standard deviation the sample size (n = 100) and
obtained a sample mean of (x= $82). Solution:
5-23
Cont’d
 Let us use expression to construct a 95% confidence interval. For a 95%
confidence interval, the confidence coefficient is (1 - α) = .95 and thus, α =
.05. Using the standard normal probability table, an area of α/2 = .05/2 = .025
in the upper tail provides z.025 = 1.96. With the sample mean 82, and σ = 20,
and a sample size n= 100, we obtain .

 Thus, using expression, the margin of error is 3.92 and the 95%
confidence interval is 82 - 3.92 = 78.08 to 82 - 3.92 = 85.92.
5-24
Student’s t Distribution
• When s is used to estimate σ, the margin of error and the
interval estimate for the population mean are based on a
probability distribution known as the t distribution.

• Although the mathematical development of the t distribution is


based on the assumption of a normal distribution for the
population we are sampling from, research shows that the t
distribution can be successfully applied in many situations where
the population deviates significantly from normal.

• The t distribution is a family of similar probability


distributions, with a specific t distribution depending on a
parameter known as the degrees of freedom.
5-25
Cont’d
5-26
Cont’d
• The t is a family of bell-shaped and symmetric distributions,
one for each number of degree of freedom.

• The expected value of t is 0.

• The variance of t is greater than 1, but approaches 1 as the number


of degrees of freedom increases.

• The t is flatter and has fatter tails than does the standard normal.

• The t distribution approaches a standard normal as the number of


degrees of freedom increases.
5-27
Cont’d
 The t-distribution is symmetrical, bell-shaped and has zero as its
mean. The interval estimate of the population mean for the
small-sample case with  unknown is given by:

 Where:
 Where s is the sample standard deviation, (1-α) is the

confidence coefficient, and tα/2 is the t-value providing an area


of α/2 in the upper tail of a t-distribution with n – 1 degrees of
freedom.
5-28

Cont’d
• Each sample distribution is a discrete distributions. Because the
value of the sample mean would vary from sample to sample.
• This variability serves as the basis for the random sampling
distribution.
• The standard deviation which measures the variability among
all possible values of the sample values, is considered as a good
approximation of the population’s standard deviations σ.
• The reason the number of degrees of freedom associated with the
t value in expression is n - 1 concerns the use of s as an estimate
of the population standard deviation σ.
• The expression for the sample standard deviation is
5-29

Cont’d

 Where:
 n is the size of sample.
 The new value n – 1 in the denominator results into higher
value of s than the observed value s of the sample. Here n – 1 is
also known as degree of freedom.
 The number of degrees of freedom, df = n – 1 indicate the
number of values that are free to vary in a random sample.
5-30
Cont’d
• Interval estimate of a population mean:

• where s is the sample standard deviation, (1-α) is the confidence


coefficient, and tα/2 is the t value providing an area of α/2 in the
upper tail of the t distribution with n-1 degrees of freedom.
t – distribution table -1 5-31


t – distribution table -2, (continued) 5-32


t – distribution table -3 (continued) 5-33
5-34
Cont’d
• Example 1: Training time in days for a sample of 20 ABC
industries employees

• Solution:
5-35
Cont’d

5-36
Determining Sample The Size

Reading Assignment!!!
5-37

THANKS!!!

THANKS!!! 37

You might also like