Pertemuan 3 Statlan
Pertemuan 3 Statlan
Kuliah ketiga
▪ Testing of single variance
▪ Testing of two variances
▪ Using Excel for testing of variance
▪ Inferences about 𝜎 2 are based on the sample
variance 𝑆 2 .
▪ These inferences are based on a new
distribution: the chi-square or 𝜒 2 distribution.
▪ 𝜒 2 distribution is characterized by a family of
distributions, where each distribution depends
on its particular degrees of freedom df.
2
▪ It is common to use the notation 𝜒𝑑𝑓 when
referring to the chi-square distribution.
LO 11.1
▪ The Sampling Distribution of S2.
2
• 𝜒𝑑𝑓 is the probability distribution of the
sum of several independent, squared
standard normal random variables.
• S2 is based on the squared differences
between the sample values and the
sample mean.
LO 11.1
▪ The Sampling Distribution of S2.
• If a sample of size n is taken from a normal population with a finite
variance, then we can define the statistic
2 (𝑛 − 1)𝑆 2
𝜒𝑑𝑓 =
𝜎2
2
which follows the 𝜒𝑑𝑓 distribution with df = n − 1.
LO 11.1
▪ The Chi-Square Distribution.
2
• The shape of 𝜒𝑑𝑓 distribution
depends on the degrees of
freedom, df = n − 1.
2
• As the df grow larger, the 𝜒𝑑𝑓
distribution tends to the normal
distribution.
LO 11.1
▪ Finding 𝝌𝟐
𝒅𝒇 Values and Probabilities.
2
• 𝜒𝛼,𝑑𝑓 represents a value such
that the area in the right tail of
the distribution is equal α.
• In other words, P(𝜒 2𝑑𝑓 ≥ 𝜒𝛼,𝑑𝑓
2
)=
α.
LO 11.1
▪ The Chi-Square Table.
2
• Find 𝜒𝛼,𝑑𝑓 with α = 0.05 and df = 10.
• Use Chi-Square table.
▪ Look at the first column labeled df and find the value 10.
▪ Continue along this row until reaching the column 0.050.
2
▪ 𝜒0.05,10 = 18.307.
LO 11.1
▪ Left-tail Values.
2
• 𝜒1−𝛼,𝑑𝑓 represents a value such that the area in the left tail of the distribution
is α while the area in the right tail equals 1 − α .
• In other words, P(𝜒 2𝑑𝑓 ≥ 𝜒1−𝛼,𝑑𝑓
2
) = 1 − α.
• For example, if df = 10 and we want to find the
value such that the area to the left of the 𝜒210
equals 0.05, we find 𝜒0.95,10
2
= 3.940.
LO 11.1
▪ Graph of the Probability α = 0.05 on Both Sides of 𝝌𝟐
𝟏−𝜶,𝒅𝒇 .
LO 11.1
▪ Confidence Interval for σ2.
• A 100(1 − α)% confidence interval for the population variance σ2 is
computed as
(𝑛−1)𝑠 2 (𝑛−1)𝑠 2
2 , 2 ,
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓
• where df = n − 1.
• Formula is valid only when the random sample is drawn from a normally
distributed population.
LO 11.2
▪ Confidence Interval for σ.
• Since the standard deviation is just the positive square root of the variance, a
100(1 − α)% confidence interval for the population standard deviation is
computed as
(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
2 , 2 .
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓
• where df = n − 1.
LO 11.2
▪ Example.
• Compute 95% confidence intervals for the population
standard deviation for the Growth fund. Assume that returns
are normally distributed.
• n = 10, s = 20.45, α = 0.05.
• α = 0.05 and α∕2 = 0.025 .
• The 95% confidence for σ2:
(𝑛−1)𝑠 2 (𝑛−1)𝑠 2 (10−1)(20.45)2 (10−1)(20.45)2
2 , 2 = ,
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓 19.023 2.700
=[197.86, 1,394.01].
• To find the 95% confidence for σ, take the positive
square root of the limits of this interval: [14.07, 37.34].
LO 11.2
▪ Hypothesis Test for the Population Variance.
• A test about σ2 could have one of three forms:
LO 11.2
▪ Example.
• At the 5% significance level, conduct the test to verify if the
standard deviation of returns for the Value fund differs from
10%. This is equivalent to testing if the variance differs from
100(%)2. (Assume that returns are normally distributed.)
LO 11.2
▪ Example (continued).
• Since df = 10 – 1 = 9 and we have a two-tailed test, we
compute the p-value as two times the area in the right-tail
of the distribution:
LO 11.2
▪ Example (continued).
• Excel:
LO 11.2
▪ Inferences about the ration of two population variances 𝜎12 /𝜎22 are
based on the ration of the corresponding sample variances 𝑆12 /𝑆22 .
LO 11.3
▪ Sampling Distribution of 𝑺𝟐 𝟐
𝟏 /𝑺𝟐 .
• If independent samples of size n1 and n2 are drawn from normal populations
with equal variances, then the statistic
𝑆12
𝐹(𝑑𝑓1,𝑑𝑓2) = 2
𝑆2
follows the 𝐹(𝑑𝑓1,𝑑𝑓2) distribution with df1 = n1 − 1 and df2 = n2 − 1.
LO 11.3
▪ The F Distribution.
• 𝐹𝛼,(𝑑𝑓1 ,𝑑𝑓2) represents the
value such that the area in the
right tail of the distribution is
equal α.
• In other words, P(𝐹(𝑑𝑓1,𝑑𝑓2) ≥
𝐹𝛼,(𝑑𝑓1 ,𝑑𝑓2) ) = α.
LO 11.3
▪ The F Table.
• With df1 = 6 and df2 = 8, 5% of
the area falls above 3.58.
• In other words, P(𝐹(𝑑𝑓1 ,𝑑𝑓2) ≥
3.58) = 0.05.
LO 11.3
▪ The F Distribution: Left-Tail Values.
• 𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) represents the value such that the area in the left tail of the
distribution is equal α.
▪ Given that the entire area equals one, the area to the right of the given value
must equal 1 − α.
• 𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) can be found as:
1
𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) = .
𝐹𝛼,(𝑑𝑓2,𝑑𝑓1)
• Reverse the order of the numerator and the denominator degrees of
freedom!
LO 11.3
▪ The F Distribution: Left-Tail Values.
• Find 𝐹1−𝛼,(𝑑𝑓1 ,𝑑𝑓2) where α = 0.05, df1 = 6 and df2 = 8.
• We find 𝐹0.95,(6,8) as:
1 1
𝐹0.95,(6,8) = = = 0.24.
𝐹0.95,(8,6) 4.15
• In other words, the lower (left) tail area is P(𝐹(6,8) < 0.24) = 0.05.
LO 11.3
▪ Confidence interval for 𝝈𝟐
𝟏 /𝝈𝟐
𝟐.
LO 11.4
▪ Example.
LO 11.4
▪ Hypothesis Test About 𝝈𝟐 𝟐
𝟏 /𝝈𝟐 .
▪ A hypothesis test about the ratio 𝜎12 /𝜎22 has one of the
three forms:
LO 11.4
▪ Example.
LO 11.4
▪ Example (continued).
• To find the p-value, compute the probability in the
right tail of the distribution:
LO 11.4
▪ Example (continued).
• Excel:
LO 11.4
1. The administrator of a college is concerned about the students using cell phones
for texting during the class hours. She randomly selects 25 students to check if the
variance of the number of students in the college using cell phones during the class
hours is less than 12. She arrives at a sample variance of 7. She assumes the
distribution to be normal.
▪ a. State the appropriate null and alternative hypotheses for the test.
▪ b. Compute the value of the test statistic.
▪ c. Use the critical value approach to test the administrator's concern at α = 0.10.
▪ d. Repeat the test at α = 0.01.
2. A supermarket has just added a new cash register to reduce the waiting times of the
customers during weekends. Because a new cash register has been added, the
customers expect their waiting time to be less than the waiting time had been before the
cash register was added. Suppose the sample variance of the waiting times for 25
customers before adding the new cash register is = 4.8 minutes, while the sample
variance of the waiting times for 25 customers after adding the cash register is = 2.2
minutes. The manager of the supermarket believes that the waiting times are normally
distributed and that the two samples are drawn independently.
▪ a. Develop the hypotheses to test whether the ratio of two population variances is less
than 1.
▪ b. Calculate the appropriate test statistic.
▪ c. Using the p-value approach, what is the decision at the 5% significance level?
▪ d. What is the conclusion? Given that all other criteria are satisfied, should the
supermarket continue with the new cash register?