0% found this document useful (0 votes)
43 views32 pages

Pertemuan 3 Statlan

This document discusses hypothesis testing and confidence intervals for population variances and the ratio of variances using chi-square and F distributions. It provides examples of computing test statistics and p-values for tests of a single variance and the ratio of variances. Confidence intervals are also demonstrated for a single population variance and the ratio of variances using the chi-square and F distributions and tables. Formulas and Excel functions are presented.

Uploaded by

Hanan Tsabitah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views32 pages

Pertemuan 3 Statlan

This document discusses hypothesis testing and confidence intervals for population variances and the ratio of variances using chi-square and F distributions. It provides examples of computing test statistics and p-values for tests of a single variance and the ratio of variances. Confidence intervals are also demonstrated for a single population variance and the ratio of variances using the chi-square and F distributions and tables. Formulas and Excel functions are presented.

Uploaded by

Hanan Tsabitah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Nurlatifah

Kuliah ketiga
▪ Testing of single variance
▪ Testing of two variances
▪ Using Excel for testing of variance
▪ Inferences about 𝜎 2 are based on the sample
variance 𝑆 2 .
▪ These inferences are based on a new
distribution: the chi-square or 𝜒 2 distribution.
▪ 𝜒 2 distribution is characterized by a family of
distributions, where each distribution depends
on its particular degrees of freedom df.
2
▪ It is common to use the notation 𝜒𝑑𝑓 when
referring to the chi-square distribution.

LO 11.1
▪ The Sampling Distribution of S2.
2
• 𝜒𝑑𝑓 is the probability distribution of the
sum of several independent, squared
standard normal random variables.
• S2 is based on the squared differences
between the sample values and the
sample mean.

LO 11.1
▪ The Sampling Distribution of S2.
• If a sample of size n is taken from a normal population with a finite
variance, then we can define the statistic
2 (𝑛 − 1)𝑆 2
𝜒𝑑𝑓 =
𝜎2

2
which follows the 𝜒𝑑𝑓 distribution with df = n − 1.

LO 11.1
▪ The Chi-Square Distribution.
2
• The shape of 𝜒𝑑𝑓 distribution
depends on the degrees of
freedom, df = n − 1.
2
• As the df grow larger, the 𝜒𝑑𝑓
distribution tends to the normal
distribution.

LO 11.1
▪ Finding 𝝌𝟐
𝒅𝒇 Values and Probabilities.
2
• 𝜒𝛼,𝑑𝑓 represents a value such
that the area in the right tail of
the distribution is equal α.
• In other words, P(𝜒 2𝑑𝑓 ≥ 𝜒𝛼,𝑑𝑓
2
)=
α.

LO 11.1
▪ The Chi-Square Table.
2
• Find 𝜒𝛼,𝑑𝑓 with α = 0.05 and df = 10.
• Use Chi-Square table.

▪ Look at the first column labeled df and find the value 10.
▪ Continue along this row until reaching the column 0.050.
2
▪ 𝜒0.05,10 = 18.307.
LO 11.1
▪ Left-tail Values.
2
• 𝜒1−𝛼,𝑑𝑓 represents a value such that the area in the left tail of the distribution
is α while the area in the right tail equals 1 − α .
• In other words, P(𝜒 2𝑑𝑓 ≥ 𝜒1−𝛼,𝑑𝑓
2
) = 1 − α.
• For example, if df = 10 and we want to find the
value such that the area to the left of the 𝜒210
equals 0.05, we find 𝜒0.95,10
2
= 3.940.

LO 11.1
▪ Graph of the Probability α = 0.05 on Both Sides of 𝝌𝟐
𝟏−𝜶,𝒅𝒇 .

LO 11.1
▪ Confidence Interval for σ2.
• A 100(1 − α)% confidence interval for the population variance σ2 is
computed as

(𝑛−1)𝑠 2 (𝑛−1)𝑠 2
2 , 2 ,
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓

• where df = n − 1.
• Formula is valid only when the random sample is drawn from a normally
distributed population.

LO 11.2
▪ Confidence Interval for σ.
• Since the standard deviation is just the positive square root of the variance, a
100(1 − α)% confidence interval for the population standard deviation is
computed as

(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
2 , 2 .
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓

• where df = n − 1.

LO 11.2
▪ Example.
• Compute 95% confidence intervals for the population
standard deviation for the Growth fund. Assume that returns
are normally distributed.
• n = 10, s = 20.45, α = 0.05.
• α = 0.05 and α∕2 = 0.025 .
• The 95% confidence for σ2:
(𝑛−1)𝑠 2 (𝑛−1)𝑠 2 (10−1)(20.45)2 (10−1)(20.45)2
2 , 2 = ,
𝜒𝛼/2,𝑑𝑓 𝜒1−𝛼/2,𝑑𝑓 19.023 2.700

=[197.86, 1,394.01].
• To find the 95% confidence for σ, take the positive
square root of the limits of this interval: [14.07, 37.34].

LO 11.2
▪ Hypothesis Test for the Population Variance.
• A test about σ2 could have one of three forms:

• The value of the test statistic is computed as:


(𝑛 − 1)𝑠 2
2
𝜒𝑑𝑓 =
𝜎02
• where df = n − 1, s2 is the sample variance, and 𝜎02 is the
hypothesized value of the population variance.
• Formula is valid only if the underlying population is
normally distributed.

LO 11.2
▪ Example.
• At the 5% significance level, conduct the test to verify if the
standard deviation of returns for the Value fund differs from
10%. This is equivalent to testing if the variance differs from
100(%)2. (Assume that returns are normally distributed.)

▪ State the hypotheses:


H0: σ2 = 100
HA: σ2 ≠ 100
▪ Given that n = 10 and s = 18.46, find the value of the test
statistic:
2 (𝑛 − 1)𝑠 2 (10 − 1)(18.46)2
𝜒𝑑𝑓 = 2 = = 30.669.
𝜎0 100

LO 11.2
▪ Example (continued).
• Since df = 10 – 1 = 9 and we have a two-tailed test, we
compute the p-value as two times the area in the right-tail
of the distribution:

p-value = 2∙P(𝜒 2𝑑𝑓 ≥ 30.669).

• Excel: CHISQ.DIST.RT() function


2
▪ =CHISQ.DIST.RT(𝜒𝑑𝑓 , df) returns the probability in the
right-tail of the distribution

LO 11.2
▪ Example (continued).
• Excel:

▪ Since the p-value < 0.05, we reject the null hypothesis.


Therefore, we can conclude that the risk, measured by
the variance of the return, differs from 100(%)2 or,
equivalently, that the standard deviation differs from
10%.

LO 11.2
▪ Inferences about the ration of two population variances 𝜎12 /𝜎22 are
based on the ration of the corresponding sample variances 𝑆12 /𝑆22 .

▪ These inferences are based on a new distribution: the F distribution.


▪ It is common to use the notation 𝐹(𝑑𝑓1 ,𝑑𝑓2 ) when referring to the F
distribution:
• df1 is the numerator degrees of freedom;
• df2 is the denominator degrees of freedom.

LO 11.3
▪ Sampling Distribution of 𝑺𝟐 𝟐
𝟏 /𝑺𝟐 .
• If independent samples of size n1 and n2 are drawn from normal populations
with equal variances, then the statistic
𝑆12
𝐹(𝑑𝑓1,𝑑𝑓2) = 2
𝑆2
follows the 𝐹(𝑑𝑓1,𝑑𝑓2) distribution with df1 = n1 − 1 and df2 = n2 − 1.

LO 11.3
▪ The F Distribution.
• 𝐹𝛼,(𝑑𝑓1 ,𝑑𝑓2) represents the
value such that the area in the
right tail of the distribution is
equal α.
• In other words, P(𝐹(𝑑𝑓1,𝑑𝑓2) ≥
𝐹𝛼,(𝑑𝑓1 ,𝑑𝑓2) ) = α.

LO 11.3
▪ The F Table.
• With df1 = 6 and df2 = 8, 5% of
the area falls above 3.58.
• In other words, P(𝐹(𝑑𝑓1 ,𝑑𝑓2) ≥
3.58) = 0.05.

LO 11.3
▪ The F Distribution: Left-Tail Values.
• 𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) represents the value such that the area in the left tail of the
distribution is equal α.
▪ Given that the entire area equals one, the area to the right of the given value
must equal 1 − α.
• 𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) can be found as:
1
𝐹1−𝛼,(𝑑𝑓1,𝑑𝑓2) = .
𝐹𝛼,(𝑑𝑓2,𝑑𝑓1)
• Reverse the order of the numerator and the denominator degrees of
freedom!

LO 11.3
▪ The F Distribution: Left-Tail Values.
• Find 𝐹1−𝛼,(𝑑𝑓1 ,𝑑𝑓2) where α = 0.05, df1 = 6 and df2 = 8.
• We find 𝐹0.95,(6,8) as:
1 1
𝐹0.95,(6,8) = = = 0.24.
𝐹0.95,(8,6) 4.15

• In other words, the lower (left) tail area is P(𝐹(6,8) < 0.24) = 0.05.

LO 11.3
▪ Confidence interval for 𝝈𝟐
𝟏 /𝝈𝟐
𝟐.

▪ A 100(1 − α)% confidence interval for the ratio of the


population variances 𝜎12 /𝜎22 is computed as
𝑠12 1 𝑠12
2 𝐹 , 2 𝐹𝛼/2,(𝑑𝑓2,𝑑𝑓1 ) .
𝑠2 𝛼/2,(𝑑𝑓1,𝑑𝑓2 ) 𝑠2
where for samples of size n1 and n2, df1=n1−1 and df2=n2−1.
▪ This formula is valid if the sample variances are computed
from independently drawn samples from two normally
distributed populations.

LO 11.4
▪ Example.

▪ A professor wants to compare the variability in scores between


the two sections of statistics course. Random samples of n1 = 11
and n2 = 16 yield sample variances of 𝑠12 = 182.25 and 𝑠22 =
457.96. Construct the 95% confidence interval for the ratio of the
population variances.
▪ From the F table: 𝐹0.025, 10,15 = 3.06, 𝐹0.025, 15,10 = 3.52.
▪ The confidence interval is:
182.25 1 182.25
, 3.52 = 0.13,1.40 .
457.96 3.06 457.96
▪ Therefore, with 95% confidence we can conclude that the
variance of scores in the first section is between 13% and 140%
of the variance of scores in the second section.

LO 11.4
▪ Hypothesis Test About 𝝈𝟐 𝟐
𝟏 /𝝈𝟐 .

▪ A hypothesis test about the ratio 𝜎12 /𝜎22 has one of the
three forms:

▪ The value of the test statistic is :


𝑠12
𝐹(𝑑𝑓1 ,𝑑𝑓2 ) = 2 .
𝑠2
▪ Formula is valid if the sample variances are computed
from independently drawn samples from normally
distributed populations.
LO 11.4
▪ Hypothesis Test About 𝝈𝟐
𝟏 /𝝈𝟐
𝟐 : Notes.
▪ It is preferable to place the larger sample variance in the numerator of
the 𝐹(𝑑𝑓1,𝑑𝑓2) statistic.
▪ The resulting value allows us to focus only on the upper (right) tail of
the distribution.

LO 11.4
▪ Example.

▪ At the 5% significance level, test if the Growth fund is


riskier than the Value fund. Sample summary measures:
▪ Growth fund: 𝑥ҧ1 =10.09, s1=20.45, n1=10;
▪ Value fund: 𝑥ҧ2 =7.56, s2 =18.46, n2=10.

▪ State the hypotheses:


H0: 𝜎12 /𝜎22 ≤ 1
HA: 𝜎12 /𝜎22 > 1
▪ Find the value of the test statistic:
𝑠12 (20.45)2
𝐹(𝑑𝑓1 ,𝑑𝑓2 ) = 2= 2
= 1.227.
𝑠2 (18.46)

LO 11.4
▪ Example (continued).
• To find the p-value, compute the probability in the
right tail of the distribution:

p-value = P(𝐹(𝑑𝑓1 ,𝑑𝑓2) ≥ 1.227) .

• Excel: F.DIST.RT() function


• =F.DIST.RT(𝐹 𝑑𝑓1 ,𝑑𝑓2 , 𝑑𝑓1 , 𝑑𝑓2 ) returns the
probability in the right-tail of the distribution
for given df1 and df2.

LO 11.4
▪ Example (continued).
• Excel:

▪ Since the p-value > 0.05, we do not reject H0. At the


5% significance level, we cannot conclude that the
Growth fund is riskier than the Value fund.

LO 11.4
1. The administrator of a college is concerned about the students using cell phones
for texting during the class hours. She randomly selects 25 students to check if the
variance of the number of students in the college using cell phones during the class
hours is less than 12. She arrives at a sample variance of 7. She assumes the
distribution to be normal.
▪ a. State the appropriate null and alternative hypotheses for the test.
▪ b. Compute the value of the test statistic.
▪ c. Use the critical value approach to test the administrator's concern at α = 0.10.
▪ d. Repeat the test at α = 0.01.
2. A supermarket has just added a new cash register to reduce the waiting times of the
customers during weekends. Because a new cash register has been added, the
customers expect their waiting time to be less than the waiting time had been before the
cash register was added. Suppose the sample variance of the waiting times for 25
customers before adding the new cash register is = 4.8 minutes, while the sample
variance of the waiting times for 25 customers after adding the cash register is = 2.2
minutes. The manager of the supermarket believes that the waiting times are normally
distributed and that the two samples are drawn independently.

▪ a. Develop the hypotheses to test whether the ratio of two population variances is less
than 1.
▪ b. Calculate the appropriate test statistic.
▪ c. Using the p-value approach, what is the decision at the 5% significance level?
▪ d. What is the conclusion? Given that all other criteria are satisfied, should the
supermarket continue with the new cash register?

You might also like