0% found this document useful (0 votes)
939 views39 pages

Engineering Data Analysis M9 Finals

This document provides an overview of statistical inference methods for comparing two samples or populations. It discusses how to construct confidence intervals and perform hypothesis tests on the difference in means, variance, and proportions of two normal distributions. Specifically, it covers inference on the difference in means when variances are known or unknown, inference on the variance of two distributions, and inference on two population proportions. The objectives are to structure comparative experiments, test hypotheses and construct confidence intervals for differences and ratios of parameters between two samples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
939 views39 pages

Engineering Data Analysis M9 Finals

This document provides an overview of statistical inference methods for comparing two samples or populations. It discusses how to construct confidence intervals and perform hypothesis tests on the difference in means, variance, and proportions of two normal distributions. Specifically, it covers inference on the difference in means when variances are known or unknown, inference on the variance of two distributions, and inference on two population proportions. The objectives are to structure comparative experiments, test hypotheses and construct confidence intervals for differences and ratios of parameters between two samples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Module No. 9

I. Title: Statistical Inference of Two Samples

II. Topic: 9.1. Inference on the Difference in Means of Two Normal Distributions,
Variances Known
9.2. Inference on the Difference in Means of Two Normal Distribution,
Variances Unknown
9.3. Inference on the Variance of Two Normal Distributions
9.4. Inference on Two Population Proportions

II. Time Frame: 5 hours

IV. Introduction:

Modules 7 and 8 presented confidence intervals (CIs) and hypothesis testing


procedures for a single mean 𝜇, single proportion 𝜌, and a single variance 𝜎 2 . Here
we extend these methods to situations involving the means, proportions, and
variances of two different population distributions. For example, let 𝜇1 denote true
average Rockwell hardness for heat treated steel specimens and 𝜇2 denote true
average hardness for cold-rolled specimens. Then an investigator might wish to use
samples of hardness observations from each type of steel as a basis for calculating
an interval estimate of 𝜇1 − 𝜇2, the difference between the two true average hard
nesses. As another example, let 𝜌1 denote the true proportion of nickel-cadmium
cells produced under current operating conditions that are defective because of
internal shorts, and let 𝜌2 represent the true proportion of cells with internal shorts
produced under modified operating conditions. If the rationale for the modified
conditions is to reduce the proportion of defective cells, a quality engineer would want
to use sample information to test the null hypothesis 𝐻0 : 𝜌1 − 𝜌2 = 0 (𝑖. 𝑒. , 𝜌1 = 𝜌2 )
versus the alternative hypothesis, 𝐻0 : 𝜌1 − 𝜌2 > 0 (𝑖. 𝑒. , 𝑝1 > 𝜌2 ).

V. Objectives:

At the end of this module, the students should be able to

1. Structure comparative experiments involving two samples as hypothesis tests.


2. Test hypotheses and construct confidence intervals on the difference in means
of two normal distributions.
3. Test hypothesis and construct confidence intervals on the ratio of the variances
or standard deviations of two normal distributions.
4. Test hypothesis and construct confidence intervals on the difference in two
population proportions.

VI. Pre-test: This pre-test will be conducted through online

1
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

VII. Learning Activities:

9.1. Inference on the Difference in Means of Two Normal Distributions, Variances


Known

The previous module presented hypothesis tests and confidence intervals for a
single population parameter (the mean μ, the variance 𝜎 2 , or a proportion 𝜌). This
chapter extends those results to the case of two independent populations.

The general situation is shown in Figure 9.1. Population 1 has mean 𝜇1 and
variance 𝜎12 , and population 2 has mean 𝜇2 and variance 𝜎22 . Inferences will be
based on two random samples of sizes 𝑛1 and 𝑛2 , respectively. That is, 𝑋11,
𝑋12,..., 𝑋1𝑛1 is a random sample of 𝑛1 observations from population 1, and 𝑋21 ,
𝑋22 ,..., 𝑋2𝑛2 is a random sample of 𝑛2 observations from population 2. Most of the
practical applications of the procedures in this chapter arise in the context of
simple comparative experiments in which the objective is to study the difference
in the parameters of the two populations.

Engineers and scientists are often interested in comparing two different


conditions to determine whether either condition produces a significant effect on
the response that is observed. These conditions are sometimes called
treatments. Example 9.1 describes such an experiment; the two different
treatments are two paint formulations, and the response is the drying time. The
purpose of the study is to determine whether the new formulation results in a
significant effect reducing drying time. In this situation, the product developer (the
experimenter) randomly assigned 10 test specimens to one formulation and 10
test specimens to the other formulation. Then the paints were applied to the test
specimens in random order until all 20 specimens were painted. This is an
example of a completely randomized experiment.

When statistical significance is observed in a randomized experiment, the


experimenter can be confident in the conclusion that the difference in treatments
resulted in the difference in response. That is, we can be confident that a cause
and effect relationship has been found.

2
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Sometimes the objects to be used in the comparison are not assigned at


random to the treatments. For example, the September 1992 issue of Circulation
(a medical journal published by the American Heart Association) reports a study
linking high iron levels in the body with increased risk of heart attack. The study,
done in Finland, tracked 1931 men for 5 years and showed a statistically
significant effect of increasing iron levels on the incidence of heart attacks. In this
study, the comparison was not performed by randomly selecting a sample of men
and then assigning some to a “low iron level” treatment and the others to a “high
iron level” treatment. The researchers just tracked the subjects over time. Recall
from Module 1 that this type of study is called an observational study.

It is difficult to identify causality in observational studies because the observed


statistically significant difference in response for the two groups may be due to
some other underlying factor (or group of factors) that was not equalized by
randomization and not due to the treatments. For example, the difference in heart
attack risk could be attributable to the difference in iron levels or to other underlying
factors that form a reasonable explanation for the observed results—such as
cholesterol levels or hypertension

In this module, we consider statistical inferences on the difference in means


𝜇1 − 𝜇2 of two normal distributions where the variances 𝜎12 and 𝜎22 are known. The
assumptions for this section are summarized as follows.

A logical point estimator of 𝜇1 − 𝜇2 is the difference in sample means 𝑋1 − 𝑋2 .


Based on the properties of expected values,

𝐸(𝑋1 − 𝑋2 ) = 𝐸(𝑋1) − 𝐸(𝑋2 ) = 𝜇1 − 𝜇2

and the variance of X1 − X2 is

𝜎12 𝜎22
𝑉(𝑋1 − 𝑋2 ) = 𝑉(𝑋1 ) − 𝑉(𝑋2 ) = +
𝑛1 𝑛2

Based on the assumptions and the preceding results, we may state the following.

3
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

This result is used to develop procedures for tests of hypotheses and to


construct confidence intervals on 𝜇1 − 𝜇2. Essentially, we may think of 𝜇1 − 𝜇2 as
a parameter θ where estimator is Θ ̂ = 𝑋1 − 𝑋2 with variance 𝜎Θ̂2 = 𝜎12 ⁄𝑛1 + 𝜎22 ⁄𝑛2
̂ − θ0 ) ⁄θΘ̂
If θ0 is the null hypothesis value specified for θ, the test statistic will be (Θ
Notice how similar this is to the test statistic for a single mean used in previous
module.

9.1.1. Hypothesis Tests on the Difference in Means, Variances Known

We now consider hypothesis testing on the difference in the means 𝜇1 − 𝜇2 of


two normal populations. Suppose that we are interested in testing whether the
difference in means 𝜇1 − 𝜇2 is equal to a specified value ∆0. Thus, the null
hypothesis is stated as 𝐻0 : 𝜇1 − 𝜇2 = ∆0. Obviously, in many cases, we specify
∆0 = 0 so that we are testing the equality of two means (i.e., 𝐻0 : 𝜇1 = 𝜇2 ). The
appropriate test statistic would be found by replacing 𝜇1 − 𝜇2 in Equation 9.1
by ∆0 : this test statistic would have a standard normal distribution under 𝐻0 .
That is, the standard normal distribution is the reference distribution for the test
statistic. Suppose that the alternative hypothesis is 𝐻1 : 𝜇1 − 𝜇2 ≠ ∆0. A sample
value of 𝑥1 − 𝑥2 that is considerably different from Δ0 is evidence that 𝐻1 is true.
Because 𝑍0 has the 𝑁 (0,1) distribution when 𝐻0 is true, we would calculate the
P-value as the sum of the probabilities beyond the test statistic value 𝑧0 and
−𝑧0 in the standard normal distribution. That is 𝑃 = 2[1 − Φ (|𝑧0 |)]. This is
exactly what we did in the one-sample z-test. If we wanted to perform a fixed-
significance level test, we would take −𝑧𝑎⁄2 and 𝑧𝑎⁄2 as the boundaries of the
critical region just as we did in the single-sample z-test. This would give a test
with level of significance α. P-values or critical regions for the one-sided
alternatives would be determined similarly. Formally, we summarize these
results in the following display.

4
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

When the population variances are unknown, the sample variances 𝑠12 and 𝑠22 can
be substituted into the test statistic Equation 9.2 to produce a large-sample test for
the difference in means. This procedure will also work well when the populations
are not necessarily normally distributed. However, both 𝑛1 and 𝑛2 should exceed
40 for this large-sample test to be valid

5
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.1.2. Type II Error and Choice of Sample Size

Use of Operating Characteristic Curves The operating characteristic (OC)


curves in Appendix Charts VII𝑎, VII𝑏, VII𝑐 and VII𝑑 may be used to evaluate
the type II error probability for the hypotheses in the display (9.2). These curves
are also useful in determining sample size. Curves are provided for 𝑎 = 0.05
and 𝑎 = 0.01. For the two-sided alternative hypothesis, the abscissa scale of
the operating characteristic curve in charts VII𝑎 and VII𝑏 is 𝑑, where

|𝜇1 − 𝜇2 − ∆0 | |∆ − ∆ 0 |
𝑑= =
√𝜎12 + 𝜎22 √𝜎12 + 𝜎22

and one must choose equal sample sizes, say, n=n1=n2. The one-sided
alternative hypotheses require the use of Charts VII𝑐 and VII𝑑. For the one-
sided alternatives 𝐻1 : 𝜇1 − 𝜇2 > ∆0 or 𝐻1 : 𝜇1 − 𝜇2 < ∆0 , the abscissa scale is
also given by

|𝜇1 − 𝜇2 − ∆0 | |∆ − ∆ 0 |
𝑑= =
√𝜎12 + 𝜎22 √𝜎12 + 𝜎22

It is not unusual to encounter problems where the costs of collecting data differ
substantially for the two populations or when the variance for one population is
much greater than the other. In those cases, we often use unequal sample
sizes. If 𝑛1 ≠ 𝑛2 , the operating characteristic curves may be entered with an
equivalent value of n computed from

𝜎12 + 𝜎22
𝑛= 2
𝜎1 ⁄𝑛1 + 𝜎22 ⁄𝑛2

If 𝑛1 ≠ 𝑛2 and their values are fixed in advance, Equation 9.4 is used directly to
calculate 𝑛, and the operating characteristic curves are entered with a specified
𝑑 to obtain β. If we are given d and it is necessary to determine 𝑛1 and 𝑛2 to
obtain a specified β, say, β*, we guess at trial values of 𝑛1 and 𝑛2 , calculate 𝑛
in Equation 9.4, and enter the curves with the specified value of d to find β.If
β=β*, the trial values of 𝑛1 and 𝑛2 are satisfactory. If β ≠ β*, adjustments to 𝑛1
and 𝑛2 are made and the process is repeated.

6
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Sample Size Formulas. It is also possible to obtain formulas for calculating the
sample sizes directly. Suppose that the null hypothesis 𝐻𝑜 : 𝜇1 − 𝜇2 = ∆0 is false and
that the true difference in means is 𝜇1 − 𝜇2 where ∆> ∆0. One may find formulas for
the sample size required to obtain a specific value of the type II error probability β for
a given difference in means Δ and level of significance α For example, we first write
the expression for theβ-error for the two-sided alternative, which is

∆ − ∆0 ∆ − ∆0
𝛽 = Φ 𝑍𝑎⁄2 − − Φ −𝑍𝑎⁄2 −
𝜎12 𝜎22 𝜎12 𝜎22
√ √
𝑛1 + 𝑛2 𝑛1 + 𝑛2
( ) ( )

The derivation for sample size closely follows the single-sample case in previous
module.

This approximation is valid when Φ (−𝑧𝑎⁄2 − (∆ − ∆0 ) √𝑛⁄√𝜎12 + 𝜎22 ) is small


compared to β.

7
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

where Δ is the true difference in means of interest. Then by following a procedure


similar to that used to obtain in previous module, the expression for β can be obtained
for the case where 𝑛 = 𝑛1 = 𝑛2 .

9.1.3. Confidence Interval on the Difference in Means, Variances Known

The 100(1 − 𝑎)% confidence interval on the difference in two means 𝜇1 − 𝜇2


when the variances are known can be found directly from results given
previously in this section. Recall that 𝑋11, 𝑋12 ,... , 𝑋1𝑛1 is a random sample
ofn1observations from the first population and 𝑋21 , 𝑋22 ,... , 𝑋2𝑛2 is a random
sample of 𝑛2 observations from the second population. The difference in
sample means 𝑋1 − 𝑋2 is a point estimator of 𝜇1 − 𝜇2 , and

𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 )
𝑍=
𝜎12 𝜎22

𝑛1 + 𝑛2

has a standard normal distribution if the two populations are normal or is


approximately standard normal if the conditions of the central limit theorem
apply, respectively. This implies that 𝑃(−𝑧𝑎⁄2 ≤ 𝑍 ≤ 𝑧𝑎⁄2 ) = 1 − 𝑎, or

8
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 )
𝑃 −𝑧𝑎⁄2 ≤ ≤ 𝑧𝑎⁄2 = 1 − 𝑎
𝜎2 𝜎22
√ 1
𝑛1 + 𝑛2
[ ]

This can be rearranged as

𝜎12 𝜎22 𝜎12 𝜎22


𝑃 (𝑋1 − 𝑋2 − 𝑧𝑎⁄2√ + ≤ 𝜇1 − 𝜇2 ≤ 𝑋1 − 𝑋2 + 𝑧𝑎⁄2 √ + ) = 1 − 𝑎
𝑛1 𝑛2 𝑛1 𝑛2

Therefore, the 100(1 − 𝑎)% confidence interval for 𝜇1 − 𝜇2 is defined as


follows.

The confidence level 1−α is exact when the populations are normal. For non-normal
populations, the confidence level is approximately valid for large sample sizes.
Equation 9.7 can also be used as a large sample CI on the difference in mean when
𝜎12 and 𝜎22 are unknown by substituting 𝑠12 and 𝑠22 for the population variances. For this
to be a valid procedure, both sample sizes 𝑛1 and 𝑛2 should exceed 40.

9
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Choice of Sample Size If the standard deviations 𝜎1 and 𝜎2 are known (at least
approximately) and the two sample sizes 𝑛1 and 𝑛2 are equal (𝑛1 = 𝑛2 = 𝑛, 𝑠𝑎𝑦),
we can determine the sample size required so that the error in estimating 𝜇1 − 𝜇2
by 𝑥1 − 𝑥2 will be less than E at 100(1 − 𝑎)% confidence. The required sample size
from each population is

Remember to round up if n is not an integer. This ensures that the level of


confidence does not drop below 100(1 − 𝑎)%

One-Sided Confidence Bounds One-sided confidence bounds on 𝜇1 − 𝜇2 may


also be obtained. 100(1 − 𝑎)% Upper confidence bound on 𝜇1 − 𝜇2 is

and a 100(1−α)% lower-confidence bound is

9.2. Inference on the Difference in Means of Two Normal Distributions,


Variances Unknown

We now extend the results of the previous section to the difference in means
of the two distributions in Figure 9.1 when the variances of both distributions
𝜎12 and 𝜎22 are unknown. If the sample sizes 𝑛1 and 𝑛2 exceed 40, the normal
distribution procedures in Section 9.1 could be used. However, when small
samples are taken, we assume that the populations are normally distributed
and base our hypotheses tests and confidence intervals on the t distribution.
This parallels nicely the case of inference on the mean of a single sample
with unknown variance

10
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.2.1 Hypothesis Test on the Difference in Means, Variances Unknown

We now consider tests of hypotheses on the difference in means 𝜇1 − 𝜇2 of


two normal distributions where the variances 𝜎12 and 𝜎22 are unknown. A t-
statistic is used to test these hypotheses. As noted earlier and in Module 8.3,
the normality assumption is required to develop the test procedure, but
moderate departures from normality do not adversely affect the procedure.
Two different situations must be treated. In the first case, we assume that the
variances of the two normal distributions are unknown but equal; that is, 𝜎12 =
𝜎22 = 𝜎 2 . In the second, we assume that 𝜎12 and 𝜎22 are unknown and not
necessarily equal.

Case 1: 𝝈𝟐𝟏 = 𝝈𝟐𝟐 = 𝝈𝟐

Suppose that we have two independent normal populations with unknown


means μ1 and μ2, and unknown but equal variances, 𝜎12 = 𝜎22 = 𝜎 2 .We wish
to test

𝐻0 : 𝜇1 − 𝜇2 = ∆0
𝐻1 : 𝜇1 − 𝜇2 ≠ ∆0

Let 𝑋11 , 𝑋12,….,𝑋1𝑛1 be a random sample ofn1observations from the first


population and 𝑋21 , 𝑋22 ,….,𝑋2𝑛2 be a random sample of 𝑛2 observations from
the second population. Let 𝑋1, 𝑋2 , 𝑆12 and 𝑆22 be the sample means and
sample variances, respectively. Now the expected value of the difference in
sample means 𝑋1 − 𝑋2 is 𝐸(𝑋1 − 𝑋2 ) = 𝜇1 − 𝜇2 , so 𝑋1 − 𝑋2 is an unbiased
estimator of the difference in means. The variance of 𝑋1 − 𝑋2 is

𝜎2 𝜎2 1 1
𝑉(𝑋1 − 𝑋2 ) = + = 𝜎2 ( + )
𝑛1 𝑛2 𝑛1 𝑛2

It seems reasonable to combine the two sample variances 𝑆12 and 𝑆22 to form
an estimator of 𝜎 2 .The pooled estimator of 𝜎 2 is defined as follows.

It is easy to see that the pooled estimator 𝑆𝑝2 can be written as

11
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

𝑛1 − 1 𝑛2 − 1
𝑆𝑝2 = 𝑆12 + 𝑆 2 = 𝑤𝑆12 + (1 − 𝑤)𝑆22
𝑛1 + 𝑛2 − 2 𝑛1 + 𝑛2 − 2 2

where 0 < 𝑤 ≤ 1. Thus, 𝑆𝑝2 is a weighted average of the two sample variances
𝑆12 and 𝑆22 where the weights wand 1 − 𝑤 depend on the two sample sizes 𝑛1
and 𝑛2 . Obviously, if 𝑛1 = 𝑛2 = 𝑛, 𝑤 = 0.5, 𝑆𝑝2 is just the arithmetic average of
𝑆12 and 𝑆22 If 𝑛1 = 10 and 𝑛2 = 20 (say), 𝑤 = 0.32 and 1 − 𝑤 = 0.68. The first
sample contributes 𝑛1 − 1 degrees of freedom to 𝑆𝑝2 and the second sample
contributes 𝑛2 − 1 degrees of freedom. Therefore, 𝑆𝑝2 has 𝑛1 + 𝑛2 − 2 degrees
of freedom. Now we know that

𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 )
𝑍=
1 1
𝜎√ +
𝑛1 𝑛2

has a N(0, 1) distribution. Replacing 𝜎 by 𝑆𝑝 gives the following.

The use of this information to test the hypotheses in Equation 9.11 is now
straight forward: Simply replace 𝜇1 − 𝜇2 by ∆0, and the resulting test statistic
has at distribution with 𝑛1 + 𝑛2 − 2 degrees of freedom under 𝐻0 : 𝜇1 − 𝜇2 = ∆0 .
Therefore, the reference distribution for the test statistic is the t distribution with
𝑛1 + 𝑛2 − 2 degrees of freedom. The calculation of P-values and the location of
the critical region for fixed-significance-level testing for both two- and one-sided
alternatives parallels those in the one-sample case. Because a pooled estimate
of variance is used, the procedure is often called the pooled t-test.

12
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Normal probability plot and comparative box plot for the catalyst-yield data in
Example 9.5 (a) Normal Probability plot. (b) Box plots.

13
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Case 2: 𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐

In some situations, we cannot reasonably assume that the unknown variances


𝜎12 and 𝜎22 are equal. There is not an exact t - statistic available for testing
𝐻0 : 𝜇1 − 𝜇2 = ∆0 in this case. However, an approximate result can be applied

Therefore, if 𝜎12 ≠ 𝜎22 the hypotheses on differences in the means of two


normal distributions are tested as in the equal variances case except that 𝑇0∗
is used as the test statistic and 𝑛1 + 𝑛2 − 2 is replaced by 𝜈 in determining the
degrees of freedom for the test. The pooled 𝑡 − 𝑡𝑒𝑠𝑡 is very sensitive to the
assumption of equal variances (so is the CI procedure in Section 9.2.3). The
two-sample t-test assuming that 𝜎12 ≠ 𝜎22 is a safer procedure unless one is
very sure about the equal variance assumption.

14
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

15
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.2.2 Type II Error and Choice of Sample Size

The operating characteristic curves in Appendix Charts VII𝑒,VI𝐼𝑓,VII𝑔 and


VI𝐼ℎ are used to evaluate the type II error for the case in which 𝜎12 = 𝜎22 = 𝜎 2 .
Unfortunately, when 𝜎12 ≠ 𝜎22 the distribution of 𝑇0∗ is unknown if the null
hypothesis is false, and no operating characteristic curves are available for
this case. For the two-sided alternative 𝐻1 : 𝜇1 − 𝜇2 = ∆≠ ∆0 , when 𝜎12 = 𝜎22 =
𝜎 2 and 𝑛1 = 𝑛2 = 𝑛, Charts VII𝑒 and VI𝐼𝑓 are used with

|∆ − ∆ 0 |
𝑑=
2𝜎

Where Δ is the true difference in means that is of interest. To use these curves,
they must be entered with the sample size 𝑛∗ = 2𝑛 − 1. For the one-sided
alternative hypothesis, we use Charts VII𝑔 and VI𝐼ℎ and defined and Δ as in
Equation 9.17. It is noted that the parameter d is a function of 𝜎, which is
unknown. As in the single-sample 𝑡 − test we may have to rely on a prior
estimate of 𝜎 or use a subjective estimate. Alternatively, we could define the
differences in the mean that we wish to detect relative to 𝜎.

9.2.3 Confidence Interval on the Difference in Means, Variances Unknown

Case 1: 𝝈𝟐𝟏 = 𝝈𝟐𝟐 = 𝝈𝟐

To develop the confidence interval for the difference in means 𝜇1 − 𝜇2 when


both variances are equal, note that the distribution of the statistic

𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 )
𝑇=
1 1
𝑆𝑝 √𝑛 + 𝑛
1 2

is the 𝑡 distribution with n1+n2−2 degrees of freedom.


Therefore 𝑃(−𝑡𝑎⁄2,𝑛1 +𝑛2 −2 ≤ 𝑇 ≤ 𝑡𝑎⁄2,𝑛1 +𝑛2−2 ) = 1. Now substituting Equation
9.18 for 𝑇 and manipulating the quantities inside the probability statement
leads to the 100(1 − 𝑎)% confidence interval on 𝜇1 − 𝜇2.

16
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Case 2: 𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐

In many situations, assuming that 𝜎12 = 𝜎22 is not reasonable. When this
assumption is unwarranted, we may still find a 100(1 − 𝑎)% confidence
interval on 𝜇1 − 𝜇2 using the fact that 𝑇∗ =
[𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 )]⁄√𝑆12 ⁄𝑛1 + 𝑆22 ⁄𝑛2 is distributed approximately as t with
degrees of freedom v given by Equation 9.16. The CI expression follows.

17
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.2.4 A Nonparametric Test for the Difference in Two Means

Suppose that we have two independent continuous populations 𝑋1 and 𝑋2 with


means μ1 and μ2, but we are unwilling to assume that they are
(approximately) normal. However, we can assume that the distributions of 𝑋1
and 𝑋2 are continuous and have the same shape and spread, and differ only
(possibly) in their locations. The Wilcoxon rank-sum test can be used to test
the hypothesis 𝐻0 : 𝜇1 = 𝜇2 . This procedure is sometimes called the Mann-
Whitney test, although the Mann-Whitney test statistic is usually expressed in
a different form.

9.2.4.1 Description of the Wilcoxon Rank-Sum Test

Let 𝑋11 , 𝑋12,…., 𝑋1𝑛1 and 𝑋21 , 𝑋22 ,…., 𝑋2𝑛2 be two independent random
samples of sizes 𝑛1 ≤ 𝑛2 from the continuous populations 𝑋1 and 𝑋2 described
earlier. We wish to test the hypotheses

𝐻0 : 𝜇1 = 𝜇2 𝐻1 : 𝜇1 ≠ 𝜇2

The test procedure is as follows. Arrange all 𝑛1 + 𝑛2 observations in ascending


order of magnitude and assign ranks to them. If two or more observations are
tied (identical), use the mean of the ranks that would have been assigned if
the observations differed. Let 𝑊1 be the sum of the ranks in the smaller sample
(1), and define W2 to be the sum of the ranks in the other sample. Then,

(𝑛1 + 𝑛2 )(𝑛1 + 𝑛2 + 1)
𝑊2 = − 𝑊1
2

Now if the sample means do not differ, we expect the sum of the ranks to be
nearly equal for both samples after adjusting for the difference in sample size.
Consequently, if the sums of the ranks differ greatly, we conclude that the
means are not equal.

18
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Appendix Table X contains the critical value of the rank sums for 𝑎 = 0.05 and
𝑎 = 0.01 assuming the preceding two-sided alternative. Refer to Appendix
Table X with the appropriate sample sizes n1andn2, and the critical value 𝑤𝑎
can be obtained. The null 𝐻0 : 𝜇1 = 𝜇2 is rejected in favor of 𝐻1 : 𝜇1 < 𝜇2 , if either
of the observed values 𝑤1 or 𝑤2 is less than or equal to the tabulated critical
value 𝑤𝑎 .

The procedure can also be used for one-sided alternatives. If the alternative
is 𝐻1 : 𝜇1 < 𝜇2 , reject 𝐻0 if 𝑤1 ≤ 𝑤𝑎 ; for 𝐻1 : 𝜇1 > 𝜇2 , reject 𝐻𝑜 if 𝑤2 ≤ 𝑤𝑎 . For
these one-sided tests, the tabulated critical values 𝑤𝑎 correspond to levels of
significance of 𝑎 = 0.025 and 𝑎 = 0.005.

19
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.2.4.2 Large-Sample Approximation

When both 𝑛1 and 𝑛2 are moderately large, say, more than eight, the
distribution of 𝑤1 can be well approximated by the normal distribution with
mean

𝑛1 (𝑛1 + 𝑛2 + 1)
𝜇𝑤1 =
2
and variance

𝑛1 𝑛2 (𝑛1 + 𝑛2 + 1)
𝜎𝑤2 1 =
12

Therefore, for 𝑛1 and 𝑛2 > 8, we could use

Normal Approximation for Wilcoxon Rank-Sum Test Statistic

𝑊1 − 𝜇𝑤1
𝑍0 =
𝜎𝑤1

as a statistic, and the appropriate critical region is |𝑧0 | > 𝑍𝑎⁄2, 𝑧0 > 𝑧𝑎 , or 𝑧0 <
−𝑧𝑎 , depending on whether the test is a two-tailed, upper-tailed, or lower-tailed
test.

9.2.4.3 Comparison to the t –Test

In Module 8, we discussed the comparison of the t-test with the Wilcoxon


signed-rank test. The results for the two-sample problem are similar to the
one-sample case. That is, when the normality assumption is correct, the
Wilcoxon rank-sum test is approximately 95% as efficient as the t-test in large
samples. On the other hand, regardless of the form of the distributions, the
Wilcoxon rank-sum test will always be at least 86% as efficient. The efficiency
of the Wilcoxon test relative to the t-test is usually high if the underlying
distribution has heavier tails than the normal because the behavior of the t -
test is very dependent on the sample mean, which is quite unstable in heavy-
tailed distributions.

20
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.2.5 Paired t - Test

A special case of the two-sample t-tests in Section 9.2 occurs when the
observations on the two populations of interest are collected in pairs. Each
pair of observations, say (𝑋1𝑗 , 𝑋2𝑗 ), is taken under homogeneous conditions,
but these conditions may change from one pair to another. For example,
suppose that we are interested in comparing two different types of tips for a
hardness-testing machine. This machine presses the tip into a metal specimen
with a known force. By measuring the depth of the depression caused by the
tip, the hardness of the specimen can be determined. If several specimens
were selected at random, half tested with tip 1, half tested with tip 2, and the
pooled or independent t - test in Section 9.2 was applied, the results of the
test could be erroneous. The metal specimens could have been cut from bar
stock that was produced in different heats, or they might not be homogeneous
in some other way that might affect hardness. Then the observed difference
in mean hardness readings for the two tip types also includes hardness
differences in specimens.

A more powerful experimental procedure is to collect the data in pairs—that


is, to make two hardness readings on each specimen, one with each tip. The
test procedure would then consist of analyzing the differences in hardness
readings on each specimen. If there is no difference between tips, the mean
of the differences should be zero. This test procedure is called the paired t -
test.

Let (𝑋11 , 𝑋21 ),(𝑋12 , 𝑋22 ),...,(𝑋1𝑛 , 𝑋2𝑛 ) be a set of 𝑛 paired observations for
which we assume that the mean and variance of the population represented
by 𝑋1 are 𝜇1 and 𝜎12 and the mean and variance of the population represented
by 𝑋2 are 𝜇2 and 𝜎22 . Define the difference for each pair of observations as
𝐷𝑗 = 𝑋1𝑗 − 𝑋2𝑗 , 𝑗 = 1,2, … , 𝑛.The 𝐷𝑗 ′𝑠 are assumed to be normally distributed
with mean

𝜇𝐷 = 𝐸 (𝑋1 − 𝑋2 ) = 𝐸 (𝑋1 ) − 𝐸 (𝑋2 ) = 𝜇1 − 𝜇2

and variance 𝜎𝐷2 so testing hypotheses about the difference for 𝜇1 and 𝜇2 can
be accomplished by performing a one sample t - test on 𝜇𝐷 . Specifically,
testing 𝐻1 : 𝜇1 − 𝜇2 = ∆0 against 𝐻1 : 𝜇1 − 𝜇2 ≠ ∆0 is equivalent to testing

𝐻0 : 𝜇𝐷 = ∆0
𝐻1 : 𝜇𝐷 ≠ ∆0

The test statistic and decision procedure follow.

21
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

In Equation 9.24, 𝐷 is the sample average of then differences 𝐷1 , 𝐷2 , … , 𝐷𝑛 , and 𝑆𝐷 is


the sample standard deviation of these differences.

The results essentially agree with the manual calculations, in addition to the
hypothesis test results. Most computer software report a two-sided CI on the
difference in means. This CI was found by constructing a single-sample CI on 𝜇𝐷 .
We provide the details later.

22
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Paired Versus Unpaired Comparisons In performing a comparative


experiment, the investigator can sometimes choose between the paired
experiment and the two-sample (or unpaired) experiment. If n measurements are
to be made on each population, the two-sample t - statistic is

𝑋1 − 𝑋 2 − ∆0
𝑇0 =
1 1
𝑆𝑝 √𝑛 + 𝑛

which would be compared to 𝑡2𝑛−2 , and of course, the paired t - statistic is

𝐷 − ∆0
𝑇0 =
𝑆𝑝 ⁄√𝑛

which is compared to 𝑡2𝑛−1 . Notice that because

the numerators of both statistics are identical. However, the denominator of the
two-sample t - test is based on the assumption that 𝑋1 and 𝑋2 are independent.
In many paired experiments, a strong positive correlation ρ exists for 𝑋1 and 𝑋2 .
Then it can be shown that

assuming that both populations 𝑋1 and 𝑋2 have identical variances 𝜎 2


Furthermore, 𝑆𝐷2 ⁄𝑛 estimates the variance of 𝐷. Whenever a positive correlation
exists within the pairs, the denominator for the paired t - test will be smaller than
the denominator of the two sample t - test. This can cause the two sample t - test
to considerably understate the significance of the data if it is incorrectly applied
to paired samples. Although pairing will often lead to a smaller value of the
variance of 𝑋1 − 𝑋2, it does have a disadvantage namely, the paired t - test leads
to a loss of 𝑛 − 1 degrees of freedom in comparison to the two sample t - test.
Generally, we know that increasing the degrees of freedom of a test increases
the power against any fixed alternative values of the parameter

So how do we decide to conduct the experiment? Should we pair the


observations or not? Although this question has no general answer, we can give
some guidelines based on the preceding discussion.

1. If the experimental units are relatively homogeneous (small σ) and the


correlation within pairs is small, the gain in precision attributable to pairing will
be offset by the loss of degrees of freedom, so an independent sample
experiment should be used.

23
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

2. If the experimental units are relatively heterogeneous (large σ) and there is


large positive correlation within pairs, the paired experiment should be used.
Typically, this case occurs when the experimental units are the same for both
treatments; as in Example 9.11, the same girders were used to test the two
methods.

Implementing the rules still requires judgment because σ and ρ are never known
precisely. Furthermore, if the number of degrees of freedom is large (say, 40 or
50), the loss of n−1 of them for pairing may not be serious. However, if the
number of degrees of freedom is small (say, 10 or 20), losing half of them is
potentially serious if not compensated for by increased precision from pairing.

Confidence Interval for 𝝁𝑫 : To construct the confidence interval for 𝜇𝐷 = 𝜇1 −


𝜇2 , note that

𝐷 − 𝜇𝐷
𝑇=
𝑆𝐷 ⁄√𝑛

follows at distribution with 𝑛 − 1 degrees of freedom. Then,


because 𝑃(−𝑡𝑎⁄2,𝑛−1 ≤ 𝑇 ≤ 𝑡𝑎⁄2,𝑛−1 ) = 1 − 𝑎, we can substitute for T in the
preceding expression and perform the necessary steps to isolate 𝜇𝐷 = 𝜇1 − 𝜇2
for the inequalities. This leads to the following 100(1 − 𝑎)% confidence interval
on 𝜇1 − 𝜇2.

This confidence interval is also valid for the case in which 𝜎12 ≠ 𝜎22 because 𝑠𝐷2
estimates 𝜎𝐷2 = 𝑉(𝑋1 − 𝑋2 ) Also, for large samples (say, 𝑛 ≥ 30 pairs), the
explicit assumption of normality is unnecessary because of the central limit
theorem.

24
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Nonparametric Approach to Paired Comparisons: Both the sign test and the
Wilcoxon signed-rank test discussed in Module 8 can be applied to paired
observations. In the case of the sign test, the null hypothesis is that the median
of the differences is equal to zero (that is, 𝐻0 : 𝑢̃𝐷 = 0 ). The Wilcoxon signed-rank
test is for the null hypothesis that the mean of the differences is equal to zero.
The procedures are applied to the observed differences as described in Module
8

9.3 Inference on the Variances of Two Normal Distributions

We now introduce tests and confidence intervals for the two population variances
shown in Figure 9.1. We assume that both populations are normal. Both the
hypothesis testing and confidence interval procedures are relatively sensitive to
the normality assumption.

9.3.1 𝑭 Distribution

Suppose that two independent normal populations are of interest when the
population means and variances, say, 𝜇1 , 𝜎12 , 𝜇2 and 𝜎22 are unknown. We
wish to test hypotheses about the equality of the two variances, say, 𝐻0 : 𝜎12 =
𝜎22 . Assume that two random samples of size 𝑛1 from population 1 and of
size 𝑛2 from population 2 are available, and let 𝑆12 and 𝑆22 be the sample
variances. We wish to test the hypotheses

25
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

𝐻0 : 𝜎12 = 𝜎22
𝐻1 : 𝜎12 ≠ 𝜎22

The development of a test procedure for these hypotheses requires a new


probability distribution, the F distribution. The random variable F is defined to be
the ratio of two independent chi-square random variables, each divided by its
number of degrees of freedom. That is,

𝑊 ⁄𝜇
𝐹=
𝑌 ⁄𝜈

Where 𝑊 and 𝑌 are independent chi-square random variables with 𝑢 and 𝜈


degrees of freedom, respectively. We now formally state the sampling distribution
of 𝐹.

The mean and variance of the F distribution are 𝜇 = 𝜈⁄(𝜈 − 2) for 𝜈 > 2 and

2𝜈 2 (𝜇 + 𝜈 − 2)
𝜎2 = ,
𝜇(𝜈 − 2)2 (𝜈 − 4)

Two F distributions are shown in Figure 9.4. The F random variable is nonnegative,
and the distribution is skewed to the right. The F distribution looks very similar to
the chi-square distribution; however, the two parameters 𝜇 and 𝜐 provide extra
flexibility regarding shape.

The percentage points of the F distribution are given in Table VI of the


Appendix. Let 𝑓𝑎,𝑢𝑣 be the percentage point of the F distribution with numerator
degrees of freedom u and denominator degrees of freedom 𝜈 such that the
probability that the random variable F exceeds this value is

26
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY


𝑃(𝐹 > 𝑓𝑎,𝑢𝑣 ) = ∫ 𝑓 (𝑥 )𝑑𝑥 = 𝑎
𝑓𝑎,𝑢𝑣

This is illustrated in Figure 9.5. For example, if 𝜇 = 5 and 𝜐 = 10, we find from
Table V of the Appendix that

𝑃(𝐹 > 𝑓0.05,510 ) = 𝑃(𝐹5,10 > 3.33) = 0.05

That is, the upper 5 percentage points of 𝐹5,10 is 𝑓0.05,5,10 = 3.33.

Table VI contains only upper-tailed percentage points (for selected values of 𝑓𝑎,𝑢,𝜈
for 𝑎 ≤ 0.25) of the F distribution. The lower-tailed percentage points 𝑓1−𝑎,𝑢,𝜈 can
be found as follows.

For example, to find the lower-tailed percentage point 𝑓0.9,5,10, note that

1 1
𝑓0.9,5,10 = = = 0.211
𝑓0.05,510 4.74

27
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.3.2. Hypothesis Tests on the Equity of Two Variances

A hypothesis-testing procedure for the equality of two variances is based on the


following result.

This result is based on the fact that (𝑛1 − 1)𝑆12 ⁄𝜎12 is a chi-square random variable
with 𝑛2 − 1 degrees of freedom, that (𝑛2 − 1)𝑆22 ⁄𝜎22 is a chi-square random
variable with 𝑛2 − 1 degrees of freedom, and that the two normal populations are
independent. Clearly, under the null hypothesis 𝐻0 : 𝜎12 = 𝜎22 , the ratio 𝐹0 = 𝑆12 ⁄𝑆22
has an 𝐹𝑛1 −1,𝑛2 −1 distribution. This is the basis of the following test procedure.

The critical regions for these fixed-significance-level tests are shown in Figure 9.6.
Remember that this procedure is relatively sensitive to the normality assumption.

28
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

𝑷 − 𝐕𝐚𝐥𝐮𝐞𝐬 𝐟𝐨𝐫 𝐭𝐡𝐞 𝑭 − 𝐓𝐞𝐬𝐭:

The P - value approach can also be used with F - tests. To show how to do this,
consider the upper-tailed test. The P - value is the area (probability) under the F
distribution with 𝑛1 − 1 and 𝑛2 − 1 degrees of freedom that lies beyond the computed
value of the test statistic 𝑓0. Appendix A Table IV can be used to obtain upper and
lower bounds on the P - value. For example, consider an F - test with 9 numerator and
14 denominator degrees of freedom for which 𝑓0 = 3.05. From Appendix A Table IV,
we find that 𝑓0.05,9.14 = 2.65 and 𝑓0.025,9.14 = 3.21 so because 𝑓0 = 3.05 lies between
these two values, the P - value is between 0.05 and 0.025; that is, 0.025 < P < 0.05.
The P - value for a lower-tailed test would be found similarly, although Appendix A
Table IV contains only upper-tailed points of the F distribution, Equation 9.31 would
have to be used to find the necessary lower-tail points. For a two-tailed test, the
bounds obtained from a one-tailed test would be doubled to obtain the P - value.

To illustrate calculating bounds on the P - value for a two-tailed F - test, reconsider


Example 9.13. The computed value of the test statistic in this example is 𝑓0 = 0.85.
This value falls in the lower tail of the 𝐹15,15 distribution. The lower-tailed point that has
0.25 probability to the left of it is 𝑓0.75,15,15 = 1⁄𝑓0.25,15,15 = 1⁄1.43 = 0.70, and
because 0.70 < 0.85, the probability that lies to the left of 0.85 exceeds 0.25.
Therefore, we would conclude that the P - value for 𝑓0 = 0.85 is greater than 2(0.25) =
0.5, so there is insufficient evidence to reject the null hypothesis. This is consistent
with the original conclusions from Example 9.13. The actual P - value is 0.7570. This
value was obtained from a calculator from which we found that 𝑃(𝐹15,15 ≤ 0.85) =
0.3785 and 2(0.3785) = 0.7570. Computer software can also be used to calculate the
required probabilities.

29
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.3.3. Type II Error and Choice of Sample Size

Appendix Charts VII𝑜, VII𝑝, VII𝑞, and VII𝑟 provide operating characteristic curves
for the F - test given in Section 9.5.1 for 𝑎 = 0.05 and 𝑎 = 0.01, assuming that
𝑛1 = 𝑛2 = 𝑛. Charts VII𝑜 and VII𝑝 are used with the two-sided alternate
hypothesis. They plot β against the abscissa parameter
𝜎1
𝜆=
𝜎2

for various 𝑛1 = 𝑛2 = 𝑛. Charts VII𝑞 and VII𝑟 are used for the one-sided alternative
hypotheses.

9.3.4. Confidence Interval on the Ratio of Two Variances

To find the confidence interval on 𝜎12 ⁄𝜎22 , recall that the sampling distribution of

𝑆22 ⁄𝜎22
𝐹= 2 2
𝑆1 ⁄𝜎1

is an F with 𝑛2 − 1 and 𝑛1 − 1 degrees of freedom. Therefore, 𝑃(𝑓1−𝑎⁄2,𝑛2−1,𝑛1 −1) =


1 − 𝑎. Substitution for F and manipulation of the inequalities will lead to the
100(1 − 𝑎)% confidence interval for 𝜎12 ⁄𝜎22 .

30
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.4 Inference on Two Population Proportions

We now consider the case with two binomial parameters of interest, say, 𝑝1 and 𝑝2 ,
and we wish to draw inferences about these proportions. We present large-sample
hypothesis testing and confidence interval procedures based on the normal
approximation to the binomial.

9.4.1 Large-Sample Tests on the Difference in Population Proportions

Suppose that two independent random samples of sizes n1 and n2 are taken
from two populations, and let 𝑋1 and 𝑋2 represent the number of observations
that belong to the class of interest in samples 1 and 2, respectively.
Furthermore, suppose that the normal approximation to the binomial is
applied to each population, so the estimators of the population proportions
𝑃1 = 𝑋1 ⁄𝑛1 and 𝑃2 = 𝑋2 ⁄𝑛2 have approximate normal distributions. We are
interested in testing the hypotheses

𝐻0 : 𝑝1 = 𝑝2 𝐻1 : 𝑝1 ≠ 𝑝2

The statistics

31
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

is distributed approximately as standard normal and is the basis of a test for


𝐻0 : 𝑝1 = 𝑝2 . Specifically, if the null hypothesis 𝐻0 : 𝑝1 = 𝑝2 is true, by using the fact
that 𝑝1 = 𝑝2 = 𝑝, the random variable

𝑃̂1 − 𝑃̂2
𝑍=
1 1
√𝑝(1 − 𝑝) (
𝑛1 + 𝑛2 )

is distributed approximately 𝑁 (0,1). A pooled estimator of the common parameter


𝑝 is

𝑋1 + 𝑋2
𝑃̂ =
𝑛1 + 𝑛2

The test statistic for 𝐻0 : 𝑝1 = 𝑝2 is then

𝑃̂1 − 𝑃̂2
𝑍𝑜 =
1 1
√𝑃̂(1 − 𝑃̂) ( + )
𝑛1 𝑛2

This leads to the test procedures described as follows.

32
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.4.2 Type II Error and Choice of Sample Size

The computation of the β-error for the large-sample test of 𝐻0 : 𝑝1 = 𝑝2 is


somewhat more involved than in the single sample case. The problem is that
the denominator of the test statistic 𝑍0 is an estimate of the standard
deviation of 𝑃̂1 − 𝑃̂2 under the assumption that 𝑝1 = 𝑝2 = 𝑝 When 𝐻𝑜 : 𝑝1 − 𝑝2
is false, the standard deviation of 𝑃̂1 − 𝑃̂2 is

𝑝1 (1 − 𝑝1 ) 𝑝2 (1 − 𝑝2 )
𝜎𝑃̂1 −𝑃̂2 = √ +
𝑛1 𝑛2

33
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

9.4.3 Confidence Interval on the Difference in Population Proportions

The traditional confidence interval for 𝑝1 − 𝑝2 can be found directly because we


know that

𝑃̂1 − 𝑃̂2 − (𝑝1 − 𝑝2 )


𝑍=
𝑝1 (1 − 𝑝1 ) 𝑝2 (1 − 𝑝2 )
√ +
𝑛1 𝑛2

is approximately a standard normal random variable. Thus 𝑃(−𝑧𝑎⁄2 ≤ 𝑍 ≤ 𝑧𝑎⁄2 ) ≃


1 − 𝑎, so we can substitute for 𝑍 in this last expression and use an approach similar
to the one employed previously to find an approximate 100(1 − 𝑎)% two-sided
confidence interval for 𝑝1 − 𝑝2 .

34
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

The CI in Equation 9.41 is the traditional alone usually given for a difference in
two binomial proportions. However, the actual confidence level for this interval can
deviate substantially from the nominal or advertised value. So when we want a
95% CI (for example) and use 𝑧0.025 = 1.96 in Equation 9.41, the actual confidence
level that we experience may differ from 95%. This situation can be improved by
a very simple adjustment to the procedure: Add one success and one failure to
the data from each sample and then calculate:

35
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

VIII. Self-Evaluation:

1. An article in the November 1983 Consumer Reports compared various types of


batteries. The average lifetimes of Duracell Alkaline AA batteries and Eveready
Energizer Alkaline AA batteries were given as 4.1 hours and 4.5 hours,
respectively. Suppose these are the population average lifetimes.

a. Let 𝑋 be the sample average lifetime of 100 Duracell batteries and 𝑌 be the
sample average lifetime of 100 Eveready batteries. What is the mean value
of 𝑋 − 𝑌 (i.e., where is the distribution of 𝑋 − 𝑌 centered)? How does your
answer depend on the specified sample sizes?

b. Suppose the population standard deviations of lifetime are 1.8 hours for
Duracell batteries and 2.0 hours for Eveready batteries. With the sample
sizes given in part (a), what is the variance of the statistic 𝑋 − 𝑌, and what
is its standard deviation?

c. For the sample sizes given in part (a), draw a picture of the approximate
distribution curve of 𝑋 − 𝑌 (include a measurement scale on the horizontal
axis). Would the shape of the curve necessarily be the same for sample
sizes of 10 batteries of each type? Explain.

2. Determine the number of degrees of freedom for the two sample 𝑡 test or CI in
each of the following situations:

a. 𝑚 = 10, 𝑛 = 10, 𝑠1 = 5.0, 𝑠2 = 6.0


b. 𝑚 = 10, 𝑛 = 15, 𝑠1 = 5.0, 𝑠2 = 6.0
c. 𝑚 = 10, 𝑛 = 10, 𝑠1 = 2.0, 𝑠2 = 6.0
d. 𝑚 = 12, 𝑛 = 24, 𝑠1 = 5.0, 𝑠2 = 6.0

3. Consider the accompanying data on breaking load (𝑘𝑔⁄25 𝑚𝑚 𝑤𝑖𝑑𝑡ℎ) for


various fabrics in both an un-abraded condition and an abraded condition (“The
Effect of Wet Abrasive Wear on the Tensile Properties of Cotton and Polyester-
Cotton Fabrics,” J. Testing and Evaluation, 1993: 84–93). Use the paired 𝑡 −
𝑡𝑒𝑠𝑡 as did the authors of the cited article, to test 𝐻0 : 𝜇𝐷 = 0 versus 𝐻𝑎 : 𝜇𝐷 > 0
at significance level .01.

36
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

IX. Review of Concepts:

Summary table and road map for Inference procedures for two samples. The table
in the module summarizes all of the two-sample parametric inference procedures
given in this module. The table contains the null hypothesis statements, the test
statistics, the criteria for rejection of the various alternative hypotheses, and the
formulas for constructing the 100(1 − 𝑎)% confidence intervals. The road map to
select the appropriate parametric confidence interval formula or hypothesis test
method for one-sample problems was presented in previous module. In Table 9.5,
we extend the roadmap to two-sample problems. The primary comments stated
previously also apply here (except that we usually apply conclusions to a function
of the parameters from each sample, such as the difference in means):

1. Determine the function of the parameters (and the distribution of the data) that
is to be bounded by the confidence interval or tested by the hypothesis.

2. Check whether other parameters are known or need to be estimated (and


whether any assumptions are made).

X. Post-test: This post-test will be conducted through online

XI. References:

 Douglas C. Montgomery & George C. Runger. Applied Statistics And


Probability For Engineers. John Wiley & Sons; 7th ed. 2018.
 Hongshik Ahn. Probability And Statistics For Sciences & Engineering with
Examples in R. Cognella, Inc.; 2nd ed. 2018.
 Jay L. Devore. Probability and Statistics for Engineering and the Science.
Cengage Learning; 9th ed. 201

37
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

PRE-TEST 9
Statistical Inference of Two Samples

Name: Score:
Course & Year: Date:

Direction: Read the problems carefully. Write your solutions in a separate sheet of
paper.

1. An experiment to compare the tension bond strength of polymer latex modified


mortar (Portland cement mortar to which polymer latex emulsions have been
added during mixing) to that of unmodified mortar resulted in 𝑥 =
18.12 𝑘𝑔𝑓 ⁄𝑐𝑚2 for the modified mortar (𝑚 = 40) and 𝑦 = 16.87 𝑘𝑔𝑓 ⁄𝑐𝑚2 for
the unmodified mortar (𝑛 = 32). Let 𝑢1 and 𝑢2 be the true aver-age tension
bond strengths for the modified and unmodified mortars, respectively. Assume
that the bond strength distributions are both normal.

a. Assuming that 𝜎1 = 1.6 and 𝜎2 = 1.4, test 𝐻0 : 𝜇1 − 𝜇2 = 0 versus 𝐻𝑎 : 𝜇1 −


𝜇2 > 0 at level .01.
b. Compute the probability of a type II error for the test of part (a) when 𝜇1 −
𝜇2 = 1.
c. Suppose the investigator decided to use a level .05 test and wished 𝛽 =
0.10 when 𝜇1 − 𝜇2 = 1. If 𝑚 = 40, what value of 𝑛 is necessary?
d. How would the analysis and conclusion of part (a) change if 𝜎1 and 𝜎2
were unknown but 𝑠1 = 1.6 and 𝑠2 = 1.4?

2. Tensile-strength tests were carried out on two different grades of wire rod
(“Fluidized Bed Patenting of Wire Rods,” Wire J.,June 1977: 56–61), resulting
in the accompanying data.

a. Does the data provide compelling evidence for concluding that true average
strength for the 1078 grade exceeds that for the 1064 grade by more than
10 𝑘𝑔⁄𝑚𝑚2 ? Test the appropriate hypotheses using the 𝑃 − 𝑣𝑎𝑙𝑢𝑒
approach.

b. Estimate the difference between true average strengths for the two grades
in a way that provides information about precision and reliability.

38
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9
LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

POST-TEST 9
Statistical Inference of Two Samples

Name: Score:
Course & Year: Date:

Direction: Read the problems carefully. Write your solutions in a separate sheet of
paper.

1. An engineer wishes to compare strength properties of steel beams with similar


beams made with a particular alloy. The same number of beams, n, of each
type will be tested. Each beam will be set in a horizontal position with a support
on each end, a force of 2500 lbs. will be applied at the center, and the deflection
will be measured. From past experience with such beams, the engineer is
willing to assume that the true standard deviation of deflection for both types of
beam is .05 in. Because the alloy is more expensive, the engineer wishes to
test at level .01 whether it has smaller average deflection than the steel beam.
What value of n is appropriate if the desired type II error probability is .05when
the difference in true average deflection favors the alloy by .04 in.?

2. The slant shear test is widely accepted for evaluating the bond of resinous
repair materials to concrete; it utilizes cylinder specimens made of two identical
halves bonded at 30°. The article “Testing the Bond between Repair Materials
and Concrete Substrate” (ACI Materials J., 1996: 553–558) reported that for 12
specimens prepared using wire-brushing, the sample mean shear strength
(𝑁⁄𝑚𝑚2 ) and sample standard deviation were 19.20 and 1.58, respectively,
whereas for 12 hand-chiseled specimens, the corresponding values were 23.13
and 4.01. Does the true average strength appear to be different for the two
methods of surface preparation? State and test the relevant hypotheses using
a significance level of .05. What are you assuming about the shear strength
distributions?

3. It has been estimated that between 1945 and 1971, as many as 2 million
children were born to mothers treated with diethylstilbestrol (DES), a
nonsteroidal estrogen recommended for pregnancy maintenance. The FDA
banned this drug in 1971 because research indicated a link with the incidence
of cervical cancer. The article “Effects of Prenatal Exposure to Diethylstilbestrol
(DES) on Hemispheric Laterality and Spatial Ability in Human Males”
(Hormones and Behavior, 1992: 62–75) discussed a study in which 10 males
exposed to DES, and their unexposed brothers, underwent various tests. This
is the summary data on the results of a spatial ability test: 𝑥 = 12.6 (exposed)
𝑦 = 13.7, and standard error of mean difference = 0.50. Test at level .05 to see
whether exposure is associated with reduced spatial ability by obtaining the 𝑃 −
𝑣𝑎𝑙𝑢𝑒.

39
MATH 114 – ENGINEERING DATA ANALYSIS
MODULE 9

You might also like