IE-525-Chapter 2
IE-525-Chapter 2
1
1. Dot Diagrams
A stack of observed values along a
horizontal axis.
Example
Tension Bond Strength Data Table 2-1
16.76 17.04
(0.316) (0.247)
2. Scatter Plots
Two dimensional plot of individual data values
for measurements on pairs of variables.
Example
Tension Bond Strength for the Modified
formulation in order of data collection
Table 2-1.
2
Scatter Plot for Tension Bond Strength
17.7
17.5
Modified Mortar
17.3
17.1
16.9
16.7
16.5
16.3
0 2 4 6 8 10
Run Order
3. Histograms
Alternatives to dot diagrams and show how
often the data occur in various intervals of the
measured variable.
3
Example
Job CPU times (in seconds), n = 25
4
Histogram
Freq.
10
yi (sec’s)
0
Interpreting a Histogram:
5
4. Box Plot
Five values: maximum, minimum, QU, QL and
the median of the data set.
QL QU
yi
X X *
Inner Fences
Outer Fences
Outliers:
Errant observations (i.e., unusually large or
small relative to the other data values)
which lie outside of the range of data values
under study.
6
Sources of outliers:
1. measurement is observed, recorded or
entered into a computer incorrectly.
2. measurement comes from a different
population.
3. measurement is correct but represents a
rare event
7
Interval Estimator :
An interval between two values qˆL and qˆU
that will enclose the parameter q with a
specified probability (Confidence level),
i.e,
P (qˆL q qˆU ) 1
Where,
qˆL is the lower confidence limit constructed
such that (/2) 100 % of the sampling
distribution of qˆ lies to its left.
qˆU is the upper confidence limit constructed
such that (/2) 100 % of the sampling
distribution of qˆ lies to its right.
8
(1) is the confidence level of the
interval.
{ = 0.05 for most practical applications}
Note:
The higher the confidence level, the wider
the interval !!
(CLT)
9
Example:
A manufacturing company produces light
bulbs that have a length of life that is
approx., normally distributed with mean
and standard deviation of 40 hr’s.
If a random sample 16 bulbs has a mean
of 780 hr’s, construct a 95%confidence
interval for the population mean.
10
Conclusion
“At 95% confidence level, we expect the
mean life of all bulbs produced to be within
760.4 and 799.6 hr’s.”
11
Estimating the mean ():
Case II: s Unknown
Theorem:
If`y and s are the mean and standard
deviation of a random sample of n from a
normal population with unknown
variance,
Where: y
t
S n
12
The statistic t can be used to construct
the interval such that:
s s
P ( y t 2 y t 2 ) 1
n n
t/2 , is the tabulated value of the t statistic
with n = n-1 df corresponding
to a tail area of (/2).
Example:
The contents of 7 similar containers of
sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0,
10.2, and 9.6 liters. Construct a 95%
confidence interval for the mean of all such
containers assuming that contents are
normally distributed.
13
From sample data:
`y = 10.0 ,and s = 0.283 liters.
Using the t-distribution Table
For n = 7-1= 6 df and = 0.05
t/2 = t 0.025= 2.447
/2 /2
-t/2 t/2
14
Estimation of the Variance (s2)
Theorem:
If a random sample of n observations
y1,y2,….,yn is selected from a normal
population with finite mean and
variance s2 , then the sampling
distribution of (n-1)s2/s2, will follow
a c2-distribution with n=n-1 df.
(n 1) S 2 (n 1) S 2
P( s
2
) 1
c2 2 c12 2
15
Where,
Example:
To estimate the variance of fill at a cannery,
10 cans were selected at random and their
contents are weighed. The following data
were obtained ( in ounces): 7.96, 7.90, 7.98,
8.01, 7.97, 7.96, 8.03, 8.02, 8.04, 8.02.
Construct a 90% confidence interval
for estimating the variance.
16
From sample data,
s = 0.0430 f(c2 )
Using Tables;
(/2) (/2)
At (1-) = 0.90,
and n = n-1= 9 c2 c2
1(/2) (/2)
c20.05 16.919, and
c20.95 3.32511
Conclusion
Note:
0.0314< s < 0.0707
17
Hypotheses Testing:
Using sample evidence to reject or accept a
claim regarding the value of a population
parameter.
Properties
We are to make a decision based on sample
data to either reject or accept a claim:
Hypotheses H0
Decision True False
Accept No error Type II error
Reject Type I error No error
18
Definitions:
• The probability of making type I error ():
“Is the conditional probability of rejecting Ho, when
Ho is true.”
• A two-tailed test:
A test of hypothesis where the alternative is
two-directional:
Reject Reject
(/2) Accept (/2)
qL qo qU
19
• A one-tailed test:
A test of hypothesis where the alternative is
one-directional:
Upper-tailed
Ho : q < qo Vs. H1: q > qo
Reject
()
Accept
qo qU
Lower-tailed
Reject
()
Accept
qL qo
20
Testing a population mean ()
Case I: s known
Example
A manufacturer of sports equipment has
developed a new synthetic fishing line that
he claims to have a mean breaking strength
of 8 kg with standard deviation of 0.5. A
random sample of 20 lines is tested and
found to have a mean 7.8 kg.
21
Ho: = 8 Vs. H1: < 8
lower-tailed test
Rejection region:
Reject
at = 0.01 (0.01)
Accept
Z0.01= -2.326 0 z
Z0.01
s
y L z 7.74 kg
n
Reject
(0.01)
Accept
`yL o `y
22
Conclusion
Since`y = 7.8 > `yL ; we accept (fail to
reject) Ho, and conclude that there is no
sufficient evidence to reject the
manufacturer’s claim.
7.8
Reject
(0.01)
Accept
`yL o `y
Alternative Procedure:
Using Z as the test statistic: -1.79
Rejection Region: Reject
Z = -2.326 (0.01)
Accept
From sample data: -2.326 0 z
y o 7.8 8.0
Z 1.79
s n 0.5 20
23
b = P{accepting Ho: = 8 | a=7.5}
Reject Accept
b
Notes
• The power of the test at a= 7.5 kg
(1-b) = 0.9842
• b is a function of a , n and
24
The p-value of a test:
(Observed Significance)
“The probability of observing a value of the
test statistic as contradictory to the null (Ho)
as that computed from sample data.”
25
Example:
Compute the p-value for the test
conducted in the previous example:
7.8 8.0
Z 1.79
0.5 20
p-value
Accept
Using Table : 0 z
-1.79
p-value = P (z < -1.79)
= 0.0367
Ho ,will be rejected only at > 0.0367
26
Testing a population mean ()
Case II: s Unknown
Example
In the fishing line example, suppose s is
unknown and sample data are:
n =20, `y =7.8 kg and S=0.6 kg.
Assuming that the breaking strength is
normally distributed, test the manufacturer’s
claim at =0.01.
Ho : = 8 Vs H1: < 8
lower-tailed test
Rejection region
using the t-Distribution Table:
at = 0.01, and
n = (n-1)=19 Reject
() Accept
t0.01= -2.539 t
t0.01 0
27
From sample data:
y o 7.8 8.0
t 1.49
S n 0.6 20
-1.49
Reject
() Accept
0 t
-2.539
Conclusion
Since t > t , we fail to reject Ho ,
and accept the manufacturer’s claim.
28
Testing the population variance s2
(n 1) S 2
c 2
s2
Example
A manufacturer of car batteries claims that
the life of his batteries is approximately
normally distributed with standard deviation
equal to 0.9 years. A random sample of 10
such batteries has a standard deviation of
1.2 years. Is there sufficient evidence to
conclude that s > 0.9 years? Use a 5% level
of significance.
29
Ho: s2=(0.9)2 Vs. H1: s2 >(0.9)2
Upper tailed test
At n=(n-1)=9, f(c2 )
and = 0.05 Reject
Accept ()
c2 = 16.92 c2
0 16.92
(n 1) S 2 9(1.2) 2
c2 16.0
s2 (0.9) 2
Conclusion:
Since c2 < 16.92, accept Ho and
reject the claim that s > 0.9 years.
30
Studies based on Two Samples:
Randomized designs to compare two
conditions or treatments:
P( F1 2 (n 1 ,n 2 ) F F 2 (n 1 ,n 2 ) ) 1
31
F/2 and F1-/2 are the F-values corresponding
to an area (/2) to the right and left of the F-
distribution with:
n1 = (n1-1) = Numerator d.f.
And,
n2= (n2-1) = Denominator d.f.
From the F-distribution Tables.
32
Note :
1
F1 2 (n 1 ,n 2 )
F 2 (n 2 ,n 1 )
Example 1
Using the F distribution Tables, find:
a) F 0.025 (6,8)
b) F 0.975 (6,8)
0.025
F(6,8)
0 4.65
1
b) F0.975( 6,8)
F0.025(8,6)
1
0.1786
5.6
33
Example 2
A random sample of n1=16 measurements on
the breaking strength of a certain type of
material has s12=3.68 (psi)2. Repeated
measurements on a second machine with
n2=10 shows s22=2.3 (psi)2.
34
At 1-= 0.90 , /2 = 0.05
Using the F distribution Table,
F0.05(15,9) = 3.01
F0.05(9,15) = 2.59
Thus,
s 12
0.532 2 4.144
s2
Conclusions
At the 90% confidence level, the ratio
between the two variances is expected to
lie within 0.532 and 4.144.
Since the interval includes a ratio of
1.0, it can be further concluded that the
difference is not significant.
35
Similar results would have been obtained
by testing the hypothesis:
s 12 s 12
H 0 : 2 1.0 vs. H1 : 2 1.0
s2 s2
And using the test statistic:
f(F) 1.6
3.68
F 1.6 0.05
2.30
0.386
F(6,8)
3.01
s 12 s 22
s y y
1 2
n1 n2
36
Using the statistic z, the interval
is given by:
s 12 s 22
( y1 y2 ) Z 2
n1 n2
Example:
A standard chemistry test was given to 50
girls and 75 boys. The girls made an
average of 82 while the boys made an
average of 76.
37
Construct a 95% confidence interval
of the difference between the population
means assuming that both populations
are normally distributed with variances
64 and 36 for boys and girls respectively.
Using Tables,
Z/2= 1.96
Thus;
3.54 < 1-2 < 8.46
38
Conclusions
At the 95% confidence level, the true
difference is expected to lie within
3.55 and 8.45.
Since the interval does not include
zero, it can be further concluded
that the difference is significant.
39
If`y1 and`y2 are means of two independent
random samples of n1 and n2 drawn from
normal distributions with unknown but
equal variances, a (1-) confidence interval
for estimating (12) is given by:
1 1
( y1 y 2 ) t 2 S p
n1 n2
where,
(n1 1) S12 (n2 1) S 22
Sp
n1 n2 2
= The Pooled Standard deviation
40
Example
Copper produced by sintering under certain
conditions is measured for porosity in a
laboratory. A random sample of 4
measurements shows a mean of 0.22 and a
variance of 0.001.
41
From sample data:
Sp = 0.04
Using the t distribution Table:
for n = (9-2) = 7
t/2 = t0.025= 2.365
Thus;
-0.01< 12 < 0.11
Conclusions
At the 95% confidence we expect
the difference between the two
population means to lie within:
[-0.01 and 0.11]
Further, it may be concluded that the
difference is not significant.
42
Estimation of the difference
between two means (1-2):
S12 S 22
( y1 y2 ) t 2
n1 n2
With,
( S12 n1 S 22 n2 ) 2
n 2 2
( S1 n1 ) ( S 22 n2 ) 2
n1 1 n2 1
43
Example
Consider the previous example without the
assumption of equal variances.
44
Estimation of the difference
between two means (1-2):
45
A interval for estimating the mean
difference of the matched pairs is given
by:
Sd Sd
d t 2 d d t 2
n n
Example:
It is claimed that a new diet will reduce a
person’s weight by 4.5 kg on the average in
a period of 2 weeks. The weight of seven
individuals who followed this diet were
recorded before and after a 2-week period:
46
Subject Before After d = (B-A)
1 58.5 60.0 - 1.5
2 60.3 54.9 5.4
3 61.7 58.1 3.6
4 69.0 62.1 6.9
5 64.0 58.5 5.5
6 62.6 59.9 2.7
7 56.7 54.4 2.3
47
From sample data:
`d = 3.56 kg, and Sd= 2.776 kg
Conclusions
At the 95% confidence level the
mean difference is expected to lie
between 0.99 and 6.13 Kg.
There is not enough evidence to
reject the manufacturer’s claim.
48