0% found this document useful (0 votes)
21 views48 pages

IE-525-Chapter 2

تصميم تجارب

Uploaded by

Farooq Alhamdany
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views48 pages

IE-525-Chapter 2

تصميم تجارب

Uploaded by

Farooq Alhamdany
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Chapter 2

Simple Comparative Experiments

Graphical Description of Variability:


1. Dot Diagrams
2. Scatter Plots
3. Histograms
4. Box Plots
The proper method used to represent the
data dependents on the objectives of the
presentation.

1
1. Dot Diagrams
A stack of observed values along a
horizontal axis.
Example
Tension Bond Strength Data Table 2-1

16.4 16.6 16.8 17.0 17.2 17.4


yi (kgf/cm2)

16.76 17.04
(0.316) (0.247)

2. Scatter Plots
Two dimensional plot of individual data values
for measurements on pairs of variables.

Example
Tension Bond Strength for the Modified
formulation in order of data collection
Table 2-1.

2
Scatter Plot for Tension Bond Strength

17.7
17.5
Modified Mortar

17.3
17.1
16.9
16.7
16.5
16.3
0 2 4 6 8 10
Run Order

3. Histograms
Alternatives to dot diagrams and show how
often the data occur in various intervals of the
measured variable.

3
Example
Job CPU times (in seconds), n = 25

1.17 1.61 1.16 1.38 3.53


1.23 3.76 1.94 0.96 4.75
0.15 2.41 0.71 0.02 1.59
0.19 0.82 0.47 2.16 2.01
0.92 0.75 2.59 3.07 1.40

Frequency Distribution Table:


Boundaries
Tally Fi R. Fi
Upper Lower

0.015 0.715 | | | | 5 0.20


0.715 1.415 | | | | | | | | 9 0.36
1.415 2.115 | | | | 4 0.16
2.115 2.815 | | | 3 0.12
2.815 3.515 | 1 0.04
3.515 4.215 | | 2 0.08
4.215 4.915 | 1 0.04

4
Histogram
Freq.

10

yi (sec’s)
0

Interpreting a Histogram:

• Proportion of sample observations


• Numerical measures
• Distribution shape
• Sampling issues

5
4. Box Plot
Five values: maximum, minimum, QU, QL and
the median of the data set.

QL QU
yi
X X *

Inner Fences

Outer Fences

Outliers:
Errant observations (i.e., unusually large or
small relative to the other data values)
which lie outside of the range of data values
under study.

6
Sources of outliers:
1. measurement is observed, recorded or
entered into a computer incorrectly.
2. measurement comes from a different
population.
3. measurement is correct but represents a
rare event

Studies Based on a Single Sample :


Parameter Estimation:
Using information from a single random sample to
estimate the value of an unknown parameter:
Point Estimator:
A rule or formula used to calculate a sample
statistic which corresponds to the unknown
parameter. e.g.,
1 n
 yi  y  ̂
n 1

7
Interval Estimator :
An interval between two values qˆL and qˆU
that will enclose the parameter q with a
specified probability (Confidence level),
i.e,
P (qˆL  q  qˆU )  1  

Where,
qˆL is the lower confidence limit constructed
such that (/2) 100 % of the sampling
distribution of qˆ lies to its left.
qˆU is the upper confidence limit constructed
such that (/2) 100 % of the sampling
distribution of qˆ lies to its right.

8
(1) is the confidence level of the
interval.
{ = 0.05 for most practical applications}
Note:
The higher the confidence level, the wider
the interval !!

Estimating the mean ():


Case I: s Known

The statistic z can be used to construct


the interval such that:
s s
P ( y  z 2    y  z 2 )  1
n n

(CLT)

9
Example:
A manufacturing company produces light
bulbs that have a length of life that is
approx., normally distributed with mean
 and standard deviation of 40 hr’s.
If a random sample 16 bulbs has a mean
of 780 hr’s, construct a 95%confidence
interval for the population mean.

Using the cumulative Normal Table


at 1- = 0.95
/2 /2
z/2 = 1.96 1- 
z
The interval is given by:

760.4 <  < 799.6

10
Conclusion
“At 95% confidence level, we expect the
mean life of all bulbs produced to be within
760.4 and 799.6 hr’s.”

• Would you accept the claim that the


mean life of these light bulbs is 850 hr’s ?

• How would you achieve a shorter interval


for estimating the mean life?

11
Estimating the mean ():
Case II: s Unknown

Theorem:
If`y and s are the mean and standard
deviation of a random sample of n from a
normal population with unknown
variance,

Then`y is assumed to follow a


t-distribution with n = n-1 degrees
of freedom.

Where: y
t
S n

12
The statistic t can be used to construct
the interval such that:
s s
P ( y  t 2    y  t 2 )  1
n n
t/2 , is the tabulated value of the t statistic
with n = n-1 df corresponding
to a tail area of (/2).

Example:
The contents of 7 similar containers of
sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0,
10.2, and 9.6 liters. Construct a 95%
confidence interval for the mean of all such
containers assuming that contents are
normally distributed.

13
From sample data:
`y = 10.0 ,and s = 0.283 liters.
Using the t-distribution Table
For n = 7-1= 6 df and  = 0.05
t/2 = t 0.025= 2.447

/2 /2

-t/2 t/2

The interval is given by:


9.74 <  < 10.26
Conclusion:
“At the 95% confidence level, we expect the
mean content of all such containers to lie
within 9.74 and 10.26 liters.”

14
Estimation of the Variance (s2)

Theorem:
If a random sample of n observations
y1,y2,….,yn is selected from a normal
population with finite mean  and
variance s2 , then the sampling
distribution of (n-1)s2/s2, will follow
a c2-distribution with n=n-1 df.

The statistic c2 can be used to construct


an interval such that:

(n  1) S 2 (n  1) S 2
P( s 
2
)  1
c2 2 c12 2

15
Where,

c2 , c2 are the values of the c2


(1 /2) /2
statistic with n= n-1 df corresponding to an
area (/2) to the right and left of
the distribution.

Example:
To estimate the variance of fill at a cannery,
10 cans were selected at random and their
contents are weighed. The following data
were obtained ( in ounces): 7.96, 7.90, 7.98,
8.01, 7.97, 7.96, 8.03, 8.02, 8.04, 8.02.
Construct a 90% confidence interval
for estimating the variance.

16
From sample data,
s = 0.0430 f(c2 )
Using Tables;
(/2) (/2)
At (1-) = 0.90,
and n = n-1= 9 c2 c2
1(/2) (/2)
c20.05  16.919, and
c20.95  3.32511

Conclusion

At 90% confidence level, we


expect the variance to lie between
0.00098 and 0.0050 (Sq. ounces).

Note:
0.0314< s < 0.0707

17
Hypotheses Testing:
Using sample evidence to reject or accept a
claim regarding the value of a population
parameter.

Elements of the Test:


• The null hypotheses, H0
• The alternative hypotheses, H1
• The test statistic
• Rejection region

Properties
We are to make a decision based on sample
data to either reject or accept a claim:

Hypotheses H0
Decision True False
Accept No error Type II error
Reject Type I error No error

18
Definitions:
• The probability of making type I error ():
“Is the conditional probability of rejecting Ho, when
Ho is true.”

• The probability of making type II error (b):


“Is the conditional probability of accepting Ho, when
Ho is false.”

• The power of the test (1-b):


“Is the probability of rejecting Ho, when Ho is false.”

• A two-tailed test:
A test of hypothesis where the alternative is
two-directional:

Ho: q = qo Vs. H1: q  qo

Reject Reject
(/2) Accept (/2)

qL qo qU

19
• A one-tailed test:
A test of hypothesis where the alternative is
one-directional:

Upper-tailed
Ho : q < qo Vs. H1: q > qo

Reject
()
Accept

qo qU

Lower-tailed

Ho : q > qo Vs. H1: q < qo

Reject
()
Accept

qL qo

20
Testing a population mean ()
Case I: s known
Example
A manufacturer of sports equipment has
developed a new synthetic fishing line that
he claims to have a mean breaking strength
of 8 kg with standard deviation of 0.5. A
random sample of 20 lines is tested and
found to have a mean 7.8 kg.

• Is there sufficient evidence to reject the


manufacturer claim at  = 0.01?

• What is b if in fact =7.5 kg?

21
Ho:  = 8 Vs. H1:  < 8
lower-tailed test

Rejection region:
Reject
at  = 0.01 (0.01)
Accept
Z0.01= -2.326 0 z
Z0.01

s
y L    z  7.74 kg
n

Reject
(0.01)
Accept

`yL o `y

22
Conclusion
Since`y = 7.8 > `yL ; we accept (fail to
reject) Ho, and conclude that there is no
sufficient evidence to reject the
manufacturer’s claim.
7.8
Reject
(0.01)
Accept

`yL o `y

Alternative Procedure:
Using Z as the test statistic: -1.79
Rejection Region: Reject
Z = -2.326 (0.01)
Accept
From sample data: -2.326 0 z
y   o 7.8  8.0
Z   1.79
s n 0.5 20

Since Z > Z, we fail to reject the null.

23
b = P{accepting Ho: = 8 | a=7.5}

= P{`y > 7.74 | a=7.5}


= P{ z > 2.15} = 0.0158

Reject Accept
b

7.5 7.74 8.0 `y

Notes
• The power of the test at a= 7.5 kg
(1-b) = 0.9842

• b is a function of a , n and 

24
The p-value of a test:
(Observed Significance)
“The probability of observing a value of the
test statistic as contradictory to the null (Ho)
as that computed from sample data.”

“The smallest level of significance that


would lead to the rejection of the null
hypothesis.”

• Using the p-value allows reporting


test results and leaving the selection
of () to the decision maker.

• The decision criterion is to:

Reject Ho only if  > p-value

25
Example:
Compute the p-value for the test
conducted in the previous example:

{ Ho:  > 8 Vs. H1:  < 8 }

7.8  8.0
Z  1.79
0.5 20
p-value
Accept

Using Table : 0 z
-1.79
p-value = P (z < -1.79)
= 0.0367
Ho ,will be rejected only at  > 0.0367

26
Testing a population mean ()
Case II: s Unknown
Example
In the fishing line example, suppose s is
unknown and sample data are:
n =20, `y =7.8 kg and S=0.6 kg.
Assuming that the breaking strength is
normally distributed, test the manufacturer’s
claim at =0.01.

Ho :  = 8 Vs H1:  < 8
lower-tailed test

Rejection region
using the t-Distribution Table:
at  = 0.01, and
n = (n-1)=19 Reject
() Accept
t0.01= -2.539 t
t0.01 0

27
From sample data:

y   o 7.8  8.0
t   1.49
S n 0.6 20
-1.49
Reject
() Accept

0 t
-2.539

Conclusion
Since t > t , we fail to reject Ho ,
and accept the manufacturer’s claim.

28
Testing the population variance s2

• When sampling from a normal population,


the test statistic is:

(n  1) S 2
c 2
s2

Example
A manufacturer of car batteries claims that
the life of his batteries is approximately
normally distributed with standard deviation
equal to 0.9 years. A random sample of 10
such batteries has a standard deviation of
1.2 years. Is there sufficient evidence to
conclude that s > 0.9 years? Use a 5% level
of significance.

29
Ho: s2=(0.9)2 Vs. H1: s2 >(0.9)2
Upper tailed test

from the c2 Distribution Table:

At n=(n-1)=9, f(c2 )
and = 0.05 Reject
Accept ()
c2 = 16.92 c2
0 16.92

from sample data:

(n  1) S 2 9(1.2) 2
c2    16.0
s2 (0.9) 2

Conclusion:
Since c2 < 16.92, accept Ho and
reject the claim that s > 0.9 years.

30
Studies based on Two Samples:
Randomized designs to compare two
conditions or treatments:

Independent Random samples

Estimating the ratio of two variances (s 12 s 22 )


If s12 and s22 are the variances of two
normal populations, then the statistic F can
be used to construct an interval for
estimating the ratio (s12/s22 ) such that:

P( F1 2 (n 1 ,n 2 )  F  F 2 (n 1 ,n 2 ) )  1

31
F/2 and F1-/2 are the F-values corresponding
to an area (/2) to the right and left of the F-
distribution with:
n1 = (n1-1) = Numerator d.f.
And,
n2= (n2-1) = Denominator d.f.
From the F-distribution Tables.

The statistic F is given by :


S12 s 22
F  ( 2 )( 2 )
S2 s 1
Substituti ng :
S12 1 s 12 S12
( 2)   ( 2 ) F 2 (n 2 ,n 1 )
S 2 F 2(n1 ,n 2 ) s 22 S2

32
Note :
1
F1 2 (n 1 ,n 2 ) 
F 2 (n 2 ,n 1 )

Example 1
Using the F distribution Tables, find:
a) F 0.025 (6,8)
b) F 0.975 (6,8)

a) F 0.025 (6,8) f(F)

0.025
F(6,8)
0 4.65

1
b) F0.975( 6,8) 
F0.025(8,6)
1
  0.1786
5.6

33
Example 2
A random sample of n1=16 measurements on
the breaking strength of a certain type of
material has s12=3.68 (psi)2. Repeated
measurements on a second machine with
n2=10 shows s22=2.3 (psi)2.

Assuming that the measurements are


normally distributed, would you conclude
that the two variances are significantly
different at the 90% confidence level?

34
At 1-= 0.90 ,  /2 = 0.05
Using the F distribution Table,
F0.05(15,9) = 3.01
F0.05(9,15) = 2.59

Thus,
s 12
0.532  2  4.144
s2

Conclusions
 At the 90% confidence level, the ratio
between the two variances is expected to
lie within 0.532 and 4.144.
 Since the interval includes a ratio of
1.0, it can be further concluded that the
difference is not significant.

35
Similar results would have been obtained
by testing the hypothesis:
s 12 s 12
H 0 : 2  1.0 vs. H1 : 2  1.0
s2 s2
And using the test statistic:
f(F) 1.6
3.68
F  1.6 0.05
2.30
0.386
F(6,8)
3.01

Estimation of the difference


between two means (1-2):

Case I: s1 and s2 known.

The difference (`y1 - ` y2) is normally dis.


with mean (1-2 ) and standard deviation:

s 12 s 22
s y y  
1 2
n1 n2

36
Using the statistic z, the interval
is given by:

s 12 s 22
( y1  y2 )  Z 2 
n1 n2

Example:
A standard chemistry test was given to 50
girls and 75 boys. The girls made an
average of 82 while the boys made an
average of 76.

37
Construct a 95% confidence interval
of the difference between the population
means assuming that both populations
are normally distributed with variances
64 and 36 for boys and girls respectively.

From sample data:


36 64
s y y    1.25
1 2
50 75

Using Tables,
Z/2= 1.96
Thus;
3.54 < 1-2 < 8.46

38
Conclusions
At the 95% confidence level, the true
difference is expected to lie within
3.55 and 8.45.
 Since the interval does not include
zero, it can be further concluded
that the difference is significant.

Estimation of the difference


between two means (1-2):

Case II: s1 and s2 Unknown


but can be assumed equal

39
If`y1 and`y2 are means of two independent
random samples of n1 and n2 drawn from
normal distributions with unknown but
equal variances, a (1-) confidence interval
for estimating (12) is given by:

1 1
( y1  y 2 )  t 2 S p 
n1 n2

where,
(n1  1) S12  (n2  1) S 22
Sp 
n1  n2  2
= The Pooled Standard deviation

and t/2 , is the value of the t statistic


corresponding to a tail area /2 with
n = (n1 + n2 -2) d.f.

40
Example
Copper produced by sintering under certain
conditions is measured for porosity in a
laboratory. A random sample of 4
measurements shows a mean of 0.22 and a
variance of 0.001.

A second laboratory repeats the same


process with independent sample of 5
measurements with mean 0.17 and variance
0.002. Construct a 95% confidence interval
for estimating the difference between
the population means assuming equal
variance.

41
From sample data:
Sp = 0.04
Using the t distribution Table:
for n = (9-2) = 7
t/2 = t0.025= 2.365
Thus;
-0.01< 12 < 0.11

Conclusions
 At the 95% confidence we expect
the difference between the two
population means to lie within:
[-0.01 and 0.11]
 Further, it may be concluded that the
difference is not significant.

42
Estimation of the difference
between two means (1-2):

Case III: s1 and s2 Unknown,


and not equal

The interval is given by:

S12 S 22
( y1  y2 )  t 2 
n1 n2
With,
( S12 n1  S 22 n2 ) 2
n 2 2
( S1 n1 ) ( S 22 n2 ) 2

n1  1 n2  1

43
Example
Consider the previous example without the
assumption of equal variances.

Using sample data;


n=6
Using the t distribution Table:
t /2 = t0.025= 2.447
Thus;
-0.012 < (1-2) < 0.112

44
Estimation of the difference
between two means (1-2):

>>Paired Comparison Designs<<

When the same sample units are subjected


to two different conditions (treatments) and
paired observations are made on each, the
differences d1, d2, ….,dn will approximate a
normal distribution with mean d and
standard deviation sd

45
A interval for estimating the mean
difference of the matched pairs is given
by:

Sd Sd
d  t 2   d  d  t 2
n n

Example:
It is claimed that a new diet will reduce a
person’s weight by 4.5 kg on the average in
a period of 2 weeks. The weight of seven
individuals who followed this diet were
recorded before and after a 2-week period:

46
Subject Before After d = (B-A)
1 58.5 60.0 - 1.5
2 60.3 54.9 5.4
3 61.7 58.1 3.6
4 69.0 62.1 6.9
5 64.0 58.5 5.5
6 62.6 59.9 2.7
7 56.7 54.4 2.3

Test the manufacturer’s claim by computing


a 95% confidence interval for the mean
difference in weight.

47
From sample data:
`d = 3.56 kg, and Sd= 2.776 kg

For (1-) = 0.95 and n = 7-1= 6


t/2= 2.447
Thus;
0.99 < d < 6.13

Conclusions
At the 95% confidence level the
mean difference is expected to lie
between 0.99 and 6.13 Kg.
There is not enough evidence to
reject the manufacturer’s claim.

48

You might also like