0% found this document useful (0 votes)
66 views22 pages

Chapter 1

1) An experiment involves stating a problem, identifying dependent and independent variables, and determining how factors will be manipulated. Careful experimental design is important. 2) Key aspects of experimental design include determining sample size, randomizing treatment order, and using replication and blocking to reduce error and increase precision. 3) The main principles of experimental design are replication, randomization, and error reduction through techniques like blocking and grouping similar experimental units. Proper experimental design allows for valid statistical analysis and reliable conclusions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views22 pages

Chapter 1

1) An experiment involves stating a problem, identifying dependent and independent variables, and determining how factors will be manipulated. Careful experimental design is important. 2) Key aspects of experimental design include determining sample size, randomizing treatment order, and using replication and blocking to reduce error and increase precision. 3) The main principles of experimental design are replication, randomization, and error reduction through techniques like blocking and grouping similar experimental units. Proper experimental design allows for valid statistical analysis and reliable conclusions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

1

Chapter 1
THE CONCEPT OF EXPERIMENT

An experiment starts with a statement of a problem, an answer to which is to be obtained.


A careful statement of the problem plays an important role in its solution. The statement of the
problem must include reference to one or more criteria to be used in assessing the results of the
study. The criterion is called the dependent or response variable and there may be more then one
dependent variable of interest in a given study. Knowledge of the dependent variable is also
essential which will help in the subsequent data analysis. It is also necessary to define the
independent variables or factors that may affect the dependent variable. Some other questions
should be asked before doing experiment. For example, are the factors involved in the experiment
be hold constant or to be manipulated at certain specific levels? Are levels of the factors involved
to be set at a certain fixed values or such levels to be chosen at random from among all possible
levels? All these points should be considered as part of the experimental phase of any research.

Steps in designing an experiment.


1) Statement of the problem.
2) Formulation of hypothesis.
3) Devising of the experimental technique and design.
4) Experimentation of possible outcomes and reference back to the reasons for the inquiry to
be sure that the experiment provides the required information to an adequate extent.
5) Consideration of the possible results from the point of view of the statistical procedures
that will be applied to then, to ensure that the conditions necessary for these procedures to
be valid are satisfied.
6) Performance of the experiment.
7) Application of the statistical techniques to the experimental results.
8) Drawing conclusions with measure of the reliability of estimates of any quantities that are
calculated. Careful consideration being given to the validity of the conclusions for the
population of objects or events to which they are to apply.
Chapter 1
9) Evaluation of the whole investigation, particularly with other investigations on the same or
similar problems.

Some Definitions:-
An Experiment:- An experiment is a planned inquiries to discover new fact, or to confirm or deny
the results of previous investigation.
Treatment:- A treatment is a set of operations (more or less well defined) Which potentially affect
change in the experimental unit.
Experimental Units:- An experimental unit is the smallest piece of experimental material to which
one trial of a single treatment is applied.
Sampling unit:- A sampling unit is a fraction (might all) of the experimental unit. It is the smallest
part of experimental material from which we can make a single measurement.
Design:- A plan and a set of rules of association between the experimental units and treatments
such that we can measure Yield = true value + error.
Experimental Error:- Experimental error describe the failure of two identically treated
experimental units to yields identical results. OR the measure of variation among "yields" on
(entire) experimental units treated alike. (It comes as a result from replication).
Yields:- Yields is the quantity which is measured on the experimental material.
Block:- A group of homogeneous experimental units are called block.
Replication:-When a treatment appears more than once in an experiment, the treatment is said to
be replicated.
Random Assignment:- If treatments are assigned to a set of units in such a way that every unit is
equally likely to receive any treatment. The assignment is said to be random.

THE DESIGN
The next and the most important phase of a research project is the design phase. The design phase
is mainly concerned with the considerations of numbers of observations to be taken and in deciding
on the size of the sample to be taken for a given experiment. Without this information, the best
alternative is to take as large a sample as possible, although in practice, it is usually impracticable.
After the number of observations and the number of experimental units are decided, the order in
2

which the experiments run, is of prime importance. Once a decision has been made to certain
variables at specific levels, there are always a number of variables that can't be controlled.
Randomization of the order of experimentation will tend to average out the effect of these
uncontrolled variables.

THE PRINCIPLES OF THE DESIGN OF EXPERIMENTS


There are three main principles of the design of experiments which play very important role in the
collection and interpretation of data. These principles are:
1) Replication:- Replication results when more than one experimental unit receives (at random)
the same treatment (we usually think in terms of treatment set). The main idea of replication is that
each treatment will be assigned to more than one experimental unit. Replication allows the
experimenter to obtain an estimate of the experimental error which is a function of the difference
among observations from experimental units under identical treatments. It also increases the
accuracy of the estimate of the treatment effects because the difference in the effect of two
treatments could be the result of experimental error. However, the precaution should be taken to
ensure that one treatment is no more likely to be favored in any replicate than another, so that the
error affecting any treatment tend to cancel out as the number of replications increased. How to
determine the number of replication, depends on the degree of precision required. The more the
number of replications, the better it is, so for as precision of estimates is concerned, but it can't be
increased indefinitely as it increase cost of experimentation.
Purpose of Replication
! It provides an estimate of experimental error.
! It increases precision by reducing standard error.
! It can broaden the base for making inference.

2) Randomization:- By randomization we mean that treatments are assigned to the units in such
a way that any unit is equally likely to receive any treatment, that is, randomization means the
allocation of treatment to the experimental units and performing the individual runs or trials of the
experiment in such a way that every possible allocation or order has the same probability. The
2

object of the randomization is to avoid any of personal bias which may conscious or unconscious.
Statistical methods require that error are random variables. Randomization makes this assumption
valid. The way in which randomization is performed, is an experiment depending on the type of
design being used.
Reasons for Randomization
! To minimize bias (means and variances).
! To obtain uncorrelated errors.
! To obtain valid estimate of experimental error.

3) Reduction of Errors (Local Control):- Reduction of error refers to the amount of balancing,
grouping and blocking of the experimental units. This makes the design more efficient.
Grouping:- Grouping mean placing of a set of homogeneous experimental units into groups in
order that the different groups may be subjected to different treatments.
Balancing:- Balancing mean the obtaining of the experimental units, the grouping, the blocking
and the assignment of the treatments to the experimental units in such a way that a balanced
configuration results.
Blocking:- Block is a set of homogeneous experimental units which are less variable within a set
than all of experimental units in total. Blocking is specifically useful when the number of
treatments is large and it may not be possible to get a large homogenous set of units required for
the experiment. Since the experimental error of results only arise from the variation among the
units with-in a replicate, the variation in all the units can be controlled by grouping the units so
that units in same replicate are similar and therefore reducing the experimental error. Variation
from one replicate to another do not contribute to the errors, it is therefore important to keep the
technique uniform within a replication and changes should be made when moving from one
replicate to another. To take full advantage of the operation opportunities for increased precision
by grouping the units, the best criteria for group should be, to minimized the variation within a
group and maximized the variation among different groups.
Blocks may or may not of interest directly, if not, they are just a source of variation to be isolated.
If yes, they are still "error reducer" also provide BT interaction, that is the effect of a treatment
3

depends upon the block.


Reasons for Blocking
! It can increase the precision of an experiment.
! Treatments are compared under more nearly equal conditions.
! Sometime it can increase the information from an experiment.

FIXED EFFECT MODEL


In a fixed effect model, the k-levels of a factor (or k-treatments) are specifically chosen by the
experimenter. In this situation, the hypothesis about the treatment mean is tested and conclusion
apply only to the factor levels considered in the analysis.
In the fixed effect model, the assumptions are:
1) treatment effect αi are assumed to be fixed and are deviation from the overall mean so that:
k k

 i =  (  i  ) = 0
i=1 i=1
(1)

where μi is the mean of the ith treatment and μ is the over all mean.
2) The εij are random sample from a population which is normally distributed, with a mean
zero and a common variance σ2. i. e. εij  NID(0, σ2).

RANDOM EFFECT MODEL


If the k levels of a factor used in the experiment are chosen randomly from a large number
of possible levels, the model is called random effect model or component of variance model. In
such situation, the conclusions are extended to all possible levels of the factor whether they are
involved in the experiment or not.
We are not interested in the means themselves, but the goal in this case is to estimate the variation
among the treatment means. The hypothesis tested is that no variability exists between treatments,
i.e, H0 :σ2α = 0. Where the model is
yij =  +  i +  ij

The assumptions underlying this model are:


1) The αi are considered to be a random variable independent of the error εij and it is assumed
4

that αi  NID ( 0, σ2α ).


2) The εij are random errors, and εij  NID(0, σ2).
NOTE: Random effect models are more common in sample surveys. In designed experiment the
grouping categories are usually random effects. Fixed effects models as regards treatment effects
are the rule in designed experiments. Unless otherwise specified, we will assume that the
treatments in a designed experiment are fixed.

Experiment to compare several treatments (Analysis of Variance)


A hypothesis comparing the composition of two treatment means is tested by using standard
normal variate or the Student-t test. However many experiments involve more than two treatments
and the experimenter is required to test the equality of all means simultaneously. For example, an
experimenter might be interested in the comparing of average yield to use by different varieties of
a crop. It might be seen that this problem could be solved by performing a t-test on all possible
pairs of means, but this solution will be incorrect since it would be considerable distorting of
type-I error. In case of testing equality of five means using pairwise comparison, there are 10
possible pairs. So if the probability of correctly accepting the null hypothesis for each individual
test is 1 - α = 0.95, then the probability of correctly accepting the null hypothesis for all the test is
(0.95)10 = 0.6. If the tests are independent then a substantial increase in the type-I error has
occurred. The appropriate procedure for testing the equality of several means is the ANALYSIS
OF VARIANCE (ANOVA). Since each observation in an experiment is thought to be the sum of
components due to general mean about which the observations are presumed to be fluctuating, the
effect of treatment applied, certain environmental effect which the design of an experiment enables
us to isolate and the residual effect.
The Analysis of Variance partition the total variation into sum of squares attributed to general
mean, treatment effect, environmental effects and the residual effects. The ANOVA provides much
more than shortcut method of securing the error sum of squares, the sum of squares due to
treatments (SST) is the quantity needed for the F-test of the hypothesis that no differences exist
between the effects of the treatments. The ANOVA in which only one factor is investigating is
called one-way classification and is called two-way classification if two factors are investigated at
5

the same time.

One Way Classification


Suppose we have t- _______________________________________________________
Observations 1 2 3 . . t Total
treatments or t levels of a 1 y11 y21 y31 . . yt1
2 y12 y22 y32 . . yt2
single factor that we wish to 3 y13 y23 y33 . . yt3
compare. Let there are "n" . . . . . . .
. . . . . . .
observations under each . . . . . . .
treatment, the data can be n y1n y2n y3n . . ytn
_______________________________________________________
arranged in the form of a Total y1. y2. y3. . . yt. y..
table as shown in Table I.

The observations can be described by a linear statistical model;


yij =  +  i +  ij i = 1, 2, . . . . t, j = 1, 2, . . . . n

where yij is the Jth observation under ith treatment, μ is the overall mean, αi denotes the effect of ith
treatment and εij the random error which is assumed to be normally and independently distributed
random variable with mean "0" and constant variance σ2 (i.e., same for all treatments), i.e, εij 
NID (0, σ2).

Model I: Analysis of variance (Fixed Effects) Model


In a fixed effect model, the k-levels of a factor (or k-treatments) are specifically chosen by the
experimenter. In this situation, the hypothesis about the treatment mean is tested and conclusion
apply only to the factor levels considered in the analysis. In the fixed effect model, the treatment
effect τi are assumed to be deviation from the overall mean so that:
k k

 i =  (  i  ) = 0 where μi is the mean of the ith treatment and μ is the over all mean.
i=1 i=1

Model II: Component of variance (Random Effects) Model


If the k levels of a factor used in the experiment are chosen randomly from a large number of
6

possible levels, the model is called random effect model or component of variance model. In such
situation, the conclusions are extended to all possible levels of the factor whether they are involved
in the experiment or not. In such a situation τi is considered to be a random variable independent
of the error εij and it is assumed that τi  NID ( 0, σ2τ ).
The hypothesis tested is that no variability exists between treatments, i.e, H0 :σ2τ = 0.
The model to be used is determined by the experimenter's view of the experiment. Either the results
are pertinent to only the treatments present (Model I) or inferences are to be made to a larger
population of treatments (Model II). This completes the specification of the model.

Analysis of Fixed Effect Model for One-Way Classification


Let y_i. represent the average of observations under ith treatment and y_.. represent the average of all
t  n observations. The mean of the ith treatment is μi = μ + αi. We are interested in testing the
hypothesis:
H0 : μ 1 = μ2 = . . . . . μt
verses the alternative hypothesis:
H1 = μi  μj for some i and j
or the given hypothesis can be also stated as:
H0 = α1 = α2 = . . . . . αt = 0
verses H1 = αi  0 for at least one value of i.

Decomposition of Sum of Squares


The variance of all nt observations is given by
t n (y y .. )
2


ij

i=1 j=1 nt 1

The numerator of the above quantity is called the total sum of squares (SSTot) which measure the
total variability of the data. Now SSTot can be written as:
7

 ( y )  
t n t n
y .. =   yij yi. + yi. y ..
2 2
ij
i=1 j=1 i=1 j=1

( y ) ( )
t n
= 
2
ij yi. + y i. y ..
i=1 j=1

= 
t n

i=1 j=1
( y ij
) (
2
yi. + y i. y .. + crossproduct terms)
2

The crossproduct terms vanishes and hence

 ( y )
y .. = n  ( yi. y .. ) +   yij yi. ( )
t n t t n
2 2 2
ij
i=1 j=1 i=1 i=1 j=1 The quantity SST is the sum of squares of
SStot = SST + SSE
difference between treatment averages and the grand average and it measures the difference
between treatment means. The quantity SSE. is the sum of squares of differences of observations
within treatments from the treatment averages, measure the random error.
In the same way we can partitioned the total (nt - 1) degrees of freedom. The SST has (t -
1) and the SSE has t(n - 1) degrees of freedom.

Expected Mean Square:


The sum of squares when divided by their respective degrees of freedom, give "Mean
squares" denoted by "MS".
Expectation of One way model, when treatments are fixed.
The model is y ij =  +  i +  ij i = 1, 2, . . . t j = 1, 2, . . . r
The assumptions are; 
i
i = 0 , E ( ij ) = 0 , E ( ij2 ) =  2 ,

Now the correction factor is;


8

2 2
1  1 
C.F =   y ij  =    +  i +  ij 
rt  i j  rt  i j 
2
1 
=  rt + r  i +   ij 
rt  i i j 
1  2 2 2  
2

=
rt 
( )
r t  +    ij  + 2rt   ij 

  i j  i j

1  2 2 2   
=
rt 
( )
r t  +     ij2 +   ij  gh  + 2rt   ij 

 i j i j  i j 
( 1
rt i j
)1
= rt 2 +    ij2 +   ij  gh + 2    ij
rt i j i j

Applying expectation to the correction factor we get

rt i j
( 1
)  
E C.F  = rt 2 +   E  ij2 +  E  ij  gh + 2   E  ij
1
rt i j
   
i j

= rt 2 +  2 1
The Total sum of squares are SSTot =  y ij2 − C.F , and
i j

 y =  ( +  i +  ij )
2 2
ij
i j i j

(
=   2 +  i2 +  ij2 + 2  i + 2  ij + 2 i  ij )
i j

Applying expectation and summation we get


 
E  y ij2  = rt 2 +  i2 + rt 2 2
 i j  i
Subtract eq.1 from eq.2 we get
(rt − 1) 2 + r  i2 A
i
1
For Treatment sum of squares SST =
r
 y i2 . − C.F , and
9

y i . =  y ij =  ( +  i +  ij )
j j

= r + r i +   ij
j
2
 
( yi .) 2
=  r + r i +   ij 
 j 
2
 
= r  + r  +    ij  + 2r 2  i + 2r   ij + 2r i   ij
2 2 2
i
2

 j  j j

 
= r 2  2 + r 2 i2 +    ij2 +   ij  gh  + 2r 2  i + 2r   ij + 2r i   ij
 j j  j j

1
 ( yi .)2 = rt 2 + r  i2 + 1   ij2 + 1   ij  gh
r i i r i j r i j
Applying expectation, we get
1 
E   y i2 . = r  i2 + (t − 1) 2 B
r  i

and E[SSE] = E[SSTot] – E[SST];


therefore E[ SSE ] = (t (r − 1)) 2

By definition

( )
t n
SSE =   yij yi.
2

i=1 j=1

Now

1
 (  +  +  )=  +  + 
1 n n
yi. =  yij =
n j=1
i ij i i. Now substituting this value from eq. 9 into eq. 8
n j=1

we will get

SSE =     +  +  (  + +  )  =   (   i. ) Taking expectation on both sides, we


t n t n
2 2
i ij i i. ij
i=1 j=1 i=1 j=1

get
10

E[ SSE ] =   E (  ij  i. )
t n
2

i=1 j=1
t
= ( n 1 ) 2
i=1

= t ( n 1 ) 2
Or
E[MSE] =  2 Thus MSE is an unbiased estimate of σ2.
Similarly

SST = n  ( yi. y .. )
t
2

i=1

and
1 t n
y .. = ( y )
tn i=1 j=1 ij
1 t n
=   (  +  i +  ij ) Now substituting values from eq. 9 and eq. 14 into eq. 13 we get
tn i=1 j=1
1 t
=+   i +  i.
t i=1

SST = n  ( y i. y .. )
t
2

i=1
2
 t
 1 t 
= n   (  +  i +  i. )   +   i +  ..  
i=1   t i=1 
2
t
 1 t  
= n     i   i  (  i.  .. ) 
i=1   t i=1  
t 1 t 
2

= n     i   i  + (  i.  .. ) + crossproduct terms 
2

i=1   t i=1  

and taking expectation of both sides
 t  1 t  
2
 t 2
E [SST ] = n E     i   i   + n E   (  i.  .. ) 
 i=1  t i=1    i=1 

The cross-product terms vanishes after taking expectation. Now if the treatment levels are fixed,
11

then αi = 0, therefore

E [ SST ] = n E  t
 2
i +n( t 1 )
2
n
= n E  + ( t 1 )
i=1

t
 i2 2
i=1

and thus the expected mean square is


n t 2
E [ MST ] = 
t 1 i=1
 i +  2 Now if our null hypothesis of no treatment effect is true, i.e., H0 : αi =

0, then  αi2 = 0 and hence E[MST] = σ2. Thus "MST" is also unbiased estimate of σ2 if the null
hypothesis of no treatment effect is true.
Now as we have assumed that.
εij  NID ( 0, σ2 ) also yij  NID ( μi, σ2 ) i.e., yij  NID (μ + αi, + σ2 ).
So SSE/σ2  χ2(t (n - 1)) and SST/σ2  χ2(t -1) (under the null hypothesis)
Since the degrees of freedom of the two χ2 - variables are equal to (tn - 1), the total numbers of
degrees of freedom therefore, the Cochran's theorem implies that is the two χ2 variables are
independent. Thus under H0 : αi = 0,
MST/MSE  F(t - 1), t (n - 1)
We see that MSE is an unbiased estimate of σ2 and under the null hypothesis, the MST is also an
unbiased estimate of σ2. However, if H0 is not true, then MST will be greater than σ2, thereby
giving a larger value of F statistic. Thus a large value of F implies a false H 0. Hence the critical
region for the Analysis of Variance will be:
Reject H0 if Fcal > Fα, [(t - 1), t (n - 1)].
NOTE: By Cochran's theorem, if Xi  NID (0, 1) i = 1, 2, . . . . n Then  Xi2 = Q1 + Q2 + . . . . +
Qs where s  n, also Qi's are χ2 variables and Qi has "ni" degrees of freedom Then these variables
will be independent if ni = n i.e. if n1 + n2 + . . . . . . . + ns = n.

Estimation of Model Parameters


To find the least square estimates of μ and αi for one way classification model given in eq. 1. We
minimize the error sum of squares, SSE, with respect to μ and αi.
12

Now from the given model, we have


 ij = yij   i
and the sum of squares is
t n t n
L =    ij2 =   ( y ij   i )2 Now differentiating L w.r.t. μ and αi and the resulting equations
i=1 j=1 i=1 j=1

are set equal to zero. That is

dL d  t n 2 
=   ( yij   i )  = 0
d  d   i=1 j=1 
t n
_ 2   ( yij   i ) = 0
i=1 j=1

Now differentiating with respect to αi, we get


n
dL
d 1
= 2 [ yj=1
ij  1 ] = 0
n
dL
d2
= 2 [ yj=1
ij  2 ] = 0

..
..
..
n
dL
dt
= 2 [ yj=1
ij  t ] = 0

From eq. 21 we get


t t n
_ n t  + n   i =   yij
i=1 i=1 j=1

Now using the assumption Σαi = 0 (of the fixed effect model)
t n
n t  =   yij
i=1 j=1

_ ˆ = y ..

From eq. 22 we get


13

n
n  + n  1 =  y1j
j=1 Similarly we can get
_ ˆ 1 = y1. y ..

ˆ 2 = y 2. y ..
ˆ 3 = y 3. y ..
..
..
..
ˆ t = yt. y ..
Confidence Interval for the ith Treatment mean
As the mean of the ith treatment is given by μi = μ + αi and the point estimator of μi would be _^i =
_^ + _^i = y_.. + y_i. - y_.. = y_i.
Now as yij  NID (μi, σ2) therefore y_i.  NID (μi, σ2/n)
Since MSE is an unbiased estimate of σ2, therefore, the variable
yi.  i
_ t t ( n 1 ) Thus the 100 (1 - α)% confidence interval for μi can be constructed as follows:
MSE
n
 
P t < t < t = 1 
2 2

 
 
yi. i 
 
_ P t ; t(n 1) < < t ;t(n 1)  = 1 

 2 MSE 2 
 n 
 MSE MSE 
_ P  t  ; t(n 1) . < yi.  i < t  ;t(n 1) . =1 
 2 n 2 n 
 MSE MSE 
_ P  yi. t  ; t(n 1) . <  i < yi. + t  ;t(n 1) . =1 
 2 n 2 n 
Which is the 100 (1 - α)% confidence interval for the ith treatment mean μi.
14

Example
An anthropologist was interested in studying physical __________________________
Caucasian Japanese Chinese
14.20 12.85 14.15
differences, if any, among the various races of people 14.30 13.65 13.90
inhabiting Hawaii. As a part of her study she obtained a 15.00 13.40 13.65
14.60 14.20 13.60
random sample of eight 5-year-old girls from each of three 14.55 12.75 13.20
races: Caucasian, Japanese and Chinese. She made a number 15.15 13.35 13.20
14.60 12.50 14.05
of anthropometric measurements on each girl. She wanted to 14.55 12.80 13.80
determine whether the Oriental races differ from the
Caucasian, and whether the oriental races differ from each
other. The results of the head width measurements are given in the table II.
The anthropologist is interested in answers to the following questions:
1. Do head width means differ among the races?
2. Is there a difference between the Caucasian race and the Oriental races?
3. Do the Oriental races differ in head width?
4. Find 95% confidence interval on the means of Chinese head width.

The sum of squares in table III are computed as


1. Correction Factor (CF) = y..2/nt = (332)2/24 = 4592.6667
2. SSTot =   y2ij - CF = 4604,9350 - 4592.6667 = 12.2683
3. SST = (1/r) y2i. - CF = (1/8)(36808.7550) - 4592.6667 = 8.4277
4. SSE = SSTot - SST = 12.2683 - 8.4277 = 3.8406

1: H0: μ1 = μ2 = μ3 Source d.f SS MS F-ratio


____________________________________________________
F = MST/MSE = Race 2 8.4277 4.2139 23.04
Error 21 3.8406 0.1829
4.2139/0.1829 = 23.04 ____________________________________________________
F.05,(2,21) = 3.47 Total 23 12.2683
F.01(2,21) = 5.78
Since 23.04 >5.78, the
differences among means are highly significant (α = 0.01).
15

2: H0: μ1 = (1/2)(μ2 + μ3)


t = [y_1 - (1/2)(y_2 + y_3)]/(s2{[1/n1] + [1/(n2 + n3)]})1/2 = 6.361
t.01(21) =  2.831. Since 2.831 < 6.361 the difference between Caucasian and Oriental Head
width means is highly significant (α = 0.01).
3: H0: μ2 = μ3
t = (y_2 - y_3)/(2s2/r) = (13.188 - 13.694)/(2)(.1829)/8 = - 2.366
t.05(21) =  2.080 t.01(21) =  2.831
Since -2,831 < -2.366 < -2.080 the difference between Japanese and chinese is significant at the
5% level but not at the 1% level.
4: The 95% confidence interval on the head width of Chinese girl is
y_3.  tα/2,(t(n - 1) MSE/n = 13.694  2.086(0.1512) = 13.694  0.3154
13.3786, 14.0094

Statistical Analysis of Random Effect Model


The model we have is
yij =  +  i +  ij i = 1, 2, . . . . t
j = 1, 2,. . . . n
The hypothesis, that we want to test is
H0: σ2α = 0 versus. H1: σ2α 0

Expected Mean Square


E[MSE] = σ2 (as in fixed effect model)
16

 t 
E [ SST ] = E  n  ( yi. y .. )2 
 i=1 
 t 1 t 
= E  n  (  +  i +  i.    i  .. )2 

  
 i=1 t i=1 
t 2
1 t
=E n (  i   i ) + (  i.  .. )
i=1 t i=1 Therefore

= E  n  t   + n  (  ) + crossproductterms
2
 1 
t t t
2

 
i i i. ..
 
i=1 i=1 i=1

= n  E  E(  ) + n (t 1) n
t 2
2
i i
i=1

= n (t 1)  2 + (t 1)  2

E [MST ] = n  2 +  2 Now as

SSE 2
_
2 t(n 1)
and under H0, we have
SST SST
= 2 _  2(t 1)
n   +
2 2

Thus under H0, we have
SST /  2 (t 1)
F= _ F [t 1; t(n 1)]
SSE /  2 t(n 1)
Now as under H0, both MST and MSE are unbiased estimates of
SST / (t 1)
F= _ F [t 1; t(n 1)]
SSE / t(n 1)
σ2. But if H0 is false, then
E[MST] = σ2α + σ2 i.e, under H1, the expected value of numerator is greater than the expected
value of the denominator. Hence critical region of the test will be:
Reject H0 if Fcal > Fα; [t - 1, t(n - 1)]

Estimate of σ2 and σ2α


Since MSE and MST are unbiased estimates of σ2 and nσ2α + σ2 respectively, so _2 = MSE
The estimate of _2α can be obtained as follows
17

MST = _2 + n__2α Thus _2α = (MST + MSE)/n


In an unbalance design "n" is replaced by:
 t

1  t n i 
2

n=   ni i=1 
t 1  i=1 t



i=1
ni 

Confidence Interval for σ2 and σ2α


We know that SSE/σ2  χ2t(n - 1). We can use this statistic to find the 100(1 - α)% confidence interval
(C.I.)for σ2 . 100(1 - α)% confidence interval for σ2 is given by:
 
P   21  (d.f.) <  2(d.f.) <  2 (d.f.) = 1 
 2 2 
 SSE 
P   21  (d.f.) < 2 <  2 (d.f.) = 1  Which is the required 100(1 - α)% confidence interval for σ2.
 2  2 
 

P 2
SSE SSE 
< < 2
2
=1 
   1  (d.f.) 
 2 (d.f.)
2 

Confidence Interval for σ2α


An exact confidence interval for σ2α can't be constructed, however, confidence interval on the ratio
σ2α/(σ2α + σ2) can be determined as follows:
As SST/(nσ2α + σ2)  χ2(t - 1) and SSE/σ2  χ2t(n - 1) Therefore
SST / (n  2 +  2 )(t 1) MST / (n  2 +  2 )
= _ F [t 1;t(n 1)]
SSE /  2 [t(n 1)] MSE /  2
Thus
18

 
P F 1  (d.f)  F  F  (d.f) = 1 
2 2

 MST 2 
P  F 1  (d.f)   F 
(d.f) =1 
 2 MSE n  
2
+  2 2

 1 MST n 2 + 2 MST 

  2
1
P  =1  Now the confidence interval for σ2α /
 F  (d.f) MSE  F 1 (d.f) MSE 

 2 2 
 1 MST MST 
1  2  
n 2 1
P 1 = 1 
 F  (d.f) MSE  F 1  (d.f) MSE 
 2 2 
  
 1  1 MST   2 1  1 MST 
P 1   1 =1 
 n  F  (d.f) MSE   2 n  F 1  (d.f) MSE 
  2   2 
(σ2α + σ2) can be find using the above interval.
As L  σ2α / σ2  U and we can write
 
2 
  =    1  Therefore the required confidence limits will be:
2

 2 +   2 1+   
2 2

  2 

 2  2 
2
L U
Where
1+ L   + 1+U

1  MST 1  1  MST 1 
L=  
1 , U=  1
n  MSE F  (d.f.)  n  MSE F 1  (d.f.) 
 2   2 
Example
A textile company weaves a fabric on a large Observations 1 2LOOMS3 4
___________________________________
number of looms. They would like the looms to be
1 98 91 96 95
homogeneous so that they obtain a fabric of 2 97 90 95 96
3 99 93 97 99
uniform strength. To investigate the variations in
4 96 92 95 98
strength between looms, four (4) looms are
selected at random and the strength of the fabric
manufactured on each loom is determined. The
data obtained is given in Table IV.
19

1) Test the hypothesis that there is no variation in looms.


2) Estimate the variance components σ2 and σ2α.
3) Find a 95% confidence interval for the ratio σ2α/(σ2α + σ2).
The hypothesis to be tested is:
H0: σ2α = 0 vs. H1: σ2α  0
Observations 1 2 LOOMS
3 4 Total
The significance level is taken as
__________________________________________________
α = 0.05 1 3 -4 1 0
2 2 -5 0 1
Test statistic to be used is
3 4 -2 2 4
F = MST/MSE  F[t - 1, t(n - 1)] 4 1 -3 0 3
__________________________________________________
under Ho.
yi. 10 -14 3 8 7
y2i. 100 196 9 64 369
y2ij 30 54 5 26 115
Computation:
C.F = (7)2/16 = 3.06
SSTot = 115 - 3.06 = 111.94
SST = 369/4 - 3.06 = 89.19
SSE = 111.94 - 89.19 = 22.75

The ANOVA table is given in Table VI.


Critical region is Reject H0 if Fcal > F.05,[3,12] = 3.49
1) As F-value at 5% level of significance is significant, that is, F falls in the critical region, so we
reject H0 and conclude that
Source d.f SS MS F-Ratio
there is variations in the ____________________________________________________
Treatment 3 89.19 29.73 15.68
average strength of fabric
Error 12 22.75 1.90
produced by different looms. ____________________________________________________
Total 15 111.94
2) Now _2 = MSE where MSE
= 1.90 therefore _2 = 1.90
Now _2α = (MST - MSE)/n =
(29.73 -1.90)/4 = 6.96
3) Confidence interval for the ratio σ2α /(σ2α + σ2) is
20

L/(1 + L)  σ2α /(σ2α + σ2)  U/(1 + U)


Where L and U are given in eq. 41
Fα/2,[3,12] = 4.47 and F(1- α/2),[3,12] = 1/ Fα/2,[12,3] = 1/14.34
Therefore L = 1/4{(29.73/1.90).(1/4.47) - 1} = 0.625
and U = 1/4{(29.73/1.90).(14.34) -1} = 55.846
therefore L/(1 + L) = 0.625/1.625 = 0.39
and U/(1 + U) = 20.124/21.124 = 0.98
Therefore 95% confidence interval for the ratio σ2α /(σ2α + σ2) is
0.39  σ2α /(σ2α + σ2)  0.98

You might also like