Analysing Repeated Measurements in Soil Monitoring and Experimentation
Analysing Repeated Measurements in Soil Monitoring and Experimentation
Summary
Field monitoring, leaching studies, and experimentation in soil biology are often now being done non-
destructively using ®xed installations so that measurements are made repeatedly on the same units. The
resulting data for each unit (suction cup, lysimeter, incubation chamber) constitute a time series in which
there may be autocorrelation. The usual methods of statistical analysis, such as the analysis of variance,
must be modi®ed or replaced by more suitable ones to take account of the possible correlation.
This paper describes the split-plot design of such experiments, shows how to assess the variance±
covariance matrix of residuals for uniformity by the Greenhouse±Geisser statistic, and shows how to use
this statistic to adjust the degrees of freedom in a formal test of signi®cance. It also describes more recent
methods. Ante-dependence analysis identi®es the extent of the temporal correlation in the data and
provides approximate signi®cance tests for the treatments. Alternatively, the paper also shows how the
traditional analysis of variance may be replaced by a restricted maximum likelihood analysis which gives
Wald statistics.
The techniques are illustrated with data on CO2 evolved from soil incubated for 75 days in closed
chambers, during which time the gas was measured on 24 occasions to give time series for three
replicates of each combination of two soils (limed and unlimed) and three types of ryegrass amendment.
An ante-dependence structure (extending to ninth order) weakened the usual signi®cance test within the
subunit stratum. The Wald statistics showed that there was, nevertheless, a strong interaction between the
treatments and time.
variance traditionelle par une analyse de probabilite d'occurence maximale restrainte qui donne les
statistiques de Wald.
Les techniques sont illustreÂes par des mesures du CO2 evolue aÁ partir de sol incube pendant 75
jours dans 18 chambres. On a mesure le gaz aÁ 24 reprises et obtenu des seÂries temporelles pour
trois replications de chaque combinaison des deux sols (l'un chauleÂ, l'autre non-chauleÂ) et des trois
types de modi®cations avec ray-grass. Une structure d'anteÂdeÂpendance, qui se prolonge au neuvieÁme
ordre, a affaibli le test de signi®cativite habituel dans le stratum des sous-uniteÂs. Les statistiques de
Wald ont mis en eÂvidence qu'il y avait, neanmoins, une inteÂraction importante entre les traitements
et le temps.
Soil scientists have been analysing results from designed To place the task in context we describe brie¯y the laboratory
experiments on crop nutrition, fertilizer ef®ciency and soil experiment on the decomposition of organic matter in soil by
amelioration for many years, both in the ®eld and in the Webster et al. (2000), and we illustrate the analysis with the
glasshouse. Yates's (1937) `TC 35' perhaps more than any authors' data. The scienti®c questions addressed by the authors
other text educated them in factorial designs so that they could were
investigate the effects of different kinds of treatment both 1 does liming enhance the ability of the soil's microbial
alone and in combination in single experiments. Thus they population to decompose organic matter?,
might add N, P and K as fertilizers at two or more levels to the 2 if it does then to what extent?, and
soil and measure the main effects of each on crop yield 3 do the treatments cause differences in behaviour over
together with their interactions, i.e. the extent to which the time?
treatments combined non-additively. There are thousands of To answer them the authors took samples of topsoil from a
examples in the literature. ®eld experiment designed to determine the lime required to
The response to a treatment may change with time, so that improve acid upland pasture in Northern England. This
time itself becomes a factor that should be incorporated into experiment had been laid down in 1981 at the Redesdale
the design. For example, an investigator might set up a series Experimental Husbandry Farm, Northumbria, where the soil at
of pots in a glasshouse, grow plants in them, and remove both the time had a pH of 3.9. In 1995 the authors sampled one plot
plants and soil for analysis at intervals during the experiment. that had received 20 t CaCO3 ha±1, the soil of which then had a
In such an experiment sampling is destructive, and any one pH of 5.2, and another that had remained unlimed and where
pot, a unit of the experiment, provides only one measurement the soil's pH was 3.5. There were thus two levels of liming,
for each variable. Again, there are many examples. some and none, or in the terms we use below, `limed' and
Field monitoring, leaching studies and experimentation `unlimed'.
in soil biology are introducing a new kind of design in To assess the versatility of the soil's micro¯ora in
which there are ®xed installations and sampling is non- metabolizing organic matter the authors mixed into the soil
destructive. Examples are the evolution of CO2 by the from the two plots ryegrass (Lolium perenne) either fresh or
soil's micro¯ora in response to a treatment for which the after treating it with ethanol and detergent. The latter was to
soil is placed in closed chambers (e.g. E.A. Webster et al., make the cell membranes more permeable and to remove some
2000), suction cups installed permanently in the ®eld to of the cell contents including chlorophyll, soluble sugars and
measure the ionic concentration of the soil solution (e.g. amino acids. In addition, the authors left some of the soil
Ridley et al., 2001), and lysimeters for leaching experi- unamended, i.e. nothing added. This gave three levels of
ments. In all of these the measurements are made organic amendment, which we denote `fresh', `treated' and
repeatedly on the same units. The resulting data for each `nil'.
unit then constitute a time series in which there may be Subsamples of soil, three from each of the six combinations,
autocorrelation. This calls for a more elaborate analysis were weighed into small glass chambers, which were placed in
than the classical analysis of variance. micro-respirometric devices so that the CO2 from them could
Both Webster et al. and Ridley et al. recognized that they be determined, and incubated at 15°C for 75 days. The head
were dealing with repeated measurements and adapted their space of each chamber was ¯ushed out from time to time
analyses to the autocorrelation. In general, however, soil (there were 24 occasions, which are listed in Table 1), and the
scientists are unfamiliar with the proper forms of analysis for amount of CO2 produced since the previous time was
such experiments. The only text book that we know on the measured by gas chromatography. Time is thus a third factor
subject is that by Winer (1971); that targets psychologists, it is in the experiment. Webster et al. show the responses of the six
out-of-date, and it has some confusing notation. combinations of soil and ryegrass to time by plotting the
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 3
cumulative production of CO2 against time in their data, for example, by multivariate analysis of variance. Note
Figure 3. The units are in mol per chamber. also, however, that the measurements on different variables
can be independent even when they are all taken at the same
times.
Experimental design
As an example, Rowell & Walters (1976) suggested ®tting
The experiment has three factors. Two of them, liming and polynomial equations to the data from each unit, and then
ryegrass amendment, comprise a conventional 3 3 2 factorial treating the estimated coef®cients of the equations as the
arrangement. With threefold replication this gave 18 experi- response variables. This is unlikely to be satisfactory because
mental units, which were completely randomized. The amount of correlations among the coef®cients. We could instead ®t
of CO2 produced in each chamber was observed on 24 orthogonal polynomials and analyse their coef®cients, but
successive occasions, though at unequal intervals, to see how these would still be correlated (though less so) and they would
the differences between treatments changed over time, and this be much more dif®cult to interpret.
gave 432 observations in total. Nevertheless, using summary statistics avoids problems of
The design might seem analogous to a split-plot design, with modelling the within-unit correlations. In fact, for polynomials
the soil samples (chambers, units) corresponding to whole there is now the alternative of ®tting a random coef®cient
plots, and the occasions of observation to the subplots. model which, as described below, estimates both the
However, there are some important differences between these coef®cients and the correlations between them. Both of these
two situations. For example, in split-plot experiments the approaches, however, assume that the time response can
allocation of the treatments to each subject is (or should be) reasonably be modelled by a polynomial, but, as Figure 1
randomized, but in an experiment with repeated measurements shows, this looks unlikely for the CO2 data.
the time factor clearly cannot be randomized. Also, in a split-
plot analysis there is the important assumption that the
Analyses on single occasions
observations within each whole plot share an equal correlation,
and this is less likely to hold with repeated measurements. Often, repeated measurements are analysed using separate
Indeed, the correlation structure is one of the most important analyses of the data from each occasion. This is valid, but the
issues to consider when analysing repeated measurements. separate tests give information only about the treatment
Commonly there will be strong correlations because measure- differences at the observed times. They cannot be used to
ments are made on the same samples, there are likely to be assess changes over time; the fact that the statistical
serial correlations due to measurements' being made succes- signi®cance of a treatment has changed between times does
sively over time, and the variance itself may change with time. not mean that the treatment has changed signi®cantly over
As a result, conventional statistical techniques such as analysis time. Likewise, the appearance of similar patterns of treatment
of variance and regression cannot be used directly with effects on several occasions does not provide independent
repeated measurements. con®rmation of the effects. Essentially, the analyses are
correct but not especially informative.
Summary statistics
Analysis of variance of repeated measurements
We might analyse data from repeated measures by ®rst
calculating a summary statistic from the observations (over We now consider in more detail the distinctions between
time) on each unit. Such summaries can be treated like an repeated measurements and split-plot experiments. When
ordinary response variate and analysed by conventional measurements are repeated on the same units there is likely
methods, such as analysis of variance, regression or general- to be a greater correlation between observations that are made
ized linear models. Estimates of error for the summary on successive occasions (points in time) than between those
statistics are based solely on the randomization in the separated by longer times. Furthermore, the factor time cannot,
experimental design; they require no assumptions about or by its very nature, be allocated at random to the occasions
knowledge of the covariance structure of the repeated within samples. In the customary split-plot situation we can
measurements. The analyses are straightforward, but they usually assume that there is an equal correlation between
depend on the summaries' being sensible. In the example, if the subplots of each whole plot. Even if this were not so, the
we were interested only in the total amount of CO2 produced subplot treatment should have been allocated at random to the
in the chambers in the experiment, rather than the pattern of subplots within each whole plot, and this should guarantee
production over its duration, then we could analyse the totals. the validity of the analysis by ensuring an equal expected
Note, however, that were we to choose to generate several correlation between each pair of treatment combinations
different summaries it is unlikely that they would be within each whole plot.
independent. So we should either take care in interpreting Another potential problem arising from the systematic
the results or consider analysing them as a multivariate set of nature of the time factor is that effects arising from the `length
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
4 R. Webster & R. W. Payne
of treatment time' will be confounded with any effects arising The formal requirement for the validity of the analysis in the
from the duration of the experiment, such as age of the subplot stratum of a split-plot design is that all the normalized
microbial biomass in this instance. This does not affect the contrasts in that stratum have an equal variance. The only
validity of the analysis, but you should bear it in mind when practical arrangement in the variance±covariance matrix of the
drawing conclusions about the effects of time. We therefore residuals, C below, is one in which all the off-diagonal
proceed with the analysis. correlations, ij, i ¹ j, in the correlation matrix R are equal, i.e.
Let us suppose that we have n samples of soil (units, which (ij = , i ¹ j), thus:
we shall call `Samples'), q times (`Times') and a single
2 3
treatment factor with m levels. The model for this is 1 ...
6 1 ... 7
zijk = + ai + sk + tj + atij + eijk. (1) 6 7
6
2 6 : : ... : 7
2
C ST R ST 6 7: 2
7
In it is the general mean, ai is the effect of the ith treatment, 6 : : ... : 7
4 : : ... : 5
sk is the (random) effect of the kth subject (unit), tj is the effect
of the jth time, and atij is the time by treatment interaction ... 1
effect, denoted by Times.Treatment in Table 1. Note that the
last is a single quantity although we use a double symbol to The variance 2ST in this equation is estimated by the residual
denote it. The quantity eijk is a random residual attaching to an mean square in the subplot stratum. This pattern is known as a
individual observation, whereas the effects ai, tj, and atij are uniform covariance structure; equivalently, the matrix is said
®xed ± ®xed by the experimenters because they were what to show compound symmetry.
they wanted to investigate. Treating the design as a split plot Lack of compound symmetry can, however, be accommo-
would partition the variance as in Table 1. dated in the analysis. In the usual split-plot analysis, the residual
The formal conditions for the validity of the split-plot sum of squares for the subplot stratum is assumed to be
analysis are discussed in more detail below. Notice here, distributed as 2ST 2qÿ1 nÿm , where 2qÿ1 nÿm has a chi-
though, that any lack of uniformity of the correlations between square distribution with (q ± 1)(n ± m) degrees of freedom.
times affects only the subplot (Times within Samples) stratum. Similarly, under the assumption that there is no
The main plot (Samples) stratum partitions the between- Times.Treatments interaction, the Times.Treatments sum of
sample variation, i.e. variance of the measurements totalled squares is assumed to be distributed as 2ST 2qÿ1 mÿ1 , where
over the samples. This is just one of the possible summary 2qÿ1 mÿ1 has a chi-square distribution with (q ± 1)(m ± 1)
statistics mentioned above, and this part of the analysis will be degrees of freedom. For the general case Greenhouse & Geisser
valid whatever the structure of the within-sample correlation. (1959) proposed a coef®cient, which they symbolized by e and
Further, when measurements are taken on only two occasions which we denote by , by which to multiply the nominal degrees
the analysis in the Times within Samples stratum will be valid of freedom in the subplot stratum. The coef®cient was estimated
too. There can then be only one within-sample correlation, and by maximum likelihood, and is given by
the analysis in the Times within Samples stratum is of another
summary statistic, namely the difference between the observa- n2 c ÿ c2
tions at time 2 and time 1 on each sample. However, the ÿPn Pn ii P ; 3
n ÿ 1 i1 k1 cik ÿ 2n ni1 ci n2 c
information from Times within Samples, describing the way in
which the treatment effects change differentially with time, is
in which cii is the mean of the diagonal elements of C, c is the
often what is of most interest.
mean of all its elements, and ci is the mean of the elements in
the ith row of C.
To test the Times.Treatments interaction in an ordinary
split-plot analysis we should compute F as the ratio of the mean
Table 1 Structure for analysis of variance in split-plot experiment square for Times.Treatments to the residual subplot mean
with a single treatment factor square and compare it with F on (q ± 1)(m ± 1) and (q ± 1)
(n ± m) degrees of freedom. Where there are unequal correla-
Source of variation Degrees of freedom
tions amongst repeated measurements, however, we apply the
Main plot (Samples) stratum n±1 Greenhouse±Geisser coef®cient to obtain (q ± 1)(m ± 1) and
Treatments m±1 (q ± 1)(n ± m) effective degrees of freedom. For the ordinary
Residual n±m split-plot analysis, when C has compound symmetry, = 1.
Subplot (Times within Samples) stratum n(q ± 1) Otherwise it takes values between 1 and 1/(q ± 1). The latter
Times q±1 extreme is equivalent to 1 degree of freedom for each unit in the
Times.Treatments (q ± 1)(m ± 1)
numerator and n ± 1 degrees in the denominator, and it results in
Residual (q ± 1)(n ± m)
a most conservative test. Remember that when there are just two
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 5
observations on each unit, and thus one within-unit degree of would be calculated from Table 2. These in turn mean that the
freedom, the usual analysis is valid. F ratios are more probable on the null hypothesis than would
We now tackle the analysis of the actual experiment in be judged in the usual way, and that without adjusting the
which there were two treatment factors instead of just one. The degrees of freedom we should obtain too many `signi®cant'
model, and extension of Equation (1), is as follows: results. In this instance, the effective degrees of freedom are
4.6 for Times and Times.Lime effects and 9.2 for the
zihjk = + ai + bh + abih + sk + tj + atij + bthj + abtihj + eihjk. (4) Times.Ryegrass and Times.Lime.Ryegrass interactions, and
the effective residual degrees of freedom are 55. These still
We designate the effect of the ith level of treatment factor
lead to probabilities less than 0.01; so all are signi®cant in the
Lime by ai, and we use bh to stand for the hth level of
usual sense.
treatment factor Ryegrass. The terms bthj, abih, and abtihj are
the additional interaction terms, and the residual is now eihjk.
The analysis of variance (Table 2) is a little more complex, but
Multivariate analysis of variance
the principles are the same. Table 3 summarizes the results.
We cannot test formally for compound symmetry here Multivariate analysis of variance is another traditional method,
because the number of residual degrees of freedom between which provides signi®cance tests for the treatment effects. In
samples is greater than the number of times. However, the contrast to the usual analysis of variance, it demands no
small value of the Greenhouse±Geisser coef®cient, only assumptions about the covariance structure and is thus very
0.1998, indicates the lack of compound symmetry and leads general. It is used in many settings outside repeated
to substantially smaller effective degrees of freedom than measurements. Its very generality, however, means that it
may be less ef®cient than the more specialized methods. Also,
Table 2 Structure for analysis of variance for the CO2 experiment because essentially it ®ts a different correlation for every time
difference, it cannot be used when (as in our example and as is
Source of variation Degrees of freedom frequently the case) the number of times exceeds the number
Main plot (Samples) stratum n±1
of residual degrees of freedom at each time, because the
Lime m±1 variance±covariance matrix is then singular.
Ryegrass r±1
Lime.Ryegrass (m ± 1)(r ± 1)
Residual n ± mr Ante-dependence analysis
Subplot (Times within Samples) stratum n(q ± 1) Ante-dependence analysis can be regarded as a generalization
Times q±1 of multivariate analysis of variance that caters for the types of
Times.Lime (q ± 1)(m ± 1) covariance structure that typically occur with repeated
Times.Ryegrass (q ± 1)(r ± 1) measurements. It contains multivariate analysis of variance
Times.Lime.Ryegrass (q ± 1)(m ± 1)(r ± 1) as a least ef®cient, special case. As mentioned above, the
Residual (q ± 1)(n ± mr) correlation between repeated measurements tends to decrease
Samples stratum
Lime 1 28570.3 28570.3 40.58
Ryegrass 2 783977.8 391988.9 556.75
Lime.Ryegrass 2 15223.0 7611.5 10.81
Residual 12 8448.8 704.1 4.34
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
6 R. Webster & R. W. Payne
Table 4 Ante-dependence structures (a) sequential comparisons, and (b) comparisons with maximum order
(a) Sequential comparisons
as the length of time (and number of time intervals) between correlation in time. The probabilities of orders less than 9
the measurements increases. The concept of ante-dependence are very small, apart from an anomalous large value
is based on assumptions about how many preceding measure- (0.756) for order 2 against order 3 in the ®rst half of the
ments will contain information about the current one. table. This allows us to simplify the structure only a little
Speci®cally, the set of variates observed at the successive from the maximum of 11, and we can conclude that the
times is said to have an ante-dependence structure of order r if ante-dependence structure gives little simpli®cation in this
each jth variate (j > r), given the preceding r, is independent of case. More importantly, it shows, like the Greenhouse±
all further preceding variates (Gabriel, 1961, 1962). For Geisser coef®cient, how much smaller are the effective
example, if the ante-dependence structure is ®rst order then degrees of freedom than the nominal ones in Table 2 and
regressing the present measurement on the earlier ones will the need to calculate them before testing for signi®cance in
show that only the measurement immediately preceding the the Times within Samples stratum.
present one is needed in the model. Kenward (1987) describes Having established a maximum order of ante-dependence,
methods for determining the appropriate order and then using here 9, we can proceed with the analysis to assess the
this to test for the treatment effects. To determine the order treatments. The overall tests show signi®cant main effects and
chi-square statistics are calculated to provide a sequential interactions. We therefore compile `test-for-change' tables,
comparison of each order of ante-dependence with the next Tables 5a, b and c.
order (one with two, two with three, and so on), and a The right-hand columns of each contain overall tests using
comparison of each order with the maximum possible order. the data up to each successive time, indicating how the weight
Table 4 lists the statistics for the CO2 data. From it we may of evidence accumulates as the time progresses. The evidence
conclude that an order of 9 would be required to represent the leaves us in no doubt that both the main effects, liming and
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 7
Table 5 Tests of (a) Lime, (b) Ryegrass and (c) Lime.Ryegrass for an ante-dependence structure of order 9
(a) Lime
Degrees Degrees
Time Statistic of freedom Probability Statistic of freedom Probability
Overall test using data from all the times: Statistic 78.084, Degrees of freedom 24, Probability < 0.001.
ryegrass amendment, and their interactions are highly More recently, ante-dependence has become one of the
signi®cant over the life of the experiment. structures that can be modelled as part of a REML analysis
The left-hand columns of the tables assess the information (see below), and this generates not only tests but also estimates
contributed by each time that is additional to that provided by of parameters and standard errors.
earlier times (Kenward, 1987, page 303). If (as here) there is a
reasonably strong correlation between the measurements then
the statistics can be construed as testing for a treatment effect REML modelling of correlation structures
between pairs of successive occasions. We see, for example, The analysis of variance presented earlier is an example of a
from Tables 5a and b that at Times 1, 2 and 3 the effects of linear mixed model. Such models have two parts. The ®xed
both liming and ryegrass were highly signi®cant, whereas the effects model in the example contains the terms in Equation
interaction was signi®cant at only Time 1 of these (Table 5c). (4), namely the grand mean, along with the main effects of
Time 8 stands out in all three tables, and the effects at Time 17 Liming, Ryegrass and Times and their various interactions.
are only a little less marked; we should ask why. The other two terms, sk and eihjk, form the error model. The
You can compare these results with the graphs showing the example is balanced: that is, the treatments are equally
mean amounts of CO2 evolved, Figure 1(a) for Lime, (b) for replicated, and the measurements of CO2 on the samples were
Ryegrass, and (c) and (d) for the interactions. The changes are all made at an identical set of times. The analysis could
greatest prior to the occasions mentioned above, which we therefore be produced by a conventional balanced ANOVA
have marked with the black discs. algorithm. For unbalanced linear mixed models, the equivalent
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
8 R. Webster & R. W. Payne
Table 5 Continued
(a) Ryegrass
Degrees Degrees
Time Statistic of freedom Probability Statistic of freedom Probability
Overall test using data from all the times: Statistic 222.096, Degrees of freedom 48, Probability < 0.001.
method is known as REML, standing for Residual (or thereby giving the standard split-plot analysis. Instead of
Restricted) Maximum Likelihood. The method produces the assuming that the parameters, Times within Samples, are
same information as that given by ordinary analysis of independent, we could specify a model for their correlations or
variance where this is possible in the more general, covariances within each subject. One possibility would be to
unbalanced, context, but it cannot match the more specialized assume an ante-dependence structure of some appropriate
ANOVA output. In particular, no analysis-of-variance table can order as described above.
be produced, and REML produces Wald statistics instead, as Another possibility, often used in the analysis of economic
illustrated below and listed in Table 6. time series, would be to use an auto-regressive structure. With
An advantage of REML with repeated measurements, a ®rst-order auto-regressive structure, one assumes that the
however, is that you can specify models to describe the residual t at time t is given by
correlations between the random effects ± see Gilmour et al.
t±1 + et ,
(1995) and Chapter 5 in Payne (2000). The error model for
repeated measurements has two terms: Samples, sk in Equation where is a constant parameter and et is a random variable.
(4), representing the random variation between the soil The correlation of points j apart in time is then j, where is a
samples, and Times within Samples, eihjk in Equation (4), parameter lying between ±1 and 1. A second-order auto-
representing the variation amongst the measurements made on regressive structure is based on the assumption that the
each sample. In conventional analyses, the individual para- regression involves the two previous residuals, so the residual
meters in each random term are assumed to be independent. t at time t is given by
Fitting a single parameter for each sample then imposes an
equal correlation on the observations made on each sample, 1t±1 + 2 t±2 + et ,
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 9
Table 5 Continued
(c) Lime.Ryegrass
Degrees Degrees
Time Statistic of freedom Probability Statistic of freedom Probability
Overall test using data from all the times: Statistic 167.387, Degrees of freedom 48, Probability < 0.001.
where 1 and 2 are constant parameters, and et is again a and this has been done below for the CO2 data. This
random variable. The pattern of correlations is now rather generalization is known as the heterogeneous power model.
more complex. The element ri,j of the correlation matrix R The covariance structure is then D1/2RD1/2, where R is the
in row i and column j is given by the following three matrix of correlations, as in Equation (2), and D is a
equations: matrix with the variances at each time in its diagonal
elements and zero elsewhere.
ri,i = 1 The parameter 1 is the estimate of the correlation
ri+1,i = 1 / (1 ± 2) parameter , while the Scale parameters are the estimates of
and ri, j = 1ri±1, j + 2ri±2, j, (5) the variances at each time (making up the diagonal of the
matrix D).
where i > j + 1, ±1 < 1, 2 < 1, |1 + 2| < 1, 2 ± 1 < 1, and The Wald statistic to test the null hypothesis for a ®xed
2 > ±1. model term, say i = 0 for the ith effect, is de®ned by the
Auto-regressive structures are sensible only when the quadratic form
measurements are equally (or nearly equally) spaced in
time. For unequal timepoints, the equivalent to a ®rst-order a b ÿ1 a
b Ti V bi ;
i
auto-regressive structure would be a power±time model in
which the correlation between the observations made at in which the vector ab i contains the estimates of the effects, and
times t1 and t2 is given by t, where is a parameter that b i is their variance±covariance matrix. In a balanced design,
V
is estimated in the analysis, and t = t1 ± t2. Either the auto- this corresponds to the treatment sum of squares divided by the
regressive or the power±time models can be generalized to stratum mean square. The Wald statistic would have an exact
allow different amounts of residual variation at each time, 2 distribution if the variance parameters were known, but
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
10 R. Webster & R. W. Payne
Figure 1 Graphs of amount of CO2 evolved against time. The upper graphs show the amounts for the factor Lime averaged over the factor
Ryegrass (a) and the factor Ryegrass averaged over the factor Lime (b). The lower graphs show the interactions of Lime.Ryegrass, left for the
unlimed treatment (c) and right for the limed treatment (d). The black discs show the days 1, 2, 8 and 17 on which there were the largest
differences between treatments.
because they must be estimated the statistic is only and it is not possible to derive the exact residual numbers of
asymptotically distributed as 2. In practical terms, the 2 degrees of freedom.
values will be reliable if the residual degrees of freedom for Kenward & Roger (1997) point out that small samples and
the ®xed term is large compared with its own degrees of missing data are common in experiments with repeated
freedom. In a balanced design, as in our example, the number measurements and that appropriate adjustments are desirable.
of residual degrees for a ®xed (or treatment) term is easy to They give a formula that approximates the numbers of degrees
ascertain ± it is simply the number of residual degrees of of freedom and that appears to work well in practice. The
freedom for the stratum where the term is estimated. Also, if important point to remember, though, is that using 2 tends to
the design is balanced, the Wald statistic divided by its degrees give signi®cant results rather too frequently, so you need to be
of freedom is distributed as Fm,n, where m is the number of careful especially if the value is close to a critical value.
degrees of freedom of the ®xed effect, and n is the number of Table 6 presents the results of the analysis for the CO2 data.
residual degrees of freedom for the ®xed effect. For There the situation seems clear-cut: the effects of all the terms
unbalanced designs the F distribution is only approximate, in the model appear as signi®cant. We thus complete the
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 11
Table 6 REML variance components analysis using the heterogeneous power model
Estimated variance components for Sample
Component 8.055
Standard error 5.268
2 1.000 Fixed
Sample Identity ± ± ±
Time Power 1 0.4323 0.1020
d1,1 65.43 27.99
d2,2 52.13 22.51
d3,3 97.29 40.45
d4,4 61.98 26.66
d5,5 635.8 267.8
d6,6 427.2 175.3
d7,7 146.4 59.8
d8,8 15.57 7.84
d9,9 124.8 51.9
d10,10 241.3 98.2
d11,11 72.96 30.57
d12,12 58.80 24.93
d13,13 128.3 53.9
d14,14 236.6 97.0
d15,15 378.8 157.1
d16,16 119.6 50.0
d17,17 211.4 86.9
d18,18 285.5 119.3
d19,19 288.1 119.9
d20,20 94.35 39.73
d21,21 88.13 37.02
d22,22 205.6 85.1
d23,23 42.67 18.80
d24,24 48.27 21.02
a
The quantities di,i are the entries in the diagonal of matrix D.
REML analysis by printing a three-way table of means with the standard errors, but a conservative estimate for these is
standard errors for differences between the pair of means, given by the residual degrees of freedom from the Samples
Table 7. The analysis does not supply degrees of freedom for stratum in the repeated measurements analysis of variance.
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
12 R. Webster & R. W. Payne
No lime Lime
Ryegrass Ryegrass
Time 1 2 3 1 2 3
Random coef®cient regression random variation about the true regression parameters. This
variation is also likely to involve correlations between
Random coef®cient regression is another type of analysis that parameters. The method is thus similar to the analysis of
can be obtained using REML. The assumption here is that polynomial contrasts over time (see `Summary statistics'
there is an underlying regression model that de®nes the time above), but the models are ®tted in a single analysis which
response within each treatment group, and that the random provides a simultaneous (and valid) assessment of all the
variability of the units that receive each treatment generates coef®cients. It can also be used when there are missing values.
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13
Analysing repeated measurements 13
Acknowledgements
We thank Dr E.A. Webster for stimulating us to think through
this problem and for the data to illustrate feasible solutions and
Summary and conclusion
Professor M.G. Kenward for his helpful comments.
Measurements from ®xed installations used for monitoring soil
in the ®eld, from lysimeters and from laboratory experiments References
in soil biology typically derive from repeated use of the same
Gabriel, K.R. 1961. The model of ante-dependence for data of
units. They are likely to show correlations that vary according
biological growth. Bulletin de l'Institut International Statistique
to their closeness in time. As a consequence, the usual methods
(Paris), 39, 253±264.
of statistical analysis for assessing the effect of time and its Gabriel, K.R. 1962. Ante-dependence analysis of an ordered set of
interactions with any treatments are inappropriate. variables. Annals of Mathematical Statistics, 33, 201±212.
Two broad lines of analysis have been developed success- Gilmour, A.R., Thompson, R. & Cullis, B.R. 1995. Average
fully in recent years to tackle the problems caused by these information REML: an ef®cient algorithm for variance parameter
temporal correlations. One is to analyse the ante-dependence estimation in linear mixed models. Biometrics, 51, 1440±1450.
structure in the data. This measures the strength and extent in Greenhouse, S.W. & Geisser, S. 1959. On methods in the analysis of
time of the correlation and takes them into account when pro®le data. Psychometrika, 24, 95±112.
Kenward, M.G. 1987. A method for comparing pro®les of repeated
comparing the effects of the different treatments over time.
measurements. Applied Statistics, 36, 296±308.
The other uses restricted maximum likelihood (REML) in
Kenward, M.G. & Roger, J.H. 1997. Small sample inference for ®xed
which the correlation structure over time is modelled by some effects estimators from restricted maximum likelihood. Biometrics,
function. It provides Wald statistics for assessing the 53, 983±997.
treatments and their interactions with time. Payne, R.W. (ed.) 2000. The Guide to GenStat: Part 2, Statistics. VSN
We have illustrated both forms of analysis using data on the International, Oxford.
evolution of CO2 from an experiment on the effects of lime Ridley, A.M., White, R.E., Helyar, K.R., Morrison, G.R., Heng, L.K.
and ryegrass incorporated into soil. A strong ante-dependence & Fisher, R. 2001. Nitrate leaching loss under annual and perennial
structure (to ninth order) indicated how inappropriate are the pastures with and without lime on a duplex (texture contrast) soil in
humid southeastern Australia. European Journal of Soil Science,
usual tests for the effect of time and its interactions in the
52, 237±252.
experiment, though the analysis did show that the differences
Rowell, J.G. & Walters, R.E. 1976. Analysing data with repeated
in time and treatment effects and their interactions were still observations on each experimental unit. Journal of Agricultural
signi®cant in the usual sense. Analysing the data by REML Science, Cambridge, 87, 423±432.
produced the Wald statistics, which are distributed approxi- Verbyla, A.P., Cullis, B.R., Kenward, M.G. & Welham, S.J. 1999.
mately as chi-square, as well as tables of means and their The analysis of designed experiments and longitudinal data using
standard errors. This showed that despite the strong correlation smoothing splines (with discussion). Applied Statistics, 48, 269±
between times there were signi®cant effects of time and of its 311.
interactions with the two main treatments, lime and ryegrass; Webster, E.A., Chudek, J.A. & Hopkins, D.W. 2000. Carbon
transformations during decomposition of different components of
again, the interactions with time were judged to be
plant leaves in soil. Soil Biology and Biochemistry, 32, 301±314.
signi®cantly different.
Winer, B.J. 1971. Statistical Principles in Experimental Design, 2nd
Soil scientists should be aware of these techniques when edn. McGraw-Hill, New York.
planning experiments that are likely to involve repeated Yates, F. 1937. The Design and Analysis of Factorial Experiments.
measurements and should use them when analysing the Technical Communication No 35, Imperial Bureau of Soil Science,
resultant data. Harpenden.
# 2002 Blackwell Science Ltd, European Journal of Soil Science, 53, 1±13