Statistical Modeling Implications For Coffee Progenies Selection
Statistical Modeling Implications For Coffee Progenies Selection
DOI 10.1007/s10681-015-1561-6
Received: 14 January 2015 / Accepted: 18 September 2015 / Published online: 5 October 2015
Ó Springer Science+Business Media Dordrecht 2015
Abstract A reliable phenotyping and a thorough explaining the coffee yield pattern. There were
investigation of the experimental data via accurate alterations in parameter estimates, prediction error
statistical methods are key requirements for attaining variance of genotypic values, rankings and coinci-
selection gain. Coffee bean yield data are provided dence index in selecting the best progenies. The model
from annual harvests. The data analysis is generally involving annual harvests gave more information
performed based on total phenotypic data of entire regarding the coffee progenies yield behavior in
period or in biennia using a split-plot-in-time model. comparison to biennia.
An essential aspect of these data is the covariance
associated with some random factors of the statistical Keywords Coffea arabica Repeated measures
model. The aim of this work was to evaluate different Covariance structure Bean yield Plant breeding
covariance matrix structures in coffee progenies bean
yield modeling and their implications for prediction
accuracy of progenies genotypic values and selection
under different harvest data grouping strategies. We Introduction
evaluated 21 S0:1 Coffea arabica L. progenies during
eight harvests. The analyses were conducted consid- The main target trait in coffee plant improvement is
ering all the harvests (annual or biennia) and focusing the bean yield. However, this trait has a quantitative
only on the high yield or low yield years. In each case, nature, a complex genetic architecture, and it is highly
we modeled the residual covariance matrix (R) and the influenced by the environment. Thus, the identifica-
genetic covariance matrix over harvests (G). We tion of superior coffee genotypes regarding breeding
noticed that some models are more suitable in value is a challenge. The aspects aforementioned
affect selection and lead to small or null genetic gain
(Bernardo 2010) and they are particularly problematic
V. T. Andrade (&) F. M. A. Gonçalves J. A. R. Nunes for coffee crops since it is a perennial species, with a
Departamento de Biologia, Universidade Federal de
long and costly improvement cycle. This poses a high
Lavras - UFLA, Campus Universitário,
P. O. Box 3037, Lavras, MG CEP 37200-000, Brazil risk at the end of the improvement cycle of having
e-mail: [email protected] selected a genotype that does not outperform the
current cultivars. In Brazil, given the gains already
C. E. Botelho
obtained with the ‘Mundo Novo’ coffee (Coffea
Empresa de Pesquisa Agropecuária de Minas Gerais-
EPAMIG, Unidade Regional do Sul de Minas, 176, arabica L.) cultivars over half a century ago, this task
Lavras, MG CEP 37200-000, Brazil becomes even more difficult (Carvalho et al. 1952).
123
178 Euphytica (2016) 207:177–189
The successful selection of superior genotypes variance. Furthermore this approach not seems suit-
requires a reliable phenotyping from an adequate able for coffee crop, because the existence of biannu-
experimentation and additionally the use of accurate ality—annual alternation between high and low yield
statistical methods to analyze the phenotypic data (Medina Filho et al. 2007). This phenomenon probably
taking into account possible additional information as leads to heterogeneity of variance and even determi-
pedigree or molecular markers, and spatial trends nate the temporal correlation pattern. Thus, the
among plots (Smith et al. 2007; Piepho and Eckl statistical models that assume correlation between
2014). harvests and their variances as constant does not seem
The genetic improvement of perennial plants realistic (Pinto et al. 2013; Piepho and Eckl 2014).
requires special attention given that several measure- These aspects in coffee crop can affect the prediction
ments are taken on the same plot (Cilas et al. 2011; Liu of the progenies breeding values (White and Hodge
et al. 2012). This evaluation process produces longi- 1988). The disregard for specific statistical models can
tudinal data, whose main feature might be related to result in misestimated parameter values and the
the serial correlation among these measurements and consequent altered ranking of the evaluated progenies.
heterogeneity of variance (Wolfinger 1996; Piepho Some works have shown this change in parameter
and Eckl 2014). These aspects increase the complexity estimates as a function of the covariance structure used
of the statistical models and tend to result in distortions and different plant selection (Apiolaza et al. 2000;
in selection of superior individuals when the statistical Smith et al. 2007; Cilas et al. 2011; Mariguele et al.
models are not consistent with the biological nature of 2011; Piepho and Eckl 2014). However, its impact on
the data and/or the method is not sufficiently robust to the selective process has not been quantified yet.
estimate the parameters (Piepho et al. 2004; Smith The aim of this work was to evaluate different
et al. 2005; Hu and Spilke 2011). covariance matrix structures in bean yield modeling of
Among the methods used in the analysis of coffee progenies and its implications on prediction
longitudinal data, the repeated measures analysis of accuracy of progenies genotypic values and selection
variance, the multivariate analysis, and the modeling under different harvest data grouping strategies.
of mean and covariance structures of the factors in the
statistical model are the most common ones (Everitt
1999; Keselman et al. 2001; Piepho et al. 2004). The Materials and methods
repeated measures analysis of variance makes more
restrictive assumptions regarding the data covariance The experiment was conducted by the Minas Gerais
structure and this may affects the inferences in many state coffee breeding program, Brazil, coordinate by
situations. Multivariate analysis, although a reason- ‘Empresa de Pesquisa Agropecuária de Minas Gerais’
able choice, may become less robust with the excess of (EPAMIG) at the Machado Experimental farm
parameters (Littell et al. 2000; Knafl et al. 2012). The (218400 S, 458550 W). The experimental area presents
last method is commonly referred to as mixed Distroferic Red Latosol soil, with undulating relief,
modeling and has been considered the ideal approach 881 m high, with average annual rainfall of 1670 mm
in social, economic and biological sciences (Cheng and average temperature of 21 8C.
et al. 2010). Given the characteristics of the coffee We analyzed 21 endogamic progenies S0:1 origi-
plant genetic improvement this technique seems very nated from crossing the cultivars Coffea arabica L.
promising. ‘Mundo Novo’ with ‘Mundo Novo’ and ‘Mundo
Improvement of coffee involves the evaluation of Novo’ with ‘Bourbon Vermelho’ developed by the
the genotypes bean yield over the course of four ‘Instituto Agronômico de Campinas (IAC)’ coffee
harvests in each generation. The simplest approach breeding program. The experiment was set in January,
would be calculating the total yield per plot over the 1988 and the bean yield (kg) was measured from eight
harvests and fit the same model that would be fitted for annual harvests. The experimental design consisted of
a single measure. However, this method could lead to randomized complete block design (RCBD) with three
information loss, change the genotypes ranking and replications. Each plot consisted of a line with eight
possibly increase the breeding values prediction error plants spaced 3.0 9 1.5 m apart.
123
Euphytica (2016) 207:177–189 179
datasets. First dataset involved all harvests, a second genetic standard deviation in harvest h; Sg is the
dataset encompassed all harvests clustered in biennia, average of genetic standard deviations of all harvests;
a third dataset involved only the high yield harvests Sf is the average of phenotypic standard deviations of
(Harvests 2, 4, 6 and 8), and a forth dataset involved all harvests; sfh is the phenotypic standard deviation in
only low yield harvests (Harvests 1, 3, 5 and 7). The harvest h and y is the vector of original phenotypic
joint analysis of each dataset was performed by the data. This correction is an attempt to consider the
mixed model approach according to Smith et al. genetic and residual heterogeneity. Therefore this
(2007): correction applied over the plot data, penalizing
y ¼ Xm þ Zg þ e where y is the data vector; m is harvests that generate low precision estimates, and
the vector of fixed effects comprising the main effects benefitting harvests with high heritability. This anal-
for blocks and harvests or biennium and their inter- ysis is referred to as CSh.
actions added to the constant; g is the vector of To evaluate the impact of the alternative models on
progeny effects for individual harvests or biennia selection, we compared the estimates of some param-
(ordered as progenies within harvests/biennia) where eters and the resulting progeny ranking based on the E-
g NMV ð0; GÞ; e is the vector of residuals, where BLUPs. The E-BLUP of each progeny over the
e N ð0; RÞ; X and Z are the design matrices related to harvests was estimated according to the following
P
fixed and random effects. The variance structure of R index (Smith et al. 2007):E BLUPi ¼ wh g~ih ,
h
and G follows below. R ¼ Rh In , where Rh is the
where g~ih is the E-BLUP of the progeny i within
residual covariance matrix that accommodates tem-
harvest h and wh is the attributed weight of each
poral correlation (between harvests) and possibly
harvest. We used wh ¼ 1=m, where m is the number of
heterogeneous variance across harvests for each plot.
123
180 Euphytica (2016) 207:177–189
harvests in each dataset. Furthermore we also esti- due to chance. This quantity was calculated by
mated the following parameters: progeny-mean heri- 10 % 9 B. Thus, it is expected one common progeny
2
tability h^ and the selection gain (SG). The due to chance.
123
Euphytica (2016) 207:177–189 181
obtained from the E-BLUPs of the progenies. We correlation with the last harvest is relatively high
observe strong genetic variance heterogeneity for the following the 1st harvest. However, only after the 4th
annual harvests (Fig. 1a). In addition, we can notice a harvest the progeny ranking become very similar to
pronounced alternation of genetic variance ampli- the ranking of the 8th harvest, particularly for high-
tudes, particularly for the annual harvests 1, 3, 5, and average harvests. Regarding biennia, the grouping
7, with the possibility of small or even null selective reduced the magnitude of the alteration in the proge-
gain. This observation was corroborated by non- nies ranking. Similarly to the results for the annual
significant p values associated to the progenies for harvests, the high correlation with the last harvest was
these harvests. This phenomenon could be related to observed following the 1st harvest, indicating that
the bean yield mean in each harvest. selection could be proceed after second harvest,
Analyzing the harvest’s mean yield, we notice that particularly in initial stages of plant breeding program.
the alternating pattern between high and low yield The progeny genetic value estimates and the
starts at the second harvest. This behavior, relatively correlations, regarding annual or biannual harvests,
frequent in coffee bean yield is known as biannuality help to understand the progenies behavior and to better
and hinders the joint analysis. The grouping of harvest define the appropriate statistical models concerning
data according to biennia reduced the genetic variance R and G matrix. Usually, the harvest data are analyzed
heterogeneity and produced a more consistent progeny jointly, assuming R as CS structure and VC to G, but
relative behavior. In this case, only biennium 1 failed this assumption may not be appropriated. Thus, we
to show genetic variability (p \ 0.05). analyzed alternative structures to model the R and
Regarding ĥ2, the estimates varied from 18 % G matrices, in an attempt to deal with the data in a
(Harvest 7) to 98 % (Harvest 8), whilst for biennia more realistic way and to obtain precise estimates.
these values ranged from 32 % (Biennium 1–2) to
87 % (Biennium 7–8). It was observed that high yield Covariance structures
harvests were associated with higher heritability
estimates. According to Table 1, differences in We tested alternative covariance structures for matri-
progeny’s E-BLUP rankings arise from different ces Rh and Gh. We noticed that BIC indicated the full
harvests. In the annual harvests, higher correlations toeplitz correlation structure (TOEP) for R considering
were observed between the E-BLUPs relative to high- the annual harvests. This matrix structure considers
average yield harvests (2, 4, 6 and 8). This tendency the existence of specific error correlations for each
was also observed between the low-average yield interval between harvests. Considering biennia, the
harvests (1, 3, 5 and 7). However, we identified a chosen structure was CSH, which assumes equal
higher frequency of negative correlations between the correlations between biennia accounting for the
rankings of high and low average harvests. residual variances heterogeneity. Regarding the highly
Another important aspect observed in Table 1 is the yielding years, the chosen model is the autoregressive
correlation between the progenies E-BLUPs from structure with heterogeneous variances (ARH), indi-
initial and last harvests. We noticed that the positive cating different correlations among the considered
123
182 Euphytica (2016) 207:177–189
123
Euphytica (2016) 207:177–189 183
123
184 Euphytica (2016) 207:177–189
Table 4 Spearman correlations between the empirical best diagonal, for the five best progenies selected based on the
linear unbiased prediction (E-BLUPi), above the diagonal, and E-BLUPi for different models in annual harvests
Hamblin and Zimmerman coincidence index (%) below the
CS Ra CShb TOEP Rc TOEP R-TOEP Gd
Table 5 Spearman correlations between the empirical best diagonal, for the five best progenies selected based on the
linear unbiased prediction (E-BLUPi), above the diagonal, and E-BLUPi for different models in biennia
Hamblin and Zimmerman coincidence index (%) below the
CS Ra CShb CSH Rc CSH R-CSH Gd CSH R-ARH Ge
data supports the correlation results. A CI equal to already mentioned, the simplest way to dealing with
100 % was obtained between annual harvests and high this kind of dataset is to fitting a RCBD model with
yield years and equal to 48 % if selection by biennia is total bean yield over harvests. This approach produced
considered comparing to annual harvests and high the same progeny rank and selection gain compared
mean years. A null CI was obtained in all the occasions with proposed models, so this method could be used in
considering the joint analysis of the low yield years routine selection process. However this analysis
(Table 8). estimated E-BLUP PEV six times greater than best
fitted models in annual harvests. Furthermore the
progenies selection is done based on adaptability and
Discussion stability over the harvests. This sort of information it is
possible to be explored by data analysis involving
Our study was motivated by a complexity of modeling individual harvests.
coffee bean yield, due mainly by alternation between An adequate genetic variability in the improvement
high and low phenotypic mean trough harvests. As population is a necessary condition to obtain genetic
123
Euphytica (2016) 207:177–189 185
Table 6 Spearman correlation between the empirical best diagonal), for the five best progenies selected based on the
linear unbiased prediction (E-BLUPi) (above diagonal) and E-BLUPi for different models in high yield harvests
Hamblim and Zimmerman coincidence index (%) (below
CS Ra CShb ARH Rc ARH R-ANTE Gd
CS R 0.996* 0.906*
0.951*
CSh 100 0.903* 0.961*
ARH R 74 74 0.864*
ARH R-ANTE G 48 48 48
a
Compound symmetry for the residual covariance matrix
b
Compound symmetry with transformation for the variance heterogeneity
c
First order autoregressive with heterogeneous variance for the residual covariance matrix
d
First order autoregressive with heterogeneous variance for the residual covariance matrix and structured antedependency for the
genetic covariance matrix over harvests
* Significant at the 0.05 probability level
gain. The ĥ2 found for the coffee plant was in Given the coffee biannuality, a commonly
accordance to the values found in literature. Carvalho adopted strategy for selection is to group years as
et al. (2012) reported an average ĥ2 of 0.51 for the biennia (Botelho et al. 2010). This is used to
progenies estimated from 123 works. circumvent the lack of feasible procedures to
analyze data with such complexity. Using this
procedure, we minimize the heterogeneity of the
Table 7 Spearman correlation between the empirical best
genetic variances over the biennia, as well as higher
linear unbiased prediction (E-BLUPi) (above diagonal) and
Hamblim and Zimmerman coincidence index (%) (below estimates of heritability and E-BLUP rank correla-
diagonal), for the five best progenies selected based on the tions between biennia (Table 1). This is probably
E-BLUPi for different models in low yield harvests associated with the attenuation of the progeny-
CS Ra CShb ANTE Rc harvest interaction effect since the simple-type
interaction is associated with the heterogeneity of
CS R 0.989* 0.744*
genetic variances and the complex part of the
CSh 74 0.690* interaction is related to the genetic correlation
ANTE R 74 48 imperfections (Falconer and Mackay 1996). This is
a
Compound symmetry for the residual covariance matrix also shown by the ĥ2 estimates and by the corre-
b
Compound symmetry with transformation for the variance lations between individual harvests, which highlight
heterogeneity the progeny-harvest interaction and biannuality,
c
Structured antedependency for the residual covariance matrix phenomena usually described in the coffee culture
* Significant at the 0.05 probability level (Botelho et al. 2010; Cilas et al. 2011).
Table 8 Spearman correlation of the empirical best linear unbiased prediction (E-BLUPi) between the harvest grouping strategies
(above diagonal) and coincidence index (%) in the five best progenies selection (below diagonal)
Annual Biennium High yield harvests Low yield harvests
123
186 Euphytica (2016) 207:177–189
The Spearman correlations between the E-BLUPs estimate the values outside the diagonal (covariance),
relative to individual harvests and biennia (Table 1) being less restrictive than CS. Some works with Coffea
suggest another important point: selection on initial canephora showed CSH (Cecon et al. 2008), CS and
harvests is feasible for coffee, in agreement with ARH (Cilas et al. 2011) as the most appropriate.
Mistro et al. (2007) and Oliveira et al. (2011). It is Another observation is that the alteration in the
currently widely accepted in the coffee breeding covariance pattern of the character over time may
segment that selection can be reliably done on the lead to loss of information regarding the progenies
fourth harvest or the first major harvest (Medina Filho behavior and thus the grouping in biennia may not be
et al. 2007). This is reflected by the E-BLUPs the best choice. The inclusion of only the high average
correlations relative to the high yield years that years in the analysis the ARH structure for the error
presented a raise on the fourth harvest. However, in and the ANTE structure for the Gh matrix was the right
initial stages of plant breeding program selection choices. These structures assume that the correlation
could be practiced after the second harvest. This between harvests diminishes with the increase in inter-
information was supported by high correlation after harvest interval.
second biennium and may lead to efficiency To attenuate the variance and covariance hetero-
improvement. geneity, Resende et al. (2007) suggests that the data
The coffee yield data has been commonly modeled must be multiplied by the square-root of ĥ2 divided by
with split-plot-in-time, an approach that considers the mean value of the square-root of the heritabilities.
homogeneity of variances within the harvests and This approach was referred to as CSh in this work.
equal pair-wise correlations between harvests (Hu and Thus, after this transformation, the CS structure could
Spilke 2011). However, the efficiency of this model be effective and similar to the multivariate analysis,
may be reduced due to the non sphericity of the error considered ideal. However, we cannot judge the fitting
covariance matrix R. According to Piepho et al. superiority by the model selection criteria using
(2004), sphericity is rarely achieved in perennial likelihood based method.
plants regarding bean yield and this may have Different covariance structures can result in differ-
important biological implications. Thus, it is impor- ent outcomes in the selection process. The alterations
tant to explore alternative approaches to analyze these in parameter estimation and in selection have been
data. reported in several works (Apiolaza et al. 2000;
Aiming at reliable genetic predictions, we used Mariguele et al. 2011; Hu and Spilke 2011; Piepho and
different structures to represent the covariance matrix Eckl 2014). Nevertheless, the quantification of the
R and G. It is clear that the coffee yield data need a impact on selection accuracy was not found.
more flexible covariance structure owing its peculiar- The breeder focus is the SG since it is the basis to
ities. The TOEP structure for Rh and Gh , indicated for the improvement strategy. We noticed that different
annual harvests dataset, is like an autoregressive models resulted in different ĥ2, PEV and SG estimates
structure with order equal to the matrix dimension. (Tables 2, 3). The best fitted models did not always
Therefore, it differs from a first order autoregressive in obtain the greater estimates. However, we seek to
the decay rate of these correlations (Littell et al. 2006). interpret the biological reality contained in the data.
It seems precisely the case of the coffee plant bean Thus even that best fitted model produces reduced
yield biannuality because correlations between har- heritability, larger E-BLUP PEV and lowers SG they
vests are mainly due by the preceding phenotypic might be used owing the reliability. Another point to
mean. Also this structure seems to be quite suitable in discuss from Tables 2 and 3 is the selection gain
modeling progenies yield behavior since the covari- obtained by the different datasets. Except for the low
ance parameter estimated for R and G represented yield years, the SG estimates as a percentage of the
clearly the phenotypic pattern. original mean obtained by the models did not change
When grouping the data as biennia, we obtained the significantly, showing only a slight advantage for the
CSH structure. This structure presents a different high yield years.
value for each diagonal element (variance) and uses Although SG estimation is important, what seems
the product between variance of harvests in consider- to matter in selection is the actual superiority of the
ation multiplied by a constant correlation parameter to progenies regarding their genetic value for bean yield.
123
Euphytica (2016) 207:177–189 187
Therefore, another major point in plant breeding is the the structure differentiation. We also noticed that the
ranking of the genetically superior progenies. The SG obtained by the ARH R-ANTE G approach was
different models and grouping data strategies gener- virtually equal to the one obtained with CS, considered
ated different rankings of the progenies E-BLUPi and, the ideal structure. However, the CI between these two
consequently, different selected progenies (Tables 4, models was only 48 % while the correlation was
5, 6, 7). This stresses the relevance of the models to the 0.951. Therefore, the CI’s should be evaluated along
coffee plant genetic improvement. with the correlation in the estimation of the rankings
In the annual harvests, we observed an interesting differences. The doubt that existed with regard to the
aspect of the Rh and Gh modeling. When both matrices use of CS or CSh was remedied by CI, because it was
were modeled by TOEP, we observed the exact match of 100 %. Given the scenario described so far, the
with the CSh selection that resulted in the highest SG analyses concerning the low yield years were the most
estimate. This supports recommendation for the use of surprising ones. In this case, the indicated model gave
compound symmetry with correction for variance us the information contained in the data preventing the
heterogeneities in the coffee bean yield improvement false acceptance of the hypothesis of existence of
(Resende et al. 2007). On the other hand, when progenies variation and harvest interaction, and
modeling matrix Rh with TOEP, the match occurs for showing that there is no selection gain for this
CS and the heterogeneity correction is no longer grouping. Regarding the E-BLUPi, these behaviors
adequate. In addition, the SG was 11 % lower when were not significant at all since the selection done with
modeling Rh with TOEP relative to the modeling of ANTE R differed in only one progeny in comparison
Gh . to CS R (CI 74 %).
As previously argued, the grouping of coffee yield We notice, then, the impacts of the different
data as biennia is a consensus among researchers. In structures of covariance matrices in coffee breeding.
this case, we noticed that the alternative models The coincidence indices between the models in the
indicated to the detriment of CS promoted greater progenies selection reached values that affect their
alterations in the progenies ranking (Table 5), performance directly. If the selected progenies are not
although there was no important difference regarding actually the superior genotypes, the selection gain can
SG (Table 2). This could be interpreted as a result of be reduced or even nullified. This is potentiated when
the ranking alteration occurring mainly below the the species has a long improvement cycle, such as
selection’s truncation point, and the selected progenies coffee. The risk of obtaining a cultivar equivalent to
being a match. However, the CI estimate denies this the previous ones and wasting all the time and
hypothesis since they were also low. It is believed, resources invested in the genetic improvement is real.
then, that the ranking alteration occurred with close The ranking alteration following more appropriate
performance progenies. Once again, the modeling of models offers clues about the structure hidden on the
Gh , in addition to Rh ’s, allowed the recovery of the data, keeping in mind that we do not search for the
information contained in this factor. If only the matrix complex model but for informative and parsimonious
Rh were modeled, a high concordance between the model. We must stress that a model should always be
obtained selection and CS and CSh would be obtained, verified in regard to adequacy and results (Hu and
leading to the wrong conclusion that alternative Spilke 2011). Valuable insights can emerge from the
models are not effective. The similarity between data and guide the maximization of the selective
CSh and CSH was reported by Resende (2007) and is process efficiency in the plant breeding (Resende et al.
confirmed by our results. 2013).
Considering the years of high yield, an interesting In a practical sense, the more significant results
pattern appears. The use of a covariance structure in Rh concern possibly the data grouping. The annual
instead of CS promoted a sharp drop of the SG harvests and the high yield years presented a full
estimate (Table 3). We expected that this would lead selection match (Table 8). This is relevant in a coffee
to a greater alteration in the ranking obtained by the breeding program routine as it saves resources—the
two models, but it did not occur (Table 6). Probably, experimental harvest could be done only in the high
the difference resides mostly on the E-BLUPi esti- yield years and the trials could be commercially
mates. Thus, the progenies were equally affected by harvested in the low average years. In a breeding
123
188 Euphytica (2016) 207:177–189
program, there are several experiments conducted de Ciências Exatas- owing heritability estimator suggestion.
concurrently, what requires considerable workforce, Conselho Nacional de Desenvolvimento Cientı́fico e
Tecnológico (CNPq). Consórcio Pesquisa Café.
which is mostly scarce. In addition to that, if the coffee
harvesting does not obey the experimental design, the
coffee price per liter can be reduced and directly
impact the amount of money spent per selection gain. References
Another point to support this recommendation is that
Apiolaza LA, Gilmour AR, Garrick DJ (2000) Variance mod-
the percentage of gain in relation to the mean before elling of longitudinal height data from a Pinus radiata
selection was higher in the high yield harvests. progeny test. Can J For Res 30:645–654. doi:10.1139/x99-
Oliveira et al. (2011) found similar results and stated 246
Bernardo R (2010) Breeding for quantitative traits in plants.
that selection must be conducted after the first major
Stemma Press, Woodbury
harvest and always in high average years (Medina Bonomo P, Cruz CD, Viana JMS, Pereira AA, Oliveira VR,
Filho et al. 2007). In this work, it would occur after the Carneiro PCS (2004) Evaluation of coffee progenies from
second harvest—what seems plausible. The men- crosses of Catuai Vermelho and Catuai Amarelo with
‘‘Hibrido de Timor’’ descents. (In Portuguese, with English
tioned authors also found high correlation between the
abstract.). Bragantia 63:207–219. doi:10.1590/S0006-
rankings for the annual harvests, high yield year’s 87052004000200006
average and low coincidence when the low yield Botelho CE, Rezende JC, Carvalho GR, Carvalho AM, Andrade
harvests were included in the analysis. VT, Barbosa CR (2010) Adaptability and phenotype sta-
bility of Arabica coffee cultivars in Minas Gerais, Brazil.
Since some available models support biannuality
(In Portuguese, with English abstract.). Pesqui Agropecu
and variance heterogeneity, the use of annual harvests Bras 45:1404–1411. doi:10.1590/S0100-204X2010001
is feasible. Thus, one should avoid the data grouping 200010
without concern. Although the selective gains based Carvalho A, Krug CA, Mendes JET, Antunes Filho H, Morais H,
Aloisi Sobrinho J et al (1952) Breeding of coffee plant IV:
on the annual harvests and biennia are equal (Table 2),
Mundo Novo coffee cultivar. (In Portuguese, with English
there was only 48 % coincidence between the selected abstract.). Bragantia 12:97–129. doi:10.1590/S0006-8705
progenies (Table 8). Therefore, these results reinforce 1952000200001
the question whether the biannuality should be treated Carvalho GR, Mendes ANG, Bartholo GF, Cereda GJ (2006)
Mundo novo coffee (Coffea arabica L.) cultivar progenies
by grouping the data as biennia, contrarily to the
evaluation. (In Portuguese, with English abstract.). Cienc
common practice (Bonomo et al. 2004; Carvalho et al. Agrotec 30:853–860. doi:10.1590/S1413-70542006000
2006; Botelho et al. 2010). It is believed that the 500005
analysis conducted on annual harvests can prevent the Carvalho SP, Custódio TN, Baliza DP, Rezende TT (2012)
Meta-analysis for heritability estimates of vegetative and
loss of information regarding the character since the
reproductive traits of Coffea arabica L. (In Portuguese,
biennia alter the data variance and covariance pattern. with English abstract.). Semin-Cienc Agrar 33:1291–1298.
doi:10.5433/1679-0359.2012v33n4p1291
Cecon PR, Silva FF, Ferreira A, Ferrão RG, Carneiro APS,
Detmann E et al (2008) Repeated measure analysis in the
Conclusion
clonal evaluation in ‘Conilon’ coffee. (In Portuguese, with
English abstract.). Pesqui Agropecu Bras 43:1171–1176.
Some random factor covariance structures are more doi:10.1590/S0100-204X2008000900011
appropriate to model the bean yield data than the Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA,
Muller KE (2010) Real longitudinal data analysis for real
compound symmetry in the coffee breeding. Different
people: building a good enough mixed model. Stat Med
structures alter the accuracy of the parameters esti- 29:504–520. doi:10.1002/sim.3775
mates and the progenies ranking regarding their Cilas C, Montagnon C, Bar-Hen A (2011) Yield stability in
breeding values. clones of Coffea canephora in the short and medium term:
longitudinal data analyses and measures of stability over
The analysis conducted on annual harvests can
time. Tree Genet Genome 7:421–429. doi:10.1007/s11295-
avoid the loss of information since the biennia alter the 010-0344-4
variance and covariance pattern of the data. Everitt BS (1999) Analysis of longitudinal data: beyond
MANOVA. Br J Psychiatry 172:7–10. doi:10.1192/bjp.
Acknowledgments We are grateful to anonymous reviewer 172.1.7
for helpful comments. Professor Julio Sı́lvio de Sousa Bueno Falconer DS, Mackay TFC (1996) Introduction to quantitative
Filho—Universidade Federal de Lavras (UFLA), Departamento genetics. Longmans Green, Harlow, Essex
123
Euphytica (2016) 207:177–189 189
Hamblin J, Zimmerman MJO (1986) Breeding common bean Piepho HP, Eckl T (2014) Analysis of series of variety trials
for yield mixtures. Plant Breed Rev 4:245–272. doi:10. with perennial crops. Grass Forage Sci 69:431–440. doi:10.
1002/9781118061015.ch8 1111/gfs.12054
Hu X, Spilke J (2011) Variance–covariance structure and its Piepho HP, Mohring J (2007) Computing heritability and selec-
influence on variety assessment in regional crop trials. tion response from unbalanced plant breeding trials. Genetics
Field Crop Res 120:1–8. doi:10.1016/j.fcr.2010.09.015 177:1881–1888. doi:10.1534/genetics.107.074229
Keselman HJ, Algina J, Kowalchuk RK (2001) The analysis of Piepho HP, Buchse A, Richter C (2004) A mixed modelling
repeated measures design: a review. Br J Math Stat Psychol approach for randomized experiments with repeated mea-
54:1–20. doi:10.1348/000711001159357 sures. J Agron Crop Sci 190:230–247. doi:10.1111/j.1439-
Knafl GJ, Beeber L, Schwartz TA (2012) A strategy for 037X.2004.00097.x
selecting among alternative models for continuous longi- Pinto LRM, Sanches CL, Dias CT, Loguercio LL (2013)
tudinal data. Res Nurs Health 35:647–658. doi:10.1002/ Advantages of multivariate analysis of profiles for studies
nur.21508 with temporal variation of treatment effects in plants. Int J
Littell RC, Pendergast J, Ranjini N (2000) Modeling covariance Plant Sci 174:85–96. doi:10.1086/668218
structure in the analysis of repeated measures data. Stat Resende MDV (2007) Matemática e estatı́stica na análise de
Med 19:1793–1819. doi:10.1002/1097-0258(20000715) experimentos e no melhoramento genético (Mathematics
19:13\1793:AID-SIM482[3.0.CO;2-Q and statistics in the experiment analysis and genetic
Littell RC, Littell RC, Milliken GA, Stroup WW, Wolfinger RD, inprovenment-In Portuguese.) Embrapa Florestas, Colombo
Schabenberger O (2006) SAS for mixed models. SAS Resende RMS, Resende MDV, do Valle CB, Jank L, Torres
Institute, Cary Júnior RAA, Cançado LJ (2007) Selection efficiency in
Liu S, Rovine MJ, Molenaar CM (2012) Selecting a linear Brachiaria hybrids using a posteriori blocking. Crop Breed
mixed model for longitudinal data: repeated measures Appl Biotechnol 7:296–303. doi:10.12702/1984-7033.
analysis of variance, covariance pattern model, and growth v07n03a09
curve approaches. Psychol Methods 17:15–30. doi:10. Resende RMS, Casler MD, Resende MDV (2013) Selection
1037/a0026971 methods in forage breeding: a quantitative appraisal. Crop
Mariguele KH, Resende MDV, Viana JMS, Silva FF, Silva PSL, Sci 53:1925–1936. doi:10.2135/cropsci2013.03.0143
Knop FC (2011) Methods of longitudinal data analysis for SAS Institute (2009) User’s guide: statistics. SAS Institute, Cary
the genetic improvement of sugar apple. (In Portuguese, Schwarz G (1978) Estimating the dimension of a model. Ann
with English abstract.). Pesqui Agropecu Bras Stat 6:461–464. doi:10.2307/2958889
46:1657–1664. doi:10.1590/S0100-204X2011001200011 Smith AB, Cullis BR, Thompson R (2005) The analysis of crop
Medina Filho HP, Bordignon R, Guerreiro Filho O, Maluf MP, cultivar breeding and evaluation trials: an overview of
Fazuoli LC (2007) Breeding of Arabica coffee at IAC, current mixed model approaches. J Agric Sci 143:449–462.
Brazil: objectives, problems and prospects. Acta Hortic doi:10.1017/S0021859605005587
745:393–408. doi:10.17660/ActaHortic.2007.745.25 Smith AB, Stringer JK, Wei X, Cullis BR (2007) Varietal
Mistro JC, Fazuoli LC, Guerreiro Filho O, Silvarolla MB, selection for perennial crops where data relate to multiple
Toma-Braghini M (2007) Determination of number of harvests from a series of field trials. Euphytica 157:
years in Arabica coffee progenies selection through 253–266. doi:10.1007/s10681-007-9418-2
repeatability. Crop Breed Appl Biotechnol 8:79–84. White TL, Hodge GR (1988) Best linear prediction of breeding
doi:10.12702/1984-7033.v08n01a11 values in a forest tree improvement program. Theor Appl
Oliveira ACB, Pereira AA, Silva FL, Rezende JC, Botelho CE, Genet 76:719–727. doi:10.1007/BF00303518
Carvalho GR (2011) Prediction of genetic gains from Wolfinger RD (1996) Heterogeneous variance: covariance
selection in Arabica coffee progenies. Crop Breed Appl structures for repeated measures. J Agric Biol Environ Stat
Biotechnol 11:106–113. doi:10.1590/S1984-70332011000 1:205–230
200002
123