Multivariate Assumptions and Effect of Model Parameters in Path Analysis in Oat Crop
Multivariate Assumptions and Effect of Model Parameters in Path Analysis in Oat Crop
https://doi.org/10.1071/CP23135
ABSTRACT
For full list of author affiliations and
declarations see end of paper
Context. Path analysis (PA) is a widely used multivariate statistical technique. When performing PA,
the effects of the parameters of the mathematical model relating to the experimental design are
*Correspondence to:
Jaqueline Sgarbossa
disregarded, working only with the average effects of the treatments. Aims. We aimed to analyse
Department of Plant Science, Federal the implications of statistical assumptions, and of removing mathematical model parameters, on the
University of Santa Maria, Santa Maria, PA results in oat. Methods. A field study was conducted in southern Brazil in five crop years. The
Rio Grande do Sul, Brazil
experimental design employed was a two-factor 22 × 5 randomised complete block design,
Email: [email protected]
characterised by 22 cultivars and five fungicide applications, with three repetitions. Six explanatory
Handling Editor: variables were measured, panicle length, panicle dry mass, panicle spikelet number, panicle grain
Davide Cammarano number, panicle grain dry mass, and harvest index, and the primary variable yield. Initially, normality
and multicollinearity diagnoses were carried out and correlation coefficients were calculated. The PA
was performed in three ways: traditional, with measures to address multicollinearity (ridge), and tradi-
tional with eliminating variables. Key results and conclusions. The occurrence of multicollinearity
resulted in obtaining path coefficients without biological application. Removing the model’s
parameters modifies the path coefficients, with average changes of 10.5% and 13.3% in the
direction, and 24.7% and 23.0% in the magnitude, of the direct and indirect effects, respectively.
Implications. This new approach makes it possible to remove the influences of treatments and
experimental design from observations and, consequently, from path coefficients and their interpre-
tations. Therefore, the researcher will reduce possible bias in the coefficient estimates, highlighting
the real relationship between the variables, and making the results and interpretations more reliable.
Introduction
Oats are one of the main cereals grown in the world. In Brazil, this cereal stands out as the
second most significant winter crop, showing an increase of 5.4% in the cultivation area
between the 2020/2021 and 2021/2022 seasons (Conab 2021). In this regard, the state
Received: 12 May 2023
Accepted: 7 February 2024
of Rio Grande do Sul accounts for about 70% of this production area (Conab 2019), and
Published: 1 March 2024 a considerable fraction of this area is cultivated with black oat (Avena strigosa L.),
intended for soil cover and forage production (Moreira et al. 2008; Cassol et al. 2011).
Cite this: Sgarbossa J et al. (2024) Another fraction is occupied with oats (Avena sativa L.), intended for grain production
Multivariate assumptions and effect of for human food or animal feed, forage, silage, hay, soil cover, and raw material for
model parameters in path analysis in oat industry (Caierão et al. 2001; Kaziu et al. 2019).
crop. Crop & Pasture Science 75, CP23135.
Numerous studies have been conducted to enhance the oat production system, whether
doi:10.1071/CP23135
the breeding program is directed to the development of new cultivars of oats or black oats
(Caierão et al. 2006; Mantai et al. 2017; Alessi et al. 2018; Meira et al. 2019a, 2019b,
© 2024 The Author(s) (or their
employer(s)). Published by
2019c), as well as the improvement of cultivation management (Dornelles et al. 2020;
CSIRO Publishing. Kraisig et al. 2020; Mantai et al. 2020a, 2020b; Silva et al. 2020). To obtain superior
genotypes, it is essential to proceed with efficient selection, which can often be laborious
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
and time-consuming when performed directly on the trait. This 2018, and 2019, in the city of Augusto Pestana, in southern
difficulty can be overcome by selecting populations based on Brazil, under geographical coordinates of 28°26 0 30″S and
their yield components or other adaptive traits that indirectly 54°00 0 58″W, with an altitude of 400 m above sea level.
increase grain yield. However, indirect selection requires a high According to Köppen’s climate classification, the region’s
correlation between the variable under selection and the climate is Cfa, characterised by an average air temperature of
objective variable (Falconer and Mackay 1997). 19.1°C, ranging from 0 to 38°C, and an accumulated rainfall
Simple correlation studies allow analysing the direction of 2040 mm (Alvares et al. 2013). The soil of the experimental
and intensity of the linear association between two variables, area is classified as Latossolo Vermelho distrófico típico
but do not indicate cause-and-effect relationships between the (Oxisol) (Tedesco et al. 1995).
variables (Vencovski and Barriga 1992; Cruz 2005; Ferreira The experimental design employed was a two-factor 22 × 5
2009; Sari et al. 2018). Path analysis (PA) is more appropriate randomised complete block design, characterised by 22 oat
for situations where the study objective encompasses more cultivars: URS Altiva, URS Brava, URS Guará, URS Estampa,
than two traits, because it provides information about the URS Corona, URS Torena, URS Charrua, URS Guria, URS
interrelationships of the traits. In this analysis, correlation Tarimba, URS Taura, URS 21, FAEM 007, FAEM 006, FAEM 5
coefficients are broken down into direct and indirect effects, Chiarasul, FAEM 4 Carlasul, Brisasul, Barbarasul, Fapa Slava,
allowing the influence of one variable over the other to be IPR Afrodite, UPFPS Farroupilha, UPFA Ouro, and UPFA
quantified (Cruz et al. 2012). Thus, statistical techniques that Gaudéria, and five fungicide applications: 0 (no fungicide
identify the cause-and-effect relationships between variables application), 1 (application performed at 60 days after
are fundamental because they allow the identification of emergence [DAE]), 2 (applications performed at 60 DAE
characteristics that can be used in the indirect selection of
and 75 DAE), 3 (applications performed at 60, 75 and 90
plant cultivars (Cargnelutti Filho et al. 2015).
DAE), and 4 (applications performed at 60, 75, 90 and 105
For the responses obtained through PA to be valid,
DAE), with three repetitions.
statistical assumptions must be met, such as multivariate
Oat sowing in all agricultural years was performed between
normality of the residuals and the absence of multicollinearity
May 15 and June 15, following the technical recommendations
(Hair et al. 2009). Toebe and Cargnelutti Filho (2013a), when
for the crop. Harvest was performed from late October to early
comparing results obtained through traditional PA, in which
November in all agricultural years. The production perfor-
the residuals were not normal and when correcting the non-
mance was analysed by collecting the plants from three
normality through data transformation, found that violating
central rows, 5 m long, selected on the day of harvest, and
the assumption generates a bias in the results, i.e. the obtained
quantifying the following variables: panicle length (PL, cm);
responses had no biological application. Similarly, Couto et al.
(2009), working with the zucchini crop, observed the need for panicle dry mass (PDM, g), by weighing on a precision
data transformation to proceed with PA because not meeting balance; panicle spikelet number (PSN), by counting; panicle
the assumptions distorts the results. grain number (PGN), by counting; panicle grain dry mass
Furthermore, it should be considered that when performing (GDM), by weighing on a precision balance; harvest index
multivariate analysis, such as PA, the parameters of the (HI), determined by the ratio between panicle grain dry mass
mathematical model are not considered, only the mean values and panicle dry mass; and grain yield (kg ha−1), determined
of each treatment or each repetition are worked with, without by weighing the grains in the plot and later extrapolating
stratifying the effects of factors, for factorial, block, for to kg ha−1.
randomised block design, and interaction, for factorial. These
aspects can lead to the occurrence of bias in the results Statistical analysis
obtained. Accordingly, the following hypotheses were generated:
(1) the violation of statistical assumptions generates a bias in In the statistical analyses, each agricultural year was
the analysis results; (2) removing parameters from the approached individually. Thus, univariate statistical assump-
mathematical model modifies the analysis results. To meet tions, normality of residuals, and homoscedasticity of residual
these hypotheses, the present study aimed to analyse the effect variances were initially tested for all measured variables.
of statistical assumptions and the removal of parameters from Subsequently, multivariate statistical assumptions, multivariate
the mathematical model on the results of the path analysis in normality, and multicollinearity were also tested. For the
the oat crop. multivariate normality diagnosis, the multivariate Shapiro–
Wilk test was applied (Royston 1983). The Shapiro–Wilk test
was also applied (P-value ≤ 0.05) to analyse the normality of
Materials and methods
the residuals (Shapiro and Wilk 1965) and the homogeneity of
variances was assessed by Bartlett’s test (P-value ≤ 0.05)
Study area and experimental design
(Steel et al. 1997). The multicollinearity diagnosis was
The research was carried out with results from experiments performed considering the variance inflation factor (VIF)
conducted during the agricultural years 2015, 2016, 2017, and the condition number (CN).
2
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
Pearson correlation analysis was performed, generating indirect effects of the five explanatory variables on yield were
the matrix of correlation coefficients. Subsequently, the CN calculated, following the path analysis model:
was obtained by the ratio between the largest and smallest
eigenvalue of the correlation matrix X 0 X. CN ≤ 100 indicates Yield = β̂1 PL + β̂2 PSN + β̂3 PGN + β̂4 GDM + β̂5 HI + Residual,
the occurrence of weak multicollinearity, 100 > CN < 1000, (2)
moderate to severe multicollinearity, and CN ≥ 1000, severe
multicollinearity (Montgomery and Peck 1982). The VIF, on where β̂1 , β̂2 , β̂3 , β̂4 and β̂5 are the estimators of the direct
the other hand, was obtained for each variable in the variables PL, PSN, PGN, GDM and HI. As in the traditional
inverse diagonal of the correlation matrix X 0 X. When a VIF path analysis, the normal equation system X 0 Xβ̂ = X 0 Y was
value > 10 is obtained, it indicates the occurrence of severe used to obtain the indirect effect of each variable on the yield.
multicollinearity (Hair et al. 2009). The occurrence of In the ridge path analysis, the six explanatory variables
multicollinearity between the explanatory variables was were kept (PL, PDM, PSN, PGN, GDM and HI), for estimating
defined by obtaining values of CN ≥ 1000 and VIF > 10. the direct and indirect effects on grain yield. However, a
When violation of assumptions was verified, data transforma- constant ‘k’ was added to the diagonal of the correlation
tion was performed employing the Box–Cox family of matrix X 0 X, in order to reduce the variance associated with the
transformations (Box and Cox 1964). After the data least-squares estimator of the path analysis. Thus, the system
transformation, the multivariate normality and multicollinearity of normal equations X 0 Xβ̂ = X 0 Y became ðX 0 X + kÞβ̂ = X 0 Y.
diagnosis was performed again to check the transformation’s The addition of values of the constant ‘k’ was tested to
efficiency. For situations in which the data transformation choose the smallest value at which the path coefficients
was not effective in circumventing the violation of assump- stabilised (Cruz et al. 2012).
tions, we opted for the exclusion of variables and/or grouping Additionally, the effects of mathematical model
of highly correlated variables, as recommended by Cruz parameters (cultivar, application, and block) were isolated
et al. (2012). and removed, proceeding again with the traditional path
The path analysis was performed for experiments from analysis, path analysis with the elimination of variables,
each year, with and without data transformation. The and ridge path analysis. These results were compared to the
primary variable grain yield was considered as a function of
ones observed in the ‘traditional’ path analysis to identify
the explanatory variables (PL, PDM, PSN, PGN, GDM, and
whether the removal of the model factors generates changes
HI) (Cruz et al. 2012). The direct and indirect effects of the
in the results of the analysis.
explanatory variables on yield were estimated by means of
The removal of the effects of the model parameters refers to
three methods of path analysis: traditional, traditional with
the uniformity trials (without applying treatments), considering
the elimination of variables, and multicollinearity conditions
that each observation is composed of the overall mean plus
(ridge path analysis). In all methods, it was considered that
the random effect of the error. The group of data in which the
each explanatory variable has a direct effect on yield and acts
removal of model parameter effects was performed is referred
indirectly through its effects on the other explanatory variables.
to here as ‘predicted’, and the group of data with the
In the traditional analysis, the direct and indirect effects
maintenance of parameter effects is referred to as ‘original’.
were estimated, disregarding possible multicollinearity
The mathematical model for two-factor experiments (fixed
biases. To this end, the variables were standardised, and
effect) under randomised block design is presented in Eqn 3:
the model of the path analysis was established:
Y ijk = m + ai + dj + ðadÞij + bk + eijk , (3)
Yield = β̂1 PL + β̂2 PDM + β̂3 PSN + β̂4 PGN + β̂5 GDM
+ β̂6 HI + Residual, (1) where Yijk is an observation in block k (k = 1, 2, and 3)
referring to the treatment level i (22 levels) of factor A
where β̂1 , β̂2 , β̂3 , β̂4 , β̂5 and β̂6 are the estimators of the direct (cultivar), with level j (five levels) of factor D (applications);
effects of the variables PL, PDM, PSN, PGN, GDM and HI, m is the overall mean of the experiment; ai is the effect of level
respectively. Following this, by means of a system of i (i = 22) of factor A; dj is the effect of level j (j = 5) of factor D;
normal equations X 0 Xβ̂ = X 0 Y, the direct and indirect effects (ad)ij is the effect of the interaction of level i of factor A with
of each explanatory variable on yield were obtained, as level j of factor D; bk is the random effect of block k; eijk is the
described by Cruz et al. (2012). random effect of experimental error.
In the traditional path analysis with variable elimination, a The estimates of the model parameters m, ai, dj, (ad)ij, and
high linear association was found between the variables PDM, bk were obtained based on the method of least squares,
GDM and HI. In all 5 years, the elimination of the PDM according to the equations presented below:
variable was effective in reducing multicollinearity to satisfac-
tory levels. Thus, this explanatory variable was eliminated from Y :::
m̂ = , (4)
the path analysis in each year and, subsequently, the direct and IJK
3
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
where m̂ is the overall mean of the experiment; Y : : : is the exceeded or were very close to the limits established for
sum of all the observations of the experiment; I is the levels cultivation. In the year 2015, a period with no rainfall was
of factor A; J is the levels of factor D; K is the number of blocks. observed right after crop fertilisation. This interval was
associated with the occurrence of high temperatures during
âi = Y¯ i .. − m̂, (5) the anthesis period, when the development of the reproduc-
tive system is particularly sensitive to water stress and high
where âi is the effect of factor A; Ȳ i .. is the mean of level i of temperatures, and may have impacted the performance of
factor A; m̂ is the overall mean of the experiment. the crop.
where d̂j is the effect of factor D; Y¯ :j: is the mean of level j of Results
factor D; m̂ is the overall mean of the experiment.
Statistical assumptions
bˆ k = Y¯ ..k − m̂, (7) The diagnosis of univariate normality and homoscedasticity
of the residual variances allowed the identification of the
where b̂k is the effect of block; Y¯ ..k is the mean of level k of
violation of statistical assumptions in all growing years,
factor block; m̂ is the overall mean of the experiment.
indicating the need for data transformation (Supplementary
Table S1). When performing the data transformation, it was
^
ad = Y¯ ij: − m̂ − âi − d̂j , (8)
ij observed that this technique was not always effective in
bypassing the violation of assumptions. However, it helped
ˆ Þ is the effect of the interaction; Ȳ is the mean of
where ðad to improve the measures of normality and homoscedasticity.
ij ij:
the combination of level i of factor A and level j of Factor D; âi In general, the highest rates of assumption violation were
is the effect of factor A; d̂j is the mean of factor D. observed in Year 2015 and the lowest rates in Year 2016.
A 5% probability level of error was adopted in all statistical The multivariate normality and multicollinearity diagnosis
analyses, and all analyses were performed using Excel demonstrated a violation of statistical assumptions for all
software and R software (R Core Team 2021). The analyses years under study (Supplementary Table S2). Furthermore,
were performed using the packages stats (R Core Team the variance inflation factor (VIF) and condition number (CN)
2021), car (Fox and Weisberg 2019), MVN (Korkmaz et al. statistics allowed identification of the variables that caused
2014), pracma (Borchers 2021), faraway (Faraway 2016), noise in the data, which were PDM, GDM and HI. Thus, as an
Hmisc (Harrell 2021), biotools (Silva et al. 2017), rpanel initial strategy to address poor conformity to distributional
(Bowman et al. 2007), and tkrplot (Tierney 2021). assumptions, the variables that did not show univariate
normality were transformed. However, the data transforma-
tion was not effective in circumventing the violation of the
Meteorological conditions
assumptions. As an alternative approach, the variables that
The data on air temperature (minimum, average, and showed multicollinearity (PDM, GDM and HI) were individu-
maximum) and accumulated rainfall were obtained from a ally excluded, and again, the diagnosis of multicollinearity
mobile automatic weather station located approximately and multivariate normality was performed. The elimination
200 m away from the experimental area. During the oat of the PDM variable proved effective in circumventing
season in 2015, an average air temperature of 17.4°C, ranging multicollinearity (Supplementary Table S3). However, this
from −0.16°C to 33.0°C, and an accumulated rainfall of alternative was not sufficient to obtain multivariate
798.3 mm were recorded (Fig. 1). For the 2016 growing normality.
period, the average air temperature was 16.5°C, ranging from In general, the same trend was observed in the results of the
−0.3°C to 34.7°C, with an accumulated rainfall of 711.2 mm. multicollinearity and multivariate normality diagnosis between
During the 2017 season, the average air temperature was the data groups, but with divergent absolute values. In 2015,
19.1°C, ranging from −4.2°C to 34.3°C, with an accumulated the absolute values of the VIF and CN statistics were higher
rainfall of 715.5 mm. For the 2018 growing season, the for the original data pool, in contrast to the predicted ones.
average air temperature was 16.5°C, ranging from −1.60°C to However, for the other years, the highest absolute values of
32.30°C, with an accumulated rainfall of 687.4 mm. During all variables were obtained for the predicted data set, except
2019, the average air temperature was 17.0°C, ranging from for HI in 2016. Similar responses were observed when PDM
−4.2°C to 36.1°C, with an accumulated rainfall of 645.5 mm. exclusion was performed, and the highest absolute values
Oats require temperatures between 0°C and 35°C (Leite et al. of the VIF and CN statistics for all variables were obtained in
2012) for growth and development to occur. Therefore, during the predicted data group, except for 2015 and the variable
all crop years, the minimum and maximum temperatures PSN in 2017.
4
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
40 100 40 100
(a) 90
(b) 90
35 35
80 80
Air temperature (°C)
Rainfall (mm)
70 70
Rainfall (mm)
25 25
60 60
20 20
50 50
15 15
40 40
10 30 10 30
5 20 5 20
0 10 0 10
–5 0 –5 0
1 16 31 46 61 76 91 106 121 136 1 16 31 46 61 76 91 106 121 136
Days after emergence (DAE) Days after emergence (DAE)
40 100 40 100
(c) 90 (d) 90
35 35
80
30
70 70
Rainfall (mm)
25 25
Rainfall (mm)
60 60
20 20
50 50
15 15
40 40
10 30 10 30
5 20 5 20
0 10 0 10
–5 0 –5 0
1 16 31 46 61 76 91 106 121 136 1 16 31 46 61 76 91 106 121 136
Days after emergence (DAE) Days after emergence (DAE)
40 100
(e) 90
35
Air temperature (°C)
30 80
25 70
Rainfall (mm)
60
20
50
15
40
10 30
5 20
0 10
–5 0
1 16 31 46 61 76 91 106 121 136
Days after emergence (DAE)
Rainfall Minimum Maximum
Fig. 1. Air temperature (minimum and maximum) and accumulated rainfall during the oat growing season (May–October) in 5 years: 2015
(a), 2016 (b), 2017 (c), 2018 (d), and 2019 (e).
Simple correlation variables in the 5 years between data groups (original and
predicted), several situations are observed. Generally, the
When analysing the linear relationship between the yield
response pattern was maintained in 79% of combinations,
components of oats, considering the original and predicted
with reverse direction in the response in 4.8% of combina-
data set, high rates of significant correlation between the tions and alteration in 50% of the absolute value of the
variables were observed, regardless of the cultivation year coefficient, with the maintenance of sign in 16.2% of
(Table 1). In general, all traits under study showed a correla- combinations. Furthermore, it should be noted that the
tion with yield, but with r values of low magnitude for PL. The response pattern was maintained in the years 2016, 2017,
PDM showed high correlations with PGN, GDM and PSN, 2018, and 2019, with minor changes in the magnitude of the
which is justified because these variables are panicle compo- coefficients. In 2015, the response pattern with the inversion
nents. PGN and GDM also showed a substantial correlation, of sign was observed in 19.0% of the combinations. Also in
suggesting that panicles with higher grain numbers result in 2015, change in the absolute value of the coefficient by at
higher values of grain mass. least 50%, with maintenance of the sign, was observed in
When reviewing individually the 105 values of correlation 66.75% of combinations. Coincidentally this year had the
coefficients obtained from the combination of each pair of highest rates of violation of statistical assumptions.
5
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
Table 1. Pearson’s correlation coefficients for the yield components Path analysis vs removal of model parameters
of white oat grown in 5 years (2015, 2016, 2017, 2018, and 2019), with
original data (below the diagonal) and predicted data (above the The values of the correlation coefficient between PL and yield
diagonal), with n = 330. of the oats in the 5 years were of low magnitude (≤0.19),
indicating a low association between these variables. The
Variables Yield PL PDM PSN PGN GDM HI
other variables showed correlation coefficients with higher
2015
magnitudes, suggesting a possible cause and effect relation-
Yield 1 0.03n.s. 0.07n.s. −0.16* 0.04n.s. 0.10n.s. 0.18* ship, regardless of the year (Table 2). It is also important to
PL 0.04 n.s.
1 0.00 n.s.
0.01 n.s.
0.06 n.s.
0.00n.s. −0.02n.s. note that these responses were maintained even after the
PDM 0.58* 0.25* 1 0.40* 0.45* 0.98* 0.02n.s. transformation of the variables.
PSN 0.42* 0.33* 0.63* 1 0.69* 0.36* −0.17* When analysing the correlation coefficients between the
PGN 0.48* 0.25* 0.73* 0.77* 1 0.42* −0.11* explanatory variables and the yield, for the group of
GDM 0.63* 0.24* 0.99* 0.61* 0.73* 1 0.20* predicted data, a low linear association between PL and the
HI 0.68* 0.07n.s. 0.54* 0.31* 0.41* 0.64* 1
2016 Table 2. Pearson’s correlation coefficients of the explanatory
Yield 1 0.18* 0.50* 0.31* 0.42* 0.53* 0.44* variables with the grain yield of oats, in 5 years (2015, 2016, 2017, 2018,
PL 0.12* 1 0.50* 0.41* 0.42* 0.49* 0.14* and 2019), showing the effect on this statistic of three different analysis
methods, and the correlation in results between those alternative
PDM 0.44* 0.48* 1 0.70* 0.80* 0.99* 0.28*
methods.
PSN 0.26* 0.40* 0.69* 1 0.89* 0.69* 0.23*
Years Variables
PGN 0.35* 0.39* 0.79* 0.88* 1 0.81* 0.31*
PL PDM PSN PGN GDM HI
GDM 0.47* 0.47* 0.99* 0.69* 0.80* 1 0.40*
HI 0.41* 0.07n.s. 0.25* 0.21* 0.29* 0.36* 1 Original data
n.s.
2017 2015 0.04 0.58* 0.42* 0.48* 0.63* 0.68*
Yield 1 −0.01* 0.59* 0.26* 0.55* 0.63* 0.56* 2016 0.12* 0.44* 0.26* 0.35* 0.47* 0.41*
n.s.
PL 0.00 n.s.
1 0.25* 0.35* 0.16* 0.20* −0.11* 2017 0.00 0.55* 0.24* 0.52* 0.59* 0.54*
PDM 0.55* 0.26* 1 0.55* 0.74* 0.99* 0.52* 2018 −0.10n.s. 0.42* 0.21* 0.37* 0.48* 0.61*
PSN 0.24* 0.37* 0.54* 1 0.70* 0.51* 0.12 n.s. 2019 0.19* 0.48* 0.39* 0.43* 0.52* 0.43*
PGN 0.52* 0.17* 0.73* 0.69* 1 0.75* 0.50* Mean 0.05 0.49 0.30 0.43 0.54 0.53
GDM 0.59* 0.21* 0.99* 0.51* 0.74* 1 0.65* Transformed data
HI 0.54* −0.12* 0.48* 0.11n.s. 0.47* 0.61* 1 2015 0.04n.s. 0.59* 0.43* 0.51* 0.65* 0.68*
2018 2016 0.12* 0.44* 0.26* −0.34* 0.47* 0.40*
Yield 1 −0.17* 0.45* 0.23* 0.39* 0.51* 0.65* 2017 0.01n.s. 0.55* 0.26* 0.52* 0.59* 0.56*
PL −0.10n.s. 1 0.28* 0.18* 0.15* 0.24* −0.14* 2018 −0.10 n.s.
0.43* 0.24* 0.38* 0.49* 0.62*
PDM 0.42* 0.30* 1 0.49* 0.61* 0.99* 0.49* 2019 −0.18* 0.48* 0.38* 0.43* 0.51* 0.44*
PSN 0.21* 0.22* 0.49* 1 0.84* 0.47* 0.19* Mean −0.02 0.50 0.31 0.30 0.54 0.54
PGN 0.37* 0.21* 0.61* 0.83* 1 0.61* 0.37* Predicted data
GDM 0.48* 0.25* 0.99* 0.48* 0.61* 1 0.59* 2015 0.03n.s. 0.07n.s. −0.16* 0.04n.s. 0.10n.s. 0.18*
HI 0.61* −0.10n.s. 0.43* 0.17* 0.34* 0.55* 1
2016 0.18* 0.50* 0.31* 0.42* 0.53* 0.44*
2019
2017 −0.01n.s. 0.59* 0.26* 0.55* 0.63* 0.56*
Yield 1 0.19* 0.53* 0.42* 0.48* 0.56* 0.45*
2018 −0.17* 0.45* 0.23* 0.39* 0.51* 0.65*
PL 0.19* 1 0.51* 0.37* 0.34* 0.48* 0.03n.s.
2019 0.19* 0.53* 0.42* 0.48* 0.56* 0.45*
PDM 0.48* 0.47* 1 0.63* 0.69* 0.99* 0.31*
Mean 0.04 0.43 0.21 0.38 0.47 0.46
PSN 0.39* 0.36* 0.62* 1 0.87* 0.62* 0.17*
r(1)
−0.05 n.s.
1.00* 0.99* 0.74 n.s.
0.99* 1.00*
PGN 0.43* 0.33* 0.67* 0.86* 1 0.68* 0.26*
r(2) 0.98* −0.50n.s. −0.41n.s. −0.10n.s. −0.58n.s. −0.27n.s.
GDM 0.52* 0.45* 0.99* 0.61* 0.67* 1 0.44*
HI 0.43* 0.02n.s. 0.29* 0.16* 0.24* 0.41* 1 PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN,
panicle grain number; GDM, panicle grain dry mass; HI, harvest index; r(1),
PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN, Pearson’s correlation coefficient between the original and transformed data
panicle grain number; GDM, panicle grain dry mass; HI, harvest index; n.s., not groups; r(2), Pearson’s correlation coefficient between the original and predicted
significant. data sets; n.s., not significant.
*Significant at 5% error probability. *Significant at 5% error probability.
6
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
yield was also observed, as well as specific changes in the showed a significant correlation with the response variable,
direction of the correlation (+ or −) for PSN and in the it is justifiable to carry out path analysis to stratify the
magnitude of the correlation for PDM, PGN, GDM, and HI, in direct and indirect effects of the explanatory variables on the
2015. In addition, when performing a new Pearson correla- response. Thus, it is necessary to carry out a new diagnosis of
tion between the five coefficients obtained for the group of multicollinearity, contemplating only the explanatory vari-
original data and the five coefficients obtained for the group ables and checking the possibility of bias in the path analyses.
of predicted data, we observed associations of low magnitude, The diagnosis of multicollinearity based on VIF indicated
especially for the variables PGN, HI, PSN, PDM, and GDM. that the variables PL, PSN, and PGN do not have a high
Considering the existence of a linear relationship between correlation with the other explanatory variables (VIF < 10),
the yield and most variables and that, in some years, PL regardless of the year, the data group (original or predicted),
the use of data transformation techniques, or exclusion of
Table 3. Variance Inflation Factor (VIF) for the explanatory variables of variables (Table 3). For the variables PDM, GDM, and HI,
the yield of oats cultivated in 5 years (2015, 2016, 2017, 2018, and 2019), the VIF statistic (VIF > 10) indicated the existence of
with original data, transformed data, original data excluding variables, multicollinearity, regardless of the data group or data
predicted data and predicted data excluding variables. transformation. However, the technique of eliminating the
PDM variable circumvented the violation of the statistical
Years Variables
assumption, resulting in absence of severe multicollinearity
PL PDM PSN PGN GDM HI
between the variables.
Original data Similar results were observed when performing the
2015 1.151 478.581 2.683 3.389 577.082 15.642 diagnosis of multicollinearity using the condition number
2016 1.337 1112.270 4.624 6.543 1204.839 18.838 statistic, which indicated the existence of severe multi-
2017 1.265 381.302 2.382 3.368 466.407 17.996 collinearity (CN > 1000) among the explanatory variables,
regardless of the data group, or the transformation of the
2018 1.212 503.389 3.397 4.116 579.004 13.077
original data (Table 4). As observed for the VIF, the elimina-
2019 1.328 727.694 3.934 4.414 798.742 17.539
tion of the PDM variable circumvented the problem of multi-
Transformed data collinearity of the variables, resulting in CN values < 100.
2015 1.160 1086.342 3.035 3.875 1351.405 40.629 The original data transformation technique was not
2016 1.222 1079.775 3.878 4.865 1164.683 18.470 effective in circumventing the multicollinearity between the
2017 1.259 468.478 2.382 3.323 578.117 22.314 explanatory variables, and in some situations it inflated the
coefficients, indicating the existence of a high correlation
2018 1.213 504.312 3.326 4.032 580.074 13.080
between the explanatory variables. Thus, when performing
2019 1.310 627.500 3.938 4.416 687.984 15.131
the path analysis, we chose to use the original data, without
Original data with elimination of variables transforming variables. Therefore, first, the path analysis
2015 1.132 – 2.678 3.388 3.154 1.749 was performed considering the original and predicted data
2016 1.329 – 4.621 6.520 3.202 1.756 groups, and later the PDM variable was eliminated in each
2017 1.249 – 2.372 3.298 2.957 1.971 data group, and a new path analysis was performed.
2018 1.181 – 3.383 4.111 2.200 1.634
The variables PL, PDM, PSN, PGN, GDM, and HI showed
explanatory power of 54.3%, 30.2%, 42.4%, 43.2% and
2019 1.326 – 3.934 4.371 2.423 1.279
35.7% of the variance in oat yield, in years 2015, 2016, 2017,
Predicted data 2018, and 2019, respectively, for the original data group
2015 1.013 244.261 1.978 2.045 253.060 8.976 (Table 5). For the predicted data group, the variables showed
2016 1.369 1313.202 5.160 7.709 1440.177 24.035
2017 1.244 481.814 2.522 3.764 606.054 24.790 Table 4. Condition number (CN) for explanatory variables of the yield
2018 1.228 656.422 3.689 4.492 767.099 17.150 of oats cultivated in 5 years (2015, 2016, 2017, 2018, and 2019), with original
2019 1.395 863.520 4.236 4.827 957.012 21.125 data (Orig.), transformed data (Orig.t), original data excluding variables
(Orig.e), predicted data (Pred.), and predicted data with exclusion of
Predicted data with elimination of variables
variables (Pred.e).
2015 1.007 – 1.960 2.044 1.344 1.122
Years Orig. Orig.t Orig.e Pred. Pred.e
2016 1.352 – 5.149 7.653 3.455 1.204
2015 3952.276 9272.453 15.574 1346.278 6.494
2017 1.242 – 2.517 3.680 3.293 2.210
2016 8889.842 8522.000 31.161 10795.510 37.476
2018 1.212 – 3.686 4.491 2.403 1.838
2017 3011.417 3717.600 13.471 3943.674 15.325
2019 1.395 – 4.235 4.777 2.632 1.328
2018 3657.622 3663.841 17.693 4846.142 19.659
PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN,
2019 5532.334 4756.321 20.711 6725.201 23.191
panicle grain number; GDM, panicle grain dry mass; HI, harvest index.
7
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
Table 5. Direct and indirect effects of explanatory variables on yield of oats cultivated in 5 years (2015, 2016, 2017, 2018, and 2019), for original (Orig.)
and predicted (Pred.) data, without considering multicollinearity (n = 330).
Effects 2015 2016 2017 2018 2019
Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred.
PL
Direct on yield −0.104 0.021 −0.088 −0.082 −0.053 −0.072 −0.139 −0.176 −0.018 −0.056
Indirect via PDM 0.210 0.000 0.400 −0.181 0.372 0.366 0.662 0.644 1.573 2.560
Indirect via PSN 0.043 −0.004 −0.047 −0.069 −0.035 −0.039 −0.021 −0.012 0.023 0.027
Indirect via PGN 0.005 0.016 0.012 0.054 0.041 0.040 0.041 0.026 0.033 0.036
Indirect via GDM −0.154 0.000 −0.179 0.425 −0.261 −0.250 −0.562 −0.537 −1.442 −2.412
Indirect via HI 0.044 −0.002 0.025 0.030 −0.062 −0.058 −0.080 −0.111 0.016 0.034
r 0.044 0.032 0.122 0.177 0.003 −0.013 −0.098 −0.166 0.186 0.189
PDM
Direct on yield 0.828 0.010 0.828 −0.361 1.432 1.451 2.238 2.288 3.324 5.059
Indirect via PL −0.026 0.000 −0.042 −0.041 −0.014 −0.018 −0.041 −0.050 −0.009 −0.028
Indirect via PSN 0.082 −0.137 −0.082 −0.116 −0.051 −0.061 −0.047 −0.034 0.040 0.046
Indirect via PGN 0.014 0.116 0.024 0.102 0.170 0.183 0.122 0.104 0.067 0.071
Indirect via GDM −0.645 0.074 −0.378 0.852 −1.238 −1.228 −2.190 −2.251 −3.162 −4.935
Indirect via HI 0.328 0.003 0.090 0.061 0.248 0.262 0.339 0.394 0.221 0.318
r 0.580 0.065 0.440 0.496 0.547 0.589 0.421 0.452 0.482 0.531
PSN
Direct on yield 0.130 −0.340 −0.119 −0.166 −0.095 −0.111 −0.097 −0.069 0.065 0.073
Indirect via PL −0.034 0.000 −0.035 −0.034 −0.019 −0.025 −0.030 −0.031 −0.006 −0.021
Indirect via PDM 0.519 0.004 0.571 −0.251 0.780 0.802 1.098 1.113 2.067 3.198
Indirect via PGN 0.015 0.175 0.027 0.113 0.161 0.173 0.167 0.145 0.086 0.090
Indirect via GDM −0.400 0.027 −0.263 0.597 −0.638 −0.638 −1.056 −1.077 −1.950 −3.091
Indirect via HI 0.189 −0.023 0.075 0.048 0.055 0.062 0.131 0.154 0.124 0.175
r 0.418 −0.157 0.256 0.306 0.244 0.263 0.213 0.234 0.385 0.424
PGN
Direct on yield 0.020 0.256 0.030 0.126 0.235 0.246 0.201 0.172 0.100 0.104
Indirect via PL −0.026 0.001 −0.034 −0.035 −0.009 −0.012 −0.028 −0.027 −0.006 −0.019
Indirect via PDM 0.608 0.004 0.654 −0.290 1.039 1.080 1.359 1.390 2.233 3.470
Indirect via PSN 0.101 −0.233 −0.105 −0.149 −0.065 −0.078 −0.080 −0.058 0.056 0.063
Indirect via GDM −0.476 0.032 −0.303 0.696 −0.924 −0.938 −1.355 −1.390 −2.136 −3.406
Indirect via HI 0.253 −0.016 0.104 0.067 0.244 0.254 0.269 0.301 0.186 0.268
r 0.480 0.044 0.346 0.416 0.520 0.552 0.365 0.387 0.433 0.481
GDM
Direct on yield −0.651 0.076 −0.381 0.858 −1.257 −1.246 −2.211 −2.271 −3.194 −4.985
Indirect via PL −0.025 0.000 −0.041 −0.041 −0.011 −0.015 −0.035 −0.042 −0.008 −0.027
Indirect via PDM 0.820 0.010 0.822 −0.358 1.412 1.430 2.217 2.268 3.290 5.009
Indirect via PSN 0.080 −0.123 −0.082 −0.116 −0.048 −0.057 −0.046 −0.033 0.040 0.045
Indirect via PGN 0.014 0.108 0.024 0.102 0.173 0.185 0.123 0.105 0.067 0.071
Indirect via HI 0.393 0.028 0.131 0.085 0.319 0.329 0.427 0.480 0.320 0.448
r 0.631 0.098 0.473 0.532 0.587 0.628 0.475 0.508 0.515 0.562
HI
Direct on yield 0.612 0.142 0.363 0.214 0.520 0.503 0.780 0.810 0.772 1.027
(Continued on next page)
8
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
Table 5. (Continued).
Effects 2015 2016 2017 2018 2019
Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred.
Indirect via PL −0.007 0.000 −0.006 −0.012 0.006 0.008 0.014 0.024 0.000 −0.002
Indirect via PDM 0.443 0.000 0.206 −0.102 0.682 0.756 0.972 1.114 0.952 1.565
Indirect via PSN 0.040 0.056 −0.025 −0.038 −0.010 −0.014 −0.016 −0.013 0.010 0.012
Indirect via PGN 0.008 −0.029 0.009 0.040 0.110 0.124 0.069 0.064 0.024 0.027
Indirect via GDM −0.418 0.015 −0.138 0.341 −0.771 −0.815 −1.211 −1.345 −1.325 −2.176
r 0.677 0.183 0.410 0.444 0.538 0.564 0.609 0.653 0.433 0.454
2
R 0.543 0.099 0.302 0.359 0.424 0.465 0.432 0.490 0.357 0.423
Residual 0.676 0.949 0.835 0.800 0.759 0.732 0.754 0.715 0.802 0.759
PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN, panicle grain number; GDM, panicle grain dry mass; HI, harvest index.
explanatory power of 9.9%, 35.9%, 46.5%, 49.0% and 42.3% 2019, respectively. Thus, it can be inferred that removing
of the variance in yield, for years 2015, 2016, 2017, 2018, and the effects of the model parameters resulted in an increase
2019, respectively. Thus, for years 2016, 2017, 2018, of 14.6% in the explanation capacity of the variables, for
and 2019, removing the effects of the model parameters years 2016, 2017, 2018, and 2019. However, for 2015, it
increased the explanatory capacity of the variables by 15%. resulted in a reduction of 81.5% in the ability to explain
In contrast, for 2015, there was a reduction of 81.8% in the variance in the yield.
explanatory power of the variance in yield. In general, removing the effects of the model parameters
In general, the removal of the effects of the model resulted in a change in the direction of the direct effects of
parameters resulted in a change in the direction of the direct the explanatory variables on yield in 8% of situations, a 50%
effects in 16.7% of the combinations and promoted changes variation in the magnitude of the direct effects in 24% of
greater than 50% in the magnitude of the direct effects in situations, and maintenance of the pattern of path coeffi-
26.7%, with maintenance in the pattern of the direct effects cients in 68% of cases. Regarding indirect effects, alterations
in 56.6% of the combinations. Another aspect to be high- were observed in the direction of the coefficients in 12% of
lighted is that once again most of the changes, whether in cases, with alterations greater than 50% in the magnitude
direction or in magnitude, corresponded to data from 2015, of the effects in 25% of the combinations and maintenance
coinciding with the highest rates of violation of univariate of the pattern of responses in 63% of the combinations.
assumptions. Another aspect to be highlighted is that most of the changes,
When analysing the impacts on the indirect effects of whether in direction or in magnitude, were observed in 2015,
removing the model parameters, we can observe, in general, coinciding with the highest rates of violation of assumptions.
a change in the direction of the response of the path coeffi- Another strategy that can be used by researchers to
cients in 18.67% of situations, a change in the magnitude overcome problems with multicollinearity, is to carry out
(>50% of the absolute value) of the coefficients in 26.67% ridge path analysis. In the ridge path analysis, the variables
of cases, and maintenance of the response pattern in 54.66% PL, PDM, PSN, PGN, GDM, and HI explained 52.8%, 29.2%,
of the situations. In years 2017 and 2018 the pattern of 41.0%, 41.0%, and 33.6% of the variance in yield, for years
responses was maintained in above 90% of cases, whereas 2015, 2016, 2017, 2018 and 2019, respectively, for the
in years 2015, 2016, and 2019 the pattern was maintained in original data group (Table 7). For the predicted data group,
only 3.3%, 33.3%, and 46.7% of the situations, respectively. the variables studied were able to explain 9%, 34.8%, 45.2%,
On the other hand, when observing the direct and indirect 46.8%, and 38.8% of the variance in yield, for years 2015,
effects of the explanatory variables on the yield of oats, values 2016, 2017, 2018, and 2019, respectively. Thus, removing
for path coefficients that exceed the unit (1) were obtained, the effects of the model parameters resulted in an average
regardless of the data group. increase of 14.8% in the ability to explain the variables of
When carrying out a path analysis with the elimination of variance in yield, for years 2016, 2017, 2018, and 2019. In
PDM, the percentage variance in oat yield explained by the contrast, there was a reduction of 82.9% in the explanatory
other variables was 54.10%, 30.20%, 41.80%, 42.20%, and power of variance in 2015.
34.20% for years 2015, 2016, 2017, 2018, and 2019, respec- In general, the removal of the effects of the model
tively, for the original data group (Table 6). Considering parameters resulted in a change in the direction of the
the group of predicted data, the variables showed explanatory direct effects in 6.67% of the combinations and promoted
power of 9.9%, 35.9%, 46.1%, 48.2%, and 39.4% of the changes greater than 50% in the magnitude of these effects
variance in the yield for years 2015, 2016, 2017, 2018, and in 23.33%, with maintenance in the response pattern in
9
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
Table 6. Direct and indirect effects of explanatory variables on yield of oats cultivated in 5 years (2015, 2016, 2017, 2018, and 2019), for original (Orig.)
and predicted (Pred.) data, excluding variables (n = 330).
Effects 2015 2016 2017 2018 2019
Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred.
PL
Direct on yield −0.099 0.021 −0.085 −0.083 −0.043 −0.070 −0.121 −0.165 −0.012 −0.056
Indirect via PSN 0.044 −0.004 −0.047 −0.069 −0.037 −0.040 −0.018 −0.011 0.023 0.026
Indirect via PGN 0.005 0.016 0.010 0.055 0.044 0.043 0.040 0.026 0.042 0.049
Indirect via GDM 0.060 0.000 0.226 0.238 0.067 0.076 0.047 0.047 0.128 0.162
Indirect via HI 0.034 −0.002 0.017 0.037 −0.027 −0.022 −0.045 −0.063 0.006 0.009
r 0.044 0.032 0.122 0.177 0.003 −0.013 −0.098 −0.166 0.186 0.189
PSN
Direct on yield 0.133 −0.340 −0.118 −0.168 −0.102 −0.116 −0.085 −0.064 0.064 0.070
Indirect via PL −0.032 0.000 −0.034 −0.035 −0.016 −0.024 −0.026 −0.029 −0.004 −0.021
Indirect via PGN 0.016 0.175 0.023 0.115 0.175 0.186 0.161 0.146 0.108 0.124
Indirect via GDM 0.157 0.031 0.331 0.334 0.164 0.193 0.089 0.094 0.173 0.207
Indirect via HI 0.145 −0.023 0.053 0.059 0.024 0.023 0.074 0.088 0.044 0.044
r 0.418 −0.157 0.256 0.306 0.244 0.263 0.213 0.234 0.385 0.424
PGN
Direct on yield 0.021 0.256 0.027 0.129 0.254 0.265 0.194 0.173 0.126 0.143
Indirect via PL −0.024 0.001 −0.033 −0.035 −0.008 −0.011 −0.025 −0.025 −0.004 −0.019
Indirect via PSN 0.103 −0.233 −0.104 −0.150 −0.070 −0.082 −0.070 −0.054 0.055 0.061
Indirect via GDM 0.186 0.036 0.382 0.390 0.237 0.284 0.114 0.122 0.189 0.228
Indirect via HI 0.195 −0.016 0.074 0.082 0.106 0.095 0.152 0.171 0.066 0.068
r 0.480 0.044 0.346 0.416 0.520 0.552 0.365 0.387 0.433 0.481
GDM
Direct on yield 0.255 0.086 0.480 0.481 0.323 0.377 0.185 0.199 0.283 0.334
Indirect via PL −0.023 0.000 −0.040 −0.041 −0.009 −0.014 −0.031 −0.039 −0.006 −0.027
Indirect via PSN 0.082 −0.123 −0.081 −0.116 −0.052 −0.059 −0.040 −0.031 0.039 0.043
Indirect via PGN 0.015 0.108 0.021 0.104 0.187 0.200 0.119 0.106 0.084 0.098
Indirect via HI 0.303 0.028 0.094 0.104 0.139 0.124 0.242 0.273 0.114 0.114
r 0.631 0.098 0.473 0.532 0.587 0.628 0.475 0.508 0.515 0.562
HI
Direct on yield 0.471 0.140 0.259 0.262 0.226 0.189 0.442 0.461 0.275 0.261
Indirect via PL −0.007 0.000 −0.006 −0.012 0.005 0.008 0.012 0.023 0.000 −0.002
Indirect via PSN 0.041 0.056 −0.024 −0.038 −0.011 −0.014 −0.014 −0.012 0.010 0.012
Indirect via PGN 0.009 −0.029 0.008 0.040 0.120 0.134 0.067 0.064 0.030 0.037
Indirect via GDM 0.164 0.017 0.173 0.191 0.198 0.247 0.101 0.118 0.117 0.146
r 0.677 0.183 0.410 0.444 0.538 0.564 0.609 0.653 0.433 0.454
R2 0.541 0.099 0.302 0.359 0.418 0.461 0.422 0.482 0.342 0.394
Residual 0.677 0.949 0.836 0.800 0.763 0.735 0.760 0.720 0.811 0.779
PL, panicle length; PSN, panicle spikelet number; PGN, panicle grain number; GDM, panicle grain dry mass; HI, harvest index.
70% of the combinations. Regarding the indirect effects, the the magnitude of the coefficients in 19.33%, with the mainte-
removal of the effects of the model parameters resulted in a nance of the pattern of indirect effects in 71.34% of the
change in the direction of the coefficients in 9.33% of the combinations. Another aspect to be highlighted is that, again,
combinations, and promoted changes greater than 50% in most of the changes, whether in direction or in magnitude,
10
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
Table 7. Direct and indirect effects of the explanatory variables on yield of oats cultivated in 5 years (2015, 2016, 2017, 2018, and 2019), with the
addition of k value (0.05) in the diagonal of the X’X matrix and considering the original and predicted data (n = 330).
Effects Year 2015 Year 2016 Year 2017 Year 2018 Year 2019
Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred.
PL
Direct on yield −0.093 0.022 −0.077 −0.075 −0.047 −0.071 −0.122 −0.165 −0.012 −0.052
Indirect via PDM 0.028 0.000 0.103 0.104 0.045 0.048 0.032 0.030 0.078 0.102
Indirect via PSN 0.038 −0.003 −0.038 −0.051 −0.030 −0.031 −0.013 −0.008 0.024 0.028
Indirect via PGN 0.009 0.014 0.011 0.042 0.039 0.038 0.034 0.023 0.039 0.046
Indirect via GDM 0.034 0.000 0.108 0.120 0.029 0.033 0.022 0.024 0.051 0.060
Indirect via HI 0.033 −0.002 0.019 0.040 −0.030 −0.026 −0.045 −0.062 0.006 0.009
r 0.049 0.031 0.126 0.181 0.005 −0.010 −0.092 −0.158 0.187 0.192
PDM
Direct on yield 0.111 0.028 0.214 0.208 0.172 0.192 0.108 0.106 0.164 0.201
Indirect via PL −0.024 0.000 −0.037 −0.038 −0.012 −0.018 −0.036 −0.046 −0.005 −0.026
Indirect via PSN 0.073 −0.121 −0.067 −0.085 −0.044 −0.050 −0.030 −0.022 0.042 0.047
Indirect via PGN 0.025 0.098 0.022 0.080 0.163 0.173 0.100 0.090 0.079 0.091
Indirect via GDM 0.144 0.056 0.228 0.242 0.138 0.164 0.084 0.100 0.113 0.122
Indirect via HI 0.245 0.003 0.069 0.079 0.121 0.120 0.189 0.219 0.082 0.086
r 0.574 0.064 0.429 0.486 0.538 0.580 0.415 0.447 0.474 0.521
PSN
Direct on yield 0.117 −0.299 −0.097 −0.122 −0.080 −0.090 −0.061 −0.044 0.068 0.074
Indirect via PL −0.031 0.000 −0.031 −0.031 −0.017 −0.025 −0.026 −0.029 −0.004 −0.019
Indirect via PDM 0.069 0.011 0.148 0.144 0.094 0.106 0.053 0.051 0.102 0.127
Indirect via PGN 0.027 0.148 0.025 0.089 0.154 0.163 0.137 0.125 0.101 0.115
Indirect via GDM 0.089 0.021 0.158 0.169 0.071 0.085 0.041 0.048 0.070 0.077
Indirect via HI 0.141 −0.023 0.057 0.064 0.027 0.028 0.073 0.086 0.046 0.047
r 0.413 −0.142 0.261 0.312 0.248 0.267 0.216 0.237 0.382 0.421
PGN
Direct on yield 0.034 0.216 0.028 0.099 0.225 0.232 0.165 0.149 0.117 0.132
Indirect via PL −0.023 0.001 −0.030 −0.032 −0.008 −0.011 −0.025 −0.025 −0.004 −0.018
Indirect via PDM 0.081 0.013 0.169 0.167 0.125 0.143 0.066 0.064 0.110 0.138
Indirect via PSN 0.090 −0.205 −0.085 −0.109 −0.055 −0.063 −0.051 −0.037 0.058 0.064
Indirect via GDM 0.106 0.024 0.183 0.197 0.103 0.125 0.052 0.062 0.076 0.084
Indirect via HI 0.190 −0.016 0.079 0.088 0.120 0.116 0.150 0.167 0.069 0.073
r 0.479 0.033 0.344 0.411 0.509 0.541 0.357 0.379 0.427 0.474
GDM
Direct on yield 0.145 0.057 0.229 0.243 0.140 0.166 0.085 0.101 0.114 0.123
Indirect via PL −0.022 0.000 −0.036 −0.037 −0.010 −0.014 −0.031 −0.039 −0.005 −0.025
Indirect via PDM 0.109 0.028 0.212 0.206 0.169 0.189 0.107 0.105 0.163 0.199
Indirect via PSN 0.072 −0.108 −0.067 −0.085 −0.041 −0.046 −0.029 −0.021 0.041 0.046
Indirect via PGN 0.025 0.091 0.023 0.081 0.165 0.175 0.101 0.091 0.078 0.091
Indirect via HI 0.295 0.028 0.100 0.112 0.156 0.150 0.238 0.267 0.118 0.122
r 0.624 0.095 0.461 0.519 0.580 0.619 0.471 0.503 0.510 0.555
HI
Direct on yield 0.459 0.140 0.277 0.281 0.254 0.230 0.435 0.451 0.286 0.279
(Continued on next page)
11
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
Table 7. (Continued).
Effects Year 2015 Year 2016 Year 2017 Year 2018 Year 2019
Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred. Orig. Pred.
Indirect via PL −0.007 0.000 −0.005 −0.011 0.006 0.008 0.012 0.022 0.000 −0.002
Indirect via PDM 0.059 0.001 0.053 0.059 0.082 0.100 0.047 0.051 0.047 0.062
Indirect via PSN 0.036 0.049 −0.020 −0.028 −0.009 −0.011 −0.010 −0.008 0.011 0.013
Indirect via PGN 0.014 −0.025 0.008 0.031 0.106 0.117 0.057 0.055 0.028 0.035
Indirect via GDM 0.093 0.011 0.083 0.097 0.086 0.109 0.046 0.060 0.047 0.054
r 0.655 0.176 0.396 0.429 0.525 0.552 0.587 0.631 0.419 0.440
R2 0.528 0.090 0.292 0.348 0.410 0.452 0.410 0.468 0.336 0.388
Residual 0.687 0.954 0.842 0.808 0.768 0.740 0.768 0.730 0.815 0.782
k 0.050
PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN, panicle grain number; GDM, panicle grain dry mass; HI, harvest index.
related to estimates for 2015, coinciding with the highest PSN, there was no significant correlation between the path
rates of violation of assumptions. coefficients considering the original and predicted data
In order to furnish a more precise analysisof the effect groups, regardless of the type of path analysis performed.
of removing model parameters upon the magnitude and
direction of the coefficients, a Pearson’s correlation was Interpretation of path analysis in different data
calculated comparing these coefficients under different groups
methods of analysis. The correlation explored the relationship
between estimates derived from traditional path analysis, When analysing the cause-effect relationships for the original
from traditional path analysis with the elimination of data group vs the predicted data group, it was identified for
variables, and from path analysis with measures to address the PL variable in years 2015, 2016, 2017, and 2019 that the
multicollinearity (ridge). Both direct and indirect effects of direct effects are in the opposite direction to the correlation
each variable were examined in this manner, and a compar- coefficients, with low magnitude values, regardless of the
ison was also made between between the data groups data group, suggesting that the correlation is caused by
(original and predicted). For the variables PL, PGN, and HI, indirect effects (Table 5). Only for year 2018, do the
correlations ranging from moderate to strong (0.59 – 0.80) correlation coefficient and the direct effects show similar
were obtained, regardless of the type of path analysis sign and magnitude, that is, in this year the direct effects
performed (Table 8). For the variables GDM and PDM, explain the true association between the traits. Regarding
correlations from strong to very strong (≥0.71 to ≤0.96) were the indirect effects, there were very low magnitudes in the
obtained, regardless of the type of path analysis. However, for coefficients, confirming the absence of cause-effect of this
variable on the yield.
For the PDM variable, in the original data group, positive
Table 8. Pearson’s correlation coefficients between path coefficients direct effects of high magnitude were obtained, indicating
(direct and indirect effects) obtained between data groups (original that the direct effects explain the true association between
and predicted), for traditional path analysis (without considering the variables, regardless of the year. Similar responses were
multicollinearity), traditional path analysis with elimination of variables
observed for the predicted data group, except for years 2015
and path analysis with measures to address multicollinearity (ridge).
and 2016. For 2015, the correlation coefficient and direct effect
Explanatory variables TraditionalA With eliminationB RidgeA values were negligible. For 2016, the association between
PL 0.68* 0.64* 0.65* variables can be attributed to indirect effects. When analysing
PDM 0.95* – 0.88*
the indirect effects in 2016, we can observe a negative
association of greater magnitude via GDM, PGN, and PSN.
PSN 0.01n.s. −0.01n.s. −0.04n.s.
For PSN, correlation coefficients and direct effects of the
PGN 0.59* 0.64* 0.72* same sign were observed for years 2015 and 2019, regardless
GDM 0.96* 0.86* 0.82* of the data group, but with negligible direct effects. In years
HI 0.80* 0.71* 0.68* 2016, 2017, and 2018, correlation coefficients of low magnitude
and negative direct effects of low magnitude were obtained,
PL, panicle length; PDM, panicle dry mass; PSN, panicle spikelet number; PGN,
panicle grain number; GDM, panicle grain dry mass; HI, harvest index.
indicating that the true association with yield can be
A
n = 30. explained by indirect effects. Indirect effects were negligible.
B
n = 25. For PGN, the direct effects were of the same sign as the
12
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
correlation coefficients, but with magnitudes lower than 0.26, direct effects were of the same sign and magnitude, so the
regardless of the data group or year, indicating that the true direct effects explain a good part of these relationships.
association of this variable with yield may be related to However, the indirect effects estimated for GDM also contributed
indirect effects. The coefficients referring to indirect effects to these relationships, a situation that can be understood because
were also of low magnitude (<0.19), with the highest values the HI is obtained by the ratio between GDM and PDM.
observed via PSN, PDM, and GDM. When interpreting the ridge path analysis for the original
For GDM, in the original data group, the direct effects were vs predicted data groups, correlation coefficients and direct
in the opposite direction and of large magnitude compared to effects of reverse direction or very low magnitudes were
the correlation coefficients, indicating that the direct effects obtained for PL, regardless of the data group or year,
must be considered in the analysis. When analysing the suggesting that indirect effects explain the true association
indirect effects, there was a large negative association via between the variables (Table 7). When analysing the indirect
PDM, PGN, PSN, and HI. Similar results are observed for the effects, we can verify that coefficients of low magnitudes were
predicted data group, in years 2015, 2017, 2018, and 2019. obtained, indicating once again a low association between PL
On the other hand, year 2016 suggested a cause-effect and yield. For PDM, correlation coefficients and direct effects
relationship between the variables GDM with yield, via of the same sign, and with magnitudes that can be considered
direct effect. For HI, correlation coefficients and direct effects similar were observed. This suggests a direct relationship
with similar direction and magnitude were observed, regardless between the variables. However, when considering the indirect
of the data group or year, suggesting that the true association effects, the correlation can also be explained via GDM and PGN,
between the variables is explained by the direct effects. In relationships that arise because both variables are components
addition, there was the contribution (indirect) of GDM to of the panicle and contribute to obtaining the absolute values of
obtain these correlation coefficients (between HI and yield). this trait.
When interpreting the cause-effect relationships for the In general, for PSN, the direct effects showed values with
original and predicted data groups with variable elimination, reverse direction or very low magnitudes, in relation to the
it was observed for the PL variable that the direct effects and correlation coefficients, regardless of the data group or year.
correlation coefficients were of low magnitude or showed a This condition suggests that the true association between the
reverse direction, regardless of the group of variables, data variables is explained by indirect effects. When analysing the
or year, a condition that suggests that the true association indirect effects, values with very low magnitude were identified
between the variables must be explained by indirect effects (<0.11), suggesting the absence of a cause-effect relationship
(Table 6). In turn, the indirect association effects were of low between these variables. For PGN, we can verify direct effects
magnitude, indicating the absence of a cause-effect relation- in the same direction as the correlation coefficients, regardless
ship between PL and yield. For PSN, we can observe an inverse of the data group or year. However, only in years 2017, 2018,
and low magnitude relationship between the correlation and 2019 were the direct effects of a magnitude that can be
coefficients and the direct effects, regardless of the fact that considered somewhat interesting. In these years, we can also
the true association between the variables was explained verify indirect effects via PSN, GDM, and PDM.
by the indirect effects, of the data group or year. When For GDM, the correlation coefficients and direct effects
analysing the indirect effects, in general, the coefficients were were in the same direction, but with lower magnitudes, regard-
of low magnitude, indicating that there was no association less of the data group or year, suggesting that the association
between PSN and yield. can be explained by indirect effects. For the indirect effects,
For the PGN variable, both the direct effects and the the path coefficients also did not exceed values of 0.24 but
correlation coefficients were positive, but of low magnitude suggest that part of the correlation can be explained via PDM,
(<0.27), regardless of the data group and year, so the GDM, and PSN. When analysing the correlation coefficients
indirect effects must be considered to explain the true and the direct effects for HI, similar directions and values
association between the variables. For the indirect effects, were identified, evidencing that the direct effects explain the
most of the PGN vs yield relationship was explained via PSN association between the variables. However, when analysing
and PGN, but with path coefficients that did not exceed 0.20, the path coefficients for the indirect effects, a contribution of
indicating low magnitude. When considering the direct PGN and PDM toward these correlations was observed, a
effects and the correlation coefficient for GDM, there were situation that arises due to the relationship between these
similar positive directions and magnitude, regardless of data variables for the calculation of HI.
group or year. However, the values of direct effects were 44%
and 36.7% lower than the correlation coefficients, for Discussion
the original and predicted data groups, respectively. Thus, a
fraction of the association between the variables can be
Statistical assumptions and simple correlation
attributed to indirect effects, via PGN and HI.
Similar responses were observed for HI, that is, regardless The diagnosis of univariate normality and homoscedasticity
of the year or data group, the correlation coefficients and of the residual variances allowed the identification of the
13
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
violation of statistical assumptions in all growing years. The extreme weather conditions in critical periods of the crop,
use of the data transformation technique helped to improve which impacted the alteration of its development and yield.
the metrics of normality and homoscedasticity. However, it
was not always effective in bypassing the violation of assump- Path analysis
tions. The improvement in the statistics of the assumption Pearson’s correlation indicated the existence of a linear
tests is associated with the intrinsic characteristics involved
relationship between the explanatory variables and yield,
in the developmentof the data transformation technique,
regardless of the growth year and data group. When carrying
which is based on adding a lambda value that maximises the
out a new diagnosis of multicollinearity, considering only the
maximum likelihood estimator and minimises the residual
group of explanatory variables, a violation of the assump-
(Kutner et al. 2004).
tion was observed in all the scenarios studied. The variable
When analysing the linear relationships between the yield
elimination technique was an effective alternative to
components, considering the data groups studied, high rates
overcome the problems related to multicollinearity. Meira
of significant correlations were observed between the
et al. (2019c), when working with a crop of black oats, also
variables, regardless of the growth year. The high number
reported the occurrence of severe multicollinearity in
of significant linear combinations is related to the number of
the correlation matrix between the explanatory traits, and the
observations, a situation that conditions the achievement of
authors chose to eliminate the variables to circumvent the
significance even in situations with correlation coefficient
multicollinearity. In the literature, studies are found that
values of low magnitude (Lúcio et al. 2013).
demonstrate the need to eliminate some explanatory variables
In general, all traits under study showed a correlation with
to properly carry out path analysis with corn (Toebe and
yield, but with r values of low magnitude for PL, indicating
Cargnelutti Filho 2013a, 2013b; Olivoto et al. 2017; Toebe
that individually this variable has a lower influence on
et al. 2017a, 2017b), tomato (Rodrigues et al. 2010; Sari
yield (Sari et al. 2017). The PDM showed high correlations
with PGN, GDM and PSN, because these variables are panicle et al. 2017, 2018), pepper (Carvalho et al. 1999; Moreira et al.
components. PGN and GDM also showed a good correlation, 2022) and jabuticaba (Salla et al. 2015) crops.
suggesting that panicles with higher grain numbers result in In general, the removal of the model parameters promoted
higher values of grain mass. Kaziu et al. (2019), studying changes in the capacity to explain the variance in yield by the
the linear relationships of yield components of oats, found independent variables. Furthermore, changes in the direction
a strong to very strong association of yield with panicle and magnitude of path coefficients (direct and indirect) were
mass, harvest index, and panicle grain number and a weak observed in all years and types of path analysis performed,
association with panicle length (r = 0.31), similar to the especially for 2015. This is a response resulting from the
results found in this study. On the other hand, Dumlupinar variation in the meteorological conditions of this year compared
et al. (2011), analysing the correlation between the yield compo- to the others. In addition, when path analysis was performed
nents of different oat genotypes, showed that PDM and PGN were without using techniques to circumvent the multicollinearity
highly correlated with each other, as well as the thousand grain between the explanatory variables, values for path coefficients
mass (TGM) vs PDM, PDM vs yield and yield vs TGM, showing that exceeded unity (1) were observed, regardless of the data
that grain mass plays a determining role for yield. group. This situation is indicative of the existence of bias in
Benin et al. (2003), analysing the linear relationship the analyses, such as multicollinearity, aspects that can impact
between the yield characteristics of oats, found that panicle the biological application of the results (Olivoto et al. 2017;
mass, the number of panicles per plant, and the average Toebe et al. 2017b). These observations corroborate what
mass of grains have a positive correlation with the yield of was verified in the multicollinearity diagnosis, which
the crop. Similar responses were observed by Caierão et al. identified the assumption violation.
(2001). The authors found a positive correlation between Considering that to overcome the problems with
the number of grains per panicle and grain mass and yield, multicollinearity, it was necessary to eliminate the variable
demonstrating that they can be used in indirect selection PDM and that studies suggest that this trait has a high
for yield. In a study with the black oat crop, Meira et al. correlation with yield and a good prospect of indirect
(2019a) found a strong positive correlation between PGN selection via PDM (Caierão et al. 2001, 2006; Mantai et al.
and HI and yield, thus demonstrating that increases in yield 2020a) and that the grain mass corresponds to about 80%
occur as the number and mass of grains per plant increase. to 85% of the panicle mass (Caierão et al. 2001), ridge path
When reviewing individually the correlation coefficients analysis was performed. However, for ridge path analysis,
obtained from the combination of each pair of variables in the removal of model parameters also resulted in changes
the 5 years between the data groups, the response pattern in the ability to explain variance in yield, as well as in the
was maintained in years 2016, 2017, 2018, and 2019, with magnitude and direction of path coefficients.
minor changes in the magnitude of the coefficients. In contrast, In general, when analysing cause-effect relationships,
the most significant divergences between the coefficients were considering the types of path analysis performed, the years
obtained in 2015. These responses can be attributed to the and the data groups, there was no cause–effect relationship
14
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
of the PL and PSN variables with yield. The results obtained Box GEP, Cox DR (1964) An analysis of transformations. Journal of the
in this study are similar to those found in the literature, Royal Statistical Society: Series B (Methodological) 26, 211–243.
doi:10.1111/j.2517-6161.1964.tb00553.x
regardless of the data group or type of path analysis Caierão E, Carvalho FIFd, Pacheco MT, Lonrecetti C, Marchioro VS, Silva
performed. Benin et al. (2003) reported that the variables JG (2001) Seleção indireta em aveia para o incremento no rendimento
panicle mass, number of panicles per plant, and average de grãos. Ciência Rural 31, 231–236. doi:10.1590/S0103-84782001
000200007
grain mass have the greatest direct and indirect effects on Caierão E, Carvalho FIFd, Floss EL (2006) Seleção indireta para o
yield, and are thus the main characteristics to be considered incremento do rendimento de grãos em aveia. Ciência Rural 36,
for the selection of superior genotypes for white oat. Meira 1126–1131. doi:10.1590/S0103-84782006000400013
Cargnelutti Filho A, Toebe M, Alves BM, Burin C, Santos GOd, Facco G,
et al. (2019a) found no direct effect of panicle length on Neu IMM (2015) Relações lineares entre caracteres de aveia preta.
black oat yield, but a moderate indirect positive effect via Ciência Rural 45, 985–992. doi:10.1590/0103-8478cr20140500
the number of grains per panicle. The authors also observed Carvalho CGPd, Oliveira VR, Cruz CD, Casali VWD (1999) Análise de
trilha sob multicolinearidade em pimentão. Pesquisa Agropecuária
that the number of grains per panicle exerts a direct positive Brasileira 34, 603–613. doi:10.1590/S0100-204X1999000400011
influence of high magnitude on the yield. Moradi et al. (2005) Cassol LC, Piva JT, Soares AB, Assmann AL (2011) Produtividade e
and Bibi et al. (2012) found that the number of grains per composição estrutural de aveia e azevém submetidos a épocas de corte
e adubação nitrogenada. Revista Ceres 58, 438–443. doi:10.1590/
panicle has a greater direct effect on oat yield. S0034-737X2011000400006
Conab (2019) Acompanhamento da safra brasileira de grãos: safra 2019/
20 - Terceiro levantamento. 3. v. 7. Companhia Nacional de
Conclusions Abastecimento, Brasília, DF. Available at https://www.conab.gov.br/
info-agro/safras/graos/boletim-da-safra-de-graos?start=50 [accessed
21 January 2020]
The occurrence of multicollinearity among the explanatory Conab (2021) Acompanhamento da safra brasileira de grãos: safra 2021/
variables resulted in obtaining path coefficients with magni- 22 - Segundo levantamento. 2. v. 9. Companhia Nacional de
Abastecimento, Brasília, DF. Available at https://www.conab.gov.br/
tudes that exceed the unit and without biological application. info-agro/safras/graos/boletim-da-safra-de-graos?start=20 [accessed
Removing the model’s parameters modified the path 15 December 2021]
coefficients, with average changes of 10.5% and 13.3% in the Couto MRM, Lúcio AD, Lopes SJ, Carpes RH (2009) Transformações
de dados em experimentos com abobrinha italiana em ambiente
direction of the associations and 24.7% and 23.0% in the protegido. Ciência Rural 39, 1701–1707. doi:10.1590/S0103-847820
magnitude of the direct and indirect effects, respectively, 09005000110
regardless of the type of path analysis performed. Cruz CD (2005) ‘Princípios de Genética Quantitativa.’ (UFV: Viçosa - MG,
Brazil)
The indirect selection for the white oat yield based on the Cruz CD, Regazzi AJ, Carneiro PCS (2012) ‘Modelos biométricos
harvest index, considering the grain and panicle dry mass, is aplicados ao melhoramento genético.’ (UFV: Viçosa, Brazil)
an interesting alternative for selecting cultivars with higher Dornelles EF, Silva JAGd, Carvalho IR, Alessi O, Pansera V, Lautenchleger
F, Stumm EMF, Carbonera R, Bárta RL, Tisott JV (2020) Resistance of
production potential. oat cultivars to reduction in fungicide use and a longer interval from
application to harvest to promote food security. Genetics and Molecular
Research 19, 1–12. doi:10.4238/gmr18542
Dumlupinar Z, Maral H, Kara R, Dokuyucu T, Akkaya A (2011) Evaluation
Supplementary material of Turkish oat landraces based on grain yield, yield components and
some quality traits. Turkish Journal of Field Crops 16, 190–196.
Falconer DS, Mackay TF (1997) ‘Introduction to quantitative genetics.’
Supplementary material is available online. (Pearson Education India: London)
Faraway J (2016) faraway: functions and datasets for books by Julian
Faraway. Cran. R-project. Available at https://cran.r-project.org/
References package=faraway
Alessi O, Dornelles EF, Mamann ÂTWd, Kraisig AR, Henrichsen L, Marolli Ferreira DF (2009) ‘Estatística Básica.’ (UFLA: Lavras, Brazil)
A, Pansera V, Silva JAGd (2018) Aplicação de modelos de regressão e Fox J, Weisberg S (2019) ‘An R companion to applied regression.’ (Sage)
de adaptabilidade e estabilidade na identificação de cultivares de Hair JF, Black WC, Babin BJ, Anderson RE, Tatham RL (2009) ‘Análise
aveia branca com maior resistência genética a doenças foliares. multivariada de dados.’ (Bookman)
Proceeding Series of the Brazilian Society of Computational and Applied Harrell FE (2021) Frank E Harrell. CRAN. R-project. Available at https://
Mathematics 6, 1–7. doi:10.5540/03.2018.006.02.0424 cran.r-project.org/package=Hmisc
Alvares CA, Stape JL, Sentelhas PC, de Moraes Gonçalves JL, Sparovek G Kaziu I, Kashta F, Celami A (2019) Estimation of grain yield, grain
(2013) Köppen’s climate classification map for Brazil. Meteorologische components and correlations between them in some oat cultivars.
Zeitschrift 22, 711–728. doi:10.1127/0941-2948/2013/0507 Albanian Journal of Agricultural Sciences 18, 13–19. Available at
Benin G, Carvalho FIFd, Oliveira ACd, Marchioro VS, Lorencetti C, Kurek https://www.proquest.com/scholarly-journals/estimation-grain-yield-
AJ, Silva JAG, Cargnin A, Simoni D (2003) Estimativas de correlações components-correlations/docview/2317872862/se-2
e coeficientes de trilha como critérios de seleção para rendimento de Korkmaz S, Goksuluk D, Zararsiz G (2014) MVN: an R package for assessing
grãos em aveia. Revista Brasileira de Agrociência 9, 9–16. multivariate normality. The R Journal 6, 151–162. doi:10.32614/RJ-
Bibi A, Shahzad AN, Sadaqat HA, Tahir MHN, Fatima B (2012) Genetic 2014-031
characterization and inheritance studies of oats (Avena sativa L.) for Kraisig AR, Silva JAGd, Carvalho IR, Mamann ÂTWD, Corso JS, Norbert L
green fodder yield. International Journal of Biology, Pharmacy and (2020) Time of nitrogen supply in yield, industrial and chemical quality
Allied Sciences 1, 450–460. of oat grains. Revista Brasileira de Engenharia Agrícola e Ambiental 24,
Borchers HW (2021) pracma: practical numerical math functions. CRAN. 700–706. doi:10.1590/1807-1929/agriambi.v24n10p700-706
R-project. Available at https://cran.r-project.org/package=pracma Kutner MH, Nachtsheim CJ, Neter J (2004) ‘Applied linear regression
Bowman A, Crawford E, Alexander G, Bowman RW (2007) rpanel: simple models.’ (McGraw-Hill: Boston, MA, USA)
interactive controls for R functions using the tcltk package. Journal of Leite JGDB, Federizzi LC, Bergamaschi H (2012) Mudanças climáticas e
Statistical Software 17, 1–18. doi:10.18637/jss.v017.i09 seus possíveis impactos aos sistemas agrícolas no Sul do Brasil.
15
J. Sgarbossa et al. Crop & Pasture Science 75 (2024) CP23135
Agrária 7, 337–343. Available at http://www.agraria.pro.br/ojs32/ Rodrigues GB, Marim BG, Silva DJHd, Mattedi AP, Almeida VdS (2010)
index.php/RBCA/article/view/v5i3a1239/993 Análise de trilha de componentes de produção primários e
Lúcio AD, Storck L, Krause W, Gonçalves RQ, Nied AH (2013) Relações secundários em tomateiro do grupo Salada. Pesquisa Agropecuária
entre os caracteres de maracujazeiro-azedo. Ciência Rural 43, 225–232. Brasileira 45, 155–162. doi:10.1590/S0100-204X2010000200006
doi:10.1590/S0103-84782013000200006 Royston JP (1983) Some techniques for assessing multivariate normality
Mantai RD, Silva JAGd, Marolli A, Mamann ÂTWd, Sawicki S, Krüger based on the Shapiro-Wilk W. Journal of the Royal Statistical Society
CAMB (2017) Simulation of oat development cycle by photoperiod Series C (Applied Statistics) 32, 121–133. doi:10.2307/2347291
and temperature. Revista Brasileira de Engenharia Agrícola e Salla VP, Danner MA, Citadin I, Sasso SAZ, Donazzolo J, Gil BV (2015)
Ambiental 21, 3–8. doi:10.1590/1807-1929/agriambi.v21n1p3-8 Análise de trilha em caracteres de frutos de jabuticabeira. Pesquisa
Mantai RD, Silva JAGd, Binelo MO, Sausen ATZR, Rossi DS, Corso JS Agropecuária Brasileira 50, 218–223. doi:10.1590/S0100-204X20150
(2020a) Nitrogen management in the relationships between oat
00300005
inflorescence components and productivity. Revista Brasileira de
Sari BG, Lúcio AD, Santana CS, Lopes SJ (2017) Linear relationships
Engenharia Agrícola e Ambiental 24, 385–393. doi:10.1590/1807-
between cherry tomato traits. Ciência Rural 47, e20160666. doi:10.1590/
1929/agriambi.v24n6p385-393
Mantai RD, Silva JAGd, Scremin OB, Carvalho IR, Magano DA, Fachinetto 0103-8478cr20160666
JM, Lautenchleger F, Rosa JAd, Peter CL, Berlezi JD, Babeski CM Sari BG, Lúcio AD, Olivoto T, Krysczun DK, Tischler AL, Drebes L (2018)
(2020b) Nitrogen levels in oat grains and its relation to productivity. Interference of sample size on multicollinearity diagnosis in path
Genetics and Molecular Research 19, 1–13. doi:10.4238/gmr18569 analysis. Pesquisa Agropecuária Brasileira 53, 769–773. doi:10.1590/
Meira D, Meier C, Olivoto T, Follmann DN, Rigatti A, Lunkes A, Marchioro s0100-204x2018000600014
VS, Souza VQd (2019a) Multivariate analysis reveals genetic Shapiro SS, Wilk MB (1965) An analysis of variance test for normality
divergence and promising traits for indirect selection in black oat. (complete samples). Biometrika 52, 591–611. doi:10.1093/biomet/
Revista Brasileira de Ciências Agrárias 14, 1–7. doi:10.5039/agraria. 52.3-4.591
v14i4a6514 Silva ARd, Malafaia G, Menezes IPP (2017) biotools: an R function to
Meira D, Meier C, Olivoto T, Nardino M, Klein LA, Moro ED, Fassini F, predict spatial gene diversity via an individual-based approach.
Marchioro VS, Souza VQd (2019b) Estimates of genetic parameters Genetics and Molecular Research 16, 1–6. doi:10.4238/gmr16029655
between and within black oat populations. Bragantia 78, 43–51. Silva JAGd, Mamann ÂTWd, Scremin OB, Carvalho IR, Pereira LM, Lima
doi:10.1590/1678-4499.2018116 ARCd, Lautenchleger F, Basso NCF, Argenta CV, Berlezi JD, Porazzi
Meira D, Meier C, Olivoto T, Nardino M, Rigatti A, Klein LA, Caron BO, FU, Matter EM, Norbert L (2020) Biostimulants in the indicators of
Marchioro VS, Souza VQD (2019c) Phenotypic variance of black oat yield and industrial and chemical quality of oat grains. Journal of
growing in crop seasons reveals genetic effects predominance. Anais Agricultural Studies 8, 68–87. doi:10.5296/jas.v8i2.15728
Da Academia Brasileira De Ciências 91, e20180036. doi:10.1590/ Steel RGD, Torrie JH, Dickey DA (1997) ‘Principles and procedures of
0001-3765201920180036 statistics: a biometrical approach.’ (MH Book: New York, NY, USA)
Montgomery DC, Peck EA (1982) ‘Introduction to linear regression Tedesco MJ, Gianello C, Bissani CA, Bohnen H, Volkweiss SJ (1995)
analysis.’ (John Wiley and Sons, Inc.: New York, NY, USA) ‘Análise de solo, plantas e outros materiais.’ (UFRGS: Porto Alegre,
Moradi M, Rezai A, Arzani A (2005) Path analysis for yield and related Brazil)
traits in oats. Journal of Science and Technology of Agriculture and Tierney L (2021) tkrplot: TK Rplot. CRAN. R-project. Available at https://
Natural Resources 9, 173–180. cran.r-project.org/package=tkrplot
Moreira FB, Cecato U, Prado INd, Wada FY, Rêgo FCdA, Nascimento WGd Toebe M, Cargnelutti Filho A (2013a) Não normalidade multivariada
(2008) Avaliação de aveia preta cv Iapar 61 submetida a níveis
e multicolinearidade na análise de trilha em milho. Pesquisa
crescentes de nitrogênio em área proveniente de cultura de soja. Acta
Agropecuária Brasileira 48, 466–477. doi:10.1590/S0100-204X20130
Scientiarum. Animal Sciences 23, 815–821. doi:10.4025/actascianimsci.
v23i0.2608 00500002
Moreira SO, Gonçalves LSA, Rodrigues R, Sudré CP, Júnior ATdA, Toebe M, Cargnelutti Filho A (2013b) Multicollinearity in path analysis
Medeiros AM (2022) Correlações e análise de trilha sob multicolin- of maize (Zea mays L.). Journal of Cereal Science 57, 453–462.
earidade em linhas recombinadas de pimenta (Capsicum annuum doi:10.1016/j.jcs.2013.01.014
L.). Revista Brasileira de Ciências Agrárias 8, 15–20. doi:10.5039/ Toebe M, Cargnelutti Filho A, Storck L, Lúcio AD (2017a) Sample size for
agraria.v8i1a1726 estimation of direct effects in path analysis of corn. Genetics and
Olivoto T, de Souza VQ, Nardino M, Carvalho IR, Ferrari M, de Pelegrin Molecular Research 16, 1–23. doi:10.4238/gmr16029523
AJ, Szareski VJ, Schmidt D (2017) Multicollinearity in path analysis: a Toebe M, Cargnelutti Filho A, Storck L, Lúcio AD (2017b) Direct effects on
simple method to reduce its effects. Agronomy Journal 109, 131–142. scenarios and types of path analyses in corn hybrids. Genetics and
doi:10.2134/agronj2016.04.0196 Molecular Research 16, 1–15. doi:10.4238/gmr16019529
R Core Team (2021) R: a language and environment for statistical Vencovski R, Barriga P (1992) ‘Genética biométrica no fitomelhoramento.’
computing. Available at https://www.r-project.org/ (Sociedade Brasileira de Genetica: Ribeirão Preto - SP, Brazil)
16
www.publish.csiro.au/cp Crop & Pasture Science 75 (2024) CP23135
Data availability. The data that support this study will be shared upon reasonable request to the corresponding author.
Conflicts of interest. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence
the work reported in this paper.
Declaration of funding. This research was funded by Coordination for Improvement of Higher Education Personnel (Capes-Brazil) (Finance code 001, Process
N°.88887.499817/2020-00).
Acknowledgements. We thank the members of the research groups on Technical Systems of Agricultural Production at the Regional University of Northwestern
Rio Grande do Sul State and the research group on Agricultural Experimentation at the Federal University of Santa Maria for help in this project. This paper forms
part of the PhD thesis of Jaqueline Sgarbossa (2023).
Author affiliations
A
Department of Plant Science, Federal University of Santa Maria, Santa Maria, Rio Grande do Sul, Brazil.
B
Department of Agronomy, Regional University of Northwestern Rio Grande do Sul State, Ijuí, Rio Grande do Sul, Brazil.
C
Department of Agronomy and Natural Sciences, Federal University of Santa Maria Campus Frederico Westphalen, Frederico Westphalen, Rio Grande do Sul, Brazil.
D
Federal University of Pampa, Itaqui, Rio Grande do Sul, Brazil.
E
Department of Plant Science, Federal University of Santa Catarina, Florianópolis, Santa Catarina, Brazil.
F
Department of Forest Science, Federal University of Paraná, Curitiba, Paraná, Brazil.
17