A Comparison of PLS and ML Bootstrapping Techniques in SEM: A Monte Carlo Study
A Comparison of PLS and ML Bootstrapping Techniques in SEM: A Monte Carlo Study
Equation Modeling.
1 Introduction
Structural Equation Modeling techniques, such as the covariance based SEM (CB-
SEM) and the Partial Least Squares based SEM (PLS), have gained enormous pop-
ularity as the key multivariate analysis methods in empirical research in the past
Pratyush N. Sharma
Joseph M. Katz Graduate School of Business, University of Pittsburgh, e-mail: [email protected]
Kevin H. Kim
School of Education and Joseph M. Katz Graduate School of Business, University of Pittsburgh
e-mail: [email protected]
1
2 Pratyush N. Sharma and Kevin H. Kim
few years. These techniques have been applied in diverse disciplines such as man-
agement information systems (MIS) (Ringle, Sarstedt, & Straub, 2012), market-
ing (Reinartz, Heinlein, & Henseler, 2009) and psychology (MacCallum & Austin,
2000). While there are many similarities in the two techniques, there are major dif-
ferences among them especially in the estimation approaches they utilize. CBSEM
focuses on estimating a set of model parameters so that the theoretical covariance
matrix implied by the system of structural equations is as close as possible to the
sample covariance matrix (Reinartz et al., 2009). One of the most common estima-
tion methods in CBSEM is the Maximum Likelihood (ML), which assumes multi-
variate normality and large sample theory. However, since researchers often work
with relatively small samples from non-normal populations, bootstrap re-sampling
offers a viable alternative (Nevitt & Hancock, 2001). Unlike CBSEM, PLS does
not work with latent variables rather it works with block variables, and estimates
model parameters to maximize the variance explained for all endogenous constructs
through a series of ordinary least squares regression (Reinartz et al., 2009). Thus,
Partial Least Squares (PLS) based Structural Equation Models do not assume nor-
mality, and hence employ bootstrapping to obtain standard errors for hypothesis
testing. Instead they assume that the sample distribution is a reasonable representa-
tion of the intended population distribution (Hair, Ringle, & Sarstedt, 2011).
Bootstrapping is a nonparametric approach to statistical inference that does not
make any distributional assumptions of the parameters like traditional methods.
Bootstrapping draws conclusions about the characteristics of a population strictly
from the sample at hand, rather than making unrealistic assumptions about the pop-
ulation. That is, given the absence of information about the population, the sample is
assumed to be the best estimate of the population. Hence, bootstrapping has advan-
tages in situations where there is weak or no statistical theory about the distribution
of a parameter, or when the underlying distributional assumptions needed for valid
parametric inference are violated (Mooney, 1996).
Bootstrapping estimates the empirical sampling distribution of a parameter by re-
sampling from a sample with replacement. Although each re-sample has the same
number of elements as the original sample, the replacement method ensures that
each of these re-samples is likely to be slightly and randomly different than the orig-
inal sample (Mooney & Duval, 1993). If the sample is a good approximation of the
population then bootstrapping will provide a good approximation of the sampling
distribution of the parameter. This necessitates a sufficiently large and unbiased sam-
ple. Unsurprisingly, researchers have cautioned against blind faith in bootstrapping
and advocated investigation of bootstrapping, especially under conditions of insuf-
ficient sample size (Ichikawa & Konishi, 1995; Yung & Bentler, 1994).
In Structural Equation Models, Bootstrapping allows for the possibility to con-
duct significance testing of a statistic (θ ) such as a path or a factor loading. Such
significance tests analyze the probability of observing a statistic of that size or larger
when the null hypothesis H0 : θ = 0, is true. However, Bollen and Stine (1992) have
argued that while such a naı̈ve bootstrap procedure works well in many cases, it
can fail if the sample that is used to generate bootstrap samples doesn’t represent
the population. Under the naı̈ve bootstrapping, the mean of the bootstrap population
PLS and ML Bootstrapping 3
(i.e. the average of the observed sample) is unlikely to be equal to zero. In such
cases, the bootstrap samples are drawn from a population for which the null hypoth-
esis does not hold, regardless of whether H0 holds for the unknown population from
the original sample was drawn. Hence the bootstrap values of the test statistic are
likely to reject H0 too often (Bollen & Stine, 1992). This is most likely for misspec-
ified models, or when the true population model is unknown. As a remedy, Bollen
and Stine proposed a simple transformation of the data that seeks to make the null
hypothesis true under the bootstrap re-sampling by centering the data around the
sample mean. Re-sampling from the centered values forces the mean of the boot-
strap population to be zero so that H0 holds, resulting in fewer Type-1 errors (Bollen
& Stine, 1992).
Given the reliance of CBSEM and PLS on bootstrapping under most conditions,
we argue that researchers need a better understanding of bootstrapping behavior
especially under the limiting conditions of sample size, distributional assumptions
and model misspecifications. We also seek to contribute to the ongoing debate in the
MIS and marketing literatures about the use of PLS under conditions of insufficient
sample sizes and distributional assumption violations (Ringle et al., 2012; Reinartz
et al., 2009; Marcoulides & Saunders, 2006; Hair et al., 2011; Marcoulides, Chin,
& Saunders, 2009). While there are a few studies comparing CBSEM and PLS un-
der various sets of design factors (Reinartz et al., 2009; Areskoug, 1982; Goodhue,
Lewis, & Thompson, 2006), none of them have focused on bootstrapping behaviors
of the estimation methods used. Our goal is to provide researchers with some addi-
tional guidelines based on bootstrapping behavior while choosing among CBSEM
and PLS. We conducted a Monte Carlo study to evaluate the efficiency and accuracy
in model parameter recovery by naı̈ve bootstrapping in PLS, and ML and Bollen-
Stine bootstrapping in CBSEM. Specifically, our research question is: In terms of
the efficiency and accuracy of model parameter recovery, how does naı̈ve bootstrap-
ping in PLS compare to ML and Bollen-Stine bootstrapping in SEM across various
conditions of sample size and distributional assumptions? We analyzed this question
using a mixed ANOVA design.
2 Method
δ1 x1
.60
δ2 x2 .60 ξ1
.60
δ3 x3 .30 y1 ε1
.60
η1 .60 y2 ε2
.60
δ4 x4 .30 ζ1 y3 ε3
.60
δ5 x5 .60 ξ2
.60
δ6 x6
3 Results
In order to analyze if there were any differences among the techniques in terms of
achieving proper solutions, we checked the instances for non-convergence of solu-
tions (i.e., the model didn’t coverge within 500 iterations) and the valence of the
variance estimates. Table 1 presents the frequency pattern of non-convergence of
ML based CBSEM, PLS, and the three bootstrap techniques. We found that for
ML based CBSEM all non-convergences occurred at sample size 50, however PLS
always converged. Bootstrap techniques produced higher non-convergences and
PLS and ML Bootstrapping 5
Next, we analyzed the accuracy and efficiency of the bootstrap techniques. Since
in this study there were no model misspecifications, we expected that ML and
Bollen-Stine bootstrapping in SEM would result in similar accuracy and efficiency
of the parameter recovery. The ANOVA results showed that the bias, RMSD and
the averages of the standard deviations of the parameter estimates for both ML
and Bollen-Stine bootstraps in CBSEM were similar, confirming our expectations.
However, these estimates differed significantly when compared to naı̈ve bootstrap-
ping in PLS (Table 2). In terms of measurement model accuracy, we found that
the mean bias and RMSD values for the retrieved factor loading estimates in naı̈ve
PLS bootstrapping were larger than both ML and Bollen-Stine SEM bootstraps.
Surprisingly, we found that as sample size increased PLS bias increased, suggest-
ing that naı̈ve PLS bootstrapping overestimated the factor loadings at larger sample
sizes. However, we also found that in general naı̈ve PLS bootstrap had larger bias
6 Pratyush N. Sharma and Kevin H. Kim
Table 2 Mean bias and RMSD of measurement and structural model by sample size and estimation
method. Since ML and Bollen-Stine SEM bootstrap values were similar, we only present ML
bootstrap values.
Sample Size Method Measurement Model Structural Model
Bias SE RMSD SE Bias SE RMSD SE
50 ML .008 .002 .163 .002 .048 .012 .210 .007
PLS .115 .003 .214 .003 −.032 .003 .121 .004
100 ML .000 .001 .103 .002 .009 .010 .147 .006
PLS .134 .002 .184 .002 −.060 .003 .097 .003
150 ML .000 .001 .083 .002 −.008 .009 .111 .006
PLS .143 .002 .174 .002 −.076 .003 .100 .003
200 ML .000 .001 .071 .002 .005 .009 .097 .006
PLS .148 .002 .167 .002 −.080 .003 .097 .003
500 ML −.001 .001 .044 .002 .000 .009 .064 .006
PLS .153 .002 .160 .002 −.087 .003 .094 .003
but smaller RMSD than both the ML and Bollen-Stine bootstraps for the structural
model estimates (i.e., regression coefficients among latent variables). The RMSD
values for structural model suggested that the naı̈ve PLS bootstrap outperformed
ML and Bollen-Stine SEM bootstraps up to a sample size of 200, after which the
situation was reversed. The effect of distributional conditions on bootstrap accuracy
and efficiency was not significant.
In terms of the measurement model efficiency, we found that the mean of stan-
dard errors of ML and Bollen-Stine SEM bootstraps were smaller than the naı̈ve
PLS bootstrap up to a sample size of 200 (Table 3). However, at a sample size of
500, naı̈ve PLS bootstrap had similar efficiency as ML and Bollen-Stine bootstraps.
For the structural model efficiency, we found that the naı̈ve PLS bootstrap outper-
formed ML and Bollen-Stine SEM bootstraps at all levels of sample size.
Table 3 Mean standard errors of measurement and structural models by sample size and estimation
methods.
Sample Size Method Measurement Model Structural Model
M SE M SE
50 ML .181 .002 .276 .006
PLS .215 .005 .146 .003
100 ML .107 .002 .183 .006
PLS .127 .005 .090 .003
150 ML .085 .002 .146 .006
PLS .099 .005 .076 .003
200 ML .074 .002 .123 .006
PLS .076 .005 .067 .003
500 ML .045 .002 .079 .006
PLS .045 .005 .044 .003
PLS and ML Bootstrapping 7
4 Discussion
4.1 Conclusion
PLS outperformed ML in smaller sample sizes; not only did it always converge but
it also led to smaller bias and RMSD than ML. However, as sample size increased,
the difference between the techniques disappeared. At larger sample sizes, ML pro-
duced smaller bias and RMSD than PLS. PLS was more accurate at reproducing
structural parameters than ML but it was less efficient at the measurement param-
eters. Hence, at large sample sizes, ML is preferred over PLS but at small sample
sizes, a researcher might benefit from using PLS over ML.
8 Pratyush N. Sharma and Kevin H. Kim
References
Areskoug, B. (1982). The first canonical correlation: theoretical pls analysis and
simulation experiments. In K. G. Joreskog & H. Wolds (Eds.), Systems under
direct observation: causality, structure, and prediction. Amsterdam: North
Holland.
Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness of fit measures
in structural equation modeling. Sociological methods and research, 21(2),
205-229.
Diamanatopoulos, A., & Siguaw, J. A. (2006). formative versus reflective indi-
cators in organizational measure development: a comparison and empirical
illustration. British Journal of Management, 17(4), 263-282.
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psy-
chometrika, 43(4), 521-532.
Fox, J. (2006). Structural equation modeling with the sem package in r. Structural
Equation Modeling, 13(3), 465-486.
Goodhue, D., Lewis, W., & Thompson, R. (2006). Pls small sample size and statis-
tical power in mis research. IEEE Computer Society.
Hair, J. F., Ringle, C. M., & Sarstedt, M. (2011). Pls: indeed a silver bullet. Journal
of Marketing Theory and Practice, 19(2), 139-151.
Ichikawa, M., & Konishi, S. (1995). Application of bootstrap methods in factor
analysis. Psychometrika, 60, 77-93.
MacCallum, R. C., & Austin, J. T. (2000). Application of structural equation model-
ing in psychological research. Annual Review of Psychology, 51(1), 201-226.
Marcoulides, G. A., Chin, W., & Saunders, C. (2009). Foreward: a critical look at
partial least squares modeling. MIS Quarterly, 33(1), 171-175.
Marcoulides, G. A., & Saunders, C. (2006). Pls: a silver bullet? a commentary on
sample size issues in pls modeling. MIS Quarterly, 30(2), 3-10.
Monecke, A. (n.d.). sempls: an r package for structural equation models using
partial least squares.
Mooney, C. (1996). Bootstrap statistical inference: examples and evaluations for
political science. American Journal of Political Science, 40(2), 570-602.
Mooney, C., & Duval, R. (1993). Bootstrapping: a nonparametric approach to
statistical inference. Newbury, CA: Sage.
Nevitt, J., & Hancock, G. R. (2001). Performance of bootstrapping approaches
to model test statistics and parameter standard error estimation in structural
equation modeling. Structural Equation Modeling, 8(3), 353-377.
Reinartz, W. J., Heinlein, M., & Henseler, J. (2009). An emprical comparison of the
efficacy of covariance based and variance based sem. International Journal
of Research in Marketing, 26(4), 332-344.
Ringle, C. M., Sarstedt, M., & Straub, D. (2012). A critical look at the use of
pls-sem in mis quarterly. MIS Quarterly, 36(1), 3-14.
Team, R. D. C. (n.d.). R: a language and environment for statistical computing.
Vale, C. D., & Maurelli, V. A. (1983). Simulating multivariate non-normal distri-
butions. Psychometrika, 48, 465-471.
References 9
Yung, Y. F., & Bentler, P. M. (1994). Bootstrap corrected ADF test statistics in
covariance structure analysis. British Journal of Mathematical and Statistical
Psychology, 47(1), 63-84.