Tourism Sector, Travel Agencies, and Transport Suppliers: Comparison of Different Estimators in The Structural Equation Modeling
Tourism Sector, Travel Agencies, and Transport Suppliers: Comparison of Different Estimators in The Structural Equation Modeling
Tourism Sector, Travel Agencies, and Transport Suppliers: Comparison of Different Estimators in The Structural Equation Modeling
Abstract The paper addresses the effect of external integration (EI) with transport suppliers on the efficiency
of travel agencies in the tourism sector supply chains. The main aim is the comparison of different estimation
methods used in the structural equation modeling (SEM), applied to discover possible relationships between EIs and efficiencies. The latter are calculated by the means of data envelopment analysis (DEA). While
designing the structural equation model, the exploratory and confirmatory factor analyses are also used as
preliminary statistical procedures. For the estimation of parameters of SEM model, three different methods are
explained, analyzed and compared: maximum likelihood (ML) method, Bayesian Markov Chain Monte Carlo
(BMCMC) method, and unweighted least squares (ULS) method. The study reveals that all estimation methods
calculate comparable estimated parameters. The results also give an evidence of good model fit
performance. Besides, the research confirms that the amplified external integration with transport providers
leads to increased efficiency of travel agencies, which might be a very interesting finding for the operational
management.
Key wordsTourism Sector, Structural Equation Modeling, Estimation methods, External Integration, Efficiency
of Travel Agencies.
I. INTRODUCTION
Structural equation modeling (SEM) is a family of advanced statistical tools for modeling the
relationships between different types of variables. It can deal with an enormous number of
exogenous and endogenous variables, as well as with unobserved latent variables (factors constructs) expressed by linear combinations of the measured indicator (manifest - item) variables
[1, 2]. Since some of the variables involved in SEM are latent, structural equation modeling is
sometimes also called as latent variable modeling. One of the primary goals of SEM is an estimation
of causal effects between addressed variables. For this reason, SEM modeling has been also
referred as causal path modeling [1]. The latter represents the extension of multiple regression
analysis and enables efficient background for modeling the complex causal relationships among
the multiple variables [3].
Another name for SEM modeling is covariance structures modeling, since the investigation of
particular covariance and correlation patterns among the treated variables is engaged here and
the covariance analysis methods are used for SEM estimation [1, 2, 3]. Different names for SEM
modeling are all consistent with Bollens definition from 1989, who proposed that SEM be based on
three main analytical methodologies: (1) path analysis, (2) latent variable analysis and modeling,
and (3) covariance estimation methods [2, 4].
In statistical manner, SEM can be treated as an integration, generalization, and extension of
familiar general linear statistical models such as multiple regression modeling, analysis of variance
(ANOVA), and factor analysis [1]. Due to a broad spectrum of covariance analysis methods, which
can provide accurate estimates, SEM can be conducted for different types of data, such as
continuous, ordinal, longitudinal, cross-sectional, and so [1, 2, 3, 5].
Since SEM is a confirmatory type of modeling, we can simplistically say that it combines the
multiple regression analysis, simultaneous equations models, and confirmatory factor analysis (CFA)
into the comprehensive statistical modeling framework [1, 2, 3]. SEM model comprises two main
submodels, measurement and structural submodel, which can be estimated simultaneously [2]. The
latter means that the relations between observed indicators and latent variables (measurement
part) and the causal relations between the latent variables among themselves (structural part) can
be evaluated in a single model [1, 3]. The only condition for simultaneous estimation of both
submodels is the ensuring of the full-information estimation methods [1, 3].
Applications of SEM that concentrate exclusively on the relationships between latent variables
and their observed indicators are usually referred to as CFA analysis [1]. In this case, the SEM model
comprises only one part, that is measurement submodel. Contrary to exploratory factor analysis
(EFA), the CFA investigates how well the measured indicators characterize the unobserved latent
variables. For this purpose, individual statistical tests are applied while the factor structures are
hypothesized in advance and then verified empirically. Although EFA does not test a certain
theory, but only derive particular factor structure from the data, it can be a still useful preliminary
guideline for subsequent CFA. This way, the nature of the latent variables can be initially inspected,
and a preliminary insight of the relationships between factors and their measured indicators can be
initially provided [3, 6].
One of the main aims of SEM modeling is to explore whether the hypothesized theoretical model
consistently reflects the measured data [1]. For this purpose, the different Goodness-of-fit (GOF)
indices are used to verify if a model defined by the researcher is consistent with variancecovariance patterns in the data [2]. By other words, this means that the GOF indices support us to
identify the level of plausibility and adequacy of assumed relationships between the variables
addressed in the SEM model [1, 3].
According to [1, 2], the most frequently used estimation methods in SEM modeling are:
maximum likelihood (ML) method, generalized least squares (GLS) method, unweighted least
squares (ULS) method, weighted least squares (WLS) methods, also called asymptotically
distribution-free (ADF) methods, and Bayesian Markov Chain Monte Carlo (BMCMC) methods. All
mentioned estimators have their advantages and weaknesses, such as issues about normality
violations, appropriateness of sample size, and so.
The ML estimator is probably the most popular among researchers because it is justly robust
against violations of normality conditions [2]. Its solution maximizes the probability that the observed
covariances belong to the population, which has its variances and covariances produced by the
process implied by the model, where a multivariate normal distribution is assumed [2].
Surprisingly, despite the extensive use of SEM modeling in many areas, a relatively little research
has been reported in the scholarly literature on comparison of achieved results with several
different estimation methods applied to the same real cases [7]. This is particularly true for the
tourism sector and tourism supply chains, which are addressed in this study. More precisely, the
paper addresses an investigation of possible impacts of external integration (EI) with transport
providers on the efficiency (EFF) of travel agencies. For this purpose, the SEM model is constructed
on the basis of questionnaires data collected in the survey, which was proceeded among the
chosen Croatian agencies.
The main aim of the SEM model is to identify how the differences in the integration level
regarding the different kinds of suppliers (water, air, bus, and rail) influence on the efficiency of the
agencies. From existing literature, it is evident that practically none of similar research has been
done in the field. But as noted in some previous studies [8-14], the amplified integration of members
in the tourism supply chains definitely leads to improved performance and bigger quality of
services. Since there is a quite big gap detected in the existing literature about similar kind of
research as it was ours, we believe that the findings of this study might serve as a major contribution
of this paper.
The efficiencies of the travel agencies are calculated by the means of data envelopment
analysis (DEA). During the construction of SEM model, the EFA and CFA analyses are also employed
as preliminary stages. Namely, it is recommended to conduct the CFA alone before the estimation
of the entire SEM model since its measurement part must be firstly separately statistically evaluated.
The reason is to verify independently if hypothesized factor model, reflected in its indicators,
adequately fits the real data.
For the estimation of parameters of given SEM model, three different estimators, ML estimator,
BMCMC estimator, and ULS estimator, are applied. For these estimators, their characteristics and
different properties are shortly stressed. Afterward, the comparison of achieved estimation results by
these estimators is performed and discussed.
When a SEM model is finally estimated and appropriately evaluated, it can be used to identify
the relationships between the measured item indicators and the latent factors EI-s. Also, the causal
directional paths from these factors to the latent factor EFF can be investigated. While doing these
calculations, the program package IBM SPSS V21 and its extension AMOS were applied.
Afterward, during the estimation process, the main issue is to find such model parameters that
the model-implied variances and covariances are as close as possible to the observed sample
variances and covariances [2].
Fig. 2 shows most frequently used estimation methods, used in SEM modeling process. All the
details about those methods can be investigated in the scholarly literature [1, 3, 6, 15]. In the
sequel, the main properties of corresponding estimators will be explained.
inspection of the overall fit of the model to the data by the means of test. And even more,
when this case happens, the asymptotic variance-covariance matrix of the ML estimator also
provides the calculation of standard error estimates, which enables us to conduct the significance
tests [1]. On the contrary, when the measured data are substantially non-normal, the calculated
2
different assumptions about the observed variables. Some of them can effectively face with the
non-normal character of the variables while the others are even specially developed for the
categorical variables. The main properties of these estimators can be shortly stressed as follows.
GLS estimator minimizes the so-called weighted residual function by the means of different
iterative algorithms. This estimator assumes the multivariate normality of the data with no excessive
kurtosis. It is also characterized as asymptotically unbiased, consistent, efficient, and normally
distributed full-information estimator.
ULS estimator does not have any assumptions about the distribution of measured variables. In
general, it is less efficient than maximum likelihood estimator. As it turns out, it has one specific
requirement, which demands that all indicators must be observed on the same scale.
WLS (ADF) estimators are also insensitive to the distributional properties of the measured
variables. When asymptotic covariance matrix is applied here, these estimators also involve forthorder moments around the mean, which are additionally included in estimation besides the second
order moments. For the adequate estimation, these methods rigorously require the large sample
size. Since the full-weight matrix must be inverted here, they are computationally very expensive
estimators.
DWLS estimators are very useful in the case of significantly non-normal ordinal variables when we
are also dealing with so-called polychoric correlations between the categorical variables. To avoid
the computational wastefulness of WLS estimators, DWLS estimators might usually be a better
choice [1].
WLSWM and WLSMV estimators are extraordinary estimators, which were specially designed for
2
the variables with categorical nature. Herein, the corrected test statistics is also available [6].
As it turns out, among the estimation methods, based on polychoric correlations, the WLSMV
method has been indicated to yields better results than the WLS and WLSM estimator in type I error
control [1].
BMCMC estimators are very appropriate for the noticeably non-normal categorical variables.
These estimators do not require any assumption about the asymptotic normality of the estimated
parameters. The reason is that the Bayesian credibility intervals only rely on percentiles of the
posterior distribution, not limited to any fixed form [1, 3].
The BMCMC estimators are modern and up-to-date alternative to other SEM estimation methods
[1, 5, 16]. They use the Markov Chain Monte Carlo (MCMC) procedures for gradual reduction of the
uncertainty in the parameter estimates [5]. Since these methods are insensitive to the normality
issues, they can be competently used to examine the correctness of results of other, more classical
SEM estimation methods, particularly in the case of ordinal and slightly non-normal data [5].
Besides, it was reported in several studies that the BMCMC estimators can provide more
accurate estimates for smaller sample sizes than some other estimation methods, such as, for
example, the ML method [17, 18]. In general, the main property of these methods is the capability
of combining the prior knowledge about the parameters with the fact that the modeling process
does not depend on the asymptotic theoretical baseline [19]. This property becomes particularly
essential in the case of small sample size and ordinal or markedly non-normal data.
The main philosophy of Bayesian estimation is the fact that every parameter can be addressed
as a random variable with associated probability distribution. Then the assumed prior probability
distribution can be combined with the empirical information carried in the sample data by the
means of the Bayes' theorem, which gives us the posterior distribution [16].
The uncertainty in the estimated parameters is afterward progressively reduced by the
generation of new data, which are produced from the original sample by using the MCMC
procedure [5, 16]. The latter picks up the repeated samples from the given dataset and generates
a big number of the estimates for each model parameter. This way, posterior probabilities of those
parameters can be also derived, and the mean values of posterior distributions can be used for the
parameter estimates [16].
Since the maximum likelihood estimator provides quite a big number of GOF indices, it is usually
desired to use it in the estimation process. As Olsson and his colleagues suggest [20], it is convenient
to employ several estimators while doing estimation (for example ML, WLS, and GLS estimators,
etc.), and then investigate whether all of them provide similar estimation results. If so, we have an
additional confirmation that the model structure is correctly identified, as well as the parameter
estimates accurate enough. Such logic was for example used in work [21], where authors applied
two estimators, ML estimator, and ULS estimator. Similarly, following the suggestions of Byrne [5],
researchers in study [22] compared the results of ML and BMCMC estimators, where Bayesian
method was used to reaffirm the results of ML estimator.
In our case, we have used three estimators, ML estimator, ULS estimator, and BMCMC estimator,
by which it was possible to compare the calculated results for parameter estimates of the SEM
model.
IV. CONCEPTUAL FRAMEWORK, HYPOTHESIZED MODEL, AND SURVEY
A conceptual framework of the hypothesized model is depicted in Fig. 3. Data collection was
carried out by a conduction of a survey among 671 travel agencies, located alongside the NorthEast coast in the Adriatic Sea. The questionnaire was divided into two parts. The first one was
interrelated with the external integration indicators for each type of transport supplier. It consisted
of 11 ordinal variables, as follows: Wi QiW , i 1,...,11 (water suppliers), Ai Qi A , i 1,...,11 (air
suppliers), Bi Qi B , i 1,...,11
measures were needed to evaluate the behavioral magnitudes of external integration with
transport suppliers and encompassed the following crucial EI dimensions: interaction, consultation,
and collaboration [10].
Creation of ordinal variables Wi , Ai , Bi , Ri was based on interviewing the managers of travel
agencies. They were asked to estimate the level of relationships with transport suppliers on the
ordinal scale from 1 (zero cooperation) to 5 (total cooperation). The structure of survey questions of
the first part of the questionnaire is shown in Fig. 3.
The second part of the questionnaire consisted of five indicators. They can be called as an
agencies inner variables, denoted by: xi , i 1, 2,3, y j , j 1, 2 , where xi refer to input variables,
while y j refer to output variables. The meaning of these variables can be seen in Fig. 3. These
variables were needed to calculate the efficiencies of the travel agencies, similarly as it was
reported in study [23].
From Fig. 3 can be seen that we conducted four main hypotheses Hi , i 1, 2,3, 4 , which
indicate that external integrations with transport suppliers, denoted by EI i , i 1, 2,3, 4 , do have a
certain influence on the efficiency (EFF) of the agencies. Additionally, six sub-hypotheses have
been applied, by which it was supposed that external integrations are interrelated among
themselves as well (their connections are not shown in Fig. 3).
Figure 4. The main methodologies used in the SEM modeling process for the case of agencies
VI.
EI W
EI A
EI B
EI R
Cronbach Alpha
0,960
0.972
0.921
0.947
Eigenvalues
9,951
5.495
2.725
2.346
% of Variance
39.805
21.980
10.902
9.383
Cumulative %
39.805
61.785
72.687
82.069
Factor
CR
AVE
EI A
0.969
0.819
EI W
0.949
0.791
EI B
0.883
0.717
EI R
0.947
0.819
model structure. This is logical since our primary interest in this study was to reveal and estimate the
causal paths between the integration factors EIW , EI A , EI B , EI R on one side, and the efficiency
factor EFF on the other side.
Table 3 shows the comparison of achieved estimation results for all used estimators, ML, ULS, and
BMCMC estimator. These results refer to the standardized weights of causal paths between
integration factors and efficiency factor, as well as to the standardized weights of correlations
among the integration factors themselves.
Table 3. Comparison of achieved estimation results (Maximum likelihood, Bayesian estimation,
Unweighted least squares)
Standardized
Weight on:
Type of relation
ML estimator
BMCMC
estimator
ULS estimator
Significance
EIW EFF
Causal path
0.421
0.41
0.408
yes
EI A EFF
Causal path
0.499
0.477
0.395
yes
EI B EFF
Causal path
-0.148
-0.13
-0.037
no
EI R EFF
Causal path
-0.009
-0.024
-0.05
no
EIW EI B
Correlation
0.538
0.537
0.429
yes
EIW EI R
Correlation
0.484
0.486
0.412
yes
EI A EI B
Correlation
0.502
0.53
0.464
yes
EI B EI R
Correlation
0.39
0.348
0.292
yes
EIW EI A
Correlation
0.166
0.153
0.177
no
EI A EI R
Correlation
0.165
0.158
0.225
no
Careful observation of results in Table 3 leads us to the following conclusions for all three used
estimators:
1. The weights of two causal paths and four correlations have been estimated as positive and
statistically significant;
2. Roughly speaking, all estimators have provided more or less similar results for those weights,
which were significant. This is particularly true for the ML and BMCMC estimators, which gave
quite comparable results. The results achieved by the ULS estimator slightly diverge from results
of other two estimators but are still comparable to them.
3. Since the results of all three estimators are sufficiently close to each other, we can derive similar
conclusions on their basis. Additionally, we can simultaneously apply the GOF indices of all
addressed estimators in order to carry out an as much reliable model fit evaluation as possible.
Table 4 shows the GOF indices obtained for the case of ML estimator. Their values were
compared with the required threshold values, given in the literature [1, 5, 6, 15]. Since the
comparison gave adequate results, it was concluded that the overall SEM model provides a
reasonably good fit to the data. As it turned out, the same conclusion was derived on the basis of
observing of GOF indices related to the ULS and BMCMC estimators.
Table 4. GOF indices for developed SEM model (Maximum likelihood case)
Fit Index
Description
Value for ML
estimator
152.329
2
df
1.058
RMSEA
0.031
0.911
0.993
CFI
0.995
IFI
0.995
0.0602
NFI
NNFI (TLI)
SRMR
Based on estimation results presented in table 3, Fig. 5 can be created, which corresponds to
the conceptual framework shown in Fig. 3. Fig. 5 represents the final estimated SEM model, where
the retained indicator items are also depicted. Dashed lines refer to the causal paths or
correlations with statistically insignificant weights (c.f. Table 3).
Achieved results imply that the effect of factors EI B and EI R on efficiency EFF cannot be
supported, so the hypotheses H 3 , H 4 are rejected. However, on the other side, the first two
hypotheses,
H 1 and H 2 , evidently can be accepted, which implies that the effect of factors EIW
13.
P.A. Phillips, Hotel performance and competitive advantage: a contingency approach, International
Journal of Hospitality Management, vol. 11, pp. 359-65, 1999.
14.
L. Enz, L. Canina, and K. Walsh, Hotel industry average: an inaccurate tool for measuring
performance, The Cornell Hotel and Restaurant Administration Quarterly, vol. 42 (6), pp. 22-32, 2001.
15.
R. B. Kline, Principles and Practice of Structural Equation Modeling, 3rd ed., The Guilford Press: New York,
2011.
16.
J. L. Arbuckle, IBM SPSS Amos 20 Users Guide, IBM, Amos Development Corporation: Chicago, b.d.
17.
J. Evermann, and M. Tate, Bayesian Structural Equation Models for Cumulative Theory Building in
Information SystemsA Brief Tutorial Using BUGS and R, Communications of the Association for Information
Systems , vol. 34, pp. 14811515, 2014.
18.
T. Asparouhov, and B. Muthn, Bayesian analysis using Mplus, Technical appendix. Los Angeles, Muthn
& Muthn, 2010.
19.
P. Congdon, Applied Bayesian Modeling, West Sussex, England: John Wiley & Sons, 2003.
20.
U. H. Olsson, T. Foss, S. V. Troye, and R. D. Howell, The Performance of ML, GLS, and WLS Estimation in
Structural Equation Modeling Under Conditions of Misspecification and Nonnormality, Structural Equation
Modeling, vol.7, no. 4, pp. 557595, 2000.
21.
E. Penelo, C. Viladrich, and J. M. Domnech, Perceived parental rearing style in childhood: internal
structure and concurrent validity on the Egna Minnen Betrffande Uppfostran-Child Version in clinical settings,
Comprehensive Psychiatry, vol. 51, pp. 434442, 2010.
22.
M. R. A. Hamid, Z. Mustafa, F. Idris, M. Abdullah, and N. R. M. Suradi, Measuring ValueBased
Productivity: A Confirmatory Factor Analytic (CFA) Approach, International Journal of Business and Social
Science, vol. 2, no. 6, April 2011.
23.
R. Fuentes, Efficiency of travel agencies: A cases study of Alicante, Spain, Tourism management, vol.
32, no. 2, pp. 75-87, 2011.
24.
W.W. Cooper, L.M. Seiford, and K. Tone, K., Data Envelopment Analysis: A Comprehensive Text with
Models, Applications, References and DEA-Solver Software, Massachusetts, USA: Kluwer Academic Publishers,
2006.
25.
J.B. Ullman, Structural equation modeling: Reviewing the basics and moving forward, Journal of
Personality Assessment, vol. 87, no. 1, pp. 3550, 2006.
26.
R. Weston, and P.A. Gore, A Brief Guide to Structural Equation Modeling, The Counseling Psychologist,
vol. 34, no. 5, pp. 719-751, 2006.
27.
C.P. Chou, and P.M. Bentler, Estimates and tests in structural equation modelling, In R. Hoyle, Structural
equation modeling: Concepts, issues, and applications, Thousand Oaks, CA: Sage, pp. 3755. 1995.
28.
X. Zhai, A.M.M. Liu, and R. Fellows, Human Resource Practices in Chinese Construction Organizations:
Development of a Measurement Scale, International Journal of Architecture, Engineering and Construction,
vol. 2, no. 3, pp. 170-183, 2013.
29.
M. Lei, and R. G. Lomax, The effect of varying degrees of nonnormality in structural equation
modelling, Struct. Equ. Modeling, vol. 12, pp. 127, 2005.
30.
P. J. Curran, S. G. West, and J. F. Finch, The Robustness of Test Statistics to Nonnormality and
Specification Error in Confirmatory Factor Analysis, Psychol. Methods, vol. 1, pp. 1629, 1996.
31.
M. Sahin, A. Todiras, P. Nijkamp, B. Neuts, and C. Behrens(2013), A Structural Equations Model for
Assessing the Economic Performance of High-Tech Entrepreneurs, In R. Capello, and T.P. Dentinho,
Globalization Trends and Regional Development , vol. 48, 2013.
32.
M.T. Frohlich, and R. Westbrook, R., Arcs of integration: an international study of supply chain
strategies, Journal of Operations Management, vol. 19, pp. 185200, 2001.
33.
D.A.
Kenny,
Measuring
Model
Fit,
Retrieved
on
February
6,
2014,
from
http://davidakenny.net/cm/fit.htm
34.
C. Fornell, D.F. Larcker, Evaluating structural equation models with unobservable variables and
measurement error, Journal of Marketing Research, vol. 18, pp. 39-50, 1981.
AUTHORS
A. Nataa Kovai, Phd, is the Higher Assistant at the Faculty of Tourism and Hospitality
Management, University of Rijeka, Opatija, Croatia (e-mail: [email protected]).
B. Darja Topolek, PhD, is the Assistant Professor at the Faculty of Logistics, University of Maribor,
Celje, Slovenia (e-mail: darja.topolsek@ um.si).
C. Dejan Dragan, Phd, is the Assistant Professor at the Faculty of Logistics, University of Maribor,
Celje, Slovenia (e-mail: [email protected]).
Manuscript received by 21 April 2015. [21 April 2015]
Published as submitted by the author(s).