Bayesian Network Prediction Model
Bayesian Network Prediction Model
a r t i c l e i n f o a b s t r a c t
Keywords: For effective Bayesian networks (BN) prediction with prior knowledge, this study proposes an integrated
Tourism management BN mechanism that adopts linear structural relation model (LISREL) to examine the belief or causal rela-
Loyalty tionships which are subsequently used as the BN network structure for predicting tourism loyalty. Four
Bayesian networks hundred and fifty-two valid samples were collected from tourists with the tour experience of the Toyugi
Linear structural relation model
hot spring resort, Taiwan. The proposed mechanism is compared with back-propagation neural networks
(BPN) or classification and regression trees (CART) for 10-fold cross-validation. The results indicate that
our approach is able to produce effective prediction outcomes.
Ó 2009 Elsevier Ltd. All rights reserved.
0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.04.010
C.-I Hsu et al. / Expert Systems with Applications 36 (2009) 11760–11763 11761
and promotion willingness can lead to revenue growth of the firms to verify the hypothesized relationships, but it is seldom combined
and the increase of market share; (2) reduction of costs; (3) in- with other machine-learning algorithms. This study uses LISREL to
crease of employees’ work satisfaction (Jacoby, 1994). In order to aid BN in discovering a suitable network architecture for
increase the customers’ loyalty, companies need positive customer prediction.
relationship management (CRM). CRM means the enterprises find
the customers’ real needs with the support of process and technol- 3.1. LISREL
ogy, and improve the products and services that are devoted to the
enhancement of customer loyalty (Kalakota & Robinson, 1999). LISREL is one SEM technique which combines the concepts of
Spengler (1999) also suggested that CRM integrates planning, mar- both factor analysis and path analysis. It is especially appropriate
keting and customer service by information technology, and pro- to use LISREL to analyze the data in social and behavioral research
vides customized services to increase customer loyalty and fields. While multiple regression can estimate the parameters of
corporate operational benefits. In addition, Hui, Wan, and Ho only one linear equation at a time, LISREL can simultaneously pro-
(2007) indicated the characteristics of tourist attractions, such as cess multiple sets of variable relationships to estimate the param-
interesting cultures, attractive urban sightseeing, interesting night eters in an entire system of linear equations in a model. The LISREL
life and attractive natural and scenic aspects might increase cus- model and equations are shown in Fig. 1 in which E is disturbance;
tomer satisfaction and revisiting will. According to the literature g is the vector of endogenous latent variable; a is intercept; B is the
review discussed above, this research proposes three factors which matrix of regression coefficient for endogenous latent variable; C is
might increase tourism loyalty: customer service, web function the matrix of regression coefficient for exogenous latent variable; n
with the support of technology and local characteristics. is the vector of exogenous latent variable; and f is the vector of
Three factors influence travelers’ tourism loyalty are proposed latent disturbance. The analysis consists of two steps: (1) measure-
based on a literature review, including (1) Customer Service (CS): ment model analysis, which aims to analyze the loading relation-
the service consumers received from employees; (2) Web Function ships between latent variables and their corresponding
(WF): the functions providing by the tour web site; (3) Local Char- observable variables, and (2) structural model analysis, which aims
acteristics (LC): the consumer’s perception of the local tourism to analyze the hypotheses relationships among latent variables.
characteristics; and Tourism Loyalty (TL): the loyal degree regard-
ing a tourist revisit a destination. This study suggests that the 3.2. BN
greater the degree to which a tourist perceived regarding customer
service, web function and local characteristics of the destination, A BN is a graphical model of variables and their relationships
the greater is his/her tourism loyalty, which refers to repeat of vis- based on probability theory. It is also called a belief network or
it, willing to revisit and recommend to others. Therefore, H1–H3 is causal network. A BN uses prior probabilities and probabilities in
established as below: sample space to estimate posterior probabilities. In a BN graph, ar-
rows between nodes are used to represent a directed acyclic graph
H1: Customer service positively influences tourism loyalty. (DAG) (Niedermayer, 1998). Each parent node represents the cause
H2: Web function positively influences tourism loyalty. of an event, a child node represents the outcome, and an arrow
H3: Local characteristics positively influences tourism loyalty. represents the causal relation. As shown in Fig. 2, the set of parent
nodes of a node TL is denoted by parents(TL) and the joint distribu-
All questionnaire items are shown in Table 1. Each tourist was tion of the node values can be written as the product of the local
asked to rate on a scale of 1–5 his or her degree of agreement with distributions of each node and its parents.
each item. Advantages of BN include the ability to analyze problems with
incomplete data and to combine domain knowledge and data
3. The integrated Bayesian network mechanism (Hackerman & Wellman, 1995). However, without prior under-
standing or knowledge about the problem domain, the required
To overcome the difficulty of constructing a BN structure when significant computational effort of an NP-hard task in exploring a
learning from data, this study proposes a novel approach that com-
bines LISREL and BN to predict a tourist’s loyalty level. LISREL is an
advanced statistical technique in the social and behavioral sciences E1 CS1
V1
E2 CS2
V2
Table 1 CS
E3 V1
CS3
The questionnaire items.
collected after deducting the sample with more than three ques-
CS WF LC tions not answered. The effective response rate was 82.2%.
The resultant sample was then randomly split into two sub-
samples. The first subset was used for exploratory factor analysis
(EFA) to identify the factor structure hidden in the data collected.
TL The other subset was used for LISREL analysis. SPSS 13.0 for Win-
dow version was used for EFA analysis. As shown in Table 2, using
the Eigen value rule, a 4-factor structure emerged.
P (CS ,WF , LC , TL ) = ∏ P (TL parents (TL)) The reliability and validity are examined as shown in Table 3.
The Cronbach’s a values for all constructs are all above the 0.7 level
Fig. 2. BN: predicting the outcome level. (Cronbach, 1947). We examined convergent validity with the com-
posite reliability (CR) and the average variance extracted (AVE) by
the constructs. A CR greater than .60 is preferred (Fornell & Larcker,
Table 2
The result of EFA.
1981) and all constructs in this study met this requirement. The
AVE for all constructs in this study exceeded the preferred 0.5 (For-
Factor Item 1 2 3 4
nell & Larcker, 1981).
Customer service CS1 .796 .128 .014 .181
CS2 .692 .284 .061 .065 4.2. Verifying relationships in the research model
CS3 .648 .380 .301 .123
CS4 .751 .081 .281 .159
CS5 .700 .188 .364 .141 We employed LISREL to examine the measurement and struc-
Web function WF1 .274 .794 .106 .251
tural models. Regarding whether the measurement model and
WF2 .191 .895 .155 .142 structural model are good of fit, the criteria of goodness-of-fit mea-
WF3 .258 .742 .259 .220 sures are as follows: v2 =df is suggested to be smaller than 2 for
Local characteristics LC1 .131 .205 .764 .257 good of fit (Carmines & McIver, 1981) and 3 for acceptable fit (Chin
LC2 .287 .108 .649 .262 & Todd, 1995); CFI is suggested to be greater than 0.95 for good fit
LC3 .158 .142 .766 .113 (Bentler, 1995); NFI, NNFI greater than 0.9 for good fit (Hu & Jen,
Tourism loyalty TL1 .044 .256 .240 .701 2005); IFI greater than 0.9 for good fit (Hu & Jen, 2005); GFI and
TL2 .163 .191 .145 .821 AGFI greater than 0.9 for good of fit, and 0.8 for acceptable fit (Sub-
TL3 .265 .100 .211 .769
hash, 1996); PGFI greater than 0.5 for good fit (Mulaik, James, Van
Altine, Bennett, & Stilwell, 1989); SRMR smaller than 0.08 for good
fit (Hu & Jen, 2005); and RMSEA smaller than 0.05 for good fit and
Table 3
Cronbach’s a, CR, and AVE. 0.08 for acceptable fit (McDonald & Ho, 2002). The goodness-of-fit
indexes for both measurement and structural models are accept-
a CR AVE
able as shown in Table 4.
CS 0.849 0.899 0.642 As shown in Table 5, all path coefficients are significant at the
WF 0.873 0.785 0.549
0.05 level in the structural model. The results indicate that the
LC 0.731 0.764 0.520
TL 0.770 0.875 0.700 hypothesized relationships are supported. Square multiple correla-
tions (SMC) are also reported, which is 0.73 indicating the
explained proportion of variance in tourism loyalty.
previously unknown network is costly and inefficient (Niederma-
yer, 1998). 4.3. Predicting the level of tourism loyalty
BN is becoming an increasingly important solution for practical
problems in the field of Artificial Intelligence (Korb & Nicholson, In the above section, we describe LISREL analysis employment
2003). The applications of BN include the areas of maintenance to verify the hypothesized relationships in the proposed research
project delays based on specialists experience (de Melo & Sanchez, model. Subsequently, this study adopts the supported relation-
2008), detecting firms that issue fraudulent financial statements ships as the BN network structure to predict a tourist’s loyalty le-
(FFS) and identifying factors associated to FFS (Kirkos, Spathis, & vel. Based on the LISREL analysis results, the nodes CS, WF, and LC
Manolopoulos, 2007), and other problem domains. represent the antecedents of the outcome node TL. The input data
for each node is the average value of the corresponding question-
4. The experiment results naire items. The BN software Netica 2.05 (NORSYS, 2000) was used
to construct the probability model using the 452 valid samples. The
4.1. Sample and exploratory factor analysis same software was used to predict the outcome. All the learning/
testing results were obtained for 10-fold cross-validation.
This study develops a questionnaire based on the proposed For comparison purposes, the BPN and CART modules in SPSS-
items and delivered to 550 travelers with the tourism experience Clementine 8.1 software package were used to predict the same
of the Toyugi hot spring resort of farmers’ association, Taitung outcome TL. The system’s default control parameters were
County, Taiwan. Four hundred and ninety-eight travelers replied adopted. The input variables are the same constructs in the re-
the questionnaires. Four hundred and fifty-two valid samples were search model including CS, WF, and LC.
Table 4
The goodness-of-fit indexes.