Knowledge Graph Completion For Activity Recommendation in Business Process Modeling
Knowledge Graph Completion For Activity Recommendation in Business Process Modeling
https://doi.org/10.1007/s13218-024-00880-7
TECHNICAL CONTRIBUTION
Abstract
Activity recommendation is an approach to assist process modelers by recommending suitable activities to be inserted at
a user-defined position. In this paper, we suggest approaching activity recommendation as a knowledge graph completion
task. We convert business process models into knowledge graphs through various translation methods and apply embed-
ding- and rule-based knowledge graph completion techniques to the translated models. Our experimental evaluation reveals
that generic knowledge graph completion methods do not perform well on the given task. They lack the flexibility to capture
complex regularities that can be learned using a rule-based approach specifically designed for activity recommendation.
Vol.:(0123456789)
KI - Künstliche Intelligenz
1
This problem is also referred to as link prediction.
KI - Künstliche Intelligenz
2.1 Activity Recommendation Methods specifically been developed for the activity recommendation
problem.
The first methods designed for activity recommendation
were mostly based on graph mining techniques [5–7]. 2.2 Knowledge Graph Completion Methods
Later Wang et al. developed an embedding-based approach
called RLRecommender [15], which embeds activities and We compare the selected methods for activity recommenda-
relations between them into a continuous low-dimensional tion against some of the best and most prominent methods
space. As shown in [8], the performance of RLRecom- for solving the general problem of knowledge base com-
mender is comparably low, since the recommendations for pletion. In particular, we focus on several knowledge graph
an unlabeled activity are only based on one related activity embedding models and one rule-based system. To explain
in the process model. Therefore, the method generates dif- these methods, we first introduce the notion of a knowl-
ferent recommendations depending on the chosen related edge graph and describe the knowledge graph completion
activity that is used to determine the recommendations for problem.
the unlabeled activity. In other words, RLRecommender A knowledge graph G = {(e, r, e� ) | e, e� ∈ E ∧ r ∈ R} is
lacks an aggregation method which combines all possible a set of triples. E denotes a set of entities. These entities can
recommendations given the process model under develop- be persons, locations, organizations or, in our case, activi-
ment in one recommendation list. ties and their labels. R is a set of relations that might contain
In [8], the authors presented a rule-based activity rec- relations as worksFor or locatesIn. From a logical point of
ommendation approach. The proposed method learns rules view a relation is a binary predicate and a triple (e, r, e� ) is a
that describe regularities in the use of labels in the given atomic fact that expresses that e is in relation r to e′. Knowl-
process model repository. For this purpose, the authors edge graphs are used to store our knowledge about a certain
defined various rule types that capture different co-occur- domain in simple formal representation.
rence and structural patterns. Then the method applies Given that our knowledge about a certain domain is usu-
the learned rules to the model under development making ally incomplete, we can assume that the knowledge graph
use of the full given context. In an extensive experimental itself is also incomplete. Knowledge graph completion deals
study the rule-based method outperformed a variety of with this problem in terms of completion tasks. A comple-
other approaches [15–17]. tion task (or completion query) asks to find a correct candi-
All approaches mentioned so far are limited to recom- date for the question mark in an incomplete triple (e, r, ?).
mend only those labels that have previously been used in the The answer to such a query is a ranking of candidates. The
models stored in the process model repository. In an exten- higher the correct candidate is ranked the better. We intro-
sion of the rule-based approach proposed in [8], the authors duce the metrics that are usually used to quantify the quality
developed in [9] a method that can recommend labels that of a ranking in Sect. 2.3.
have never been seen before. This approach combines com- The field of knowledge graph completion has for a long
ponents of the labels that are already known to create new time been dominated by knowledge graph embedding mod-
labels. This results in some cases in the recommendation of els. The standard evaluation protocol for knowledge graph
labels that do not appear as a whole within the given reposi- completion has already been proposed in 2013 in the paper
tory. A similar method is presented in [18], where process that also introduced the well-known model TransE [12].
element sequences are converted into text paragraphs. These TransE is a model that belongs to the family of translational
textual data are then represented using sentence embeddings, models. Given a triple (e1 , r, e2 ), in a (pure) translational
which are learned text representations capturing semantic model, the embedding of a relation r is used as a means to
information as numerical vectors. In [10], the authors went map (translate) the subject e1 into the object e2. Formally, we
a step further and presented an approach that is build on a have e∗1 + r∗ = e∗2 if the embedding function is denoted by *.
language model. This approach is capable to recommend Another well-known translation model, that has proven to
completely new labels where not even parts of these labels achieve very good results, is the model RotatE [19]. Here the
have been used in the process model repository previously. translation can be understood as a rotation in the complex
Within respect to our research question we have to limit embedding space. We included both TransE and RotatE in
ourselves to approaches that predict only labels that have our experiments. Specifically, the RotatE model maps the
previously been used in the repository. We have to stick to entities and relations to the vector space and defines each
this constraint as knowledge base completion methods are relation as a rotation from the source entity to the target
also limited to predict only those entities as candidates that entity.
are used in the given knowledge graph, i.e., that are ele- In adition to these models, we included two factorization
ments from E. Thus we use only RLRecommender [15] and models (DistMult [13] and TuckER [20]) and one hierar-
the rule-based method proposed in [8] as methods that have chical Transformer model (HittER [21]). These knowledge
KI - Künstliche Intelligenz
graph embedding (KGE) models are different in terms of Fig. 1. The most important aspect is related to the quality
how they combine entity and relation embeddings to capture of the predictions. In Fig. 1, three possible candidates are
the existence or absence of edges within the given graph. shown to the user followed by the Show more option. These
While DistMult uses a simple bilinear interaction between candidates are the top-3 candidates of a (in most cases) rela-
entities and relations [13], TuckER offers the advantages tively long ranking of candidates. The predictive quality of
of parameter sharing across various relations and the an approach is usually quantified by the positions of the cor-
decoupling of entity and relation embedding dimensions rect candidates. With respect to our use case position #1, #2
[20]. Finally, HittER employs a hierarchical Transformer and #3 are displayed to the user, while a position lower in
model to capture the interaction between entity and relation the ranking requires an additional interaction with the user
embeddings. interface to be displayed. Candidates that are ranked very
As an alternative to the embedding-based models, we also low will not be displayed to the user at all.
consider AnyBURL [22, 23], which is a rule learner spe- In [12] the authors proposed the Hits@k measure to quan-
cifically designed for knowledge graph completion problem. tify the quality of predictions, where k is usually set to 1 or
AnyBURL has been shown to perform on the same level as 10, resulting in Hits@1 and Hits@10 measures. Hits@10,
current state of the art KGE models [11]. Unlike embedding- for example, captures the fraction of hits in the top-10 rec-
based models, AnyBURL is a symbolic model that offers ommendations, i.e., the fraction of cases where the label that
interpretability by providing the rules contributing to its was actually used in a process model is among the ten rec-
predictions. In our experiments, we have incorporated Any- ommendations which are according to the employed method
BURL as an additional general method for the knowledge the most likely.
graph completion task. The Hits@10 metric distinguishes only between candi-
We have chosen the knowledge graph embedding models dates ranked within the top-10 and those that are ranked
TransE, DistMult, TuckEr, HittER, RotateE for the following lower. It does not differentiate between a correct candi-
reasons: TransE and DistMult are probably the models that date ranked at #2 and a correct candidate ranked at #9. To
are used most often in different application settings. They account for these differences, the MRR (mean rank recipro-
can be understood as classical models. TuckEr, HittER and cal) has been proposed [24], which has become the most
RotateE are younger models that have achieved very good important metric in knowledge base completion and link
results (see an overview in [11]). AnyBURL is a rule-based prediction [11]. The reciprocal rank of a recommendation
model that is also know to perform very well. All of these list has the value 0 if the actually chosen activity is not in the
models have in common that they are generic knowledge provided list and 1/p otherwise, where p denotes the position
graph completion models. None of these models has been of the hit in the list. The MRR is the mean of the recipro-
specifically been designed for activity label recommenda- cal ranks of all generated recommendation lists. Within our
tion. Within this paper we want to find out in how far these experiments we consider a recommendation list of length
generic models can be used to solve label recommendation 10 to compute the MRR, which is a close approximation of
tasks. To understand how well these generic models per- the MRR that is based on the full ranking. This approach
form, we compare them against the current state of the art is more realistic within our use case, as the list of recom-
approach for label recommendation described in [8]. This mendations shown to the user has to be limited. Within our
approach is a rule learning approach that is similar to Any- example, one can imagine that first the top-3 candidates are
BURL. However, the supported rule types have specifically shown and by pressing Show more this list is enlarged to
been designed for activity label recommendation. Addi- display the top-10.
tionally, we include RLRecommender [15], which is also a However, the predictive quality of the recommended
non-generic label recommendation approach that uses inter- activities is only one aspect. Another aspect is related to the
nally embeddings in rather specific way. Both approaches runtime of the algorithms that compute the recommenda-
work directly on the given process models. To apply the tions. If we are concerned with a user interface depicted
generic models, we first need to translate the given process in Fig. 1, the time required to make a prediction should be
model into a knowledge graph. We explain this translation limited, to avoid waiting time for the user. Within this paper,
in Sect. 3.2. we will not focus on an experimental analysis of runtimes.
Instead, we refer to runtimes reported and discussed in [11].
2.3 Evaluation Criteria If we look at the runtimes reported there, a trained model
can make a single predictions within several milliseconds
The evaluation activity label recommendation should con- no matter if these models belong to the family of embed-
sider different aspects, which should be reflected in the ding-based models or rule-based models. However, there
criteria used to evaluate and compare recommendation is a significant difference between both model families:
techniques. To illustrate these, reconsider the example from Embedding-based models need to include the process model
KI - Künstliche Intelligenz
under development in the embedding. This means that each Given a process model under development, the activity
edit operation performed by the user requires to retrain or recommendation problem is concerned with recommending
to update [25] the embeddings. While an update operation suitable activities to extend the model at a user-defined posi-
requires acceptable computational resources compared to a tion. The position of the activity that has to be recommended
full training run, it is unclear in how far predictive quality is given by the activity that was last added to the model.
remains stable. Therefore, the activity recommendation problem breaks
Finally, a user might also be interested in the reason or an down to finding a suitable label for the last added, and so
explanation for a recommendation. Symbolic approaches are far unlabeled, activity node ñ .
usually better suited to deliver these explanations in terms of
the rules that fired [26]. While there are also approaches to Definition 2 (Activity recommendation problem) Let B be
explain embedding-based models, more computational effort a set of business process graphs and LB the set of activity
is required and it can be doubted that these explanations are labels that are used in B . Let B = (N, E, 𝜆) be a given busi-
as appropriate as the direct symbolic explanations of a rule ness process graph under development, where each node
based approach [27]. We will not further discuss this issue n ∈ N except one node ñ is labeled, i.e., 𝜆(n) is given for all
within this paper, where we focus mainly on the predictive n ∈ N⧵{̃n}. The activity recommendation problem is to find
quality of the models that we analyze in our experiments. a suitable label 𝜆(̃n) ∈ LB for ñ .
(̃n, hasLabel, ?), where ñ denotes the unlabeled node in the much hasLabel updates based on the positive and negative
process model under development for which we want to sug- examples derived from the training triple. In our experi-
gest a suitable label. Once we applied one of the translation ments, we report on the results for the original dataset and
approaches, we can use a general knowledge graph comple- the dataset that we created with this augmentation strategy.
tion technique to solve such a completion task. We refer to the latter as the augmented dataset.
We can directly apply any standard knowledge graph In this section, we report on the design and results of our
completion approach on the outcome of our translation experimental study. We investigate the performance of dif-
approaches. However, we have to be aware that there are ferent approaches for applying existing knowledge graph
some specifics in our setting that might require us to mod- completion methods that have not specifically been designed
ify the standard training procedure to achieve good results. for activity recommendation. Additionally, we compare the
While our graph contains two or three relations, depend- approaches to the embedding-based activity recommenda-
ing on the translation approach, we are only interested in tion technique RLRecommender [15] and to the rule-based
completions task as (̃n, hasLabel, ?). This means that we are method presented in [8]. By analyzing the resulting predic-
always concerned with the hasLabel relation and, moreover, tions in detail, we detect an important reason for negative
we are only asking for possible objects (tails) given an activ- results and apply a post-hoc filtering technique to resolve it.
ity ñ as subject (head). This query direction is sometimes
called tail-direction. 4.1 Process Model Repository
Training a knowledge graph embedding model usually
involves exploring a hyperparameter search space to identify The Business Process Management Academic Initiative
the optimal or, at the very least, a proficient hyperparameter (BPMAI) [31] and SAP Signavio Academic Models (SAP-
configuration. This process typically relies on a validation SAM) [32] datasets are the most representative collec-
set. Adhering to this conventional approach, we have gener- tions of process models publicly available. SAP-SAM and
ated a validation set that resembles the test set by exclusively BPMAI, are collections of models that were created over the
containing hasLabel triples. Note that in most standard eval- course of roughly a decade on a platform that researchers,
uation datasets, the validation set contains triples for each of teachers, and students could use to create business (process)
the relations that appear in the training set. models. As both datasets are created by humans, they pro-
The validation set is also used to stop the training process vide the closest representation of typical modeling projects.
after several epochs to avoid overfitting to the training data. Both have similar characteristics, with SAP-SAM being an
This is usually done by analysing the development of the essentially larger version of BPMAI2.
mean reciprocal rank (MRR) against the validation set. If the We chose to use the BPMAI dataset for our experiments
MRR no longer improves (for several epochs), the training for two primary reasons. First, it is more widely employed
phase ends and the model that has been learned is used in in previous research [8, 33] for the label prediction task.
the prediction phase. As we know that we are only interested Within this paper we analyze for the first time in how far
in the tail-direction, we use the MRR in tail-direction as the standard KGE approaches can be used for label prediction.
stopping criteria. Thus, we focus only on results for a process model reposi-
If we take a look at the example in Fig. 2, it becomes tory that has already been used for evaluating the activity
obvious that the hasLabel triples are only a small fraction of recommendation problem. Second, size of BPMAI dataset
all triples. This means that the embeddings of the entities are is more manageable, allowing us to conduct more in-depth
only to a limited degree determined by the activity labels. In experiments, including hyperparameter tuning for knowl-
[30] the authors argued that a similar setting is quite often edge graph embedding (KGE) models, thus providing more
the case in down-stream applications of knowledge base insights than could be achieved by using the SAP-SAM
completion that the authors refer to as the recommendation dataset.3
case. They propose a specific training strategy that is based
on the idea to generate additional negative examples for 2
While SAP-SAM dataset contains more than one million business
the target relation that has to be predicted. Unfortunately, (process) models, its subset BPMAI dataset, includes roughly 30
we were not able to re-implement the proposed approach. thousands models.
Instead of that we simply increased the number of hasLabel 3
Note that we performed the experiments dividedly on two comput-
triples by adding for each hasLabel triple a copy of that ers: Intel® Xeon® CPU E5-2640 [email protected] GHz and I ntel® Xeon®
Silver 4114 [email protected] GHz. The runtimes of the evaluated KGE
triple. That results within the training process into twice as
KI - Künstliche Intelligenz
Contrary to previous work, we only use the last revisions Table 1 Statistics for the six knowledge graphs used to evalu-
of the BPMN 2.0 models in the collection. In contrast to ate knowledge graph embedding models. With “T” and “A” as the
employed translation and augmentation approaches, “Entitities” as
using all revisions, we thus represent the possible case that the number of entities in the knowledge graph, “Rel.” as the number
the recommendation methods sometimes only have few or of relation types, and “Train”, “Valid.”, “Test” as the number of tri-
even none domain-specific reference models for the sug- ples per split
gestion of activities available. This makes the datasets more
realistic and increases the test hardness. Moreover, out of
all last revision BPMN 2.0 models, we use only those with
3 to 50 activities and English labels. This choice results in a
process model repository consisting of 3, 688 process mod-
els. On average, the processes involve 14.3 activities while
the standard deviation equals 8.3.
4.2 Evaluation Setup
DistMult [13], TuckER [20]. Additionally, we incorporate
We employed a 80%-10%-10% data split to separate the pro- HittER [21], which utilizes the Transformer model.
cess model repository into train, validation and test splits. For the KGE model HittER, we opt for the context inde-
For the experiments, we create one recommendation task for pendent Transformer model available in libKGE. This vari-
every process model in the validation and test split. ant comprises a three-layer entity Transformer, excluding the
Evaluation procedure. We want to evaluate the six-layer context Transformer [21]. We refer to this model in
approaches on realistic recommendation tasks. Therefore, the following as HittER*. We also tested other popular KGE
we use an evaluation procedure in which we simulate the models (ComplEx [35] and ConvE [36]) but they yielded
current status of a process model under development from comparatively poor results that we do not report here.
a given business process graph by specifying the amount of Additionally, the results of the rule-based activity recom-
information that is available for the recommendation. The mendation method [8] and the results of the embedding-
basic idea is to remove some of the nodes and all edges con- based technique RLRecommender [15] are included in our
nected to these nodes from a given business process graph evaluation. Both techniques have been developed with a spe-
while treating the remaining graph as the intermediate result cial focus on the activity recommendation problem. One of
of a modeling process. The employed evaluation procedure our main goals within this paper is to find out whether these
is based on breadth-first search. In this procedure, we ran- specialized techniques perform worse or better compared to
domly select one activity, which is neither a source nor sink the general knowledge graph completion methods.
node, as the one to be predicted. During the evaluation, we Datasets. We take each of the three translation
therefore filter out processes which do not have a chain of at approaches defined in Sect. 3.2 along with their augmented
least three different activities when executed. Then we hide counterparts (see Sect. 3.3) to construct six different knowl-
the label of the chosen activity and determine the shortest edge graphs for our experiments. Each of these knowledge
path s from a source node to the activity. After that we hide graphs captures the triples obtained from process models in
all other activities that are not on a path of length s starting the training split (with or without augmentation), comple-
from a source node while the remaining activities and edges mented with the triples describing the context of the pro-
between them serve as a context for the recommendation cess models under development from the validation and test
task. splits.
Evaluated methods. Evaluated methods encompass the Table 1 shows the statistics of these graphs, showing the
rule learner AnyBURL [22], and five KGE models (also number of entities, relation types, and number of triples in
referred to as embedding-based methods). the training, validation, and test sets.
To apply knowledge graph embedding models, we use While the translation approaches 1 and 2a yield 70, 876
the PyTorch-based library libKGE [34]. The selected KGE entities, which is the sum of the total number of activity
models encompass TransE [12] and RotatE [19] from the nodes (48, 677) and activity labels (22, 199), approach 2b
translational family, along with two factorization models: also considers processes as entities, which adds the total
number of processes in the evaluation (3, 672) to the sum.
Relations followedBy and hasLabel are the basis for all
approaches. In the training set, there exist 57, 241 instances
Footnote 3 (continued) of the followedBy relation across all datasets. Addition-
models remain in a reasonable range of maximum 48 h for the hyper- ally, there are 47, 909 instances of the hasLabel relation in
parameter search given a particular translation approach. the datasets that are not augmented. The count of the this
KI - Künstliche Intelligenz
RLRecommender, which has been published in 2018, has Examination of the Predictions. To spot the reason for
an MRR of 23.8% and a Hits@10 score of 35.1%. The latter these unexpected results, we looked at some concrete cases.
means that in roughly one third of all recommendations the Figure 3 shows a process model in the validation set, where
correct one is among the top-10 list of recommendation. the activity Search for Units Available has been randomly
These results have clearly been topped in the rule-based selected as the one to be predicted. The use of translation
approach proposed in 2021 (see [8]) with an MRR of 41.4% approach 1 combined with TransE yields the top 10 recom-
and a Hits@10 score of 47.5%, which means that in nearly mendations shown on the right.
half of all recommendation tasks a correct recommenda- The correct activity Search for Units Available is at posi-
tion is among the top-10 list. This is a clear improvement tion eight of the recommendation list, which means that the
and divides the space of possible results into three areas: MRR would be 1/8 = 0.125, if this was the only completion
i) outcomes inferior to RLRecommender, ii) outcomes at task of the whole evaluation. Surprisingly, some of the items
least comparable to RLRecommender but falling short of the in the recommendation list are not labels but activity nodes
state-of-the-art rule-based approach (highlighted in bold in of the given process model, i.e., n3 and n4, as well as activ-
Table 2), and iii) outcomes as accurate as or potentially out- ity labels like Register or Enter Personal Details that have
performing the state-of-the-art rule-based approach (high- already been used in the process model. Clearly, the recom-
lighted in bold and underlined in Table 2). mendation of activity nodes is not useful since we are inter-
When looking at the results obtained by general knowl- ested in the prediction of activity labels. Also, it is likely
edge graph completion methods, we observe that the that activity labels that have already been inserted into the
Hits@10 numbers are at least 10 percentage points worse process model are not added a second time. Therefore, we
than those achieved by the rule-based activity recommen- decided to do a post-processing in which we filter out other
dation approach and also the MRR scores are worse. This recommendations than labels, i.e., activity nodes and pro-
means that no combination of tested methods and transla- cesses, as well as activity labels that are already present in
tion approaches works well for the activity recommenda- the given process model. In the example of Fig. 3, this means
tion problem. While some of the KGE models demonstrate that the correct prediction moves from position eight to posi-
Hits@10 scores that are better than those achieved by tion four of the recommendation list. This corresponds to an
RLRecommender, it is evident that both specialized meth- MRR improvement from 0.125 to 0.25.
ods, particularly the rule-based approach, significantly out- Results with Post-processing. We applied this post-
perform general knowledge graph completion methods. processing to all our experimental results. This resulted in
In terms of Hits@10, TuckER stands out as the most pro- significant improvements shown in Table 3. The percent-
ficient model. AnyBURL and DistMult achieve similar yet ages in brackets indicate the improvement of the associated
slightly worse results. These top three models yield results Hits@10 or MRR numbers in comparison to the results
at least on par with RLRecommender. Notably, AnyBURL
and DistMult exhibit robust stability, consistently delivering
comparable outcomes across various translation approaches.
In contrast, TuckER’s performance experiences a decline
specifically when employing translation approach 2a. Sur-
prisingly, providing more information through inSame-
Process or inProcess relations does not always translate
into higher Hits@10 figures for TuckER. This trend extends
beyond TuckER, encompassing all knowledge graph com-
pletion methods, with the exception of DistMult. In the case
of DistMult, the incorporation of supplementary information
through inSameProcess or inProcess relations consistently
leads to enhanced performance, observed through improve-
ments in both Hits@10 and MRR metrics.
While some of the general knowledge graph completion
methods exhibit comparable Hits@10 results to the RLRec-
ommender, they perform surprisingly bad with respect to
the MRR metric. DistMult emerges as the most successful
model, followed by AnyBURL and TuckER. However, none
Fig. 3 An example of a process model where the completion task (n4,
of these three models manage to surpass RLRecommender,
hasLabel, ?) has to be solved, given that Search For Units Available
and evidently the performance gap between these models is the correct answer, and the top-10 recommendations of TransE in
and the Rule-based approach is quite significant. combination with translation approach 1
KI - Künstliche Intelligenz
without post-processing. It seems general knowledge graph demonstrates results that are, at the very least, comparable
completion methods are not capable to distinguish between to those achieved by RLRecommender.
different types of entities. This has already been observed As described in Sect. 2, RLRecommender [15] is based
in [38], where the authors argue that especially knowledge on a rather specific approach to use embeddings in which
graph embedding methods often violate constraints on the only one related activity in the process model is used for the
domain and range of relations. In our case this corresponds recommendation of an activity. In contrast, knowledge graph
to a missing distinction between activity nodes and activity embedding (KGE) models do not face such constraints and,
labels. Moreover, none of the approaches, including the rule as a result, have the potential to surpass RLRecommender in
learner AnyBURL, was able to learn that a label that has performance. This is particularly evident in the case of trans-
already been used in the process under development will lation approach 2b, where models such as TuckER, DistMult
not be used for another activity node in the same process. and TransE exhibit superior performance in Hits@10 and
If we now compare the results of the general knowledge MRR.
graph completion methods with post-processing to the The post-processing step notably enhances the perfor-
results of the specialized methods (i.e., RLRecommender mance of the TransE model, leading to its superior results
and the rule-based activity recommendation), we observe in the Hits@10 metric. However, the TransE results strongly
that the gap between general and specialized methods has depend on the translation approach. Similar observation
become less significant after post-processing. Except for Hit- applies to TuckER, which excels in terms of MRR. On the
tER*, every general knowledge graph completion technique contrary, both DistMult and AnyBURL showcase greater
KI - Künstliche Intelligenz
stability, consistently delivering comparable results across the development of a problem specific filtering as a post-
various translation approaches. processing step to increase the quality of the results.
The specific regularities crucial for effective activity
recommendations appear to exert only a limited influ-
ence on the resulting embedding space. These specifics
5 Discussion and Practical Implications are reflected in the types of rules supported by the rule-
based recommendation method [8] that are also more
In this section we will analyse performance of generic
expressive compared to the general rule types supported
knowledge graph completion methods in more detail. Our
by AnyBURL. We can conclude that general methods for
goal is to explore why these methods do not reach state-of-
knowledge graph completion are not flexible enough to
the-art performance and discuss the practical implications
adapt to the given problem resulting in a relatively low
of our findings.
prediction quality.
Based on the results provided in Table 3, translation
The limited predictive accuracy of knowledge graph
approach 2b proves to be the most effective for the KGE
embedding (KGE) models stems from their focus solely
models TransE, TuckER and DistMult. This outcome under-
on semantic evidence, disregarding local evidence such as
scores the value of the additional explicit information pre-
subgraphs around query triples. These models learn triples
sented in triples featuring the inProcess relation. In contrast,
independently as they learn an embedding for each entity
approach 2a works comparably poor in combination with
and relation. Therefore, they may overlook sequential infor-
the embedding-based methods. One reason for this could
mation inherent in triples [39]. In our knowledge graph, fol-
be that in approach 2a the nodes are strongly interconnected
lowedBy relation is limited to activity nodes. Consequently,
via the relation inSameProcess. Thus, the interconnection
it is impossible for KGE models to capture similar sequential
of the nodes via inSameProcess is more prominent than via
relations between activity label nodes because information
the relation followedBy. This can be disadvantageous since
about such sequential relationships is only preserved along
the co-occurrence patterns depicted by inSameProcess are
the paths between activity label nodes. Therefore, crucial
often less relevant than the structural patterns captured by
information about the sequential order of activity labels may
followedBy, which avoid recommending activities that have
be overlooked by KGE models. To address this, path-based
high co-occurrence statistics, but are not relevant at the cur-
knowledge graph completion methods like NBFNet [40],
rent model position.
A∗Net [41], and RED-GNN [39] can be applied to activity
Unlike the embedding-based methods, AnyBURL
recommendation problem in future research. These models
achieves the best results when using the translation
predict relations between entities by employing message-
approach 2a. While this approach only needs one triple
passing neural networks [42] over all paths between them.
(m, inSameProcess, n) to express that two nodes m and
Like rule-based methods such as AnyBURL, they provide
n are in a process p, approach 2b needs the two triples
interpretability by identifying the most influential paths for
(m, inProcess, p) and (n, inProcess, p) . This has a direct
predicting the label of newly added activity node in the pro-
impact on the regularities that can be captured by Any-
cess model.
BURL. A rule as
KGE models also face the challenge of overfitting, evi-
hasLabel(X, register) ← inSameProcess(X, Y), dent in differences between loss function values for training,
hasLabel(Y, upload documents) validation, and test triples. To understand this limitation, we
compared our study’s datasets with common benchmarks
is within the supported language bias while the equivalent for knowledge graph completion. Our datasets have nota-
rule will have a body length of three based on translation bly fewer relation types and are sparser. For example, while
approach 2b and is thus out of scope. FB15K-237 [43] and Yago3-10 [44] contain 237 and 37
Overall, the rule-based method [8], which has been spe- relation types, ours only have 2 or 3. Similarly, the ratio of
cifically designed for activity recommendation, surpasses triples to entities is much lower in our datasets compared to
the performance of the best general knowledge graph com- benchmark datasets.
pletion methods, exhibiting a minimum improvement of Due to the sparse nature of our datasets and their lim-
5 % in Hits@10 and 9 % in MRR. These significant dif- ited relation types, general Knowledge Graph Embedding
ferences illustrate that a general knowledge graph com- (KGE) models exhibit excessive flexibility in embedding
pletion method cannot compete with an approach which entities in the vector space. Consequently, entity embeddings
has specifically been designed for activity recommenda- are susceptible to initialization and randomness, and more
tion in business process modeling. This remains the case importantly are prone to overfitting. Notably, overfitting
despite our thorough exploration of various translation issue is pronounced for more flexible models like HittER*
approaches, augmentation of the knowledge graph, and and RotatE. In contrast, a simpler model like TransE tends
KI - Künstliche Intelligenz
to yield superior results. Additionally, we found that KGE defined as the generalized sum of all the path representa-
models with smaller embedding sizes often perform better. tions connecting them [40]. Similar path-based knowledge
Leveraging larger training datasets, such as the SAP-SAM graph completion approaches like A∗Net [41], and RED-
dataset [32], can potentially reduce overfitting risks when GNN [39] can be applied to the activity recommenda-
training general KGE models for activity recommendation tion problem, offering several advantages. Firstly, these
tasks. Furthermore, investigating alternative translation approaches can learn not only from semantic evidence
methods, including inverse relationships like precededBy but also from local evidence provided by relational path
and behavioural relationships such as indirectFollowedBy, between entities [41]. Secondly, they identify influential
holds significance for future research. These additional paths between entities for predicting labels, enhancing
relations could augment knowledge graph density and its explainability. Thirdly, they feature a lower number of
informativeness, consequently diminishing the likelihood model parameters compared to KGE models, potentially
of overfitting. reducing risk of overfitting and improving predictive
We showed that generic knowledge graph completion accuracy [40]. Lastly, path-based methods can general-
methods do not match the performance of specialized rule- ize to unseen entities as path semantics are determined
based approaches as outlined in [8]. However, our findings solely by relations rather than entities [41]. To implement
have broader implications. They suggest greater potential path-based approaches, the knowledge graph represen-
for adapting knowledge graph completion methods in the tation of process model repository needs augmentation
business process domain, particularly in process modeling. with inverse relations (e.g., precededBy) and identity rela-
Moreover, our results indicate that knowledge graph embed- tions (i.e., self-loops for all entities). Besides path-based
ding (KGE) models may underperform on sparse knowledge approaches, alternative techniques working on (RDF)
graphs with few relation types. This underscores the need knowledge graphs, e.g., Graph Convolutional Networks
for further research on KGE models tailored to different [45] or RDF2Vec [46], could be tested in the future.
characteristics of knowledge graphs beyond those found in Another direction for future research is related to the
benchmark datasets. current state of the process model under development that
should be taken into account as a context for the recom-
mendation of an activity. When putting the context into the
6 Conclusion and Future Work training set, as done for the embedding-based methods, the
KGE models have to be retrained after every activity that
In this paper, we presented different approaches to use has been added to the model under development. This is
knowledge graph completion methods for activity recom- very impractical for the application of such methods for
mendation in business process modeling. A problem-specific real-time recommendations, as the training takes too much
filtering as post-processing step improved the quality of the time. In future work, we would like to explore ways of
predictions. However, the rule-based activity recommen- avoiding the need for complete retraining when the process
dation method [8] still worked better than the application model under development has been extended, as it has, for
of various general knowledge graph completion methods, example, been done by Song et al. [47] for evolving knowl-
which revealed their lack of flexibility to adapt to the given edge graphs and translation-based embeddings.
problem. In summary, we conclude that the use of generic Last but not least, it is important to acknowledge that
knowledge graph completion methods is not a good choice our current exploration has focused exclusively on co-
for solving the activity recommendation problem. Our occurrence and structural patterns, a choice negatively
empirical results indicate the use of the problem-specific influenced by the inherent sparseness of knowledge
rule-based approach proposed in [8] is currently the best graphs. In future research, we intend to incorporate tex-
solution for any application, e.g., activity recommendation tual patterns, potentially leveraging approaches such as
in a process model editor (see Fig. 1), that requires to present KG-Bert [48].
ranked recommendations to a user. Reproducibility: Our source code and the configura-
A key limitation of the general knowledge graph com- tion files for different knowledge graph completion meth-
pletion methods employed in this study is their lack of ods are publicly accessible at https://github.com/keyvan-
explicit encoding for local sub-graphs between entity amiri/KGE-ActivityRecommendation.
pairs. Consequently, these models may struggle to capture
Funding Open Access funding enabled and organized by Projekt
intricate regularities crucial for determining the label of DEAL.
the unlabeled activity node. A potential remedy to tackle
this challenge involves the use of the Neural Bellman-Ford Data availability The Business Process Management Academic Ini-
Networks in which representation of a pair of entities is tiative (BPMAI) dataset is publicly accessible at https://zenodo.org/
records/3758705, while different knowledge graphs obtained from this
KI - Künstliche Intelligenz
dataset (as described in this paper) are also accessible at our project 15. Wang H, Wen L, Lin L, Wang J (2018) RLRecommender: A
repository: https://github.com/keyvan-amiri/KGE-ActivityRecomme representation-learning-based recommendation method for busi-
ndation ness process modeling. ICSOC. pp. 478–486. Springer
16. Jannach D, Fischer S (2014) Recommendation-based modeling
Open Access This article is licensed under a Creative Commons Attri- support for data mining processes. RecSys. pp. 337–340
bution 4.0 International License, which permits use, sharing, adapta- 17. Jannach D, Jugovac M, Lerche L (2016) Supporting the design
tion, distribution and reproduction in any medium or format, as long of machine learning workflows with a recommendation system.
as you give appropriate credit to the original author(s) and the source, ACM TiiS 6(1):1–35
provide a link to the Creative Commons licence, and indicate if changes 18. Goldstein M, González-Álvarez C (2021) Augmenting model-
were made. The images or other third party material in this article are ers with semantic autocompletion of processes. In: Polyvyanyy
included in the article’s Creative Commons licence, unless indicated A, Wynn MT, Van Looy A, Reichert M (eds) Business Process
otherwise in a credit line to the material. If material is not included in Management Forum. Springer International Publishing, pp
the article’s Creative Commons licence and your intended use is not 20–36
permitted by statutory regulation or exceeds the permitted use, you will 19. Sun Z, Deng ZH, Nie JY, Tang J (2019) RotatE: Knowledge
need to obtain permission directly from the copyright holder. To view a Graph Embedding by Relational Rotation in Complex Space
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. (Feb), arXiv:1902.10197 [cs, stat]
20. Wang Y, Broscheit S, Gemulla R (2019) A Relational Tucker
Decomposition for Multi-Relational Link Prediction (Feb),
arXiv:1902.00898 [cs, stat]
References 21. Chen S, Liu X, Gao J, Jiao J, Zhang R, Ji Y (2021) HittER:
Hierarchical Transformers for Knowledge Graph Embeddings
1. Dumas M, La Rosa M, Mendling J, Reijers HA (2013) Funda- (Oct). arXiv:2008.12813 [cs]
mentals of Business Process Management. Springer, Berlin 22. Meilicke C, Chekol MW, Ruffinelli D, Stuckenschmidt H (2019)
2. Frederiks PJ, Van der Weide TP (2006) Information modeling: Anytime bottom-up rule learning for knowledge graph comple-
The process and the required competencies of its participants. tion. IJCAI. pp. 3137–3143. AAAI Press
DKE 58(1):4–20 23. Meilicke C, Chekol MW, Betz P, Fink M, Stuckeschmidt H
3. Friedrich F, Mendling J, Puhlmann F (2011) Process model (2023) Anytime bottom-up rule learning for large-scale knowl-
generation from natural language text. CAiSE. pp. 482–496. edge graph completion. The VLDB Journal pp. 1–31
Springer 24. Toutanova K, Chen D (2015) Observed versus latent features for
4. Fellmann M, Zarvic N, Metzger D, Koschmider A (2015) knowledge base and text inference. In: Proceedings of the 3rd
Requirements catalog for business process modeling recom- workshop on continuous vector space models and their compo-
mender systems. WI. pp. 393–407 sitionality. pp. 57–66
5. Cao B, Yin J, Deng S, Wang D, Wu Z (2012) Graph-based work- 25. Hamaguchi T, Oiwa H, Shimbo M, Matsumoto Y (2017) Knowl-
flow recommendation: on improving business process modeling. edge transfer for out-of-knowledge-base entities: A graph neural
CIKM. pp. 1527–1531. ACM network approach. arXiv:1706.05674
6. Deng S, Wang D, Li Y, Cao B, Yin J, Wu Z, Zhou M (2017) A 26. Schramm S, Wehner C, Schmid U (2023) Comprehensible artifi-
recommendation system to facilitate business process modeling. cial intelligence on knowledge graphs: A survey. J Web Semant
IEEE Trans Cybern 47(6):1380–1394 79:100806
7. Li Y, Cao B, Xu L, Yin J, Deng S, Yin Y, Wu Z (2014) An effi- 27. Betz P, Meilicke C, Stuckenschmidt H (2022) Adversar-
cient recommendation method for improving business process ial explanations for knowledge graph embeddings. IJCAI
modeling. IEEE Trans Indust Inf 10(1):502–513 2022:2820–2826
8. Sola D, Meilicke C, Van der Aa H, Stuckenschmidt H (2021) 28. Fahland D, Lübke D, Mendling J, Reijers H, Weber B, Weidlich
A rule-based recommendation approach for business process M, Zugal S (2009) Declarative versus imperative process mod-
modeling. In: CAiSE. Springer eling languages: The issue of understandability. In: Halpin T,
9. Sola D, van der Aa H, Meilicke C, Stuckenschmidt H (2022) Krogstie J, Nurcan S, Proper E, Schmidt R, Soffer P, Ukor R (eds)
Exploiting label semantics for rule-based activity recommenda- Enterprise, Business-Process and Information Systems Modeling.
tion in business process modeling. Inf Syst 108:102049 Springer, pp 353–366
10. Sola D, van der Aa H, Meilicke C, Stuckenschmidt H (2023) 29. Dijkman R.M, Dumas M, García-Bañuelos L (2009) Graph match-
Activity recommendation for business process modeling with ing algorithms for business process model similarity search. BPM.
pre-trained language models. In: European Semantic Web Con- vol. 5701, pp. 48–63. Springer
ference. pp. 316–334. Springer 30. Hubert N, Monnin P, Brun A, Monticolo D (2022) New strategies
11. Rossi A, Barbosa D, Firmani D, Matinata A, Merialdo P (2021) for learning knowledge graph embeddings: The recommendation
Knowledge graph embedding for link prediction: A comparative case. In: International Conference on Knowledge Engineering and
analysis. ACM Trans Knowl Discov Data (TKDD) 15(2):1–49 Knowledge Management. pp. 66–80. Springer
12. Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko 31. Model collection of the BPM Academic Initiative, http://bpmai.
O (2013) Translating embeddings for modeling multi-relational org/
data. NIPS. pp. 2787–2795 32. Sola D, Warmuth C, Schäfer B, Badakhshan P, Rehse JR, Kampik
13. Yang B, tau Yih W, He X, Gao J, Deng L (2015) Embedding T (2022) Sap signavio academic models: a large process model
entities and relations for learning and inference in knowledge dataset. In: International Conference on Process Mining. pp.
bases. ICLR (Poster) 453–465. Springer
14. Sola D, Meilicke C, van der Aa H, Stuckenschmidt H (2022) 33. Sola D (2020) Towards a rule-based recommendation approach for
On the use of knowledge graph completion methods for activ- business process modeling. In: ICSOC PhD Symposium. Springer
ity recommendation in business process modeling. In: Marrella 34. Broscheit S, Ruffinelli D, Kochsiek A, Betz P, Gemulla R (2020)
A, Weber B (eds) Business Process Management Workshops. LibKGE - A knowledge graph embedding library for reproducible
Springer International Publishing, Cham, pp 5–17 research. System Demonstrations, EMNLP, pp 165–174
KI - Künstliche Intelligenz
35. Trouillon T, Welbl J, Riedel S, Gaussier É, Bouchard G (2016) YW (eds.) Proceedings of the 34th International Conference on
Complex embeddings for simple link prediction. In: International Machine Learning. Proceedings of Machine Learning Research,
conference on machine learning. pp. 2071–2080. PMLR vol. 70, pp. 1263–1272. PMLR
36. Dettmers T, Minervini P, Stenetorp P, Riedel S (2018) Convolu- 43. Toutanova K, Chen D (2015) Observed versus latent features for
tional 2d knowledge graph embeddings. In: Proceedings of the knowledge base and text inference. In: Allauzen A, Grefenstette
AAAI conference on artificial intelligence. vol. 32 E, Hermann KM, Larochelle H, Yih SWt (eds.) Proceedings of
37. Ruffinelli D, Broscheit S, Gemulla R (2020) You CAN teach an the 3rd Workshop on Continuous Vector Space Models and their
old dog new tricks! ICLR. On training knowledge graph embed- Compositionality. pp. 57–66. Association for Computational
dings. OpenReview.net Linguistics
38. Hubert N, Monnin P, Brun A, Monticolo D (2023) Sem@ k : Is 44. Rebele T, Suchanek F, Hoffart J, Biega J, Kuzey E, Weikum G
my knowledge graph embedding model semantic-aware? arXiv: (2016) YAGO: A multilingual knowledge base from wikipedia,
2301.05601 wordnet, and geonames. In: Groth P, Simperl E, Gray A, Sabou
39. Zhang Y, Yao Q (2022) Knowledge graph reasoning with rela- M, Krötzsch M, Lecue F, Flöck F, Gil Y (eds) The Semantic Web
tional digraph. In: Proceedings of the ACM Web Conference - ISWC 2016. Springer International Publishing, pp 177–185
2022. pp. 912–924. WWW ’22, Association for Computing 45. Schlichtkrull M, Kipf T.N, Bloem P, van den Berg R, Titov I,
Machinery . https://doi.org/10.1145/3485447.3512008 Welling M (2018) Modeling relational data with graph convolu-
40. Zhu Z, Zhang Z, Xhonneux LP, Tang J (2021) Neural Bellman- tional networks. In: ISWC. pp. 593–607. Springer
Ford Networks: A General Graph Neural Network Framework for 46. Ristoski P, Paulheim H (2016) Rdf2vec: Rdf graph embeddings
Link Prediction. In: Advances in Neural Information Processing for data mining. ISWC. pp. 498–514. Springer
Systems. vol. 34, pp. 29476–29490. Curran Associates, Inc 47. Song HJ, Park SB (2018) Enriching translation-based knowledge
41. Zhu Z, Yuan X, Galkin M, Xhonneux LP, Zhang M, Gazeau M, graph embeddings through continual learning. IEEE Access
Tang J (2023) A*net: A scalable path-based reasoning approach 6:60489–60497
for knowledge graphs. Advances in Neural Information Processing 48. Yao L, Mao C, Luo Y (2019) KG-BERT: BERT for knowledge
Systems 36:59323–59336 graph completion. CoRR:abs/1909.03193
42. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017)
Neural message passing for quantum chemistry. In: Precup D, Teh