0% found this document useful (0 votes)
7 views

Design and Implementation of Process Mining System Based On A-Algorithm

Uploaded by

Nissa Auwliya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Design and Implementation of Process Mining System Based On A-Algorithm

Uploaded by

Nissa Auwliya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Design and Implementation of Process Mining System Based on α-Algorithm

Yong Ling1, Liqun Zhang2, Meng Gao3


School of Computer Science and Technology, Shandong University, 250101, Jinan, P.R. China
1
lingyong@mail.sdu.edu.cn 2zhanglq@sdu.edu.cn 3gaomengde@gmail.com

Abstract description from a set of real executions. In this paper,


we design and implement the process mining system.
In the Service-oriented Architecture (SOA), business The process mining system’s core idea is to make
process analysis of the activities and process redesign is use of workflow log gathered from information
an important aspect. Creating a process model is a systems as they take place, to construct the process
time-consuming task and requires the participation of model. It has the potential of mining enterprise-level
experts, as a "Business Process Reengineering" techno- information systems, as a technology of business
logy, process mining is an important way of monitoring process redesign (BPR)[1]. The workflow log contains
business activities and improving the efficiency of “sufficient” information about workflow process as it
workflow modeling, that is emerging research field. In is actually being executed. Clearly, process mining
this paper, we design and implement a graphic and techniques can be used to create a feedback loop to
interactive process mining system which is based on adapt the workflow model to changing circumstances
α-algorithm, its core idea is to extract information about and detect imperfections of the design.
processes from workflow log, and then we carry on the
validation of the mining result. 2. The idea of system implementation

2.1. Language model


1. Introduction
Since the mid-1990s, the researchers have proposed
Traditional modeling approaches encounter new many kinds of model languages to adapt to business
challenges, and how to correctly and efficiently create a process. Representative model definition languages
business processes model becomes important issue, that include ADONIS modeling language, Petri net and
is needed to be explored at present. some others. Aalst presented WF-net (workflow net)
Traditional modeling approach in its life cycle on the basis of Petri net, in virtue of the effectiveness
exposes the following questions: first of all, modeling a and activity of Petri net, moreover, Aalst defined
workflow is far from trivial: it requires deep knowledge safety and robustness of the workflow, bringing about
of the business process at hand and the workflow extensive influence. Compared with other model
language being used; secondly, complex modeling languages, this kind of model language has very strong
consumes massive manpower and time, moreover the pertinence in business process. This paper aims to
constructed workflow model through this way would discuss process mining on the basis of WF-net. Figure
inevitably lead to artificial negligence and errors; lastly, 1 shows an example of WF-net.
in the traditional approach the focus is on the design and ) *

configuration phases, less attention is paid to the


enactment phase and few organizations systematically
$
&

collect run time data which is analyzed as input for % (

redesign(i.e., the diagnosis phase is typically missing). In


'
order to solve these problems, we use the process mining
as the method of distilling a structured process Figure 1. An example of WF-net

The 3rd Intetnational Conference on Innovative Computing Information


and Control (ICICIC'08)
Authorized licensed use limited to: Shandong University of Science and Technology. Downloaded on March 03,2024 at 09:10:07 UTC from IEEE Xplore. Restrictions apply.
978-0-7695-3161-8/08 $25.00 © 2008 IEEE
system.
2.2. Language model of mining α-algorithm 3URFHVV/RJ

3UHWUHDWPHQW 1RGHOD\RXW
The α-algorithm[2] receives as input an workflow log
and returns as output a Petri net, log need to ensure that:
(1) each task refers to a case(i.e., a workflow instance); 7KHFRUHRI
SURFHVV
(2) each task includes a sponsor; (3) each task requires PLQLQJ 5HVXOWYLHZ
initiation and termination time. Note that the record of
time is to ensure the tasks in the log are totally ordered.
,QWURGXFH
Specific example will be presented in Table 1 in Section H[WHUQDO:)0/
2XWSXW:)0/ 0RGHO
*8,PRGXOH
GRFXPHQW YDOLGDWLRQ
3. Based on this kind of log, definition 1[2] divides the GRFXPHQW

relations between the tasks into four ordering relations: ([WHUQDO:)0/


GRFXPHQW :)0/GRFXPHQW
!W oW #W ||W
, , , .
Figure 2. The architecture of system
Definition 1 (Log-based ordering relations) Let W be a
workflow log over T, i.e., W  P (T*), Let a , b  T:
(1) a > W b if and only if there is a trace s = t1t2 tn- 1tn 3.2. Workflow log
and i  {1,…,i-1} such that s  W and a = ti,b = ti + 1 ,
(2) a oW b if and only if a > W b , and b >/ W a The core of process mining system is to extract
(3) a #W b if and only if a >/ W b , and b >/ W a information about processes from workflow log and
(4) a || W b if and only if a > W b and b > W a. construct a process model. Any information system
using transactional systems such as ERP, CRM,
3. Implementation of process mining system Outlook, or workflow management systems will offer
this information in different forms. For example, Table
Process mining can display the actual processing of 1 is a workflow log. For every case, some tasks have
the tasks take place, which has the ability of real-time been executed, and every task has the specific personal
monitoring of changes in business processes, and operation record and time record. Later we will
improves the flexibility of business processes. It’s a fact demonstrate the result of mining.
that process mining is a process of close interaction with Table 1. A workflow log
users, so good interactive performance in the system is Case Task Sponsor Initiation Time Termination Time

indispensable. To address this issue, this paper proposes Case 1 Task A Kate 07-5-8:10.01 07-5-8:10.05

two methods: (1) it uses special node layout algorithm, Case 2 Task A Kate 07-5-8:10.05 07-5-8:10.09

and allows user to adjust the position of the nodes and Case 3 Task A Mike 07-5-8:10.16 07-5-8:10.21

edit the properties; (2) it presents a method to measure Case 1 Task B Pete 07-5-8:10.22 07-5-8:10.23

the quality of mining and help user to complete the final Case 1 Task C Sue 07-5-8:10.26 07-5-8:10.40

design. Case 2 Task B Pete 07-5-8:10.42 07-5-8:10.51


Case 1 Task D Carol 07-5-8:10.51 07-5-8:10.57

3.1. The architecture of system Case 2 Task D Carol 07-5-8:11.01 07-5-8:11.05


Case 1 Task E Frank 07-5-8:11.05 07-5-8:11.17
Case 2 Task C Sue 07-5-8:11.17 07-5-8:11.28
In this paper, the process mining system is
Case 3 Task F Jones 07-5-8:11.29 07-5-8:11.35
implemented by using Java technology. After
Case 2 Task E Frank 07-5-8:11.35 07-5-8:11.44
pretreatment, workflow log is used in the core of system,
Case 3 Task G Jones 07-5-8:11.46 07-5-8:12.02
which is based on the α-algorithm, and system provides
interactive GUI module integrating node layout
algorithm, then defines and uses file with WFML format 3.3. Pretreatment of workflow log
to store the mining result, at last it offers a method to
validate model. Figure 2 shows us the architecture of The information system can generate workflow log

The 3rd Intetnational Conference on Innovative Computing Information


and Control (ICICIC'08)
Authorized licensed use limited to: Shandong University of Science and Technology. Downloaded on March 03,2024 at 09:10:07 UTC from IEEE Xplore. Restrictions apply.
978-0-7695-3161-8/08 $25.00 © 2008 IEEE
with noise, such as the large number of duplicate tasks, can be used to compare value of nodes; we need not
which are not corresponding with same case. Before define the length of array. It is the most fundamental
extracting information from the workflow log with event and most important data structure in our algorithm.
data, we should carry on formalized processing, then 2ˊ Transition Class
examine the validity of the data, and remove redundant It describes the transition in WF-net. Each of these
information. For actual workflow log generated by objects stores id and an exclusive name. Note that the
information system, inspecting and filtering of the event task is represented by the transition class, and that all
data is a tedious and complex work, besides that, human the transitions stored in the LinkedQueue Class is
factors or other abnormal factors will make workflow log convenient.
incomplete. To simplify the problem, we assume that the 3ˊ Place Class
workflow log is ideal. Under this ideal circumstance the It describes the place in WF-net. This class is similar
workflow log does not have noise and includes sufficient with the transition class, so we will not elaborate.
information.
4ˊ CausalRelationBag Class
Class CausalRelationBag{
public LinkedQueue leftQueue= new LinkedQueue();
3.4. Calculation of relations between two tasks
//record the left Transition
public LinkedQueue rightQueue= new LinkedQueue();
//record the right Transition
After pretreatment we get the standard workflow log.
public boolean fire = false;
Here we will calculate the relations of all tasks. Let W be public boolean usable = true;
a workflow log over T. For any a, b  T: a oW b , or }
a oW 1
b , or a #W b , a ||W b . Moreover the relations CausalRelationBag class defines the casual relation,
oW , oW 1
, #W , and ||W are exclusive and partition leftQueue is the reasonˈwhich can result in the occurrence of
T u T .To find a process model on the basis of a
3
rightQueue. Definition of fire and usable is to remove
workflow log, the log should be analyzed for causal duplicate or no longer used data.
dependencies, e.g., if a task is always followed by Through use of the data structure described and the
another task, it is likely that there is a causal relation relations calculated above, Alpha algorithm can be easily
between both tasks. Note that o (! \ ! ) , W W
1
W implemented.
1 1
oW (!W \ !W ) , #W (T u T ) \ (!W  !W1 ) , ||W (!W  !W
1
) . Input˖Workflow log after pretreatment
Relations o 1
W , oW , # W , and W will be crucial
|| Output˖The linked queue including all places and transitions
information for any process mining algorithm. Since generated by mining workflow log.
ActiveNodes Mining˄File log˅{
these relations can be derived from !W , we assume the
(1)setOfTasks=the set of all tasks;
log to be complete with respect to this relation (i.e., if (2)listOfFirst=the set of first nodes in all traces;
one task can follow another task directly, then the log (3)listOfLast=the set of last nodes in all traces;
should have registered this potential behavior) [3]. We (4)for˄each node ai in setOfTasks˅{
Arbitrary subset setOfTasks[k] belongs to setOfTasks;
investigate the relation between the causal relations
if˄ a j belongs to subTasksˈmeets ai o a j and a j1 #W aj2 ˅
detected in the log (i.e., !W ) and the presence of places
{
connecting transitions. Places are created based on the leftSide.leftQueue.Add( ai );
oW and #W relations [4]. leftSide.rightQueue.Add(subTasks[k]); }}
(5)Combine˄leftSide˅;
//remove the duplicate data in leftSide; merge the data in
3.5. Generation of process model //leftSide.leftQueue that meet ai1 #W ai 2 and correspond to same subset
//of rightQueue.
(6)for˄each node ai in setOfTasks˅{
Above computation is the premise of mining algorithm. Arbitrary subset setOfTasks[k] belongs to setOfTasks;
In order to implement the algorithm, this paper below if˄ a j belongs to subTasksˈmeets a j o ai and a j1 #W a j 2 ˅
{
defines some important data structures.
rightSide.leftQueue.Add(subTasks[k]);
1ˊ LinkedQueue Class rightSide.rightQueue.Add( ai ); }}
LinkedQueue Class is a linked queue. It supports the (7)Combine˄rightSide˅;
//Remove the duplicate data in rightSide; merge the data in
basic operations of queue, the search for specified data and

The 3rd Intetnational Conference on Innovative Computing Information


and Control (ICICIC'08)
Authorized licensed use limited to: Shandong University of Science and Technology. Downloaded on March 03,2024 at 09:10:07 UTC from IEEE Xplore. Restrictions apply.
978-0-7695-3161-8/08 $25.00 © 2008 IEEE
//rightSide.rightQueue that meet ai1 #W ai 2 and correspond to same subset of system can mine the structural relations in workflow log
//leftQueue.
under noise-free and complete log circumstances. At the
(8)Combination (leftSide, rightSide);
same time, the problems such as non-free-choice constructs
//Merge leftSide and rightSide and remove the duplicate element, search for
//all the same leftQueue/rightQueue, if there is affiliation between their and log with noise need to be studied further.
//corresponding rightQueue/leftQueue, remove the data contained otherwise
//unchanged. These will be the most simplified Meta groups. 5. Conclusions
}

So far, each of the most simplified Meta groups can In practice, process mining can solve some traditional
represents a place that have the ability of recording input problems in the workflow life-cycle, and reduce the
and output transitions, through adoption of the places and workload of complex modeling process to ensure the process
transitions, we can easily extract directed arcs between improvements, the smooth progress of BPR and effective
them, and then make use of layout algorithm and GUI diagnosis. In this paper, the system uses workflow log to
module to form the interaction with the users. Figure 3 extract process information and constructs corresponding
shows the process model corresponding to Table 1 after process model, at the same time it can supply help for
process mining. business staff to construct new processes based on
requirements. At present, the test for system’s commercial
usefulness is under way.

6. References:

[1] Van der Aalst W M P, Weijters A J M M, Process


mining-a research agenda, computers in industry [J]. 2004,
53(3): 231-244.

[2] Van Der Aalst W M P, Weijters A J M M, Marudter L.,
Figure 3. The process model
Workflow mining: discovering process models from event
logs [C]. IEEE Transactions on Knowledge and Data
4. Results Verification and Quality Metrics Engineering, Eindhoven, 2002: 101-132.

The module of model validation in our constructed system [3] W.M.P. van der Aalst ,B.F. van Dongen, J. Herbst ,L.
can not only certify the actual results mined from workflow log, Maruster, G. Schimm, A.J.M.M. Weijters, Workflow mining:
but also appreciate the similarities and discrepancies between A survey of issues and approaches, ElsevierB.V,2003
the process model and the processes that actually take place,
that can make guidance. The model’s quality depends on the [4] A.K.A.de Medeiros, W.M.P .van der Aalst, and A.J.M.M.
ability of describing execution of process model, similarly, Weijters.Workflow Mining: Current Status and Future
whether the processes execute appropriately, it should be Directions. R. Meersman et al. (Eds.), Springer-Verlag
judged by the degree of consistency between them. When there Berlin Heidelberg, 2003.
is a discrepancy between the log and the model, it means that
we need to make some improvements.
In order to validate the actual effect, we have designed
process simulation generator to generate log files as the input
for process mining, and then compare the model with the
original one. Through observing the model in Figure 3 and
comparing with original model, we can see that the content of
their expression is the same. To some extend, the system in the
aspect of mining causal relations and concurrent relations is
accurate and has the very good application value. In fact, the

The 3rd Intetnational Conference on Innovative Computing Information


and Control (ICICIC'08)
Authorized licensed use limited to: Shandong University of Science and Technology. Downloaded on March 03,2024 at 09:10:07 UTC from IEEE Xplore. Restrictions apply.
978-0-7695-3161-8/08 $25.00 © 2008 IEEE

You might also like