Abstract
Purpose
Process mining provides a new means to improve processes in a variety of application domains. The purpose of this paper is to abstract a process model and then use the discovered models from process mining to make useful optimization via predictions.
Design/methodology/approach
The paper divides the process model into a combination of “pair-adjacent activities” and “pair-adjacent persons” in the event logs. First, two new handover process models based on adjacency matrix are proposed. Second, by adding the stage, frequency, and time for every activity or person into the matrix, another two new handover prediction process models based on stage adjacency matrix are further proposed. Third, compute the conditional probability from every stage to next stage through the frequency. Finally, use real data to analyze and demonstrate the practicality and effectiveness of the proposed handover optimization process.
Findings
The process model can be extended with information to predict what will actually happen, how possible to reach the next activity, who will do this activity, and the corresponding probability if there are several people executing the same activity, etc.
Originality/value
The contribution of this paper is to predict what will actually happen, how possible it is to reach the following activities or persons in the next stage, how soon to reach the following activities or persons by calculating all the possible interval time via different traces, who will do this activity, and the corresponding probability if there are several people executing the same activity, etc.
Keywords
Citation
Liu, J., Liu, P., Liu, S., Ma, Y. and Yang, W. (2013), "Handover optimization in business processes via prediction", Kybernetes, Vol. 42 No. 7, pp. 1101-1127. https://doi.org/10.1108/K-11-2012-0107
Publisher
:Emerald Group Publishing Limited
Copyright © 2013, Emerald Group Publishing Limited
1 Introduction
Process mining (Agrawal et al., 1998; Cook and Wolf, 1998; Datta, 1998) provides a new means to improve processes in a variety of application domains. There are two main drivers for this new technology (Van der Aalst et al., 2011). On the one hand, more and more events are being recorded thus providing detailed information about the history processes. On the other hand, vendors of business process management (BPM) and business intelligence (BI) software have been promising miracles. The process model can be extended with information to predict what will actually happen, how possible to reach the next activity, how soon for the transitions of running instances reach this activity, who will execute this activity, and the corresponding probability if there are several people executing the same activity? How the work handover from one person to another? If the managers get this basic but important handover information, they can make a much better optimization in BPM. But, how can we get these useful business intelligence and help the managers make an optimization? One variable approach is to use the history data of an enterprise to abstract a process model, and then use the discovered models from process mining to make useful optimization via predictions.
Process mining (Alves de Mdeoros et al., 2007; Buijs, 2010) is an emerging discipline providing comprehensive sets of tools to provide fact-based on insights and to support process improvements. This paper presents a new process models based on adjacency matrix that can help organizations to uncover their actual business processes. The discovered process models can be extended with information to predict what will happen (giving advice on the possibilities for all activities or persons in the next stage), how possible (predicting the probabilities to reach all followed activities or persons in the next stage), how soon (predicting the possible interval times via different traces to reach all the followed activities or persons in the next activity), who will execute this activity, and the corresponding probability if there are several people executing the same activity, how the work handover from one person to another.
Recently, there is more and more information made available about processes, which is recorded by information systems in the form of so-called “event logs” (Van Dongen et al., 2008; Van der Aalst et al., 2011; Van der Aalst et al., 2003). Information systems are becoming more intertwined with the operational processes they support. As a result, multitudes of events are recorded by information systems today. Many information systems that support business processes, like enterprise resource management (ERP), work flow management (WFM), BPM, customer relationship management (CRM), supply chain management (SCM), product data management (PDM), and B2B (Business to Business) are characterized by the event logs (Van Dongen et al., 2008; Van der Aalst et al., 2011). Nevertheless, it is challenging to extract valuable intelligence from these data. The goal of process mining is to use event data to extract process-related information, e.g. discover a process model by analyzing events recorded automatically by an enterprise system. A lot of research has been done in process mining and extracting process models from event logs of actual execution instances (Gunther and Van der Aalst, 2007; Schonenberg et al., 2008; Van Dongen and Van der Aalst, 2005). Process mining provides a new means to improve process management in a variety of application domains. There are two main drivers for this new technology. First, recorded events provide detailed information about the history of processes. Despite the omnipresence of event data, most organizations' diagnosed problems are based on fiction rather than facts (Dustdar et al., 2005). Second, software of BPM and BI vendors has been promising. Although BPM and BI technologies received lots of attention, they did not live up to the expectations raised by academics, consultants, and software vendors (Van der Aalst et al., 2011; Dustdar et al., 2005).
Process mining (Van der Aalst et al., 2011; PROM, 2009; Bezerra and Wainer, 2013) is an emerging research area that has opened up new possibilities of knowledge discovery. First, it enables better understanding of the process (Tan et al., 2006). Second, it also helps in the comparison between the actual process and conceptualized process and allows suitable modifications and discrepancies. The process model extracted from actual logs may differ drastically from the idealized process, suggesting that idealized process is not being enforced, and the extracted model provides guidance for modifying the idealized process (Van der Aalst, 2007). Third, one may improve an existing process by finding opportunities for exploiting parallelism by examining the process model to increase throughput (Rozinat and Van der Aalst, 2008). Finally, it is also possible to run queries against a process model (Klein and Bernstein, 2004).
In this paper, we study on handover optimization in business process through prediction based on process mining. To do this, we need to conduct two things:
extract the right handover process model; and
use the model to optimize the handover process.
Using our approach to extract the handover process model such as Figure 1 applied in a health care insurance enterprise. For example: Sean's friend invites her to attend a business party through an e-mail. At the same time, a customer comes to the insurance enterprise for “pay compensation”. Via our process model, Sean will know that she has 1/3 probability to execute the activity ET in 776(652+124) minutes or EC in 2,249 (652+1,597) minutes that happens at stage three, respectively. If the activity of EC happens in 537.333 minutes by Mike and Ellen, then Sean can give her friend a clear answer that she will attend this party in time. If EC does not happen, but CT happens in 652 minutes, then the probability is 1/2 of doing EC in 124 minutes for Sean, and 1/2 of doing ET in 1,597 minutes, respectively. If Sean does not receive any material for EC in 124 minutes, then she will have 1,473 (1,597-124) minutes free to wait Mike to finish activity of CT. Thus, she can attend the business party during free time. If CT also does not happen in 652 minutes, it means in 732 (1,384-652) minutes the activity will reach ET executed by Sue, Sean will be free if there is not any customer coming to this insurance enterprise requesting for “pay compensation”.
The following two figures show the waiting time and probability to give a clear answer for Sean of our prediction model and the existing model based on sequence abstraction (Van der Aalst et al., 2011; Van der Aalst, 2011).
Obviously, our method is better than the existent approach for time, probability, and resource prediction when there are several branches at a state thorough Figures 3 and 4. Whether there are branches or not at a state, we will get the same activity prediction result. Using our method, will be helpful to reduce the overhead caused by the various not-merged points, which is a scalability problem of the existent methods. You just need to add the features into the adjacency matrix such as organization, resource, and cost, etc. for every activity.
The primarily research implication of this paper, we use our handover process model to predict the next activity, probability of reaching this activity, and the probability who will execute this activity. Then, optimize the handover business process through prediction. We use the presented average value (Aalst et al., 2011) function to predict the interval time (IT), which from present to the next activity, such as the sojourn times (e.g. the average time spent in a particular state) proposed in Van der Aalst et al. (2011). The elapsed times (e.g. the average time to reach a particular state) and the remaining time (from every state to the end) can easily be added in the matrix and model. However, these two times (remaining and elapsed times) are not the focus of this paper.
In subsequent sections, we explain our approach in detail based on real data. In Section 2, related work is discussed. Section 3 introduces event logs and presents a simple example process that is used throughout this paper. Then, we will introduce our approach and extract the corresponding handover process model for process mining from log instances based on adjacency matrix and stage adjacency matrix in Section 4. Later, Section 5 gives further discussion and analysis using real data for “pay compensation” of an insurance enterprise in Agrawal et al. (1998), and describes our approach for the work handover process in business based on process mining, as well as insights. Section 6 compares our approach with sequence abstraction by using synthetic data and real data, respectively. Finally, Section 7 discusses the conclusion and future work.
2 Related work
In recent years, there has been considerable interest in process mining algorithms to extract process models from logs based on observed events (Dustdar et al., 2005; Van der Aalst et al., 2011). Generally speaking, event logs can be used to conduct three types of process mining (Dustdar et al., 2005; Van der Aalst et al., 2011). First type of process mining is discovery. A discovery technique takes an event log and produces a model without using any a-priori information. An example is the a-algorithm (Agrawal et al., 1998), which takes an event log and produces a Petri net explaining the behavior recorded in the log. The a-algorithm is able to automatically construct the Petri net (Schonenberg et al., 2008; Van der Aalst et al., 2004) without using any additional knowledge. If the event log contains information about resources, one can also discover resource-related models, e.g. a social network showing how people work together in an organization. Second type of process mining of process mining is conformance. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. An example is the conformance checking algorithm described in Cook and Wolf (1998). For instance, there may be a process model indicating that purchase orders of more than million Euro require to checks. Analysis of the event log will show whether this rule is followed or not. Another example is the checking of the so-called “four-eyes” principle stating that particular activities should not be executed by one and the same person. By scanning the event log using a model specifying these requirements, one can discover potential cases of fraud. Hence, conformance checking may be used to detect. Third type of process mining is enhancement. The idea is to extend or improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. One type of enhancement is repair, i.e. modifying the model to better reflect reality. For example, if two activities are modeled sequentially but in reality can happen in any order, then the model may be corrected to reflect this. Another type of enhancement is extension, i.e. adding a new perspective to the process model by cross-correlating it with the log. An example is the extension of a process model with performance data.
An example is a recommendation service presented in 2008 (PROM). This service, gives advice on the followed activity with the biggest probability. The recommendation service merely gives advice as what to do next. Another example is the cycle time prediction method presented by Wen et al. (2006). The cycle time prediction uses non-parametric regression to predict the completion time of partially executed process instances. Using the same regression technique, it can estimate whether activities will be executed in the future and, if so, the time it takes to reach them. An annotated transition system which represents an abstraction of the process with time annotations was presented by Van der Aalst (2011). The annotated transition system is similar to the cycle time prediction. However, the annotated transition system can overcome some limitations of the cycle time prediction, such as not reducing to a simple heuristic or regression model. At the same time, the annotated transition system cannot make a prediction for the probability for every different trace reaches to the next activity. The annotated transition system (Van der Aalst, 2011) is generated, which represents an abstraction of the process with time annotations. This can predict the completion time of running instances when events happen. The a-algorithm (Van der Aalst, 2011; Agrawal et al., 1998) was one of the first process algorithms that could adequately deal with concurrency. However, the a-algorithm should not be seen as a very practical mining technique as it has problems with noise, in frequent/incomplete behavior, and complex routing constructs. It is simple and many of its ideas have been embedded in more complex and robust techniques. There are several limitations of the a-algorithm (Agrawal et al., 1998) such as the original a-algorithm has problems dealing with short loops, i.e. loops of length one or two, has no problems mining loops of length three or more. Another limitation of the a-algorithm is that frequencies are not taken into account.
The approach presented in this paper differs from the existing approaches in various ways. First, an adjacency matrix is constructed (unlike the a-algorithm or annotated transition system) based on pair-adjacent activities or pair-adjacent persons of event logs. Two new kinds of novel handover process models are extracted from the adjacency matrix and stage adjacency matrix. Second, the handover process can be used to make a prediction for activity, probability, time, and how work handover from one person to another at per state. These two kinds of model are also used to predict the possible activity, probability, and time at per state.
3 Event logs
Event log (Van der Aalst et al., 2004) records the events that occur in a certain process for a certain case. It not always available but event related information is often stored in the database of the information system. There are two formats event logs, one is MXML format, and one is XES format. The MXML event log format was created by Van Dongen and Van der Aalst (2005). The goal of the format is to standardize the way of storing event log information. The XES event log format was created by Buijs (2010). More extensive and up to date information regarding this format can be found at http://code.deckfour.org.xes.
“Event Log fragment in MXML format (The example reflects the pay compensation process)” shows part of an MXML event log with the header. “Event Log fragment in MXML format (The example reflects the pay compensation process)” depicts an event log in MXML format for pay compensation in a health-care insurance enterprise, which contains five different kinds of information (activity, time stamp, event type, resource, and cost). The logged process reflects a health-care insurance enterprise for pay compensation process. Figure 3 shows history of the pay compensation process. What we can see from the figure is as follows: The beginning or first activity is “register request”, the time stamp is “2010-12-30T11:02:00”, the cost is “50”, and the resource is “Pete”; the second activity is “examine thoroughly”, the time stamp is “2010-12-31T10:06:00”, the cost is “400”, and the resource is “Sue”;, respectively. Furthermore, the last activity is “reject request”, the time stamp is “2011-01-07T14:24:00.000”, the cost is “200”, and the resource is also “Pete”.
<Process id=“running-example.mxml” description=“Converted to MXML by Fluxicon Nitro”>
<ProcessInstance id=“1”>
<AuditTrailEntry>
<Data>
<Attribute name=“Activity” >register request</Attribute>
<Attribute name=“Resource” >Pete</Attribute>
<Attribute name=“Costs”>50</Attribute>
</Data>
<WorkflowModelElement>register request</WorkflowModelElement>
<EventType>complete</EventType>
<Timestamp>2010-12-30T11:02:00.000+01:00</Timestamp>
<Originator>Pete</Originator>
</AuditTrailEntry>
…………
<AuditTrailEntry>
<Data>
<Attribute name=“Activity” >reject request</Attribute>
<Attribute name=“Resource” >Pete</Attribute>
<Attribute name=“Costs”>200</Attribute>
</Data>
<WorkflowModelElement>reject request</WorkflowModelElement>
<EventType>complete</EventType>
<Timestamp>2011-01-07T14:24:00.000+01:00</Timestamp>
<Originator>Pete</Originator>
</AuditTrailEntry>
</ProcessInstance>
</Process>
</WorkflowLog>
In “Event Log fragment in MXML format (The example reflects the pay compensation process)” we can get the trace through combination of “register request-examine thoroughly”, “examine thoroughly-check ticket”, “check ticket- decide”, and “decide-reject request”. Then we can extract the process model of this log. We also know the next activity when we get the first activity in those “pair-adjacent activities”. For example, when activity “register request” is finished the next activity is “examine thoroughly”. We can also compute the interval time for reaching the next activity from previous activity using the information of time stamp log, e.g. if the activity “register request” is finished, it needs 1,384 min to reach the next activity of example thoroughly. At the same time, we know that trace of the activity handover through a combination of “Pete-Sue”, “Sue-Mike”, “Mike-Sara”, and “Sara-Pete”. In the next section we will demonstrate the details of our approach for process mining and prediction.
4 Activity adjacency matrix and handover process model generation
In this section, we focus on the activity and divide the process model into a combination of “pair-adjacent activities” in the event logs. Then, we abstract a process model through the adjacency matrix. The process model can be extended with information to predict what will actually happen, how it is possible to reach the next activity, who will execute this activity, and the corresponding probability if there are several people executing the same activity, etc. In this paper, we also assume all the activities in the event logs are independent of previous work (Wen et al., 2006).
4.1 Activity adjacency matrix generation
In order to describe the details of the algorithm, we give an example of possible abstractions (Van der Aalst, 2011). Table I is an example log, which include six cases. Each case corresponds to a trace represented as a sequence of activities with timestamps, resource, and cost.
We will first show how to construct an adjacency matrix based on a case. Then we will show how to get the adjacency matrix for several cases. Finally, we will extract the process model based on the adjacency matrix.
Algorithm details
We can divide the trace of every case in the logs into a combination of “pair-adjacent activities”. Then the process model is a combination of “pair-adjacent activities” in the event logs. In this paper, a new process model based on adjacency matrix abstraction is proposed. The algorithm first builds an adjacency matrix. This is a (N+2)×(N+2) adjacency matrix for N activities (events) in the log and also include two artificially added activities, Start and End. Obviously, there are eight activities in Table I such as “REGR”, “ET”, “CT”, “D”, “REJ T”, “EC”, “PC”, and “REI R”. Then, plus the two artificially activities “START” and “END”, we will get a 10×10 adjacency matrix for the event logs in Table I.
The frequency n of every pair of consecutive or adjacent activities (i, j) in the log for each case adjacency matrix[i][j] is set to [n]. The number in [ ] means the frequency for every pair of two adjacent activities (i, j). At the same time, we add < m> (name) into the corresponding location of the matrix. The number of in < > locating at (i, j) in the matrix means the frequency of activity j executed by the person in ( ). Obviously, there is an adjacency relationship between activities i and j, and we will know the followed activity is j when activity i has happened. Therefore, we can predict the followed activity through the adjacency matrix. Using the same method, we can get the adjacency matrixes (a)-(f) according to cases 1-6 in Table I, respectively.
We put these six adjacency matrices together, and we will get the prediction adjacency matrix 1 for all the cases in Table I. In prediction adjacency matrix 1, for every pair of adjacent activities (i, j) in the log, adjacency matrix[i][j] is set to n (n is the sum of the frequencies at the same location of the previous six adjacent matrixes for every case. Then, we get the activity adjacency matrix for these six cases (42 events/activities) in Table II.
From activity adjacency matrix 1, we know that when a customer gives a call (writes an e-mail or mail some materials) to an insurance company for “pay compensation” (PC), first, he or she needs “register request” (REG R) from service in this company. Pete, Mike, or Ellen will execute this activity (job). The probability of executing the activity REG R for Pete, Mike, and Ellen is 3/6, 2/6, and 1/6, respectively. There are three different activities including “check ticket” (CT), “examine thoroughly” (ET), and “examine casually” (EC) in the next stage when activity of REG R is finished. The probability is 1/6 of reaching ET. The probability is 3/6 of reaching EC. Similarly, the probability is 2/6 of reaching CT. REJ R and PC are two activities at the last stage, and reach to the artificially activity END.
We get the handover prediction process model for Table I (six cases, 42 events/activities) based on activity adjacency matrix abstraction in Figure 5.
We know that the first activity is REG R through Figure 5. Pete, Mike, or Ellen will execute this activity (job). The probability of doing the activity REG R for Pete, Mike, and Ellen is 3/6, 2/6, and 1/6, respectively. There are 3/6 probability reaching to EC when activity of REG R finished. The activity EC will be executed by Mike, Ellen, Sean, or Sue. The probability of doing the activity EC for Mike, Ellen, Sean, and Sue is 3/6, 1/6, 1/6, and 1/6, respectively.
This prediction process model is built based on the prediction adjacency matrix of every activity which only happens once. At the same time, the number of activities in this model is the same as the activities in the table. We use the model in Figure 5 to make a prediction. But through the process model in Figure 5, we cannot decide whether there is a loop between CT and ET, CT and EC, etc. Huang and Kumar (2009, 2011) presented the quality metric of the process model relying on the badness score. This score is used to measure the badness of a model. Specifically, a model with more self-loop and loop structures is worse than that model with fewer. In the next section, we propose a new method to address the problem when possible self-loops and loops exist in the process model.
4.2 Activity stage adjacency matrix
In this paper, we add the stage or order of every activity into the adjacency matrixes (a)-(f) build a stage adjacency matrix. Then we extract a new process model based on the stage adjacency matrix which avoids possible loops and self-loops. In the next section, we will propose the algorithm details about how to construct the stage adjacency matrix.
Algorithm details
We add the stage of activity i of per pair adjacent activities (i, j) into the prediction adjacency matrix 1. The number in { } locating at (i, j) in the matrix means the stage of activity i for every pair of two adjacent activities (i, j). And the number in [ ] located at (i, j) in the matrix means the frequency for every pair of two adjacent activities (i, j) when activity i is at the stage { }. The number in < > located at (i, j) in the matrix indicates the frequency of activity j executed by the person in ( ) when activity i at the stage { }. When these two adjacent activities i and j appear at several places in a trace, we need to add the corresponding information attached to the related stage to differentiate them. Using the same method, we get the stage adjacency matrixes (a)-(f) according to case 1, case 2, case 3, case 4, case 5, and case 6, respectively. Putting these six stage adjacency matrices together, we will get stage prediction adjacency matrix 2 for all the cases in Table I. In stage adjacency matrix 2, for every pair of stage adjacent activities (i, j) in the logs, adjacency matrix[i][j] is set to n, and stage m (n is the sum of the frequencies at the same location of the previous six adjacent matrixes for every cases with the same stage value in { }. Then, we get the stage adjacency matrix for these six cases in Table III.
Through stage adjacency matrix 2, the activity REI R was executed by the Sara twice at the stage 4, and executed by the Sara once at the stage 8, respectively. The PC was executed by Mike once at stage 4, executed by Ellen once at stage 4, and executed by Ellen once at stage 8, respectively.
We add the interval time (IT) (e.g. the time to reach the next activity from the activity i) into the adjacency matrix (i, j) to make a time prediction. In this paper, we use the average value as the time prediction function proposed in Van der Aalst et al. (2011). IT in activity stage adjacency matrix 2 is the average value of the ITs at the same location of the previous six adjacent matrixes for every case executed by the same person in ( ) and with the same stage value in { }. “START” and “END” are two artificially activities when we build the adjacency matrix, that means they do existed in real data. Thus, the interval time from “START” to the activity at the first stage or from the activity at last stage to “END” is zero, respectively. Then we get the activity stage prediction adjacency matrix when we add the time information into the activity stage adjacency matrix. We get the handover prediction process model for Table I (six cases, 42 events/activities) based on activity stage prediction adjacency matrix abstraction.
Via Figure 6, we know that when a customer gives a call (writes an e-mail or mail some materials) to an insurance company for “pay compensation” (PC), first, he or she needs “register request” (REG R) from service in this company. Pete, Mike, or Ellen will do this activity (job). The probability of doing the activity of REG R for Pete, Mike, and Ellen is 3/6, 2/6, and 1/6, respectively. There are three different activities including “check ticket” (CT), “examine thoroughly” (ET), and “examine casually” (EC) in the next stage when activity of REG R is finished at the first stage. The probability is 1/6 of reaching ET with the corresponding time of 1,384 minutes when it happens. The probability is 3/6 of reaching EC with the corresponding time of 537.333 minutes when it happens. Similarly, the probability is 2/6 of reaching CT with the corresponding time of 652 minutes when it happens. If the activity EC does not happen in 537.333 minutes later when REG R happens, then the probability becomes 2/3 of reaching EC executed by Sean in 114.667 minutes later. Otherwise, the probability becomes 100 percent of reaching ET in 732 minutes. The manager should call Sue during this time if she is absent on business trip, sudden accident or has got a cold to hospital, then the manager can call Sean in the third stage doing the same activity temporarily. Via this prediction process model, we will know that if activity of “examine thoroughly” (ET) happens at stage two, then the next activity cannot reach the activity ET or EC happening at stage three that executed by Sean.
Sean's friend invites her to attend a business party through an e-mail. At the same time, a customer comes to the insurance company for “pay compensation”. Via our process model, Sean will know that she has 1/3 probability to execute the activity ET in 776(652+124) minutes or EC in 2,249 (652+1,597) minutes that happens at stage three, respectively. If the activity of EC happens in 537.333 min by Mike and Ellen, then Sean can give her friend a clear answer that she will attend this party in time. If EC does not happen, but CT happens in 652 min, then the probability is 1/2 of doing EC in 124 min for Sean, and 1/2 of doing ET in 1,597 min, respectively. If Sean does not receive any material for EC in 124 minutes, then she will have 1,473 (1,597-124) minutes free to wait Mike to finish activity of CT. Thus, she can attend the business party during free time. If CT also does not happen in 652 minutes, it means in 732 (1,384-652) minutes the activity will reach ET executed by Sue, Sean will be free if there is not any customer coming to this company requesting for “pay compensation”.
Only Sara executes the activity D. When the activity reaching “Decide”(D) executed by Sara, if the activity does not reaching “reinitiate request” in 895 min, then there is 1/2 probability reaching“reject” (REJ) in 3,084 min. If the (REJ) does not happen in 3,084 min, the probability becomes 100 percent of reaching activity PC executed by Ellen or Mike in 5,425 min. At the same time, when the activity reaching D executed by Sara the customer can get a clear answer “reject” or “pay compensation” for his (her) request in 3,084 minutes.
5 Resource adjacency matrix and handover process model generation
In Section 4, we can divide the trace of every case in the logs into a combination of “pair adjacent activities”. In this section, we focus on the person and divide the process model into a combination of “pair-adjacent persons” in the event logs.
5.1 Resource adjacency matrix generation
We use the resource (person) information to build an adjacency matrix. Then we add the information such as activities, time, frequency, and stage into the matrix. This is also a (N+2)×(N+2) adjacency matrix for M persons in the event logs and two artificially added persons, Null and Entity. There are six persons in Table I such as “Sue”, “Pete”, “Ellen”, “Mike”, “Sara”, and “Sean”. Then, by adding the two artificial persons “Null” and “Entity”, we will get a 8×8 adjacency matrix for the event logs in Table I.
Algorithm details
The frequency n of every pair of consecutive or adjacent persons (i, j) in the log for each case adjacency matrix[i][j] is set to [n]. The number in [ ] means the frequency for every pair of two adjacent persons (i, j). There is an adjacency relationship between persons i and j, and we will know the next person is j when person i has executed the corresponding activity. Therefore, we can predict the next person through the prediction adjacency matrix. Using the same method, we can get the prediction adjacency matrixes (a)-(f) according to cases 1-6 in Table I, respectively. We put these six adjacency matrixes together, and get the prediction adjacency matrix 1 for all the cases. In prediction adjacency matrix 3, for every pair of adjacent persons (i, j) in the log, adjacency matrix[i][j] is set to n (n is the sum of the frequencies at the same location of the previous six adjacent matrixes for every case. We get the resource adjacency matrix for these six cases (42 events/activities) in Table IV.
By using the same method in Section 3, we get the handover prediction process model for Table I (six cases, 42 events/activities) based on prediction adjacency matrix 3 abstraction (Figure 7).
We know that when a customer gives a call (writes an e-mail or mail some materials) to an insurance company for “pay compensation” (PC), the materials will be first reaching Pete, Mike, or Ellen through Figure 4. The probability of reaching Pete, Mike, and Ellen is 3/6, 2/6, and 1/6, respectively. If the materials first reach Pete, then they will reach Mike, Sue, and Sara when Pete finished his work. The probability of reaching Mike, Sue, and Sara is 1/3, 1/6 and 1/2, respectively. If the materials first reach Mike, then they will reach Pete, Ellen, Sara, Sean, and himself when Mike finished his work. The probability of reaching Pete, Ellen, Sara, Sean, and himself is 1/9, 2/9, 3/9, 2/9, and 1/9, respectively. Obviously, there are several loops and self-loops in Figure 6. We will use the same method by adding the stage of every person into the resource adjacency matrix 3 to avoid the loops and improve its quality in prediction.
5.2 Resource stage adjacency matrix generation
In this paper, we add the stage or order of every person into the prediction adjacency matrix and build a resource stage prediction adjacency matrix. Then we extract a new process model based on the stage prediction adjacency matrix which avoid possible loops and self-loops. In the next section we will propose the algorithm details about how to construct the stage prediction adjacency matrix.
Algorithm details
We add the stage of activity i of per pair adjacent person (i, j) into the resource adjacency matrix 3. The number in { } locating at (i, j) in the matrix means the stage of person i for every pair of two adjacent persons (i, j). And the number in [ ] located at (i, j) in the matrix means the frequency for every pair of two adjacent persons (i, j) when person i is at the stage { }. The number in < > located at (i, j) in the matrix indicates the frequency of activity in ( ) executed by the person j when person i at the stage { }. When these two adjacent persons i and j appear at several places in a trace, we need to add the corresponding information attached to the related stage to differentiate them. Using the same method, we can get the resource stage adjacency matrixes (a)-(f) according to case1, case2, case3, case4, case5, and case6 in Table I, respectively.
Putting these six resource stage adjacency matrixes together, we will get resource stage adjacency matrix 4 for all the cases in Table I. In resource stage adjacency matrix 4, for every pair of stage adjacent activities (i, j) in the logs, adjacency matrix[i][j] is set to n, and stage m (n is the sum of the frequencies at the same location of the previous six adjacent matrixes for every cases with the same stage value in { }. Then, we get the resource stage prediction adjacency matrix for these six cases in Table V.
As shown in Section 4, we add the interval time (IT) (e.g. the time to reach the next person from the person i) into the adjacency matrix (i, j) to make a time prediction. IT in stage adjacency matrix 4 is the average value of the ITs, at the same location of the previous six adjacency matrices for every case executed by the same activity, in ( ) and with the same stage value in { }. “Null” and “Entity” are two artificially persons when we build the adjacency matrix, that means they do existed in real data. Thus, the interval times from “○” to the persons at the first stage, and from the persons at the last stage to “Entity” are zero, respectively. Then we get the resource stage prediction adjacency matrix when we add the time information into the resource stage adjacency matrix.
We get the handover prediction process model for Table I (six cases, 42 events/activities), based on the resource stage adjacency matrix 4 abstraction as follows (Figure 8).
We know that when a customer gives a call (writes an e-mail or mail some materials) to an insurance company for “pay compensation” (PC), the materials will be reach to Pete, Mike, or Ellen at first state through Figure 4. The probability of reaching to Pete, Mike, and Ellen is 3/6, 2/6, and 1/6, respectively.
There are also three different persons will execute some activity at stage two including Mike, Ellen, and Sue when activity of REG R is finished at the first stage by Pete, Mike, or Ellen. If the first activity executed by Mike, then the followed activity will be executed by Ellen or Mike. The probability is 1/2 of reaching Mike with the corresponding time of 64 min when the first activity reaches to Mike. The probability is also 1/2 of reaching Ellen with the corresponding time of 40 min when the first activity reaches to Mike. If the first activity executed by Ellen, then the followed activity will be executed by Mike with corresponding time of 1,514 minutes. If the first activity executed by Pete, then the followed activity will be executed by Mike or Mike. The probability is 2/3 of reaching to Mike with corresponding time of 649 minutes. The probability is 1/3 of reaching to Sue with corresponding time of 1,384 minutes.
Sean's friend invites her to attend a business party through an e-mail. At the same time, a customer comes to the insurance company for “pay compensation”. Via our process model, Sean will know that she will do activity ET or CT in the third stage. If the first activity is executed by Pete, Sean will execute activity EC or CT in 1,509.5 (649+860.5) minutes. But if the activity does not reach Mike in 649 minutes, Sean can give her friend a clear answer that she will attend this party in time. If the first activity is executed by Ellen, Sean will executed activity EC or CT in 2,374.5 (1,514+860.5) minutes. But if the party will be finished in 2,374.5 minutes, then Sean can also give her friend a clear answer that she will attend this party in time. If the first activity is executed by Mike, Sean will execute activity EC or CT in 900.5 (40+860.5) minutes. But if the activity does not reach Mike in 40 min, then Sean can also give her friend a clear answer that she will attend this party in time.
6 Experiments with synthetic data and real data
6.1 Experiment with synthetic data
In this section, we report experiments in which we use our algorithm to generate process models from synthetic logs and then compare the prediction results between our model and the model based on sequence abstraction. In this experiment, we use the event log in Table VI.
In Table VI each line corresponds to a trace represented as a sequence of activities with timestamps. In Section 4, we get the stage adjacency matrix and the prediction model based on the stage adjacency matrix. Using the PROM (www.processmining.org/prom/downloads), we get the prediction process model based on sequence abstraction for the event logs. The prediction results between our method and the existing method are shown in Table VII.
In this paper, we use the average value as the prediction function (Van der Aalst et al., 2011). We can find that when there are 17 different states in Table VII. If there are several branches in the next stage at a state, we will give several different prediction times (the number of branches are equal to the prediction times). But the existing method just gives one prediction time whenever there are branches or not.
For example, using our method to make a prediction, we will know there are two activities in the next stage at state 2 in Table VII when activity A has finished:
The probability is (1/5) of reaching activity C, and the corresponding remaining time is 44.
The probability is (4/5) of reaching activity B, and the corresponding remaining time is 31.
When activity A has finished, the remaining time is 33.6 to reach to the end. We can get (44*1+31*4)/5=33.6 through our prediction results. It means that we can get the same prediction result as the existing method when activity A is at stage 1. However, we cannot get our prediction result by using the existing method. So, for the time prediction, if there are branches at a state, using our method to make a prediction will be better than the existing method (Song and Van der Aalst, 2007). If there are no branches at a state, we get the same prediction results.
We can get the same time prediction under 17 states by using our method and the sequence abstraction method when there are no branches. There are 10 states ((1, 3, 5, 7, 8, 9, 10, 11, 13, and 15) or (Start, activity C at stage 2, activity D at stage 3, activity E at stage 3, activity D at stage 4, activity C at stage 4, activity B at stage 4, activity E at stage 4, activity E at stage 5, and activity D at stage 7)). By using our method, we will get several prediction time values (the number of prediction values are equal to the branches) unlike the existing method which only gets one prediction value at one state whenever there are branches or not. As the analysis for example 1, we know that at the ten states (activity A at stage 1, activity B at stage 2, activity C at stage 3, and activity C at stage 6) using our approach to make a prediction will be better than the existent approach.
At state 12 or activity C at stage 5 using our method we get one-time prediction value. But using the existing method we get two different time prediction values. For the state (activity C at stage 5 or the prefix activities are {A, B, C, C, C} and {A, B, C, B, C}) the remaining time are 20 and 11 based on the sequence abstraction, respectively. Whereas, we only get one-time prediction value that is 15.5 based on the stage adjacency matrix abstraction. According to the sequence abstraction, there are two traces, namely (A, B, C, C, C, C) and (A, B, C, B, C, C). The fourth activity is the first different activity between these two traces. So, there will be two branches from the third activity, and will never be merged again for the sequence abstraction. For our method, these two traces will be divided into two branches at the third activity (stage 3), but it will be merged into one trace again at the fifth activity (stage 5), because the activity from the fifth activity (stage 5) to the sixth activity (stage 6) in these two traces happen at two same activities (C, C). However, using sequence abstraction to extract the process model, if the prefix activities are different, the trace is then divided into several different branches, and will never be merged again from the first state that has branches. Whereas, using stage adjacency matrix abstraction to extract the process model, several traces will be merged into one trace when there is a same activity (activities) transitioning between two stages. We do not need to consider whether there are branches or not in front of this stage. So, we only get one-time prediction value when activity C is at stage 5. However, if we use the frequency in the stage adjacency matrix and the time prediction value in Figure 4, we get (20*1+11*1)/2=15.5.
When prefix activities are A, B, C, B, C, C, after these six activities finish, if the next activity is E, the remaining time is 11, and the frequency is one. If the next activity is D, the remaining time is 9, and the frequency is one. For these two situations, the predictions are contained in our approach prediction results. That means when the prefix activities are A, B, C, C, C, C, after these six activities complete, the next activity is E instead of D. When the prefix activities are A, B, C, B, C, C, after these six activities complete, the next activity is D instead of E. But the possibility computed by the frequency value in [ ] is not zero to activity D when the six activities A, B, C, C, C, C are finished using our approach. Because the predicted possibility is only for the current activity transition between two stages, our model cannot capture the possibility for the whole previous activities (which may differentiate before the merging stage) to the following activity. However, at the same time our approach will be helpful to reduce the overhead caused by various non-merged points, which may be a scalability problem of the existing methods. Using the same method we can give an explanation for the state 16 and 17 when activity E at stages 7 and 8, respectively.
At the same time, using our method we can predict what activity (activities) will happen in the next stage and the possibility (computed using frequency) of reaching different activities as well as the remaining time to reach the end. The existing method (Van der Aalst et al., 2011) can only predict the remaining time if one or several activities have happened. That means the prediction result will be one step further than the proposed approach.
Obviously, more time prediction values will be better than the only one-time prediction value at a state. Therefore, our approach will be better than the existing method. So, when there is a state with several branches, using our method will be better than the existing approach. Tables VIII and IX show the comparison between the results of our method and the annotated transition systems (based on sequence abstraction) for synthetic example.
The existing method cannot predict probability. For the time prediction, 58.82 percent of the results are the same with using existing approach, and 35.29 percent of the results are better than the existing approach. Only 5.88 percent of the results using the existent approach will be better than our approach. For activity prediction, 94.12 percent of the results are the same with using existent approach, and 8.33 percent of the results are better than the existing approach for the experiment in Table VI (five cases, 27 events/activities).
6.2 Experiment with real data
To evaluate the practical applicability of the algorithm, we access the US patent application process data (see http://portal.uspto.gov/external/portal/pair) and discuss our experiments with this data. We collect transaction data for patents of category 435 (Chemistry: molecular biology and microbiology) under the US Patent Classification issued between 2000 and 2005. There are a total of 31,682 patent transaction instances containing 518 unique events/activities. As an initial attempt toward building a process model for all 518 events/activities, we focus our experiments on the top 10 most frequent tasks. These tasks help to identify the highest-level structure of the underlying process model with manageable complexity.
We use 20 cases containing 291 events/activities to extract the prediction process model based on the prediction stage adjacency matrix abstraction. The comparison of the results between the results of our method and the annotated transition systems is shown in Table X.
Via our approach we can predict probability to reach every activity at next stage. The existing approach cannot predict the probability. For the time prediction, 63.44 percent of the results are the same with using existent approach, and 31.26 percent of the results are better than the existent approach. And only 5.30 percent of the results using the existing approach will be better than our approach. For activity prediction, 94.70 percent of the results are the same with using existent approach, and 5.30 percent of the results are better than the existing approach for the experiment using the real data (20 cases, 291 events/activities).
6.3 Experiments with data in Table I
We use the data in Table I (six cases, 42 events/activities) to evaluate the practical applicability of the algorithm and discuss our experiments with this data. Via our approach we can predict probability and resource to reach every activity at next stage. The existing approach cannot predict the probability and the resource. Therefore, the approach based on the stage adjacency matrix abstraction has a better performance compared to approaches based on sequence abstraction (annotated transition system) from the prediction results.
7 Conclusion and further work
In this paper, we developed a new approach for analyzing event logs and extracting process model based on the adjacency matrix and stage adjacency matrix. We focus on a combination of “pair-adjacent activities” and “pair-adjacent persons” from adjacent relationships, observed in event logs, and then construct the adjacency matrix and the stage adjacency matrix. We have done two things:
extracted two different kinds of handover process model; and
use the models to optimize the handover process.
In this paper, we have evaluated our approach based on stage adjacency matrix abstraction using real data for “pay compensation” process of a health-care insurance enterprise. Four process models show our approach. At the same time, we can also predict the future by using the prediction adjacency matrix and stage prediction adjacency matrix. Our method is better than the existent approach for time prediction when there are several branches at a state. If several branches merge at a state, the existent approach will be better than our approach. Whether there are branches or not at a state, we will get the same activity prediction result. But if several branches merge at a state, the existing approach is better. Using our method, the prediction of interval time could be much more complex than the paper appears at present. It is necessary to consider a probability distribution that governs individual processing times at a joint distribution of processing times. But, our approach will be helpful to reduce the overhead caused by the various not-merged points, which may be a scalability problem of the existent methods. At the same time, we know that our approach becomes better when there are more data for time prediction. In this paper, we first abstract the handover process, and propose a method for the work handover from one person to another.
In further work, we will improve the time prediction accuracy, study high-accuracy time prediction functions, predict the delay from every state to subordinate states, as well as predict the remaining time until completion and analyze the Poisson distribution for all the events in the process models. We will also search and utilize reasonable methods to integrate and optimize the information flow, logistics, and currency flow, etc. using our business model. Furthermore, we will build a new optimal handover business process model. Finally, we will use the information in the prediction process model such as time, people, activity, and probability to abstract the social networks.
Corresponding author
Jian Liu can be contacted at: [email protected]
References
Agrawal, R. , Gunopulos, D. and Leymann, F. (1998), “Mining process models from workflow logs”, Sixth International Conference on Extending Database Technology, Proceedings on LNCS, Vol. 1377, Springer, Berlin, pp. 467-483.
Alves de Mdeoros, A.K. , Weihters, A.J.M.M. and Van der Aalst, W.M.P. (2007), “Genetic process mining: an experimental evaluation”, Data Mining and Knowledge Discovery, Vol. 14 No. 2, pp. 245-304.
Bezerra, F. and Wainer, J. (2013), “Algorithms for anomaly detection of traces in logs of process aware information systems”, Information Systems, Vol. 38 No. 1, pp. 33-44.
Buijs, J.C.A.M. (2010), Mapping Data Sources to XES in Generic Way, Technische Universiteit Eindhoven University of Technology, Eindhoven.
Cook, J.E. and Wolf, A.L. (1998), “Discovering models of software processes from event-based data”, ACM Transactions on Software Engineering and Methodology, Vol. 7 No. 3, pp. 215-249.
Datta, A. (1998), “Automating the discovery of As-Is business process models: probabilistic and algorithmic approaches”, Information Systems Research, Vol. 9 No. 3, pp. 275-301.
Dustdar, S. , Hoffmann, T. and Van der Aalst, W.M.P. (2005), “Mining of ad-hoc business process with TeamLog”, Data & Knowledge Engineering, Vol. 55 No. 2, pp. 129-158.
Gunther, C.W. and Van der Aalst, W.M.P. (2007), “Finding structure in unstructured processes: the case for process mining”, Proceedings of the 7th International Conference on Application of Concurrency to System Design (ACSD 2007), Bratislava, Slovak Republic, 10-13 July, IEEE, Piscataway, NJ, pp. 3-12.
Huang, Z. and Kumar, A. (2009), “New quality metrics for evaluating process models”, Proceedings of the 4th WBPM, Stockholm, AAAI Press, Menlo Park, CA, pp. 52-57.
Huang, Z. and Kumar, A. (2011), “A study of quality and accuracy trade-offs in process mining”, INFORMS Journal on Computing, Vol. 10 No. 3, pp. 1-18.
Klein, M. and Bernstein, A. (2004), “Towards high-precision service retrieval”, IEEE Internet Computing, Vol. 8 No. 1, pp. 30-36.
PROM (2009), The Process Mining Group, Mathematics and Computer Science Department, Eindhoven University of Technology, available at: www.processmining.org/prom/downloads.
Rozinat, A. and Van der Aalst, W.M.P. (2008), “Conformance checking of processes based on monitoring real behavior”, Information Systems, Vol. 33 No. 1, pp. 64-95.
Schonenberg, H. , Weber, B. , Van Dongen, B.F. and Van der Aalst, W.M.P. (2008), “Supporting flexible processes from recommendations based on history”, International Conference on Business Process Management (BPM 2008), LNCS, Springer, Berlin, pp. 51-66.
Song, M. and Van der Aalst, W.M.P. (2007), “Supporting process mining by showing events at a glance”, Proceedings of the 7th Annual Workshop Information and Technology Systems, Montreal.
Tan, P.N. , Steinbach, M. and Kumar, V. (2006), Introduction to Data Mining, Addison Wesley, Reading, MA.
Van der Aalst, W.M.P. (2007), “Exploring the CSCW spectrum using process mining”, Advanced Engineering Informatics, Vol. 21 No. 4, pp. 191-199.
Van der Aalst, W.M.P. (2011), Process Mining: Discovery, Conformance and Enhancement of Business Processes, Springer, Berlin.
Van der Aalst, W.M.P. , Schonenberg, M.H. and Song, M. (2011), “Time prediction based on process mining”, Information Systems, Vol. 36 No. 2, pp. 450-475.
Van der Aalst, W.M.P. , Weijters, A.J.M.M. and Maruster, L. (2004), “Workflow mining: discovering process models from event logs”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16 No. 9, pp. 1128-1142.
Van der Aalst, W.M.P. , Van Dongen, B.F. , Herbst, J. , Maruster, L. , Schimm, G. , Weijters, A.J.M.M. and Maruster, L. (2003), “Workflow mining: a survey of issues and approaches”, Data and Knowledge Engineering, Vol. 16 No. 9, pp. 1128-1142.
Van Dongen, B.F. and Van der Aalst, W.M.P. (2005), “A meta model for process mining data”, Conference on Advanced Information Systems Engineering, Porto, Portugal, Vol. 161, 10, 11, 83.
Van Dongen, B.F. , Crooy, R.A. and Van der Aalst, W.M.P. (2008), “Cycle time prediction: when will this case finally be finished?”, Proceedings of the 16th International Conference on Cooperative Information Systems, CoopIS 2008, OTM 2008, Part I, LNCS, Vol. 5331, Springer, Berlin, pp. 319-336.
Wen, L. , Wang, J. and Sun, J. (2006), “Detecting implicit dependencies between tasks from event logs”, Aisa-Pacific Web Conference on Frontiers of WWW Research and Development, LNCS, Vol. 3841, Springer, Berlin, pp. 591-603.
Further Reading
Mans, R.S. , Russell, N.C. , Van der Aalst, W.M.P. , Moleman, A.J. and Bakker, P.J.M. (2010), “Schedule-aware workflow management systems”, Transactions on Petri Nets and Other Models of Concurrency IV, LNCS, Vol. 6550, Springer, Berlin, pp. 121-143.
Van der Werf, J.M.E.M. , Van Dongen, B.F. , Hurkens, C.A.J. and Serebrenik, A. (2008), “Process discovery using integer linear programming”, Proceedings of the 29th International Conference on Applications and Theory of Petri Nets (Petri Nets 2008), LNCS, Vol. 5062, Springer, Berlin.
Weijters, A.J.M.M. and Van der Aalst, W.M.P. (2003), “Rediscovering workflow models from event-based data using little thumb”, Integrated Computer-Aided Engineering, Vol. 10 No. 2, pp. 151-162.
Acknowledgements
This research is supported by the Key Project of Science Foundation of China (No. 70931002), Major Project of Social Science Foundation of China (No. 10zd&014), Fundamental Research Funds for the Central Universities (No. 30920130132014), China Postdoctoral Science Foundation funded project (No. 2013M530261), and Jiangsu Planned Projects for Postdoctoral Research Funds (No. 1301108C). The authors would like to thank the editor and reviewer for their detailed and constructive comments.