Alpha Process Mining Algorithm
Alpha Process Mining Algorithm
Acknowledgement: Materials on these slides are adapted from “Process-oriented System Analysis & 1
Process mining"
Definitions
Let T be a set of activities (Tasks) and T * the set
of all sequences of arbitrary length over T, then
we have:
σ ∈ T * is called execution sequence, if all activities
in σ belong to the same process instance
W ⊆ T * is called execution log (workflow log)
Assumptions
In each process model, each activity appears at least
once
Each direct neighbor relation between activities is
represented at least once
Execution Logs
case 1 : task A
case 2 : task A
case 3 : task A
case 3 : task B
case 1 : task B
case 1 : task C
case 2 : task C
case 4 : task A
case 2 : task B
case 2 : task D
case 5 : task E
case 4 : task C
case 1 : task D
case 3 : task C
case 3 : task D
case 4 : task B
case 5 : task F
case 4 : task D
Execution Logs
Execution sequences: case 1 : task A
case 2 : task A
Case 1: ABCD case 3 : task A
case 3 : task B
Case 2: ACBD case 1 : task B
case 1 : task C
Case 3: ABCD case 2 : task C
case 4 : task A
case 2 : task B
Case 4: ACBD case 2 : task D
case 5 : task E
Case 5: EF case 4 : task C
case 1 : task D
Resulting case 3 : task C
case 3 : task D
workflow log: case 4 : task B
case 5 : task F
W = {ABCD, ACBD, EF} case 4 : task D
Order relations
Log based order relations for pairs of activities
a, b ∈ T in a workflow log W:
Direct successor
a >w b i.e. in an execution sequence b directly follows a
Causality
a →w b i.e. a >w b and not b >w a
Concurrency
a ║w b i.e. a >w b and b >w a
Exclusiveness
a #w b i.e. not a >w b and not b >w a
Activity pairs which never succeed each other
Execution log analysis
Idea (a)
⇔
a→b
α-Algorithm
Idea (b)
⇔
a→ b, a→ c and b # c
α-Algorithm
Idea (c)
⇔
b→ d, c→ d and b # c
α-Algorithm
Idea (d)
⇔
a→ b, a→ c and b || c
α-Algorithm
Idea (e)
⇔
b→ d, c→ d and b || c
The Alpha-Algorithm (simplified)
1. Identify the set of all tasks in the log as TW.
2. Identify the set of all tasks that have been observed as the first task
in some case as TI.
3. Identify the set of all tasks that have been observed as the last task
in some case as TO.
4. Identify the set of all connections to be potentially represented in the
process model as a set XW. Add the following elements to XW:
a. Pattern (a): all pairs for which hold a→b.
b. Pattern (b): all triples for which hold a→(b#c).
c. Pattern (c): all triples for which hold (b#c)→d.
Note that triples for which Pattern (d) a→(b||c) or Pattern (e)
(b||c)→d hold are not included in XW.
The Alpha-Algorithm (cont.)
5. Construct the set YW as a subset of XW by:
a. Eliminating a→b and a→c if there exists some a→(b#c).
b. Eliminating b→c and b→d if there exists some (b#c)→d.
b. For each task in the set TO of last tasks, add an end event and draw an
arc from the task to the end event.
The Alpha-Algorithm (cont.)
7. Construct the flow arcs in the following way:
a. Pattern (a): For each a→b in YW, draw an arc a to b.
b. Pattern (b): For each a→(b#c) in YW, draw an arc from a to an XOR-
split, and from there to b and c.
c. Pattern (c): For each (b#c)→d in YW, draw an arc from b and c to an
XOR-join, and from there to d.
d. Pattern (d) and (e): If a task in the so constructed process model has
multiple incoming or multiple outgoing arcs, bundle these arcs with an
AND-split or AND-join, respectively.