DM Lect 5 - Sequence & Stream Mining
DM Lect 5 - Sequence & Stream Mining
Mining
Sequence & Stream
Mining
Frequent
Patterns Clustering Classification
Mining
Association Sequence
Rules Mining
Apriori GSP
FP-Growth SPADE
ECLAT PrefixSpan
Element vs Event
Element Event
7
Prefix Projection
8
PrefixSpan - Example
• 1. Find length 1 sequential
patterns:
id Sequence
10 <{a}{abc}{ac}{d}{cf}>
20 <{ad}{c}{bc}{ae}>
30 <{ef}{ab}{df}{c}{b}>
40 <{e}{g}{af}{c}{b}{c}>
4 4 4 3 3 3 1
Frequent Events:
<a>,<b>,<c>,<d>,<e>,<f>
PrefixSpan - Example
id Sequence
10 <{a}{abc}{ac}{d}{cf}>
20 <{ad}{c}{bc}{ae}>
• 2. Divide search space 30 <{ef}{ab}{df}{c}{b}>
40 <{e}{g}{af}{c}{b}{c}>
Prefix
>a< >c<
<{abc}{ac}{d}{cf}> <{ac}{d}{cf}> >e<
<{_d}{c}{bc}{ae}> <{bc}{ae}>
<{_f}{ab}{df}{c}{b}>
<{_b}{df}{c}{b}> <{b}>
<{af}{c}{b}{c}>
<{_f}{c}{b}{c}> <{b}{c}>
>b< >d<
<{_c}{ac}{d}{cf}> <{cf}> >f<
<{_c}{ae}> <{c}{bc}{ae}> <{ab}{df}{c}{b}>
<{df}{c}{b}> <{_f}{c}{b}> <{c}{b}{c}>
<{c}>
PrefixSpan – Example
>d<
<{cf}>
<{c}{bc}{ae}>
<{_f}{c}{b}>
<{d}{b}>
<{d}{c}>
Frequent Sequences
PrefixSpan – Example
• Continue with frequent sequences:
>d<
<{cf}>
<{c}{bc}{ae}>
<{_f}{c}{b}>
<{d}{b}> <{d}{c}>
<{bc}{ae}> >b< >a< >e< >c<
<{_c}{ae}>
<{b}> 2 1 1 1
Frequent: <{d}{c}{b}>
<{d}{c}{b}>
>}ae{<
PrefixSpan – Example
• Find combinations with c:
>c<
<{ac}{d}{cf}> >a< >b < >c< >d< > e< >f<
<{bc}{ae}> 2 3 3 1 1 1
<{b}>
<{b}{c}>
<{c}{a}> <{c}{b}> <{c}{c}>
Ye
s
Increase
frequency
Ye Delete items
No End of
s with
bucket
f+∆≤b
Generate
Subsets
Count frequency
f≥β No Itemse
t
Yes exists
Insert with into Yes
list
No f
+∆≤b
Yes
All
No subsets
Delete itemset
processe
d
Yes
∆=b-β
Thank You