0% found this document useful (0 votes)
8 views41 pages

Lecture 13 14 FP

The FP-growth algorithm addresses the bottleneck of frequent-pattern mining by avoiding multiple database scans and candidate generation, using a compact FP-tree structure instead. It employs a divide-and-conquer methodology to efficiently mine frequent patterns by recursively constructing conditional FP-trees. The FP-tree preserves complete information while reducing irrelevant data, making it a powerful tool for frequent pattern mining.

Uploaded by

missjuthi73
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views41 pages

Lecture 13 14 FP

The FP-growth algorithm addresses the bottleneck of frequent-pattern mining by avoiding multiple database scans and candidate generation, using a compact FP-tree structure instead. It employs a divide-and-conquer methodology to efficiently mine frequent patterns by recursively constructing conditional FP-trees. The FP-tree preserves complete information while reducing irrelevant data, making it a powerful tool for frequent pattern mining.

Uploaded by

missjuthi73
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

FP GROWTH

ALGORITHM
Bottleneck of Frequent-pattern Mining
 Multiple database scans are costly
 Mining long patterns needs many passes of
scanning and generates lots of candidates
 To find frequent itemset i1i2…i100
 # of scans: 100
 # of Candidates: (1001) + (1002) + … + (110000) = 2100-1 =
1.27*1030 !
 Bottleneck: candidate-generation-and-test
 Can we avoid candidate generation?

Md. Manowarul Islam, Dept. Of CSE, JnU


FP-growth: Mining Frequent Patterns

 Compress a large database into a compact,


Frequent-Pattern tree (FP-tree) structure
 highly condensed, but complete for frequent pattern
mining
 avoid costly database scans
 Develop an efficient, FP-tree-based frequent
pattern mining method
 A divide-and-conquer methodology: decompose mining
tasks into smaller ones
 Avoid candidate generation: sub-database test only!

Md. Manowarul Islam, Dept. Of CSE, JnU


FP-tree Construction
TID items Items bought (ordered) frequent min_support = 3
Item frequency
100 {f, a, c, d, g, i, m, p} {f, c, a, m, p}
f 4
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
c 4
300 {b, f, h, j, o} {f, b}
a 3
400 {b, c, k, s, p} {c, b, p} b 3
500 {a, f, c, e, l, p, m, n} {f, c, a, m, p} m 3
p 3
Steps:
1. Scan DB once, find frequent 1-itemsets
(single item patterns)
2. Order frequent items in descending order
of their frequency
3. Scan DB again, construct FP-tree
Md. Manowarul Islam, Dept. Of CSE, JnU
FP-tree Construction
min_support = 3
TID freq. Items bought
100 {f, c, a, m, p} Item frequency
200 {f, c, a, b, m} f 4
300 {f, b} c 4
400 {c, p, b} a 3
root b 3
500 {f, c, a, m, p}
m 3
p 3
f:1

c:1

a:1

m:1

p:1

Md. Manowarul Islam, Dept. Of CSE, JnU


FP-tree Construction
min_support = 3
TID freq. Items bought
100 {f, c, a, m, p} Item frequency
200 {f, c, a, b, m} f 4
c 4
300 {f, b}
a 3
400 {c, p, b} root b 3
500 {f, c, a, m, p}
m 3
p 3
f:2

c:2

a:2

m:1 b:1

p:1 m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


FP-tree Construction
min_support = 3
TID freq. Items bought
100 {f, c, a, m, p} Item frequency
200 {f, c, a, b, m} f 4
c 4
300 {f, b}
a 3
400 {c, p, b} root b 3
500 {f, c, a, m, p}
m 3
p 3
f:3 c:1

c:2 b:1 b:1

a:2 p:1

m:1 b:1

p:1 m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


FP-tree Construction
TID freq. Items bought min_support = 3
100 {f, c, a, m, p} Item frequency
200 {f, c, a, b, m} f 4
300 {f, b} c 4
400 {c, p, b} root a 3
500 {f, c, a, m, p} b 3
m 3
Header Table f:4 c:1 p 3
Item frequency head
f 4
c 4 c:3 b:1 b:1
a 3
b 3 a:3 p:1
m 3
p 3 m:2 b:1

p:2 m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Benefits of the FP-tree Structure

 Completeness:
 never breaks a long pattern of any transaction
 preserves complete information for frequent pattern mining
 Compactness
 reduce irrelevant information—infrequent items are gone
 frequency descending ordering: more frequent items are
more likely to be shared
 never be larger than the original database (if not count
node-links and counts)
 Example: For Connect-4 DB, compression ratio could be
over 100

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
FP-tree
 General idea (divide-and-conquer)
 Recursively grow frequent pattern path using the FP-tree
 Method
 For each item, construct its conditional pattern-base, and
then its conditional FP-tree
 Repeat the process on each newly created conditional FP-
tree
 Until the resulting FP-tree is empty, or it contains only one
path (single path will generate all the combinations of its sub-
paths, each of which is a frequent pattern)

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
FP-tree
TID freq. Items bought Bottom-up traversal of the tree. First,
100 {f, c, a, m, p} itemsets ending in p, then m, etc, each time
200 {f, c, a, b, m} a suffix-based class
300 {f, b}
400 {c, p, b} root
500 {f, c, a, m, p}
Header Table f:4 c:1
Item frequency head
f 4
c 4 c:3 b:1 b:1
a 3
b 3 a:3 p:1
m 3
p 3 m:2 b:1

p:2 m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using FP-
tree
 construct prefix tree for P
 Find all prefix paths that contain p

root

f:4 c:1 Prefix path for P

c:3 b:1

a:3 p:1

m:2

p:2

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using FP-
tree
 Compute Support for p (minsup = 3)
 Follow pointers while summing up counts: 2+1 = 3 >=3
 p is frequent
root

f:4 c:1 Prefix path for P

c:3 b:1

a:3 p:1

m:2

p:2

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

f:2 c:1 Conditional pattern base for P

c:2 b:1

a:2

m:2

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
Prune infrequent TID freq. Items bought
100 {f, c, a, m, p}
In the conditional FP-tree some 200 {f, c, a, b, m}
nodes may have support less than 300 {f, b}
400 {c, b, p}
minsup root 500 {f, c, a, m, p}

f:2 c:1 Conditional pattern base for P

c:2 b:1

a:2
m needs to be pruned m=2
m:2 a needs to be pruned a=2
b needs to be pruned b=1
f needs to be pruned f=2

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Conditional pattern base for P Conditional FP-tree for P

root root
f:2 c:1
c:3
c:2 b:1

a:2
All frequent patterns that
include p
m:2
{p, cp}

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 construct prefix tree for M
 Find all prefix paths that contain m

root

f:4 Prefix path for m

c:3

a:3

m:2 b:1

m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 Compute Support for m (minsup = 3)
 Follow pointers while summing up counts: 2+1 = 3 >=3
 m is frequent

root

f:4 Prefix path for m

c:3

a:3

m:2 b:1

m:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

f:3 Conditional pattern base for m

c:3

a:3

b:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
Prune infrequent 200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

f:3 Conditional pattern base for m

c:3

a:3 b needs to be pruned b=1

b:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
Conditional pattern base for m Conditional FP-tree for m

root root

f:3 f:3

c:3 c:3

a:3 a:3

b:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
Conditional FP-tree for m Conditional FP-tree for am
root root

f:3 f:3

c:3 c:3

a:3

Conditional FP-tree for cam All frequent patterns that


include m
root
{m, fm, cm, am, fcm,
fam, cam, fcam }
f:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 construct prefix tree for b
 Find all prefix paths that contain b

root

Prefix path for b


f:4 c:1

c:3 b:1 b:1

a:3

b:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 Compute Support for p (minsup = 3)
 Follow pointers while summing up counts: 1+1+1 = 3 >=3
 b is frequent

root

Prefix path for b


f:4 c:1

c:3 b:1 b:1

a:3

b:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

Conditional pattern base for b


f:2 c:1

c:1

a:1

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Prune infrequent

root

Conditional pattern base for b


f:2 c:1

c:1

a:1 f needs to be pruned f=2


c needs to be pruned c=2
a needs to be pruned a=2

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Conditional pattern base for b


Conditional FP-tree for b
root root

f:2 c:1

c:1 No Conditional FP-tree


a:1
All frequent patterns that include b
{b}

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 construct prefix tree for a
 Find all prefix paths that contain a

root

f:4 Prefix path for a

c:3

a:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 Compute Support for p (minsup = 3)
 Follow pointers while summing up counts: 3 >=3
 a is frequent

root

f:4 Prefix path for a

c:3

a:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

f:3 Conditional pattern base for a

c:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree)
Prune infrequent TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
root 500 {f, c, a, m, p}

f:3 Conditional pattern base for a

c:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Conditional pattern base for a Conditional FP-tree for a

root root

f:3 f:3

c:3 c:3

All frequent patterns that


include a
{a, fa, ca, fcm}

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 construct prefix tree for c
 Find all prefix paths that contain c

root
Prefix path for c
f:4 c:1

c:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 Compute Support for c (minsup = 3)
 Follow pointers while summing up counts: 3+1 >=3
 c is frequent

root
Prefix path for c
f:4 c:1

c:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
500 {f, c, a, m, p}
root
Conditional pattern base for c
f:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
Prune infrequent TID freq. Items bought
100 {f, c, a, m, p}
200 {f, c, a, b, m}
300 {f, b}
400 {c, b, p}
500 {f, c, a, m, p}
root
Conditional pattern base for c
f:3

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Conditional pattern base for c Conditional FP-tree for c

root root

f:3 f:3

All frequent patterns that


include c
{c,fc}

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree
 construct prefix tree for f
 Find all prefix paths that contain f

root

f:4

 Compute Support for f (minsup = 3)


 Follow pointers while summing up counts: 4>=3
 f is frequent

Md. Manowarul Islam, Dept. Of CSE, JnU


Mining Frequent Patterns Using
the FP-tree

Conditional pattern base for f Conditional pattern base for f

root root

f:4

All frequent patterns that include f


{f}

DONE……….

Md. Manowarul Islam, Dept. Of CSE, JnU


Conditional Pattern-Bases for the
example

Item Conditional pattern-base Conditional FP-tree


p {(fcam:2), (cb:1)} {(c:3)}|p
m {(fca:2), (fcab:1)} {(f:3, c:3, a:3)}|m
b {(fca:1), (f:1), (c:1)} Empty
a {(fc:3)} {(f:3, c:3)}|a
c {(f:3)} {(f:3)}|c
f Empty Empty

Md. Manowarul Islam, Dept. Of CSE, JnU


Thank you

Md. Manowarul Islam, Dept. Of CSE, JnU

You might also like