0% found this document useful (0 votes)

4 views

fp-tree

Uploaded by

gauharjahansd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

fp-tree

Uploaded by

gauharjahansd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 37

Mining Frequent Patterns with

out Candidate Generation

Jiawei Han, Jian Pei and Yiwen Yin

SIGMOD 2000’ Conference

Presented By:
Xun Luo and Shun Liang
04/07/2005
Outline
 Introduction
 Constructing FP-Tree
Example 1

 Mining Frequent Patterns using FP-Tree

Example 2

 Performance Evaluation
 Discussions

2
Introduction
 Terminology
 Apriori-like Algorithms
 Generate-and-Test

 Cost Bottleneck

 FP-Tree and FP-Growth Algorithm

 FP-Tree: Frequent Pattern Tree

 FP-Growth: Mining frequent patterns with FP-Tree

3
Terminology
 Item set
 A set of items: I = {a1, a2, ……, am}
 Transaction database
 DB = <T1, T2, ……, Tn>
 Pattern
 A set of items: A
 Support
 The number of transactions containing A in DB
 Frequent pattern
 A’s support ≥ minimum support threshold ξ
 Frequent Pattern Mining Problem
 The problem of finding the complete set of frequent patterns
4
Apriori-like Algorithms
 Algorithm
 Anti-Monotone Heuristic
 If any length k pattern is not in the database, its length (k+1) super-pattern
can never be frequent
 Generating candidate set
 Testing candidate set

 Two non-trivial costs: (Bottleneck)

 Candidate sets are huge. (They are pruned already but still
increase exponentially with stage number k).
 Repeated scan the database and test the candidate set by

pattern matching.

5
FP-Tree and FP-Growth Algorithm
 FP-Tree: Frequent Pattern Tree
 Compact presentation of the DB without information loss.
 Easy to traverse, can quickly find out patterns associated with

a certain item.
 Well-ordered by item frequency.

 FP-Growth Algorithm
 Start mining from length-1 patterns
 Recursively do the following
 Constructs its conditional FP-tree

 Concatenate patterns from conditional FP-tree with suffix

 Divide-and-Conquer mining technique

6
Outline
 Introduction
 Constructing FP-Tree
Example 1

 Mining Frequent Patterns using FP-Tree

Example 2

 Performance Evaluation
 Discussions

7
FP-Tree Definition
 Three components:
 One root: labeled as “null”
roo
 A set of item prefix subtrees t
 A frequent-item header table f:4 c:
1
c: b: b:
3 1 1
Header a: p:
ite Table
head of node- 3 1
m links m: b:
f
c 2 1
a
b p: m:
m 2 1
p
8
FP-Tree Definition (cont.)
 Each node in the item prefix subtree consists of
three fields:
 item-name

 node-link

 count

 Each entry in the frequent-item header table co

nsists of two fields:
 item-name

 head of node-link

9
Example 1: FP-Tree Construction
 The transaction database used (fist two column only):
TID Items Bought (Ordered) Frequent Items
100 f,a,c,d,g,i,m,p f,c,a,m,p
200 a,b,c,f,l,m,o f,c,a,b,m
300 b,f,h,j,o f,b
400 b,c,k,s,p c,b,p
500 a,f,c,e,l,p,m,n f,c,a,m,p

minimum support threshold ξ= 3

10
Example 1 (cont.)
 First Scan: //count and sort
 count the frequencies of each item

 collect length-1 frequent items, then sort them in

support descending order into L, frequent item list.

L = {(f:4), (c:4), (a:3), (b:3), (m:3), (p:3)}

11
Example 1 (cont.)
 Second Scan://create the tree and header table
 create the root, label it as “null”

 for each transaction Trans, do

 select and sort the frequent items in Trans
 increase nodes count or create new nodes

If prefix nodes already exist, increase their counts by 1;

If no prefix nodes, create it and set count to 1.
 build the item header table
 nodes with the same item-name are linked in sequence

via node-links
12
Example 1 (cont.)
The building process of the tree

roo roo roo roo

t t t t
f:1 f:2 f:3

c: c: c: b:
1 2 2 1
a: a: a:
1 2 2
m: m: b: m: b:
1 1 1 1 1
p: p: m: p: m:
1 1 1 1 1
Create After tran After tran After tran
root s 1 (f,c,a, s 2 (f,c,a, s 3 (f,b)
m,p) b,m) 13
Example 1 (cont.)
The building process of the tree (cont.)

roo roo
t t
f:3 c: f:4 c:
1 1
c: b: b: c: b: b:
2 1 1 3 1 1
a: p: a: p:
2 1 3 1
m: b: m: b:
1 1 2 1
p: m: p: m:
1 1 2 1

After tran After tran

s 4 (c,b,p)
c,b,p s 5 (f,c,a,
14
Example 1 (cont.)
Build the item header table
roo
t
f:4 c:
1
c: b: b:
3 1 1
Header a: p:
ite Table
head of node- 3 1
m links m: b:
f
c 2 1
a
b p: m:
m 2 1
p

15
FP-Tree Properties
 Completeness
 Each transaction that contains frequent pattern is
mapped to a path.
 Prefix sharing does not cause path ambiguity, as
only path starts from root represents a transaction.
 Compactness
 Number of nodes bounded by overall occurrence of
frequent items.
 Height of tree bounded by maximal number of
frequent items in any transaction.

16
FP-Tree Properties (cont.)
 Traversal Friendly (for mining task)
 For any frequent item a , all the possible frequent p
i
atterns that contain ai can be obtained by following
ai’s node-links.
 This property is important for divide-and-conquer.
It assures the soundness and completeness of probl
em reduction.

17
Outline
 Introduction
 Constructing FP-Tree
Example 1

 Mining Frequent Patterns using FP-Tree

Example 2

 Performance Evaluation
 Discussions

18
FP-Growth Algorithm
 Functionality:
 Mining frequent patterns using FP-Tree generated before

 Input:
 FP-tree constructed earlier

 minimum support threshold ξ

 Output:
 The complete set of frequent patterns

 Main algorithm:
 Call FP-growth(FP-tree, null)
19
FP-growth(Tree, α)
Procedure FP-growth(Tree, α)
{
if (Tree contains only a single path P)
{ for each combination β of the nodes in P
{ generate pattern β Uα;
support = min(sup of all nodes in β)
}
}
else // Tree contains more than one path
{ for each ai in the header of Tree
{ generate pattern β= ai Uα;
β.support = ai.support;
construct β’s conditional pattern base;
construct β’s conditional FP-tree Treeβ;
if (Treeβ ≠ Φ)
FP-growth(Treeβ , β);
}
}

20
Example 2
 Start from the bottom of the header table: node p
 Two paths transformed prefix path
 p’s conditional pattern base
 {(f:2, c:2, a:2, m:2), (c:1, b:1)} roo
t
 p’s conditional FP-tree f:4 c:
1
 Only one branch (c:3)
c: b: b:
pattern (cp:3) 3 1 1
 Patterns: Header a: p:
ite Table
head of node- 3 1
 (p:3)
m links m: b:
 (cp:3) f
c 2 1
a
b p: m:
m 2 1
p 21
Example 2 (cont.)
 Continue with node m
 Two paths
 m’s conditional pattern base
 {(f:2, c:2, a:2), (f:1,c:1, a:1, b:1)} roo
t
 m’s conditional FP-tree: f:4 c:
 (f:3, c:3, a:3) 1
c: b: b:
 Call mine(<f:3, c:3, a:3>| m) 3 1 1
 Patterns: Header a: p:
ite Table
head of node- 3 1
 (m:3)
m links m: b:
f
 see next slide c 2 1
a
b p: m:
m 2 1
p 22
mine(<(f:3, c:3, a:3>| m)
roo
 node a:
t
 (am:3)
Header
 call mine(<f:3, c:3>|am)
 (cam:3) ite Table
head of node- f:3
 call(<f:3)|cam) m links

f
(fcam:3)

c
(fam:3) a c:
 node c: 3
 (cm:3)
 call mine(<f:3>|cm)
 (fcm:3)
a:
3
 node f: conditional FP-tree
 (fm:3) of “m”
 All the patterns: (m:3, am:3, cm:3, fm:3, cam:3, fam:3, fcm:3, fcam:3)
 Conclusion: A single path FP-Tree can be mined by outputting all the
combination of the items in the path.

23
Example 2 (cont.)
 Continue with node b
 Three paths
roo
 b’s conditional pattern base t
 {(f:1, c:1, a:1), (f:1), (c:1)} f:4 c:
1
 b’s conditional FP-tree c: b: b:
3 1 1
Φ Header a: p:
ite Table
head of node- 3 1
 Patterns: m
f links m: b:
 (b:3) c 2 1
a
b p: m:
m 2 1
p 24
Example 2 (cont.)
 Continue with node a
 One path
 a’s conditional pattern base roo
 {(f:3, c:3)} t
f:4 c:
 a’s conditional FP-tree 1
 {(f:3, c:3)} c: b: b:
 Patterns: 3 1 1
Header a: p:
 (a:3)
ite Table
head of node- 3 1
 (ca:3) m links
f m: b:
 (fa:3) c 2 1
 (fca:3) a
b p: m:
m 2 1
p 25
Example 2 (cont.)
 Continue with node c
 Two paths
 c’s conditional pattern base roo
t
 {(f:3)}
f:4 c:
1
 c’s conditional FP-tree c: b: b:
 {(f:3)} 3 1 1
Header a: p:
 Patterns: ite Table
head of node- 3 1
 (c:4) m links m: b:
f
c 2 1
 (fc:3) a
b p: m:
m 2 1
p 26
Example 2 (cont.)
 Continue with node f
 One path
roo
 f’s conditional pattern base t
f:4 c:
 Φ 1
 f’s conditional FP-tree c: b: b:
3 1 1
Φ Header a: p:
ite Table
head of node- 3 1
 Patterns: m
f links m: b:
 (f:4) c 2 1
a
b p: m:
m 2 1
p 27
Example 2 (cont.)
 Final results:
item conditional pattern base conditional FP-tree
p {(f:2, c:2, a:2, m:2), (c:1, b:1)} {(c:3)}| p
m {(f:4, c:3, a:3, m:2), {(f:3, c:3, a:3)}| m
(f:4, c:3, a:3, b:1, m:1)}
b {(f:4, c:3, a:3, b:1), (f:4, b:1), Φ
(c:1, b:1)}
a {(f;3, c:3)} {(f:3, c:3}| a
c {(f:3)} {(f:3)}| c
f Φ Φ

28
FP-Growth Properties
 Property 3.2 : Prefix path property
 To calculate the frequent patterns for a node ai in a
path P, only the prefix subpath of node ai in P need
to be accumulated, and the frequency count of ever
y node in the prefix path should carry the same cou
nt as node ai.
 Lemma 3.1 : Fragment growth
 Letαbe an itemset in DB, B beα's conditional patter
n base, and βbe an itemset in B. Then the support o
fαUβin DB is equivalent to the support of βin B.
29
FP-Growth Properties (cont.)
 Corollary 3.1 (Pattern growth)
 Let αbe a frequent itemset in DB, B be α's conditional patter
n base, and βbe an itemset in B. Then αUβis frequent in DB
if and only if is βfrequent in B.
 Lemma 3.2 (Single FP-tree path pattern generation)
 Suppose an FP-tree T has a single path P. The complete set
of the frequent patterns of T can be generated by the enumer
ation of all the combinations of the subpaths of P with the s
upport being the minimum support of the items contained in
the subpath.

30
Outline
 Introduction
 Constructing FP-Tree
Example 1

 Mining Frequent Patterns using FP-Tree

Example 2

 Performance Evaluation
 Discussions

31
Performance Evaluation:
FP-Tree vs. Apriori
 Scalability with Support Threshold

32
Performance Evaluation:
FP-Tree vs. Apriori (Cont.)
 Per-item runtime actually decreases with
support threshold decrease.

33
Performance Evaluation:
FP-Tree vs. Apriori (Cont.)
 Scalability with DB size.

34
Outline
 Introduction
 Constructing FP-Tree
Example 1

 Mining Frequent Patterns using FP-Tree

Example 2

 Performance Evaluation
 Discussions

35
Discussions
 When database is extremely large.
 Use FP-Tree on projected databases.

 Or, make FP-Tree disk-resident.

 Materialization of an FP-Tree
 Construct it independently of queries,
with an
reasonably fit-majority minimum support-threshold.
 Incremental updates of an FP-Tree.
 Record frequency count for every item.

 Control by watermark.

36
Thank you!
 Q & A.

What Is Frequent Pattern Analysis?
No ratings yet
What Is Frequent Pattern Analysis?
37 pages
Geomembrane HDPE Smooth Black PDF
0% (1)
Geomembrane HDPE Smooth Black PDF
1 page
FP Tree
No ratings yet
FP Tree
42 pages
Fp-Tree Growth Algorithm
No ratings yet
Fp-Tree Growth Algorithm
11 pages
FP-Growth Algorithm (1)
No ratings yet
FP-Growth Algorithm (1)
5 pages
Data Wirehose and Mining 3
No ratings yet
Data Wirehose and Mining 3
15 pages
Lecture 5 - Monday, September 3, 2007: 2.1 Example From Paper
No ratings yet
Lecture 5 - Monday, September 3, 2007: 2.1 Example From Paper
6 pages
Lecture 6
No ratings yet
Lecture 6
18 pages
FP Tree Growth: Frequent Pattern Growth Algorithm
100% (1)
FP Tree Growth: Frequent Pattern Growth Algorithm
2 pages
Frequent Closed Pattern Mining Algorithm Based On COFI-Tree
No ratings yet
Frequent Closed Pattern Mining Algorithm Based On COFI-Tree
2 pages
fp-growth
No ratings yet
fp-growth
16 pages
FP-Tree Growth Algorithm
No ratings yet
FP-Tree Growth Algorithm
15 pages
FP Growth Algorithm
No ratings yet
FP Growth Algorithm
17 pages
U3 - FP Trees - 5th Sem - DS
No ratings yet
U3 - FP Trees - 5th Sem - DS
9 pages
Lecture_13_14_FP
No ratings yet
Lecture_13_14_FP
41 pages
F P-Tree F P-Growth
No ratings yet
F P-Tree F P-Growth
7 pages
FP Growth Presentation v1 (Handout)
No ratings yet
FP Growth Presentation v1 (Handout)
10 pages
FP Growth
No ratings yet
FP Growth
21 pages
Shihab Rahman Dolon Chanpa Department of Computer Science and Engineering, University of Dhaka
No ratings yet
Shihab Rahman Dolon Chanpa Department of Computer Science and Engineering, University of Dhaka
23 pages
Unit4 2 Association Rules FP Growth
No ratings yet
Unit4 2 Association Rules FP Growth
33 pages
15-Fp-Tree Problem-10-09-2024
No ratings yet
15-Fp-Tree Problem-10-09-2024
2 pages
Module 4.2 Association Rule Mining
No ratings yet
Module 4.2 Association Rule Mining
88 pages
03 Pre Processing
No ratings yet
03 Pre Processing
20 pages
Data Mining - : Dr. Mahmoud Mounir Mahmoud - Mounir@cis - Asu.edu - Eg
No ratings yet
Data Mining - : Dr. Mahmoud Mounir Mahmoud - Mounir@cis - Asu.edu - Eg
23 pages
18-FP-Growth algorithm-12-02-2025
No ratings yet
18-FP-Growth algorithm-12-02-2025
24 pages
fpgrowth
No ratings yet
fpgrowth
11 pages
AzqaSaleemKhan (SP22 RCS 003) FPGrowth
No ratings yet
AzqaSaleemKhan (SP22 RCS 003) FPGrowth
19 pages
FP-Growth Algorithm
No ratings yet
FP-Growth Algorithm
16 pages
4.1) FP Growth Algorithm
No ratings yet
4.1) FP Growth Algorithm
26 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
44 pages
FPTree-09
No ratings yet
FPTree-09
45 pages
Lecture 5 - FP-Growth Algorithm
No ratings yet
Lecture 5 - FP-Growth Algorithm
26 pages
DWDM Unit-3
100% (1)
DWDM Unit-3
63 pages
ESE Handouts 4 - FP Growth Algorithm (Fall 2016)
No ratings yet
ESE Handouts 4 - FP Growth Algorithm (Fall 2016)
13 pages
FP-Growth Algorithm
No ratings yet
FP-Growth Algorithm
23 pages
FP Tree
No ratings yet
FP Tree
54 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
44 pages
Tutorial 02
No ratings yet
Tutorial 02
17 pages
FP GROWTH ALG
No ratings yet
FP GROWTH ALG
17 pages
Improv Me Net
No ratings yet
Improv Me Net
7 pages
Data Mining Unit 2 (Part 2)-1
No ratings yet
Data Mining Unit 2 (Part 2)-1
7 pages
Q) FP Growth Algorithm?: This Algorithm Works As Follows
No ratings yet
Q) FP Growth Algorithm?: This Algorithm Works As Follows
3 pages
Association Rule Mining3
No ratings yet
Association Rule Mining3
13 pages
An Improved Frequent Pattern Tree the Child Struct
No ratings yet
An Improved Frequent Pattern Tree the Child Struct
19 pages
What Is Frequent Pattern Analysis?
No ratings yet
What Is Frequent Pattern Analysis?
5 pages
Chapter 5
No ratings yet
Chapter 5
26 pages
Guide: Mr. Gautam Borkar: Group Members: Rahul Kelaskar A - 636 Anish Khale A - 638 Dhaval Doshi A - 682
No ratings yet
Guide: Mr. Gautam Borkar: Group Members: Rahul Kelaskar A - 636 Anish Khale A - 638 Dhaval Doshi A - 682
22 pages
Powerpoint Presentation On Somlething
No ratings yet
Powerpoint Presentation On Somlething
181 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
12 pages
Association Rule Mining: FP Growth
No ratings yet
Association Rule Mining: FP Growth
22 pages
A New Fast Algorithm For Constructing FP - Tree: Zhenzhou Wang Jiaomin Liu Sheng Guo Lijuan Yang
No ratings yet
A New Fast Algorithm For Constructing FP - Tree: Zhenzhou Wang Jiaomin Liu Sheng Guo Lijuan Yang
4 pages
FP Growth Algorithm Example Problems
No ratings yet
FP Growth Algorithm Example Problems
12 pages
DWM EXP10_96
No ratings yet
DWM EXP10_96
11 pages
FP Growth (Tree)
No ratings yet
FP Growth (Tree)
24 pages
Association Rules FP Tree1
No ratings yet
Association Rules FP Tree1
31 pages
Notes 4 DWM Data Mining
No ratings yet
Notes 4 DWM Data Mining
34 pages
A Frequent Pattern Mining Algorithm Based On Fp-Tree Structure Andapriori Algorithm
No ratings yet
A Frequent Pattern Mining Algorithm Based On Fp-Tree Structure Andapriori Algorithm
3 pages
From Introduction To Data Mining: Data Mining Association Analysis: Basic Concepts and Algorithms
No ratings yet
From Introduction To Data Mining: Data Mining Association Analysis: Basic Concepts and Algorithms
37 pages
How To Find Frequent Patterns?: Wim Pijls Walter A. Kosters
No ratings yet
How To Find Frequent Patterns?: Wim Pijls Walter A. Kosters
8 pages
Fptreehuffman
No ratings yet
Fptreehuffman
4 pages
The Recursive Book of Recursion: Ace the Coding Interview with Python and JavaScript
From Everand
The Recursive Book of Recursion: Ace the Coding Interview with Python and JavaScript
Al Sweigart
No ratings yet
Posicionador Neumatico Somas PDF
No ratings yet
Posicionador Neumatico Somas PDF
4 pages
DTR For March Onwards
No ratings yet
DTR For March Onwards
2 pages
4 QEM Process Capability
No ratings yet
4 QEM Process Capability
6 pages
12_Mat_Learn Xtra Live_006_Trig Revision_Show Notes
No ratings yet
12_Mat_Learn Xtra Live_006_Trig Revision_Show Notes
4 pages
AP6265 Series (Preliminary) : Features General Description
No ratings yet
AP6265 Series (Preliminary) : Features General Description
16 pages
Recitation 10: December 1, 2020
No ratings yet
Recitation 10: December 1, 2020
6 pages
Saic M 2021
100% (1)
Saic M 2021
6 pages
Lost Planet Explains
No ratings yet
Lost Planet Explains
2 pages
Ltron BT Manual
100% (1)
Ltron BT Manual
1 page
Chapter 3
No ratings yet
Chapter 3
26 pages
Creep Lab Manual
No ratings yet
Creep Lab Manual
11 pages
HR Analyst (Data Analyst)
No ratings yet
HR Analyst (Data Analyst)
11 pages
F110 e 1 001
No ratings yet
F110 e 1 001
39 pages
Drbd-Tutorial Tutorails
No ratings yet
Drbd-Tutorial Tutorails
104 pages
Cowan Reines Experiment
No ratings yet
Cowan Reines Experiment
13 pages
Introduction To Pro II: Starting A New Case
100% (1)
Introduction To Pro II: Starting A New Case
9 pages
LS-DYNA Manual Volume I R13
No ratings yet
LS-DYNA Manual Volume I R13
3,826 pages
Study Material On DiscriminantAnalysis
No ratings yet
Study Material On DiscriminantAnalysis
24 pages
Beam Weld Splice Connection 1
No ratings yet
Beam Weld Splice Connection 1
3 pages
Eli Garger - Percent Change HW
No ratings yet
Eli Garger - Percent Change HW
1 page
Types of Meshing
No ratings yet
Types of Meshing
16 pages
Air Compressor
No ratings yet
Air Compressor
37 pages
H Seagate® Laptop Thin SSHD
No ratings yet
H Seagate® Laptop Thin SSHD
38 pages
CSWIP 3.0 - Visual Welding Inspector - 1-90
No ratings yet
CSWIP 3.0 - Visual Welding Inspector - 1-90
90 pages
6 CE 414 UnitHG Examples
No ratings yet
6 CE 414 UnitHG Examples
7 pages
Lab Report 1
No ratings yet
Lab Report 1
8 pages
Jee Gravitation Notes for Iit Jee
No ratings yet
Jee Gravitation Notes for Iit Jee
9 pages
044_Base_paper
No ratings yet
044_Base_paper
5 pages
BECL456A-MC Lab
No ratings yet
BECL456A-MC Lab
5 pages

fp-tree

Uploaded by

fp-tree

Uploaded by

Mining Frequent Patterns with

out Candidate Generation

Jiawei Han, Jian Pei and Yiwen Yin

 Mining Frequent Patterns using FP-Tree

 FP-Tree and FP-Growth Algorithm

 FP-Growth: Mining frequent patterns with FP-Tree

 Two non-trivial costs: (Bottleneck)

 Concatenate patterns from conditional FP-tree with suffix

 Divide-and-Conquer mining technique

 Mining Frequent Patterns using FP-Tree

 Each entry in the frequent-item header table co

minimum support threshold ξ= 3

 collect length-1 frequent items, then sort them in

support descending order into L, frequent item list.

 for each transaction Trans, do

If prefix nodes already exist, increase their counts by 1;

roo roo roo roo

After tran After tran

 Mining Frequent Patterns using FP-Tree

 minimum support threshold ξ

 Mining Frequent Patterns using FP-Tree

 Mining Frequent Patterns using FP-Tree

 Or, make FP-Tree disk-resident.

You might also like