Association Rule Mining3

The document discusses the FP-growth algorithm for frequent pattern mining. It explains the key concepts of maximal and closed frequent itemsets and how FP-growth works by building an FP-tree and dividing the search space. The algorithm is described with an example transactional database to show how it efficiently finds all frequent itemsets without candidate generation.

Uploaded by

Kristofar Lolan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views13 pages

Association Rule Mining3

Uploaded by

Kristofar Lolan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Association Rule Mining-III

Maximal vs Closed Frequent Itemsets

• An itemset X is a max-pattern or Maximal Frequent if X is
frequent and there exists no frequent super-pattern Y ⊃ X.
• An itemset X is closed if X is frequent and there exists no
super-pattern Y ⊃ X, with the same support as X.

• Closed Frequent Itemsets

are Lossless: the support for
any frequent itemset can be
deduced from the closed
frequent itemsets
Maximal vs Closed Frequent Itemsets
Frequent Pattern Growth
Method
FP-growth (1)
• Example:
– Given minimum support threshold of 3 and transactional
database T:
FP-growth (2)
• First scan of database T derives a list L:
L = {(f, 4), (c, 4), (a, 3), (b, 3), (m, 3), (p, 3)}
– The root of the tree, labeled with ROOT is created.

• Scan the database T second time.

– The scan of the first transaction leads to the construction of
the first branch of the FP-tree:
{(f, 1), (c, 1), (a, 1), (m, 1), (p, 1)}
– For the second transaction: since it shares common items, f,
c, and a, it shares the common prefix f, c, a with the previous
branch, and extend to the new branch:
{(f, 2), (c, 2), (a, 2), (m, 1), (p, 1)}
FP-growth (3)
• To facilitate tree traversal, an item header table is built, in
which each item in L list connects nodes in FP-tree with the
same item value through node-links
• According to the list of frequent items L, the complete set of
frequent itemsets can be divided into subsets (6 for our
example) without overlap:
1. frequent itemsets having item p (the end of L list);
2. the itemsets having item m but no p;
3. the frequent itemsets with b and without both m and p;
4.5̃. ...;
6. the large itemsets only with f.
FP-growth (4)
• For our example, two paths are selected in the FP-tree:
{(f, 4), (c, 3), (a, 3), (m, 2), ((p, 2)} and
{(c, 1), (b, 1), (p, 1)}
– where samples with a frequent item p are
{(f, 2), (c, 2), (a, 2), (m, 2), ((p, 2)} and
{(c, 1), (b, 1), (p, 1)}
– Given threshold value (3) satisfies only frequent itemset
{(c, 3), (p, 3)} or simplified {c, p}
FP-growth (5)
• Next subset of frequent itemsets are these with m and without
p.

– FP-tree recognizes paths:

{(f, 4), (c, 3), (a, 3), (m, 2)} and
{(f, 4), (c, 3), (a, 3), (b, 1), (m, 1)}
– corresponding accumulated samples:
{(f, 2), (c, 2), (a, 2), (m, 2)} and
{(f, 1), (c, 1), (a, 1), (b, 1), (m, 1)}
– Analyzing samples we discover frequent itemset
{(f, 3), (c, 3), (a, 3), (m, 3)} or simpliﬁed {f, c, a, m}
FP-growth (6)
• Repeating the same process for subsets 3) to 6) in our
example, additional frequent itemsets could be mined.

• In our example these are itemsets {f, c, a} and {f, c} , but they
are already subsets of the frequent itemset {f, c, a, m}.

• Therefore the ﬁnal solution of the FP-growth method is the set

of frequent itemsets, which is in our example:
{{c, p} , {f, c, a, m}}
• FP-growth algorithm is about an order of magnitude faster
than Apriori algorithm.
References
You may follow the listed books for further reading.
1. Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, Vipin
Kumar, Pearson Education.
2. Data Mining: Concepts and Techniques, Jiawei Han ,Micheline Kamber, 2nd
Edition , Morgan Kaufmann Publisher.
3. Data Warehousing Fundamentals for IT Professionals, Paulraj Ponniah,
Second Edition, Wiley India.
4. Introduction to Machine Learning with Python, A. C. Muller and S. Guido,
O’Reilly.
5. Data Mining: A Tutorial Based Primer, Richard Roiger, Michael Geatz,
Pearson Education.
6. Introduction to Data Mining with Case Studies, G.K. Gupta, PHI.

Railway Management System-Final Project
88% (25)
Railway Management System-Final Project
32 pages
BABOK® v3 TechniqueTask Mapping Final2
No ratings yet
BABOK® v3 TechniqueTask Mapping Final2
18 pages
Learn Excel Power Pivot
100% (8)
Learn Excel Power Pivot
204 pages
Risk Mitigation, Monitoring, and Management (RMMM) Plan: Module-6
No ratings yet
Risk Mitigation, Monitoring, and Management (RMMM) Plan: Module-6
7 pages
Mongodb Mock Test
100% (1)
Mongodb Mock Test
7 pages
Develop Disaster Recovery Back-Up Procedures and Recovery Instructions
No ratings yet
Develop Disaster Recovery Back-Up Procedures and Recovery Instructions
4 pages
Dbms Mcqs
100% (2)
Dbms Mcqs
5 pages
A Study of Software Development Life Cycle Process Models
No ratings yet
A Study of Software Development Life Cycle Process Models
7 pages
Data Visualization With Case Study
No ratings yet
Data Visualization With Case Study
10 pages
SQL in Ignition
No ratings yet
SQL in Ignition
102 pages
Looker Vs Power BI Compete Profile
100% (1)
Looker Vs Power BI Compete Profile
31 pages
Relational Databases
No ratings yet
Relational Databases
374 pages
INFORMATION MANAGEMENT Unit 2
No ratings yet
INFORMATION MANAGEMENT Unit 2
35 pages
Part One Relational Databases
No ratings yet
Part One Relational Databases
9 pages
CH 09
No ratings yet
CH 09
38 pages
Thecodingshef: Unit 4 Big Data MCQ Aktu
No ratings yet
Thecodingshef: Unit 4 Big Data MCQ Aktu
13 pages
Awp-Im-Wfp Procedures 2.0 Information Management Procedure
No ratings yet
Awp-Im-Wfp Procedures 2.0 Information Management Procedure
20 pages
Information Systems Within The Organization
No ratings yet
Information Systems Within The Organization
20 pages
Codetru - Big Data
No ratings yet
Codetru - Big Data
17 pages
DWDM PPT by DR - Shankaragowda B.B
No ratings yet
DWDM PPT by DR - Shankaragowda B.B
11 pages
Unit 1. Introduction To Big Data: False
No ratings yet
Unit 1. Introduction To Big Data: False
7 pages
(Ebook) Information Assurance For The Enterprise: A Roadmap To Information Security by Corey Schou, Daniel Shoemaker ISBN 9780072255249, 0072255242
No ratings yet
(Ebook) Information Assurance For The Enterprise: A Roadmap To Information Security by Corey Schou, Daniel Shoemaker ISBN 9780072255249, 0072255242
59 pages
Lesson 5 - Bookstore Database
No ratings yet
Lesson 5 - Bookstore Database
11 pages
Program:-3: A Program in SQL Using Logical Operators
No ratings yet
Program:-3: A Program in SQL Using Logical Operators
7 pages
DBMS
No ratings yet
DBMS
2 pages
Postgresql MVCC
No ratings yet
Postgresql MVCC
5 pages
Big Data Analytics
No ratings yet
Big Data Analytics
5 pages
Database Normalisation: (WEEK 5) Outline
No ratings yet
Database Normalisation: (WEEK 5) Outline
5 pages
CP 121: Introduction To Database Systems Project Assignment (10 Marks) in Group of 10 Students Work On This Project, Duration 3 Weeks
No ratings yet
CP 121: Introduction To Database Systems Project Assignment (10 Marks) in Group of 10 Students Work On This Project, Duration 3 Weeks
2 pages
Product Dimension Table: Tutorials - Dimensional Modeling Tutorial
No ratings yet
Product Dimension Table: Tutorials - Dimensional Modeling Tutorial
1 page
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

Association Rule Mining3

Uploaded by

Association Rule Mining3

Uploaded by

Association Rule Mining-III

Maximal vs Closed Frequent Itemsets

• Closed Frequent Itemsets

• Scan the database T second time.

– FP-tree recognizes paths:

• Therefore the ﬁnal solution of the FP-growth method is the set

You might also like