0% found this document useful (0 votes)

59 views24 pages

Association Analysis-Part2 Notes

The document discusses techniques for efficient candidate generation and support counting in the Apriori algorithm for frequent itemset mining. It describes: 1) How candidate k-itemsets are generated by merging frequent (k-1)-itemsets. 2) How candidate pruning eliminates candidate itemsets that cannot be frequent based on subsets not meeting minimum support. 3) How a hash tree can be used to efficiently count support by hashing itemsets to candidate nodes during a single transaction scan.

Uploaded by

4JK18CS031 Lavanya Pushpakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views24 pages

Association Analysis-Part2 Notes

Uploaded by

4JK18CS031 Lavanya Pushpakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Association Analysis-part2

Candidate Generation and Pruning

• The apriori-gen function generates candidate itemsets by
performing the following two operations:
1. Candidate Generation. This operation generates new
candidate k-itemsets based on the frequent (k- 1)-itemsets
found in the previous iteration.

2. Candidate Pruning. This operation eliminates some of the

candidate k-itemsets using the support-based pruning
strategy.
Candidate Generation: Brute-force method
Candidate Generation: Merge Fk-1 and F1 itemsets
Candidate Generation: Fk-1 x Fk-1 Method

• Merge two frequent (k-1)-itemsets if their first (k-2) items are

identical

• F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE}
– Merge(ABC, ABD) = ABCD
– Merge(ABC, ABE) = ABCE
– Merge(ABD, ABE) = ABDE

– Do not merge(ABD,ACD) because they share only

prefix of length 1 instead of length 2
Candidate Generation: Fk-1 x Fk-1 Method
Candidate Pruning
• Let F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE} be the
set of frequent 3-itemsets

• L4 = {ABCD,ABCE,ABDE} is the set of candidate

4-itemsets generated (from previous slide)

• Candidate pruning
– Prune ABCE because ACE and BCE are infrequent
– Prune ABDE because ADE is infrequent

• After candidate pruning: L4 = {ABCD}

Alternate Fk-1 x Fk-1 Method

• Merge two frequent (k-1)-itemsets if the last (k-2) items of the

first one is identical to the first (k-2) items of the second.

• F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE}
– Merge(ABC, BCD) = ABCD
– Merge(ABD, BDE) = ABDE
– Merge(ACD, CDE) = ACDE
– Merge(BCD, CDE) = BCDE
Candidate Pruning for Alternate Fk-1 x Fk-1 Method
• Let F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE} be the
set of frequent 3-itemsets

• L4 = {ABCD,ABDE,ACDE,BCDE} is the set of

candidate 4-itemsets generated (from previous
slide)
• Candidate pruning
– Prune ABDE because ADE is infrequent
– Prune ACDE because ACE and ADE are infrequent
– Prune BCDE because BCE
• After candidate pruning: L4 = {ABCD}
Support Counting
• One approach for doing this is to compare
each transaction against every candidate
itemset.
• and update the support counts of candidates
contained in the transaction.
• This approach is computationally expensive,
especially when the numbers of transactions
and candidate itemsets are large.
• An alternative approach is to enumerate the
itemsets contained in each transaction
Support Counting Using a Hash {1,4,5}
{1,2,4}
Tree {4,5,7}
{1,2,5}
Hash Function Candidate Hash Tree {4,5,8}
{1,5,9}
{1,3,6}
{2,3,4}
1,4,7 3,6,9
{5,6,7}
2,5,8 {3,4,5}
234 {3,5,6}
{3,5,7}
567 {6,8,9}
{3,6,7}
145 136 {3,6,8}
345 356 367
Hash on
357 368
1, 4 or 7
124 159 689
125
457 458
Support Counting Using a Hash Tree
Hash Function Candidate Hash Tree

1,4,7 3,6,9

2,5,8

234
567

145 136
345 356 367
Hash on
357 368
2, 5 or 8
124 159 689
125
457 458
Support Counting Using a Hash Tree
Hash Function Candidate Hash Tree

1,4,7 3,6,9

2,5,8

234
567

145 136
345 356 367
Hash on
357 368
3, 6 or 9
124 159 689
125
457 458
Support Counting Using a Hash Tree
Hash Function
1 2 3 5 6 transaction

1+ 2356
2+ 356 1,4,7 3,6,9

2,5,8
3+ 56

234
567

145 136
345 356 367
357 368
124 159 689
125
457 458
Support Counting Using a Hash Tree
Hash Function
1 2 3 5 6 transaction

1+ 2356
2+ 356 1,4,7 3,6,9
12+ 356 2,5,8
3+ 56
13+ 56
234
15+ 6 567

145 136
345 356 367
357 368
124 159 689
125
457 458
Match transaction against 11 out of 15 candidates
Rule Generation

•
Confidence based Pruning

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Organization and Management Module 1: Quarter 1 - Week 1
100% (1)
Organization and Management Module 1: Quarter 1 - Week 1
16 pages
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
MODULE 3 - Syntax Analysis
No ratings yet
MODULE 3 - Syntax Analysis
110 pages
Green Cloud Computing
No ratings yet
Green Cloud Computing
12 pages
Neural Network Module 2 Notes
100% (1)
Neural Network Module 2 Notes
72 pages
Module 5 - Lecture Notes (IoT)
No ratings yet
Module 5 - Lecture Notes (IoT)
23 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Ma 2024 How AI Use in Organizations Contributes To Employee Competitive Advantage - The Moderating Role of Perceived Organization Support
No ratings yet
Ma 2024 How AI Use in Organizations Contributes To Employee Competitive Advantage - The Moderating Role of Perceived Organization Support
14 pages
Dissertation Sara Parchami
100% (2)
Dissertation Sara Parchami
7 pages
CID 20210320173003021556 989295 uniROC Ipayob
No ratings yet
CID 20210320173003021556 989295 uniROC Ipayob
6 pages
Keysight Infiniivision 4000 X-Series Oscilloscopes: User'S Guide
No ratings yet
Keysight Infiniivision 4000 X-Series Oscilloscopes: User'S Guide
560 pages
The Lifestyle Flow
No ratings yet
The Lifestyle Flow
14 pages
Usg Plasters Hydrocal Gypsum Cements Sealers Parting Compounds Brochure en IG515
No ratings yet
Usg Plasters Hydrocal Gypsum Cements Sealers Parting Compounds Brochure en IG515
2 pages
Aqa Accn4 W SQP 07
No ratings yet
Aqa Accn4 W SQP 07
6 pages
2 Energy Methods and Basic 1D Finite Element Methods
No ratings yet
2 Energy Methods and Basic 1D Finite Element Methods
53 pages
The Clergyman's Wife Chapter Sampler
0% (2)
The Clergyman's Wife Chapter Sampler
21 pages
Man Diesel Engine
No ratings yet
Man Diesel Engine
325 pages
Tentative Schedule Summer, Carryover, Supplementrary 2024-25
No ratings yet
Tentative Schedule Summer, Carryover, Supplementrary 2024-25
11 pages
In Uence of Geographical Phenomenon On Yoga: A Study On Yoga-Geography
No ratings yet
In Uence of Geographical Phenomenon On Yoga: A Study On Yoga-Geography
10 pages
Magnetic Particle Inspection
0% (1)
Magnetic Particle Inspection
32 pages
Interview Karen Barad: Intra Active Entanglements
100% (2)
Interview Karen Barad: Intra Active Entanglements
14 pages
T2222-Advanced Operation Research
No ratings yet
T2222-Advanced Operation Research
3 pages
Moral Reasoning: Moral Reasoning Is The Process of Determining Right or Wrong in A Given Situation
No ratings yet
Moral Reasoning: Moral Reasoning Is The Process of Determining Right or Wrong in A Given Situation
12 pages
Eaton DS 265760 NZMN4 AE1000 en - GB 20241113
No ratings yet
Eaton DS 265760 NZMN4 AE1000 en - GB 20241113
4 pages
Match The Verbs With Its Definition
No ratings yet
Match The Verbs With Its Definition
2 pages
Social Work in A Digital Age - Ethical and Risk Management Challenges
No ratings yet
Social Work in A Digital Age - Ethical and Risk Management Challenges
12 pages
Ocean Maths Homework
100% (1)
Ocean Maths Homework
8 pages
Notes On Works
No ratings yet
Notes On Works
2 pages
Nothing To Hide, The Blurring of The Physical and Temporal Line Between Life, Work and Education - Microcities
No ratings yet
Nothing To Hide, The Blurring of The Physical and Temporal Line Between Life, Work and Education - Microcities
7 pages
Grade 6 Conjunctions
No ratings yet
Grade 6 Conjunctions
65 pages
Tutorial Benzene and Phenol
No ratings yet
Tutorial Benzene and Phenol
4 pages
Wind Energy
No ratings yet
Wind Energy
26 pages
Singing Competition Themanoor
No ratings yet
Singing Competition Themanoor
4 pages
Airtel and SBI Announce Joint Venture To Serve The Unbanked: Prateek Waghre 0
No ratings yet
Airtel and SBI Announce Joint Venture To Serve The Unbanked: Prateek Waghre 0
7 pages
HRM - 1st Midterm
100% (1)
HRM - 1st Midterm
81 pages
Mies Van Der Rohe and The Philosophy of Work
No ratings yet
Mies Van Der Rohe and The Philosophy of Work
5 pages

Association Analysis-Part2 Notes

Uploaded by

Association Analysis-Part2 Notes

Uploaded by

Association Analysis-part2

Candidate Generation and Pruning

2. Candidate Pruning. This operation eliminates some of the

• Merge two frequent (k-1)-itemsets if their first (k-2) items are

– Do not merge(ABD,ACD) because they share only

• L4 = {ABCD,ABCE,ABDE} is the set of candidate

• After candidate pruning: L4 = {ABCD}

• Merge two frequent (k-1)-itemsets if the last (k-2) items of the

• L4 = {ABCD,ABDE,ACDE,BCDE} is the set of

You might also like