0% found this document useful (0 votes)

14 views17 pages

Module1 Part2

Uploaded by

amvarshney123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views17 pages

Module1 Part2

Uploaded by

amvarshney123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 17

Mining Association Rules

Association rule mining

 Proposed by Agrawal et al in 1993.
 It is an important data mining model studied
extensively by the database and data mining
community.
 Assume all data are categorical.
 Initially used for Market Basket Analysis to find
how items purchased by customers are related.

Bread  Milk [sup = 5%, conf = 100%]

2
The model: data

 I = {i1, i2, …, im}: a set of items.

 Transaction t :
 t a set of items, and t  I.

 Transaction Database T: a set of transactions

T = {t1, t2, …, tn}.

3
Transaction data: supermarket data
 Market basket transactions:
t1: {bread, cheese, milk}
t2: {apple, eggs, salt, yogurt}
… …
tn: {biscuit, eggs, milk}
 Concepts:
 An item: an item/article in a basket
 I: the set of all items sold in the store
 A transaction: items purchased in a basket; it may
have TID (transaction ID)
 A transactional dataset: A set of transactions
4
The model: rules
 A transaction t contains X, a set of items
(itemset) in I, if X  t.
 An association rule is an implication of the
form:
X  Y, where X, Y  I, and X Y = 

 An itemset is a set of items.

 E.g., X = {milk, bread, cereal} is an itemset.
 A k-itemset is an itemset with k items.
 E.g., {milk, bread, cereal} is a 3-itemset

5
Rule strength measures
 Support: The rule holds with support sup in T
(the transaction data set) if sup% of
transactions contain X  Y.
 sup = Pr(X  Y).
 Confidence: The rule holds in T with
confidence conf if conf% of tranactions that
contain X also contain Y.
 conf = Pr(Y | X)
 An association rule is a pattern that states
when X occurs, Y occurs with certain
probability.
6
Support and Confidence
 Support count: The support count of an
itemset X, denoted by X.count, in a data set
T is the number of transactions in T that
contain X. Assume T has n transactions.
 Then,
( X  Y ).count
support 
n
( X  Y ).count
confidence 
X .count
7
Goal and key features
 Goal: Find all rules that satisfy the user-
specified minimum support (minsup) and
minimum confidence (minconf).
 Key Features
 Completeness: find all rules.
 No target item(s) on the right-hand-side
 Mining with data on hard disk (not in memory)

8
t1: Beef, Chicken, Milk
An example t2: Beef, Cheese
t3: Cheese, Boots
t4: Beef, Chicken, Cheese
t5: Beef, Chicken, Clothes, Cheese,
 Transaction data Milk
t6: Chicken, Clothes, Milk
 Assume: t7: Chicken, Milk, Clothes
minsup = 30%
minconf = 80%
 An example frequent itemset:
{Chicken, Clothes, Milk} [sup = 3/7]
 Association rules from the itemset:
Clothes  Milk, Chicken [sup = 3/7, conf = 3/3]
… …
Clothes, Chicken  Milk, [sup = 3/7, conf = 3/3]

9
Transaction data representation
 A simplistic view of shopping baskets,
 Some important information not considered.
E.g,
 the quantity of each item purchased and
 the price paid.

10
Many mining algorithms
 There are a large number of them!!
 They use different strategies and data structures.
 Their resulting sets of rules are all the same.
 Given a transaction data set T, and a minimum support and
a minimum confident, the set of association rules existing in
T is uniquely determined.
 Any algorithm should find the same set of rules
although their computational efficiencies and
memory requirements may be different.
 We study only one: the Apriori Algorithm

11
The Apriori algorithm
 Probably the best known algorithm
 Two steps:
 Find all itemsets that have minimum support
(frequent itemsets, also called large itemsets).
 Use frequent itemsets to generate rules.

 E.g., a frequent itemset

{Chicken, Clothes, Milk} [sup = 3/7]
and one rule from the frequent itemset
Clothes  Milk, Chicken [sup = 3/7, conf = 3/3]

12
Step 1: Mining all frequent
itemsets
 A frequent itemset is an itemset whose support

is ≥ minsup.
 Key idea: The apriori property (downward
closure property): any subsets of a frequent
itemset are also frequent itemsets
ABC ABD ACD BCD

AB AC AD BC BD CD

A B C D

13
The Algorithm
 Iterative algo. (also called level-wise search):
Find all 1-item frequent itemsets; then all 2-item
frequent itemsets, and so on.
 In each iteration k, only consider itemsets that

contain some k-1 frequent itemset.

 Find frequent itemsets of size 1: F1
 From k = 2
 Ck = candidates of size k: those itemsets of size k
that could be frequent, given Fk-1
 Fk = those itemsets that are actually frequent, Fk
 Ck (need to scan the database once).
14
Dataset T
TID Items
Example – minsup=0.5 T100 1, 3, 4
Finding frequent itemsets T200 2, 3, 5
T300 1, 2, 3, 5
T400 2, 5
itemset:count
1. scan T  C1: {1}:2, {2}:3, {3}:3, {4}:1, {5}:3
 F1: {1}:2, {2}:3, {3}:3, {5}:3

 C2: {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5}

2. scan T  C2: {1,2}:1, {1,3}:2, {1,5}:1, {2,3}:2, {2,5}:3, {3,5}:2

 F2: {1,3}:2, {2,3}:2, {2,5}:3, {3,5}:2

 C3: {2, 3,5}

3. scan T  C3: {2, 3, 5}:2  F3: {2, 3, 5}

15
Step 2: Generating rules from frequent
itemsets
 Frequent itemsets  association rules
 One more step is needed to generate
association rules
 For each frequent itemset X,
For each proper nonempty subset A of X,
 Let B = X - A
 A  B is an association rule if
 Confidence(A  B) ≥ minconf,

support(A  B) = support(AB) = support(X)

confidence(A  B) = support(A  B) / support(A)
16
Generating rules: an example
 Suppose {2,3,4} is frequent, with sup=50%
 Proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with
sup=50%, 50%, 75%, 75%, 75%, 75% respectively
 These generate these association rules:
 2,3  4, confidence=100%
 2,4  3, confidence=100%
 3,4  2, confidence=67%
 2  3,4, confidence=67%
 3  2,4, confidence=67%
 4  2,3, confidence=67%
 All rules have support = 50%

Cyber-Security Roadmap
No ratings yet
Cyber-Security Roadmap
3 pages
PPDM Plus Dps Plus Ordering Licensing Guide For Dsa Gii
No ratings yet
PPDM Plus Dps Plus Ordering Licensing Guide For Dsa Gii
99 pages
Modernization of Ntuc Income Case Study
100% (2)
Modernization of Ntuc Income Case Study
4 pages
Module 5 - Frequent Pattern Mining
No ratings yet
Module 5 - Frequent Pattern Mining
111 pages
98 364 Test Bank Lesson01
100% (2)
98 364 Test Bank Lesson01
6 pages
Association Rules PDF
No ratings yet
Association Rules PDF
35 pages
Session 8-Association Rules Mining
No ratings yet
Session 8-Association Rules Mining
75 pages
Association Rule Mining
No ratings yet
Association Rule Mining
72 pages
Association Rules
No ratings yet
Association Rules
24 pages
Data Mining Task - Association Rule Mining
No ratings yet
Data Mining Task - Association Rule Mining
30 pages
16-Efficient and Scalable Frequent Item Set Mining Methods - Apriori Algorithm-05-02-2025
No ratings yet
16-Efficient and Scalable Frequent Item Set Mining Methods - Apriori Algorithm-05-02-2025
37 pages
CA03CA3405Notes On Association Rule Mining and Apriori Algorithm
No ratings yet
CA03CA3405Notes On Association Rule Mining and Apriori Algorithm
41 pages
Association Rules and Frequent Item Analysis
No ratings yet
Association Rules and Frequent Item Analysis
30 pages
Computing Techniques-Continued: Association Rule Mining Clustering Time Series Analysis
No ratings yet
Computing Techniques-Continued: Association Rule Mining Clustering Time Series Analysis
174 pages
Rule Mining by Akshay Rele
No ratings yet
Rule Mining by Akshay Rele
42 pages
Unit - III
No ratings yet
Unit - III
27 pages
Unit-2 Dma
No ratings yet
Unit-2 Dma
68 pages
P-3 1 5-Association
No ratings yet
P-3 1 5-Association
46 pages
Lect 6
No ratings yet
Lect 6
74 pages
Equent Patterns
No ratings yet
Equent Patterns
74 pages
Associationrule 1
No ratings yet
Associationrule 1
30 pages
Data Mining Mod 2
No ratings yet
Data Mining Mod 2
7 pages
Contents
No ratings yet
Contents
59 pages
Association Rule Mining
No ratings yet
Association Rule Mining
21 pages
Data Mining Association Rules
No ratings yet
Data Mining Association Rules
54 pages
Unit 5
No ratings yet
Unit 5
40 pages
Equent Itemsets & Clustering
No ratings yet
Equent Itemsets & Clustering
27 pages
06 FPBasic
No ratings yet
06 FPBasic
69 pages
Association Rule Mining:: Dm-Unit-2
No ratings yet
Association Rule Mining:: Dm-Unit-2
16 pages
DM - Unit II
No ratings yet
DM - Unit II
65 pages
DM Association
No ratings yet
DM Association
43 pages
Association Rules
No ratings yet
Association Rules
39 pages
1.2 Association Rule Mining: Abdulfetah Abdulahi A
No ratings yet
1.2 Association Rule Mining: Abdulfetah Abdulahi A
43 pages
DM Chapter 6 (Association)
100% (1)
DM Chapter 6 (Association)
21 pages
CH-4 Mining Association Rules
No ratings yet
CH-4 Mining Association Rules
35 pages
3final CH 5 Concept
No ratings yet
3final CH 5 Concept
101 pages
Unit-5: Concept Description and Association Rule Mining
No ratings yet
Unit-5: Concept Description and Association Rule Mining
39 pages
Chap4 PatternMiningBasic
No ratings yet
Chap4 PatternMiningBasic
52 pages
Association Rules & Sequential Patterns
No ratings yet
Association Rules & Sequential Patterns
65 pages
Association: Market Basket Analysis
No ratings yet
Association: Market Basket Analysis
40 pages
Association Rule Mining
No ratings yet
Association Rule Mining
54 pages
Mining: Association Rules
No ratings yet
Mining: Association Rules
54 pages
Chap4 PatternMiningBasic
No ratings yet
Chap4 PatternMiningBasic
52 pages
CH - 5
No ratings yet
CH - 5
43 pages
Data Mining Techniques (DMT) by Kushal Anjaria Session-2: Tid Items
No ratings yet
Data Mining Techniques (DMT) by Kushal Anjaria Session-2: Tid Items
4 pages
ML Unit - Iii
No ratings yet
ML Unit - Iii
64 pages
Association
No ratings yet
Association
54 pages
DM 2
No ratings yet
DM 2
71 pages
Class 4-Associative Analysis
No ratings yet
Class 4-Associative Analysis
42 pages
Association Rule Mining
No ratings yet
Association Rule Mining
97 pages
Unit 4 .3 Association Analysis
No ratings yet
Unit 4 .3 Association Analysis
50 pages
Mining Frequent Patterns and Associations
No ratings yet
Mining Frequent Patterns and Associations
52 pages
ICS 2408 - Lecture 5 - Association
No ratings yet
ICS 2408 - Lecture 5 - Association
44 pages
Module5 DMW
No ratings yet
Module5 DMW
13 pages
Unit 4 - Part 1
No ratings yet
Unit 4 - Part 1
152 pages
Data Mining: Magister Teknologi Informasi Universitas Indonesia
No ratings yet
Data Mining: Magister Teknologi Informasi Universitas Indonesia
72 pages
III Unit-DM
No ratings yet
III Unit-DM
9 pages
Data Mining and Predictive Modeling: Lecture 9: Association Rule Mining, Apriori Algorithm
No ratings yet
Data Mining and Predictive Modeling: Lecture 9: Association Rule Mining, Apriori Algorithm
24 pages
Association Rule Mod 3
No ratings yet
Association Rule Mod 3
28 pages
Association Rule Mining
No ratings yet
Association Rule Mining
92 pages
5 DM Association
No ratings yet
5 DM Association
27 pages
Association Rules
No ratings yet
Association Rules
64 pages
SOP BackOffice
100% (1)
SOP BackOffice
28 pages
Implementing Ajax Authentication Using Jquery, Spring Security and HTTPS
No ratings yet
Implementing Ajax Authentication Using Jquery, Spring Security and HTTPS
8 pages
Rashid, Fatema
No ratings yet
Rashid, Fatema
164 pages
Sabeo - Security Engineer - Mid Level CV Shorter Version
No ratings yet
Sabeo - Security Engineer - Mid Level CV Shorter Version
4 pages
PHP Training Plan
0% (2)
PHP Training Plan
53 pages
Sree Chaitanya College of Engineering LMD Colony, Thimmapur, Karimnagar
No ratings yet
Sree Chaitanya College of Engineering LMD Colony, Thimmapur, Karimnagar
3 pages
REPORT ASM Final VuDucKhoa BH02094 Security
No ratings yet
REPORT ASM Final VuDucKhoa BH02094 Security
127 pages
Audting II Ans
No ratings yet
Audting II Ans
12 pages
2
No ratings yet
2
9 pages
IEC62304 Intro 01
No ratings yet
IEC62304 Intro 01
8 pages
Cloud Computing Question Bank
No ratings yet
Cloud Computing Question Bank
27 pages
Technical Security Metrics Model in Compliance With ISO/IEC 27001 Standard
No ratings yet
Technical Security Metrics Model in Compliance With ISO/IEC 27001 Standard
9 pages
Eh Unit 3
No ratings yet
Eh Unit 3
77 pages
Irfan Ali: Business and Functional Analyst T24
No ratings yet
Irfan Ali: Business and Functional Analyst T24
2 pages
QR Code Attendance Using Web Browser
No ratings yet
QR Code Attendance Using Web Browser
3 pages
Your Project Title: CSE 400: I/Ii/Iii/Iv
No ratings yet
Your Project Title: CSE 400: I/Ii/Iii/Iv
11 pages
Chapter 06 - Subqueries
No ratings yet
Chapter 06 - Subqueries
19 pages
Accenture kjprasad-CV FI CO
No ratings yet
Accenture kjprasad-CV FI CO
5 pages
Viva Questions For Data Mining and Warehousing: Q1. Ans.
No ratings yet
Viva Questions For Data Mining and Warehousing: Q1. Ans.
13 pages
Smartsheet - Implementation RFP
No ratings yet
Smartsheet - Implementation RFP
8 pages
Microsoft Access Concepts and Terminology (1) - 2014
No ratings yet
Microsoft Access Concepts and Terminology (1) - 2014
8 pages
Online Examination System
65% (20)
Online Examination System
9 pages
Chapter 2
No ratings yet
Chapter 2
10 pages
Sample API .Net Core
No ratings yet
Sample API .Net Core
4 pages
Risk Based Testing For Agile and Devops Teams Ebook - Kobring
No ratings yet
Risk Based Testing For Agile and Devops Teams Ebook - Kobring
22 pages
WMB TopTen Problems
No ratings yet
WMB TopTen Problems
34 pages

Module1 Part2

Uploaded by

Module1 Part2

Uploaded by

Mining Association Rules

Association rule mining

Bread  Milk [sup = 5%, conf = 100%]

 I = {i1, i2, …, im}: a set of items.

 Transaction Database T: a set of transactions

 An itemset is a set of items.

 E.g., a frequent itemset

contain some k-1 frequent itemset.

 C2: {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5}

2. scan T  C2: {1,2}:1, {1,3}:2, {1,5}:1, {2,3}:2, {2,5}:3, {3,5}:2

 C3: {2, 3,5}

3. scan T  C3: {2, 3, 5}:2  F3: {2, 3, 5}

support(A  B) = support(AB) = support(X)

You might also like