Lecture 5
Lecture 5
■ Data Mining
■ Association rule mining
■ Apriori method
■ Some other methods – a brief.
■ Later, we see them.
* Data Mining:Association 3
Rules
Association Rules: Basic Concepts
• Given: (1) database of transactions, (2) each transaction
is a list of items (purchased by a customer in a visit)
• Find: all rules that correlate the presence of one set of
items with that of another set of items
– E.g., 98% of people who buy a laptop also buy antivirus
software.
* Data Mining:Association 4
Rules
Association Rule
in a database.
■ Let the items are {A, B, C, … }.
rules.
For rule A ⇾ C :
support = 0.5 (or 50%)
confidence = 0.666 (or 66.6%)
having X.
itemsets.
Find L 1 ;
for (k = 1; L k !=∅; k ++) do begin
C k+1 = candidates generated from L k
for each transaction t in database do
(i) increment the count of all
(ii) candidates in C k+1 that are contained
in t
(iii) L k+1 = candidates in C k+1 with
min_support
end
return ∪ k L k
C2 C2
L2 Scan D
C3 Scan D L3
102 B,C,F
103 A,C,F,G
Let the minimum support required is 50%, find out all frequent
itemsets using the Apriori algorithm.
At each stage show the candidates generated and describe how the
Apriori property is used to prune the candidates set.
100 A,B,C,D,E
101 A,B,C,D,F
102 B,C,F
103 A,C,F,G