Module 5.1 - Association Rule Mining, Apriori Algorithm, Data Mining, Support, Confidence, Examples
Module 5.1 - Association Rule Mining, Apriori Algorithm, Data Mining, Support, Confidence, Examples
Module 5
Mining Frequent Patterns, Association and
Correlations
* 2
Frequent Itemset Mining:Market Basket Analysis
▪ Frequent patterns can be represented in the form of association rules.
* 3
Basic Terms of Association Rule Mining
Absolute support
Relative support
4
Support and confidence
• Rules that satisfy both a minimum support threshold (min sup) and a minimum
confidence threshold (min conf ) are called strong. 5
Association Rule Mining steps
■ In general, association rule mining can be viewed as a two-step process:
1. Find all frequent itemsets: By definition, each of these itemsets will occur at
least as frequently as a predetermined minimum support count, min sup.
6
Frequent Itemset,Closed Itemset and Maximal
Frequent Itemset
8
Downward Closure Property of Frequent Patterns
9
11
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
12
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
2-itemsets
13
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
2-itemsets
Is C closed,Maximal?
Is D closed,Maximal?
14
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
3-Itemsets Is AB closed,Maximal?
Is AC closed,Maximal?
Is AD frequent?
15
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
16
Example: Frequent Itemset,Closed Itemset and
Maximal Frequent Itemset
4-Itemsets Is ABC closed,Maximal?
17
18
Example
19
Example
20
21
22
23
Frequent Itemset Mining Methods:
Apriori Algorithm: Finding Frequent Itemsets by Confined Candidate Generation
* 24
“How is the Apriori property used in the
algorithm?”
25
26
The Apriori Algorithm
■ Pseudo-code:
Ck: Candidate itemset of size k
Lk : frequent itemset of size k
L1 = {frequent items};
for (k = 1; Lk !=∅; k++) do begin
Ck+1 = candidates generated from Lk ;
for each transaction t in database do
increment the count of all candidates in Ck+1
that are contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return ∪k Lk ;
* Data Mining: Concepts and Techniques 28
The Apriori Algorithm—An Example
Supmin = 2 Itemset sup
Itemset sup
Database TDB {A} 2
L1 {A} 2
Tid Items C1 {B} 3
{B} 3
10 A, C, D {C} 3
1st scan {C} 3
20 B, C, E {D} 1
{E} 3
30 A, B, C, E {E} 3
40 B, E
C2 Itemset sup C2 Itemset
{A, B} 1
L2 Itemset sup
{A, C} 2 2nd scan {A, B}
{A, C} 2 {A, C}
{A, E} 1
{B, C} 2 {A, E}
{B, C} 2
{B, E} 3
{B, E} 3 {B, C}
{C, E} 2
{C, E} 2 {B, E}
{C, E}
C3 Itemset
3 scan
rd L3 Itemset sup
{B, C, E} {B, C, E} 2 C: Candidate Itemset
L: Frequent Itemset
* Data Mining: Concepts and Techniques 29
Example:Apriori Algorithm
30
Example:Apriori Algorithm
31
Example:Apriori Algorithm
32
Example:Apriori Algorithm
33
Example:Apriori Algorithm
34
Example:Apriori Algorithm
35
Example:Apriori Algorithm
36
Excercise
37
Excercise
38
Excercise
39
Excercise
40
Excercise
41
Excercise
42
Exercise 2
43
Exercise 2
44
Exercise 2
45
Exercise 2
46
Exercise 2
47
Exercise 2
48
Exercise 2
49
Apriori Algorithm
50
Improving Efficiency of Apriori
Algorithm
51
Improving Efficiency of Apriori
Algorithm
52
Improving Efficiency of Apriori
Algorithm: Hash Based Technique
53
Improving Efficiency of Apriori
Algorithm
54
Improving Efficiency of Apriori
Algorithm: Transaction Reduction
55
Improving Efficiency of Apriori
Algorithm: Transaction Reduction
57
Improving Efficiency of Apriori
Algorithm: Partitioning
58
Improving Efficiency of Apriori
Algorithm:
59
Improving Efficiency of Apriori
Algorithm:DIC
60
Improving Efficiency of Apriori
Algorithm:DIC
61
Improving Efficiency of Apriori
Algorithm:DIC
62
e. Sampling
■ The fundamental idea of the sampling approach is to select a random
sample S of the given data D, and then search for frequent itemsets in
S rather than D.
■ The sample size of S is such that the search for frequent itemsets in S
can be completed in main memory, and therefore only one scan of the
transactions in S is needed overall.
63
Applications Of Apriori Algorithm
64
Application of Apriori Algorithm in the Real World
1. Education Field
As we all know that all the students studying in the school have different
characteristics and personalities like every student are different in age, gender,
different names, and different parents' names, etc.
2. Medical Field
In every hospital, there are a lot of patients admitted over there every patient have
a different kind of disease according to which they are given a different treatment,
and they all will have a different type of characteristics and different medical
history, so here it is necessary to use the computer science method of Apriori
algorithm in order analyze the patients' database. So that there should be no
mixing of the information of different patients
65
4. New Technology Business Firms
Apriori is used by many companies like Flipkart, Amazon, etc. where they have to
maintain the record of various items of products that are purchased by various
customers for recommender systems and by google for the autocomplete
features.
5. Offices
One of the most efficient uses of this computer science technique is in the offices
where they have to record a large number of day to day transactions related to
sale and purchase of various good and services, like recording the transactions of
creditors, sales and purchases so there is need of analysis of all such
transactions so that there should not be any kind of confusion.
6. Mobile e-Commerce
66
Applications Of Apriori Algorithm
67
FP (Frequent Pattern Algorithm)
68
FP (Frequent Pattern Algorithm)
69
FP (Frequent Pattern Algorithm)
70
FP (Frequent Pattern Algorithm)
71
FP (Frequent Pattern Algorithm)
72
FP (Frequent Pattern Algorithm)
73
74
75
Conditional pattern base(CPB): This is a truncated version of the transaction database
that is pertinent to a particular prefix.
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
ECLAT(Equivalence Class Transformation)
Vertical Apriori
93
94
95
96
97
98
99
100
Single Dimensional Association Rules
101
Single Dimensional Association Rules
102
Single dimensional Association Rules
103
Single Dimensional Association Rules
104
Multidimensional Association Rules
105
Multidimensional Association Rules
106
Multidimensional Association Rules
107
Multidimensional Association Rules
108
Multidimensional Association Rules
109
Multidimensional Association Rules
110
Multidimensional Association Rules
111