Session 7
Session 7
Today Objective
Support Confidence
Lift
Leverage Conviction
Indian Institute of Management (IIM),Rohtak
Association Model: Problem Satamente
Apriori Procedure
Key Concepts :
• Frequent Itemsets: The sets of item
which has minimum support (denoted
by Li for ith-Itemset).
• Join Operation: To find Lk , a set of
candidate k-itemsets is generated by
joining Lk with itself.
• Apriori Property: Any subset of
frequent itemset must be frequent.
Indian Institute of Management (IIM),Rohtak
Indian Institute of Management (IIM),Rohtak
Understanding Apriori through an Example
T104 I1, I3
• We have to first find out the
frequent itemset using Apriori
T105 I2, I3 algorithm.
T106 I1, I3 • Then, Association rules will be
T107 I1, I2 ,I3, I5 generated using min. support &
min. confidence.
T108 I1, I2, I3
C1 L1
C2
Indian Institute of Management (IIM),Rohtak
Step 3: Generating 3-itemset Frequent Pattern
Itemset Sup
Generate candidate set C3 using L2 (join step).
Count
{I1, I2} 4
Condition of joining Lk-1 and Lk-1 is that it should
{I1, I3} 4 have (K-2) elements in common. So here, for L2,
{I1, I5} 2 first element should match. (general
{I2, I3} 4 obsevations
•The generation of the set of candidate 3-
{I2, I4} 2 itemsets, C3 , involves use of the Apriori
{I2, I5} 2
Property.
•C3 = L2 Join L2={{I1, I2, I3},{I1, I2, I5},{I1, I3, I5},{I2, I3, I4}, {I2, I3, I5},{I2, I4,I5}}.
If we go for all
•C3 = L2 Join L2 = {{I1, I2, I3}, {I1, I2, I5}, {I1, I2, I4},
{I1, I3, I5}, {I2, I3, I4}, {I2, I3, I5}, {I2, I4, I5}}.
•Now, Join step is complete and Prune step will be used to
reduce the size of C3. Prune step helps to avoid heavy
computation due to large Ck.
Indian Institute of Management (IIM),Rohtak
Step 3: Generating 3-itemset Frequent Pattern [Cont.]
{I1, I3}
4
{I1, I5}
1
{I2, I3} 4
{I2, I5}
2
{I3, I5}
0
Compare
Scan D for Scan D for Itemset Sup. candidate Itemset Sup
count of Itemset count of support count
Count with min
Count
each each
candidate {I1, I2, I3} candidate {I1, I2, I3} 2 support count {I1, I2, I3} 2
{I1, I2, I5} {I1, I2, I5} 2
{I1, I2, I5} 2
C3 C3 L3
• Back To Example:
We had L = {{I1}, {I2}, {I3}, {I4}, {I5}, {I1,I2}, {I1,I3}, {I1,I5}, {I2,I3}, {I2,I4},
{I2,I5}, {I1,I2,I3}, {I1,I2,I5}}.
– Lets take l = {I1,I2,I5}.
– Its all nonempty subsets are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}.
Therefor ,the set of all frequent item sets are {A},{B},{D},{A B},{A
D},{B D},{A B D}
Sugar->egg
milk->bread
Bread->milk
Milk,egg->bread
Egg,bread->milk
Indian Institute of Management (IIM),Rohtak
An Example :
min_sup = 50%(2), min_conf = 80%: generate Strong Association Rule
Tid Item bought
T100 Sugar (A), Egg(C), Butter (D)
T200 Milk(B) , Egg(C), Bread(E)
T300 Sugar (A) , Milk (B) Egg
, (C), Bread (E)
T400 Milk (B), Bread (E)
Sugar->egg
milk->bread
Bread->milk
Milk,egg->bread
Egg,bread->milk
Indian Institute of Management (IIM),Rohtak
Minimum support count = 2, minimum confidence threshold = 80%,
Transaction id Items
T100 M,O,N,K,E,Y
T200 D,O,N,K,E,Y
T300 M,A,K,E
T400 M,U,C,K,Y
T500 C,O,O,K,I,E
O,K,E 3 60%
O,K,E
K,E,Y 2 40%
Assume that the confidence of the decision rule, 1-> 2, is 100%. Is the
confidence of the decision rule, 2-> 1, also 100%? Give an example
of data to justify your answer.
Transaction Items
1 1,2
2 1,2,3
3 2,3
4 1,2,4
5 2,3,4
Assume that the confidence of the decision rule, 1-> 2, is 100%. Is the
confidence of the decision rule, 2-> 1, also 100%? Give an example
of data to justify your answer.
Transaction Items
1 1,2
Confidence for 1->2 is 100%
2 1,2,3 while confidence for 2->1 is 60%
3 2,3
4 1,2,4
5 2,3,4
End