Apriori Algorithm Example Problems
Apriori Algorithm Example Problems
Apriori algorithm uses frequent itemsets to generate association rules. It is based on the concept
that a subset of a frequent itemset must also be a frequent itemset. Frequent Itemset is an itemset
whose support value is greater than a threshold value(support).
Example-1:
Iteration 1: Let’s assume the support value is 2 and create the item sets of the size of 1 and
calculate their support values.
Itemsets having Support less than 2 are eliminated again. In this case {1,2}. Now, Let’s
understand what is pruning and how it makes Apriori one of the best algorithm for finding
frequent itemsets.
Pruning: We are going to divide the itemsets in C3 into subsets and eliminate the subsets that
are having a support value less than 2.
Iteration 3: We will discard {1,2,3} and {1,2,5} as they both contain {1,2}. This is the main
highlight of the Apriori Algorithm
Algorithm.
Since the Support of this itemset is less than 2, we will stop here and the final itemset we will
have is F3.
Note: Till now we haven’t calculated the confidence values yet.
For I = {1,3,5},, subsets are {1,3}, {1,5}, {3,5}, {1}, {3}, {5}
For I = {2,3,5},, subsets are {2,3}, {2,5}, {3,5}, {2}, {3}, {5}
Applying Rules: We will create rules and apply them on itemset F3. Now let’s assume a
minimum confidence value is 60%.
S –> (I-S)
S) (means S recommends II-S)
if support(I) / support(S) >= min_conf value
{1,3,5}
Rule 1: {1,3} –> ({1,3,5} – {1,3}) means 1 & 3 –> 5
Confidence = support(1,3,5)/support(1,3) = 2/3 = 66.66% > 60%
Hence Rule 1 is Selected
We are assuming that minimum support count is 2 and minimum confidence is 50%.
Step 1: Create a table which has support count of all the items present in the transaction
database.
We will compare each item’s support count with the minimum support count we have set. If the
support count is less than minimum support count then we will remove those items.
Since I4 was discarded in previous one, so we are not taking any superset having I4
Now, remove all those itemset which has support count less than minimum support count. So,
the final dataset will be
Step 3: Find superset with 3 items in each set present in last transaction dataset. Check all the
subset of an itemset which are frequent or not and remove the infrequent ones.
In this case if we select { I1, I2, I3 } we must have all the subset that is,
{ I1, I2 }, { I2, I3 }, { I1, I3 }. But we don’t have { I1, I3 } in our dataset. Same is true for { I1,
I3, I5 } and { I2, I3, I5 }.
Step 4: As we have discovered all the frequent itemset. We will generate strong association rule.
For that we have to calculate the confidence of each rule.
Since, All these association rules has confidence ≥50% then all can be considered as strong
association rules.
Step 5: We will calculate lift for all the strong association rules.