Apriori Algorithm Numerical Example
Apriori Algorithm Numerical Example
MACHINE LEARNING
60%
5
The Apriori algorithm is one of the first algorithms used for association rule mining. In
this article, we will discuss the basics of the apriori algorithm using a numerical example.
Here, we will use a sample transaction dataset to obtain frequent itemsets and create
association rules using the apriori algorithm. We will also discuss the advantages and
disadvantages of the apriori algorithm.
Table of Contents
https://codinginfinite.com/apriori-algorithm-numerical-example/ 1/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
6. Conclusion
Before proceeding to this article, I suggest you read this article on association rule
mining. This will help you understand the basic terms used in this article. You can also
read this article on market basket analysis to overview how association rule mining is
used in different industries.
To generate association rules using the apriori algorithm we use the following steps.
https://codinginfinite.com/apriori-algorithm-numerical-example/ 2/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
After creating the candidate sets, we use the minimum support count to select frequent
itemsets containing one item. Once we get the frequent itemsets with one item, we
iteratively join them to create larger sets containing 2, 3, 4, 5, or more items. Here, we
generate the candidate itemsets containing k items by joining the frequent itemsets with
k-1 items in common. This process is repeated until no new frequent itemsets can be
generated.
We use pruning to minimize the time taken in executing the apriori algorithm. After
creating itemsets of K items, we use the following steps to prune the candidate set.
For each candidate set having k items, we check if each of its subsets having k-1 is a
frequent itemset or not. If yes, the candidate set is considered for generating
frequent itemsets. Otherwise, we reject or prune the itemsets having subsets that
are not frequent itemsets.
After calculating the support count of each candidate itemset, we reject the itemsets
having a support count less than the minimum support count. The rest of the itemsets
are considered frequent itemsets.
After generating the frequent itemsets having k items, we create candidate itemsets
having k+1 items, perform pruning, database scan, and then frequent itemset generation
to generate frequent itemsets having k+1 items. We iterate through these processes
until we cannot generate more frequent itemsets.
After creating frequent itemsets, we will create association rules. If we have a frequent
itemset {I}, we can create association rules in the form of {S}-> {I-S}. Here {S} is a
subset of the frequent itemset {I}.
After generating association rules, we use the minimum confidence and lift to identify the
association rules used in business decisions.
Transaction ID Items
T1 I1, I3, I4
T4 I2, I5
T5 I1, I3, I5
The above dataset for the apriori algorithm numerical example contains five transactions
having transaction IDs T1, T2, T3, T4, and T5. In the transactions, it contains six different
items namely I1, I2, I3, I4, I5, and I6. Let us now use the apriori algorithm to find
association rules from the above dataset. For our numerical example, we will use the
minimum support count of 2 and minimum confidence of 75 percent.
To help us calculate the support of the itemsets, we will create a matrix representing the
presence of items in a transaction as shown below.
I1 I2 I3 I4 I5 I6
https://codinginfinite.com/apriori-algorithm-numerical-example/ 4/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
T1 1 0 1 1 0 0
T2 0 1 1 0 1 1
T3 1 1 1 0 1 0
T4 0 1 0 0 1 0
T5 1 0 1 0 1 0
Transaction Matrix
The above matrix contains Items on the horizontal axis and transaction IDs on the vertical
axis. If an item is present in a transaction, the corresponding cell is set to 1. Otherwise, it
is set to 0. We will use this matrix to calculate the support count of itemsets as it is easier
to scan this matrix compared to the transaction dataset.
To calculate the support count of any given itemset using the above matrix, we will
find the number of rows in which all the items in the given itemset are set to 1.
{I1} 3
{I2} 3
{I3} 4
{I4} 1
{I5} 4
{I6} 1
https://codinginfinite.com/apriori-algorithm-numerical-example/ 5/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
The above table contains the support count of candidate itemsets with one item. Here,
you can observe that the itemsets {I4} and {I6} have support count 1 which is less than the
minimum support count 2. Hence, we will omit these itemsets from the above table. After
this, we will get the table containing frequent itemsets with a single item as shown below.
{I1} 3
{I2} 3
{I3} 4
{I5} 4
Now that we have created frequent itemsets containing a single item, we will move to
calculate the frequent itemsets with two items.
After creating the itemsets with two items, we need to prune the itemsets having subsets
that are not frequent itemsets. As the {I1}, {I2}, {I3}, and {I5} all are frequent itemsets, no
itemsets will be removed from the above list while pruning.
As the next step, we will calculate the support count of each itemset having two items to
create the candidate itemset. The result is tabulated below.
https://codinginfinite.com/apriori-algorithm-numerical-example/ 6/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
{I1,I2} 1
{I1,I3} 3
{I1,I5} 2
{I2,I3} 2
{I2,I5} 3
{I3,I5} 3
In the above candidate itemset, you can observe that the itemset {I1, I2} has the support
count 1 which is less than the minimum support count. Hence, we will remove the above
itemset from the table and obtain the table containing frequent itemsets with two items
as shown below.
{I1,I3} 3
{I1,I5} 2
{I2,I3} 2
{I2,I5} 3
{I3,I5} 3
Now, we have calculated frequent itemsets with two items. Let us calculate the frequent
itemsets with three items.
To calculate the frequent itemsets with three items, we first need to calculate the
candidate set. For this, let us first join the frequent itemsets with two items and create
itemsets with three items as shown below.
{I1, I3, I5}, {I1, I2, I3}, {I1, I2, I5}, {I2, I3, I5}
On the above itemsets, we will perform pruning to remove any itemset that has a subset
that is not a frequent itemset. For this, we will create subsets of 2 items for each itemset
and check if they are frequent itemsets or not. All the subsets of the above itemsets are
tabulated below.
{I2, I3, I5} {I2, I3}, {I2, I5}, {I3, I5} Yes
In the table, you can observe that the itemset {I1,I2 , I3} and {I1, I2, I5} contain the itemset
{I1, I2} which is not a frequent itemset. Hence, we will prune the itemsets {I1, I2, I3} and
{I1, I2, I5}. After this, we will get the itemsets {I1, I3, I5} and {I2, I3, I5} as candidate
itemsets for the itemsets having three items. Let us calculate their support count.
https://codinginfinite.com/apriori-algorithm-numerical-example/ 8/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
In the above table, both itemsets have a support count of 2 which is equal to the
minimum support count. Hence, both itemsets will be considered frequent itemsets.
The above itemset has four subsets with three elements i.e. {I2, I3, I5},{I1, I3, I5}, {I1, I2, I5},
{I1, I2, I3}. In these itemsets, {I1, I2, I5} and {I1, I2, I3} are not frequent itemsets. Hence, we
will prune the itemset {I1, I2, I3, I5}. Thus, we have no candidate set for itemsets with 4
items. Hence, the process of frequent itemset generation stops here.
Now, let us tabulate all the frequent itemsets created in this numerical example on the
apriori algorithm.
{I1} 3
{I2} 3
{I3} 4
{I5} 4
https://codinginfinite.com/apriori-algorithm-numerical-example/ 9/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
{I1,I3} 3
{I1,I5} 2
{I2,I3} 2
{I2,I5} 3
{I3,I5} 3
The above table contains all the frequent itemsets in the given transaction data. We will
now create association rules using the frequent itemsets.
{I1} x
{I2} x
{I3} x
{I5} x
https://codinginfinite.com/apriori-algorithm-numerical-example/ 10/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
{I2, I3, {I2}->{I3, I5}, {I3}->{I2, I5}, {I5}->{I2, I3}, {I2, I3}->{I5}, {I2, I5}->{I3}, {I3, I5}->
I5} {I2}
{I1, I3, {I1}->{I3, I5}, {I3}->{I1, I5}, {I5}->{I1, I3}, {I1, I3}->{I5}, {I1, I5}->{I3}, {I3, I5}->
I5} {I1}
Association rules
You can observe that we got 23 association rules from the frequent itemsets. Now, we
will calculate the confidence of each association rule to find the most important
association rules. I have already discussed how to calculate the confidence of a given
association rule in the article on association rule mining. The confidence of each rule is
tabulated below.
{I1}->{I3} 100%
{I3}->{I1} 75%
{I2}->{I3} 66.67%
{I3}->{I2} 50%
{I1}->{I5} 66.67%
{I5}->{I1} 50%
{I2}->{I5} 100%
{I5}->{I2} 75%
{I3}->{I5} 75%
https://codinginfinite.com/apriori-algorithm-numerical-example/ 11/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
{I5}->{I3} 75%
In the above table, we have calculated the confidence of all the association rules. Now,
we have put the minimum threshold of confidence as 75%. Hence, we will delete all the
association rules having the confidence of less than 75%. Finally, we will get the following
association rules.
{I1}->{I3} 100%
{I3}->{I1} 75%
{I2}->{I5} 100%
{I5}->{I2} 75%
https://codinginfinite.com/apriori-algorithm-numerical-example/ 12/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite
{I3}->{I5} 75%
{I5}->{I3} 75%
Thus, we have calculated the association rules with their confidence using the apriori
algorithm numerical example.
Also, you need to select the minimum support of the itemsets and the confidence of the
association rules very carefully. If you choose very low minimum support, your algorithm
will calculate too many frequent itemsets and execution will be very slow. On the other
hand, if you choose a very high minimum support count, you might not get good results.
Conclusion
In this article, we discussed the apriori algorithm in data mining. Here, we discussed a
numerical example using the apriori algorithm to find frequent itemsets and association
rules from a transaction dataset. To learn more data mining concepts, you can read this
article KNN classification numerical example. You might also like this article on how to
find clusters from a dendrogram in Python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
https://codinginfinite.com/apriori-algorithm-numerical-example/ 13/19