0% found this document useful (0 votes)
28 views

Apriori Algorithm Numerical Example

The document provides a detailed explanation of the Apriori algorithm, a foundational method for association rule mining, using a numerical example with a sample transaction dataset. It outlines the steps involved in the algorithm, including candidate generation, pruning, database scanning, and association rule generation, while also discussing the advantages and disadvantages of the algorithm. The document concludes with a summary of the frequent itemsets derived from the example dataset.

Uploaded by

Aarthi E
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Apriori Algorithm Numerical Example

The document provides a detailed explanation of the Apriori algorithm, a foundational method for association rule mining, using a numerical example with a sample transaction dataset. It outlines the steps involved in the algorithm, including candidate generation, pruning, database scanning, and association rule generation, while also discussing the advantages and disadvantages of the algorithm. The document concludes with a summary of the frequent itemsets derived from the example dataset.

Uploaded by

Aarthi E
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

MACHINE LEARNING

Apriori Algorithm Numerical Example


By Aditya March 29, 2023

60%
5

The Apriori algorithm is one of the first algorithms used for association rule mining. In
this article, we will discuss the basics of the apriori algorithm using a numerical example.
Here, we will use a sample transaction dataset to obtain frequent itemsets and create
association rules using the apriori algorithm. We will also discuss the advantages and
disadvantages of the apriori algorithm.

Table of Contents

1. What is Apriori Algorithm?

2. A Step-by-Step Explanation of The Apriori Algorithm


1. Step 1: Candidate Generation in Apriori Algorithm

2. Step 2: Pruning in Apriori Algorithm

3. Step 3: Database Scan and Frequent Itemset Generation

4. Step 4: Association Rule Generation in Apriori Algorithm

3. Apriori Algorithm Numerical example


1. Create Frequent Itemsets With One Item

2. Create Frequent Itemsets With Two Items

3. Calculate Frequent Itemsets With Three Items

4. Calculate Frequent Itemsets With Four Items

4. Generate Association Rules From Frequent Itemsets

5. Advantages and Disadvantages of the Apriori Algorithm

https://codinginfinite.com/apriori-algorithm-numerical-example/ 1/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

6. Conclusion

Before proceeding to this article, I suggest you read this article on association rule
mining. This will help you understand the basic terms used in this article. You can also
read this article on market basket analysis to overview how association rule mining is
used in different industries.

What is Apriori Algorithm?


The Apriori algorithm is a frequent pattern mining algorithm used in market basket
analysis. We use the apriori algorithm to generate frequent itemsets in a transaction
dataset. It is an iterative algorithm that we used to generate frequent itemsets starting
from a set of one item to create bigger itemsets.

To generate association rules using the apriori algorithm we use the following steps.

First, we first generate frequent itemsets.


Next, we use the frequent itemsets to generate association rules.
Finally, we select the important association rules by eliminating a subset of
generated association rules using the confidence and Lift.

A Step-by-Step Explanation of The Apriori


Algorithm
To use the apriori algorithm for association rule mining, we first define the minimum
support, confidence, and lift. After this, we use candidate generation, pruning, database
scan, and rule generation to create association rules. All the steps are discussed below.

Step 1: Candidate Generation in Apriori Algorithm


We use the candidate generation step to generate itemsets that can possibly be frequent
itemsets. For this, we start by creating sets containing single items. If there are N items in
the dataset, we create N candidate sets.

https://codinginfinite.com/apriori-algorithm-numerical-example/ 2/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

After creating the candidate sets, we use the minimum support count to select frequent
itemsets containing one item. Once we get the frequent itemsets with one item, we
iteratively join them to create larger sets containing 2, 3, 4, 5, or more items. Here, we
generate the candidate itemsets containing k items by joining the frequent itemsets with
k-1 items in common. This process is repeated until no new frequent itemsets can be
generated.

Step 2: Pruning in Apriori Algorithm


The pruning step in the apriori algorithm is based on the concept that a subset of a
frequent itemset must also be a frequent itemset. In other words, if we have an itemset
having a subset that is not a frequent itemset, the itemset cannot be a frequent
itemset.

We use pruning to minimize the time taken in executing the apriori algorithm. After
creating itemsets of K items, we use the following steps to prune the candidate set.

For each candidate set having k items, we check if each of its subsets having k-1 is a
frequent itemset or not. If yes, the candidate set is considered for generating
frequent itemsets. Otherwise, we reject or prune the itemsets having subsets that
are not frequent itemsets.

Step 3: Database Scan and Frequent Itemset Generation


After Pruning, we check the support count of the remaining candidate itemsets. For this,
we scan the transaction dataset to find the support of each frequent itemset.

After calculating the support count of each candidate itemset, we reject the itemsets
having a support count less than the minimum support count. The rest of the itemsets
are considered frequent itemsets.

After generating the frequent itemsets having k items, we create candidate itemsets
having k+1 items, perform pruning, database scan, and then frequent itemset generation
to generate frequent itemsets having k+1 items. We iterate through these processes
until we cannot generate more frequent itemsets.

Step 4: Association Rule Generation in Apriori Algorithm


https://codinginfinite.com/apriori-algorithm-numerical-example/ 3/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

After creating frequent itemsets, we will create association rules. If we have a frequent
itemset {I}, we can create association rules in the form of {S}-> {I-S}. Here {S} is a
subset of the frequent itemset {I}.

After generating association rules, we use the minimum confidence and lift to identify the
association rules used in business decisions.

Apriori Algorithm Numerical example


By now, we have discussed the definition and procedure for the apriori algorithm. Let us
now discuss a numerical example using the apriori algorithm to understand it in a better
manner. For the numerical example, we will use the following dataset.

Transaction ID Items

T1 I1, I3, I4

T2 I2, I3, I5, I6

T3 I1, I2, I3, I5

T4 I2, I5

T5 I1, I3, I5

Dataset for Apriori Algorithm

The above dataset for the apriori algorithm numerical example contains five transactions
having transaction IDs T1, T2, T3, T4, and T5. In the transactions, it contains six different
items namely I1, I2, I3, I4, I5, and I6. Let us now use the apriori algorithm to find
association rules from the above dataset. For our numerical example, we will use the
minimum support count of 2 and minimum confidence of 75 percent.

To help us calculate the support of the itemsets, we will create a matrix representing the
presence of items in a transaction as shown below.

I1 I2 I3 I4 I5 I6

https://codinginfinite.com/apriori-algorithm-numerical-example/ 4/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

T1 1 0 1 1 0 0

T2 0 1 1 0 1 1

T3 1 1 1 0 1 0

T4 0 1 0 0 1 0

T5 1 0 1 0 1 0

Transaction Matrix

The above matrix contains Items on the horizontal axis and transaction IDs on the vertical
axis. If an item is present in a transaction, the corresponding cell is set to 1. Otherwise, it
is set to 0. We will use this matrix to calculate the support count of itemsets as it is easier
to scan this matrix compared to the transaction dataset.

To calculate the support count of any given itemset using the above matrix, we will
find the number of rows in which all the items in the given itemset are set to 1.

Create Frequent Itemsets With One Item


The Apriori algorithm starts by creating candidate itemsets with one item. For this, let us
calculate the support count of each item.

Itemset Support Count

{I1} 3

{I2} 3

{I3} 4

{I4} 1

{I5} 4

{I6} 1

https://codinginfinite.com/apriori-algorithm-numerical-example/ 5/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

Candidate Itemsets with one item

The above table contains the support count of candidate itemsets with one item. Here,
you can observe that the itemsets {I4} and {I6} have support count 1 which is less than the
minimum support count 2. Hence, we will omit these itemsets from the above table. After
this, we will get the table containing frequent itemsets with a single item as shown below.

Itemset Support Count

{I1} 3

{I2} 3

{I3} 4

{I5} 4

Frequent itemsets with one item

Now that we have created frequent itemsets containing a single item, we will move to
calculate the frequent itemsets with two items.

Create Frequent Itemsets With Two Items


To create frequent itemsets with two items, we will first create the candidate itemset with
two items. For this, we will join all the frequent itemsets with one item with each other.
After joining, we will get the following itemsets.

{I1,I2}, {I1,I3}, {I1,I5},{I2,I3},{I2,I5}, and {I3,I5}

After creating the itemsets with two items, we need to prune the itemsets having subsets
that are not frequent itemsets. As the {I1}, {I2}, {I3}, and {I5} all are frequent itemsets, no
itemsets will be removed from the above list while pruning.

As the next step, we will calculate the support count of each itemset having two items to
create the candidate itemset. The result is tabulated below.

https://codinginfinite.com/apriori-algorithm-numerical-example/ 6/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

Itemset Support Count

{I1,I2} 1

{I1,I3} 3

{I1,I5} 2

{I2,I3} 2

{I2,I5} 3

{I3,I5} 3

Candidate itemsets with two items

In the above candidate itemset, you can observe that the itemset {I1, I2} has the support
count 1 which is less than the minimum support count. Hence, we will remove the above
itemset from the table and obtain the table containing frequent itemsets with two items
as shown below.

Itemset Support Count

{I1,I3} 3

{I1,I5} 2

{I2,I3} 2

{I2,I5} 3

{I3,I5} 3

Frequent itemsets with two items

Now, we have calculated frequent itemsets with two items. Let us calculate the frequent
itemsets with three items.

Calculate Frequent Itemsets With Three Items


https://codinginfinite.com/apriori-algorithm-numerical-example/ 7/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

To calculate the frequent itemsets with three items, we first need to calculate the
candidate set. For this, let us first join the frequent itemsets with two items and create
itemsets with three items as shown below.

{I1, I3, I5}, {I1, I2, I3}, {I1, I2, I5}, {I2, I3, I5}

On the above itemsets, we will perform pruning to remove any itemset that has a subset
that is not a frequent itemset. For this, we will create subsets of 2 items for each itemset
and check if they are frequent itemsets or not. All the subsets of the above itemsets are
tabulated below.

Itemset Subsets All the subsets are frequent itemsets?

{I1, I3, I5} {I1, I3},{I1, I5},{I3, I5} Yes

{I1, I2, I3} {I1, I2}, {I1, I3}, {I2, I3} No

{I1, I2, I5} {I1, I2},{I1, I5},{I2, I5} No

{I2, I3, I5} {I2, I3}, {I2, I5}, {I3, I5} Yes

Pruning itemsets with three items

In the table, you can observe that the itemset {I1,I2 , I3} and {I1, I2, I5} contain the itemset
{I1, I2} which is not a frequent itemset. Hence, we will prune the itemsets {I1, I2, I3} and
{I1, I2, I5}. After this, we will get the itemsets {I1, I3, I5} and {I2, I3, I5} as candidate
itemsets for the itemsets having three items. Let us calculate their support count.

Itemset Support Count

{I2, I3, I5} 2

{I1, I3, I5} 2

Candidate itemsets with three items

https://codinginfinite.com/apriori-algorithm-numerical-example/ 8/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

In the above table, both itemsets have a support count of 2 which is equal to the
minimum support count. Hence, both itemsets will be considered frequent itemsets.

Itemset Support Count

{I2, I3, I5} 2

{I1, I3, I5} 2

Frequent itemsets with three items

Calculate Frequent Itemsets With Four Items


Now, we will calculate the frequent itemsets with four items. For this, we will first join the
items in the frequent itemsets with three items to create itemsets with four items. We will
get only one itemset as shown below.

{I1, I2, I3, I5}

The above itemset has four subsets with three elements i.e. {I2, I3, I5},{I1, I3, I5}, {I1, I2, I5},
{I1, I2, I3}. In these itemsets, {I1, I2, I5} and {I1, I2, I3} are not frequent itemsets. Hence, we
will prune the itemset {I1, I2, I3, I5}. Thus, we have no candidate set for itemsets with 4
items. Hence, the process of frequent itemset generation stops here.

Now, let us tabulate all the frequent itemsets created in this numerical example on the
apriori algorithm.

Itemset Support Count

{I1} 3

{I2} 3

{I3} 4

{I5} 4

https://codinginfinite.com/apriori-algorithm-numerical-example/ 9/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

{I1,I3} 3

{I1,I5} 2

{I2,I3} 2

{I2,I5} 3

{I3,I5} 3

{I2, I3, I5} 2

{I1, I3, I5} 2

Frequent itemsets in the dataset

The above table contains all the frequent itemsets in the given transaction data. We will
now create association rules using the frequent itemsets.

Generate Association Rules From Frequent


Itemsets
To create association rules from the frequent itemsets, we will select each itemset. Then,
we will select a subset from the itemset and make it an antecedent of the association
rule. We will make the rest of the items as the consequent of the association rule. All the
association rules that can be generated from the frequent itemsets are shown in the
following table.

Itemset Association Rules

{I1} x

{I2} x

{I3} x

{I5} x

{I1,I3} {I1}->{I3}, {I3}->{I1}

https://codinginfinite.com/apriori-algorithm-numerical-example/ 10/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

{I1,I5} {I1}->{I5}, {I5}->{I1}

{I2,I3} {I2}->{I3}, {I3}->{I2}

{I2,I5} {I2}->{I5}, {I5}->{I2}

{I3,I5} {I3}->{I5}, {I5}->{I3}

{I2, I3, {I2}->{I3, I5}, {I3}->{I2, I5}, {I5}->{I2, I3}, {I2, I3}->{I5}, {I2, I5}->{I3}, {I3, I5}->
I5} {I2}

{I1, I3, {I1}->{I3, I5}, {I3}->{I1, I5}, {I5}->{I1, I3}, {I1, I3}->{I5}, {I1, I5}->{I3}, {I3, I5}->
I5} {I1}

Association rules

You can observe that we got 23 association rules from the frequent itemsets. Now, we
will calculate the confidence of each association rule to find the most important
association rules. I have already discussed how to calculate the confidence of a given
association rule in the article on association rule mining. The confidence of each rule is
tabulated below.

Association Rule Confidence

{I1}->{I3} 100%

{I3}->{I1} 75%

{I2}->{I3} 66.67%

{I3}->{I2} 50%

{I1}->{I5} 66.67%

{I5}->{I1} 50%

{I2}->{I5} 100%

{I5}->{I2} 75%

{I3}->{I5} 75%

https://codinginfinite.com/apriori-algorithm-numerical-example/ 11/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

{I5}->{I3} 75%

{I2}->{I3, I5} 66.67%

{I3}->{I2, I5} 50%

{I5}->{I2, I3} 50%

{I2, I3}->{I5} 100%

{I2, I5}->{I3} 66.67%

{I3, I5}->{I2} 66.67%

{I1}->{I3, I5}, 66.67%

{I3}->{I1, I5}, 50%

{I5}->{I1, I3}, 50%

{I1, I3}->{I5}, 66.67%

{I1, I5}->{I3}, 100%

{I3, I5}->{I1}, 66.67%

Association rules with confidence

In the above table, we have calculated the confidence of all the association rules. Now,
we have put the minimum threshold of confidence as 75%. Hence, we will delete all the
association rules having the confidence of less than 75%. Finally, we will get the following
association rules.

Association Rule Confidence

{I1}->{I3} 100%

{I3}->{I1} 75%

{I2}->{I5} 100%

{I5}->{I2} 75%

https://codinginfinite.com/apriori-algorithm-numerical-example/ 12/19
3/11/25, 10:43 PM Apriori Algorithm Numerical Example - Coding Infinite

{I3}->{I5} 75%

{I5}->{I3} 75%

{I2, I3}->{I5} 100%

{I1, I5}->{I3}, 100%

Association rules with confidence of more than 75%

Thus, we have calculated the association rules with their confidence using the apriori
algorithm numerical example.

Advantages and Disadvantages of the Apriori


Algorithm
The Apriori algorithm is an easy-to-understand algorithm. It is an iterative algorithm and
you can easily learn and implement it. However, it has very high computational
complexity due to repeated scans of the entire database.

Also, you need to select the minimum support of the itemsets and the confidence of the
association rules very carefully. If you choose very low minimum support, your algorithm
will calculate too many frequent itemsets and execution will be very slow. On the other
hand, if you choose a very high minimum support count, you might not get good results.

Conclusion
In this article, we discussed the apriori algorithm in data mining. Here, we discussed a
numerical example using the apriori algorithm to find frequent itemsets and association
rules from a transaction dataset. To learn more data mining concepts, you can read this
article KNN classification numerical example. You might also like this article on how to
find clusters from a dendrogram in Python.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!
https://codinginfinite.com/apriori-algorithm-numerical-example/ 13/19

You might also like