EXP6
EXP6
Experiment No:6
Date of
Performance:
Date of
Submission:
Theory:
Association rule mining is a technique to identify frequent patterns and associations among a set
of items. process of identifying an association between products/items is called association rule
mining. To implement association rule mining, many algorithms have been developed. Apriori
algorithm is one of the most popular and arguably the most efficient algorithms among them. Let
us discuss what an Apriori algorithm is.
Let us try and understand the working of an Apriori algorithm with the help of a very famous
business scenario, market basket analysis.
Here is a dataset consisting of six transactions in an hour. Each transaction is a combination of 0s
and 1s, where 0 represents the absence of an item and 1 represents the presence of it.
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
We can find multiple rules from this scenario. For example, in a transaction of wine, chips, and
bread, if wine and chips are bought, then customers also buy bread.
{wine, chips} =>; {bread}
In order to select the interesting rules out of multiple possible rules from this small business
scenario, we will be using the following measures:
● Support
● Confidence
● List
● Conviction
Support
Support of item x is nothing but the ratio of the number of transactions in which item x appears
to the total number of transactions.
i.e.,
Support(wine) =
Support(wine) = = 0.66667
Confidence
Confidence (x => y) signifies the likelihood of the item y being purchased when item x is
purchased. This method takes into account the popularity of item x.
i.e.,
Lift
Lift (x => y) is nothing but the ‘interestingness’ or the likelihood of the item y being purchased
when item x is sold. Unlike confidence (x => y), this method takes into account the popularity of
the item y.
Conviction
Conviction of a rule can be defined as follows:
conv(x => y) =
i.e.,
The methods to find out the interesting rules, let us go back to the example. fix the support
threshold to 50 per cent.
Step 1: Create a frequency table of all the items that occur in all transactions
Item Frequency
Wine 4
Chips 4
Bread 4
Milk 5
Support threshold = 3
Item Frequency
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
Wine 4
Chips 4
Bread 4
Milk 5
Step 3: From the significant items, make possible pairs irrespective of the order
Item Frequency
Wine, Chips 3
Wine, Bread 3
Wine, Milk 4
Chips, Bread 2
Chips, Milk 3
Bread, Milk 4
Step 4: Again, find the significant items based on the support threshold
Item Frequency
Wine, Milk 4
Bread, Milk 4
Step 5: Now, make a set of three items that are bought together based on the significant
items from Step 4
Item Frequency
{Wine, Bread, Milk} is the only significant item set we have got from the given data. But in
real-world scenarios, we would have dozens of items to build rules from. Then, we might have to
make four/five-pair itemsets.
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
● The support value for the first rule is 0.5. This number is calculated by dividing the
number of transactions containing ‘Milk,’ ‘Bread,’ and ‘Butter’ by the total number of
transactions.
● The confidence level for the rule is 0.846, which shows that out of all the transactions
that contain both “Milk” and “Bread”, 84.6 % contain ‘Butter’ too.
● The lift of 1.241 tells us that ‘Butter’ is 1.241 times more likely to be bought by the
customers who buy both ‘Milk’ and ‘Butter’ compared to the default likelihood sale of
‘Butter.’
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science
OUTPUT:
Mahavir Education Trust's
Shah & Anchor Kutchhi Engineering College
Chembur, Mumbai 400 088
UG Program in Electronics & Computer Science