0% found this document useful (0 votes)
12 views

Ex. 9 Association Rule Learning Using Apriori Algorithm

The document describes using the Apriori algorithm to generate association rules from transactional data. It discusses key concepts like support, confidence and frequent itemsets. The algorithm is applied to sample retail data and top association rules meeting minimum support and confidence thresholds are outputted.

Uploaded by

Toygj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Ex. 9 Association Rule Learning Using Apriori Algorithm

The document describes using the Apriori algorithm to generate association rules from transactional data. It discusses key concepts like support, confidence and frequent itemsets. The algorithm is applied to sample retail data and top association rules meeting minimum support and confidence thresholds are outputted.

Uploaded by

Toygj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Ex.

9 Association Rule Learning using Apriori Algorithm

April 15, 2024

Ex 9 - ASSOCIATION RULE LEARNING USING APRIORI ALGORITHM APRIL 1, 2024


Date: 01/04/2024 Reg No: URK21CS7010
AIM: For the given dataset, generate association rules using Apriori algorithm with support thresh-
old= 8% and confidence threshold = 50%.
Description:
Association Rule Learning: A branch of machine learning that deals with discovering interesting
relationships or associations between variables in large datasets. It aims to uncover patterns,
correlations, and dependencies among items.
Apriori Algorithm: An algorithm used for frequent itemset mining and association rule learning.
It works by iteratively finding frequent itemsets in a dataset and generating association rules based
on those itemsets. The key idea is to prune the search space efficiently by leveraging the Apriori
principle, which states that if an itemset is infrequent, all its supersets are also infrequent.
Support: In association rule learning, support measures the frequency of occurrence of an itemset
in the dataset. It is calculated as the proportion of transactions that contain the itemset.
Confidence: Quantifies the strength of association between the antecedent and consequent in an
association rule. It represents the conditional probability of the consequent given the antecedent and
is calculated as the ratio of the support of the combined itemset to the support of the antecedent.
Support Threshold: A predetermined minimum level of support that an itemset must meet to be
considered frequent. It serves as a filtering criterion to identify itemsets that occur sufficiently often
in the dataset.
Confidence Threshold: A predefined minimum confidence value that association rules must satisfy
to be considered significant or interesting. It helps in selecting rules that have a strong likelihood
of being true or useful.
Candidate Set: In the context of the Apriori algorithm, the candidate set refers to the set of itemsets
generated at each iteration of the algorithm. These candidate itemsets are generated by joining
frequent itemsets from the previous iteration and pruning those that do not meet the support
threshold.
Frequency Set: The set of itemsets that satisfy the minimum support threshold. These are the
frequent itemsets discovered during the execution of the Apriori algorithm and serve as the basis
for generating association rules.

Shop2.csv

1
1. Read the data

[83]: import pandas as pd


data = pd.read_csv("Shop2.csv")
print(data.head())

TID Item
0 1 Lassi,Coffee Powder,Butter,Yougurt,Ghee,Cheese
1 2 Ghee,Coffee Powder
2 3 Lassi,Tea Powder,Butter,Cheese
3 4 Cheese,Tea Powder,Panner,Coffee Powder,Butter,…
4 5 Cheese,Yougurt,Coffee Powder,Sugar,Butter,Sweet
2. Display the candidate set and frequency set for every iteration

[84]: from apyori import apriori

def display_candidate_and_frequency_sets(data, min_support, max_rows=5):

transactions = [list(row['Item'].split(',')) for _, row in data.iterrows()]


results = apriori(transactions, min_support=min_support, use_colnames=True)

print("Candidate Sets and Frequent Itemsets L-sets 1134:")


rows_printed = 0
for i, result in enumerate(results):
if rows_printed >= max_rows:
break
print(f"Iteration {i+1}:")
if isinstance(result, list):
for frequent_itemset in result:
if rows_printed >= max_rows:
break
print(f" Frequent Itemset: {frequent_itemset}")
rows_printed += 1
else:
print(f" Association Rule 1134: {result.items} -> {result.
↪ordered_statistics[0].items_base}")

print(f" Confidence: {result.ordered_statistics[0].confidence:.


↪2f}, Support: {result.support:.2f}")

rows_printed += 1

3. Display the association rules

[85]: data = pd.read_csv("Shop2.csv")


min_support = 0.08
display_candidate_and_frequency_sets(data, min_support)

Candidate Sets and Frequent Itemsets L-sets 1134:


Iteration 1:
Association Rule 1134: frozenset({'Bread'}) -> frozenset()

2
Confidence: 0.24, Support: 0.24
Iteration 2:
Association Rule 1134: frozenset({'Butter'}) -> frozenset()
Confidence: 0.56, Support: 0.56
Iteration 3:
Association Rule 1134: frozenset({'Cheese'}) -> frozenset()
Confidence: 0.44, Support: 0.44
Iteration 4:
Association Rule 1134: frozenset({'Coffee Powder'}) -> frozenset()
Confidence: 0.64, Support: 0.64
Iteration 5:
Association Rule 1134: frozenset({'Ghee'}) -> frozenset()
Confidence: 0.44, Support: 0.44

[70]: transactions = [list(row['Item'].split(',')) for _, row in data.iterrows()]

[71]: min_support = 0.08


min_confidence = 0.5

[72]: results = apriori(transactions, min_support=min_support,␣


↪min_confidence=min_confidence, use_colnames=True)

[ ]: top_5_results = sorted(results, key=lambda x: x.ordered_statistics[0].


↪confidence, reverse=True)[:5]

4. Find all the rules of these subsets that have higher confidence value

[86]: print("\nTop 5 Association Rules based on Confidence 1134:")


for result in top_5_results:
head = list(result.items)[0]
print(f" {head} -> {result.ordered_statistics[0].items_base}")
print(f" Confidence: {result.ordered_statistics[0].confidence:.2f},␣
↪Support: {result.support:.2f}")

Top 5 Association Rules based on Confidence 1134:


Bread -> frozenset({'Bread', 'Cheese'})
Confidence: 1.00, Support: 0.08
Ghee -> frozenset({'Ghee', 'Bread'})
Confidence: 1.00, Support: 0.08
Bread -> frozenset({'Bread', 'Lassi'})
Confidence: 1.00, Support: 0.08
Ghee -> frozenset({'Ghee', 'Bread'})
Confidence: 1.00, Support: 0.08
Ghee -> frozenset({'Ghee', 'Bread'})
Confidence: 1.00, Support: 0.08
Result: Top 5 assosiation rules with the specified support and confidence thresholds on the given
dataset has been has been printed and the output is verifed.

You might also like