0% found this document useful (0 votes)
19 views4 pages

Data MINING Acitivity 2-1

Uploaded by

Suraj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views4 pages

Data MINING Acitivity 2-1

Uploaded by

Suraj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Data Visualization Of Given Data Set

(Item Dataset Using Apriori Algorithm)

Name::Rishab.Ashok.Bhadoriya

Class:: TYBCA(sci)
Roll No::81

1. Load and Preprocess the Data: The dataset will be prepared in a format suitable for the Apriori
algorithm, typically a list of transactions.

2. Apply the Apriori Algorithm: We will use the apriori function from the mlxtend library to extract
frequent itemsets.

3. Generate Association Rules: Using the frequent itemsets, we can generate association rules and
their support, confidence, and lift.

4. Visualize the Results: We'll use libraries like matplotlib and seaborn to create visualizations such as
bar plots for frequent itemsets and scatter plots for the association rules.

Python Code Example

# Install required libraries if not already installed

# !pip install mlxtend

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

import matplotlib.pyplot as plt

import seaborn as sns

# Step 1: Load dataset (example data for transactions)

# This is a simple dataset of transactions, replace with your own dataset.

data = {'Item1': [1, 2],

'Item2': [2,3,4,5],

'Item3': [2,3],

'Item4': [1],

'Item5’: [ 1,2,3]}

df = pd.DataFrame(data)

# Step 2: Apply Apriori algorithm

# Set min_support to a value based on your needs

frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)

# Step 3: Generate Association Rules

rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

# Step 4: Visualization

# Visualize the frequent itemsets as a bar plot

plt.figure(figsize=(10, 6))

frequent_itemsets['itemsets'] = frequent_itemsets['itemsets'].apply(lambda x: ', '.join(list(x)))

sns.barplot(x='support', y='itemsets', data=frequent_itemsets.sort_values(by='support',


ascending=False))

plt.title('Frequent Itemsets')

plt.xlabel('Support')
plt.ylabel('Itemsets')

plt.show()

# Visualize the rules based on support, confidence, and lift

plt.figure(figsize=(10, 6))

sns.scatterplot(x='support', y='confidence', size='lift', hue='lift', data=rules, sizes=(40, 400),


palette='viridis')

plt.title('Association Rules')

plt.xlabel('Support')

plt.ylabel('Confidence')

plt.legend(loc='upper right', bbox_to_anchor=(1.2, 1))

plt.show()

# Print rules for review

print(rules)

Steps Breakdown:

1. Dataset: The dataset (df) is created as a binary matrix where each row is a transaction and each
column represents an item. Replace it with your dataset.
2. Apriori Algorithm: We run the Apriori algorithm using mlxtend to get the frequent itemsets with a
minimum support threshold.

3. Association Rules: The association rules are extracted based on the frequent itemsets, and metrics
like lift, support, and confidence are calculated.

4. Visualization:

A bar plot of frequent itemsets with their support values.

A scatter plot to visualize the association rules, with support on the x-axis, confidence on the y-axis,
and the size and color representing the lift.

Make sure to adjust the dataset and the parameters (min_support, min_threshold, etc.) to suit your
specific needs. Let me know if you'd like help adapting this to your specific dataset.

You might also like