0% found this document useful (0 votes)
13 views12 pages

Data Mining Experiment 4

The document outlines an experiment applying the Apriori algorithm for association rule mining using supermarket data in .arff format. It details the steps to generate frequent itemsets and rules based on specified support, confidence, and lift metrics, along with explanations of key concepts like support, confidence, and lift. The process includes using Weka software to analyze the dataset and visualize results.

Uploaded by

kulsooom456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

Data Mining Experiment 4

The document outlines an experiment applying the Apriori algorithm for association rule mining using supermarket data in .arff format. It details the steps to generate frequent itemsets and rules based on specified support, confidence, and lift metrics, along with explanations of key concepts like support, confidence, and lift. The process includes using Weka software to analyze the dataset and visualize results.

Uploaded by

kulsooom456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Department of Computer Science and Engineering

Roll No : 160622733182
Name : Tabasum Syed Tajamul

Experiment No. 4: Apply the Apriori Algorithm

Date: 03/03/2025
Aim: Create the following supermarket data in .arff format
4(a) Apply the apriori algorithm with support = 0.2, confidence = 0.5 & generate 5
frequent itemsets and rules
4(b) Apply the apriori algorithm with support = 0.2, lift = 0.5 & generate 5 frequent
patterns and rules

(a) Apply the apriori algorithm with support = 0.2, confidence = 0.5 & generate 5 frequent
itemsets and rules

Description:
Association Rule Mining: Association Rule Mining is a data mining technique used to identify
relationships between items in large datasets. It helps uncover patterns, such as which products
are frequently bought together in a store. Key metrics include support, which measures how
often an itemset appears in transactions, confidence, which indicates the likelihood of one item
appearing when another does, and lift, which evaluates the strength of an association beyond
random chance.
For example, a supermarket may discover that 80% of customers who buy bread also purchase
butter. This insight can help businesses optimize product placement and marketing strategies.
Popular algorithms for association rule mining include Apriori, which generates frequent
itemsets iteratively, and FP-Growth, which builds a tree structure to find patterns more
efficiently.
Market Basket Analysis: Market Basket Analysis is a data mining technique used to identify
patterns in customer purchasing behavior. It helps businesses understand which products are
frequently bought together, enabling better decision-making in sales, marketing, and inventory
management. MBA uses association rule mining to discover relationships between items in
transaction data.
Frequent Item: A frequent item is an item or a set of items that appear together in a dataset with
a frequency above a specified threshold. In association rule mining, frequent items are identified
using the support metric, which measures how often an item or itemset appears in transactions.
An itemset is a collection of one or more items. If the occurrence of an itemset exceeds a
predefined minimum support threshold, it is considered frequent.
Support: The proportion of transactions that contain a particular item or itemset. It helps identify
frequently bought items.
Formula:
𝑆𝑢𝑝𝑝𝑜𝑟𝑡(𝑋
𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑋

) =
𝑇𝑜𝑡𝑎𝑙 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Confidence: The probability that a customer who buys item X also buys item Y. It measures the
reliability of the association rule.
Formula:
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒(𝑋→𝑌) =𝑆𝑢𝑝𝑝𝑜𝑟𝑡(
𝑆𝑢𝑝𝑝𝑜𝑟𝑡(𝑋∪𝑌)

𝑋)
Lift: Measures how much more likely two items are bought together compared to random
chance.
Formula:
𝐿𝑖𝑓𝑡(𝑋→𝑌
) =
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒(𝑋
→𝑌)
𝑆𝑢𝑝𝑝𝑜𝑟𝑡(𝑌)
Algorithm Apriori:

1) Collect the dataset: Gather transactional data where each transaction contains a set of
items.
2) Generate frequent 1-itemsets (L1): Compute support for individual items and discard
those below the minimum support threshold.
3) Generate k-itemsets iteratively:
● Use frequent (k-1)-itemsets (Lk-1) to generate candidate k-itemsets (Ck).
● Prune non-frequent subsets and compute support for Ck.
● Retain itemsets meeting the minimum support threshold, forming Lk.
4) Repeat step 3 until no more frequent itemsets can be generated.
5) Extract association rules from frequent itemsets and evaluate their strength using
confidence, keeping those above the minimum confidence threshold.
Results:
1) Open notepad
2) Enter the dataset as follows:

Figure 1: Notepad - supermarket.arff file

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Table 1: Transaction Dataset

Tid Itemset

T1 {bread, cheese, juice}

T2 {bread, egg, juice, yogurt}

T3 {cheese, yogurt}

T4 {bread, cheese, egg, yogurt}

T5 {egg, juice}

3) Save the file in .arff format (supermarket.arff)


4) Open Weka environment, start Weka Explorer

Figure 2: Weka Environment

Figure 3: Weka Explorer

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

5) Open file, choose path - weather.arff

Figure 4: Open supermarket.arff

Figure 5: supermarket.arff

Figure 6: Visualization of all Attributes

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

6) Viewing the data, click edit

Figure 7: Data Viewer


7) After loading the file, choose the associate tab in the weka explorer window.
8) Under the associate tab, click on choose and select the apriori algorithm as shown below.

Figure 8: Selecting Apriori Algorithm for Association Rule Mining

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 9: Apriori Association Rule Mining Interface


9) Change the parameters as follows (set metricType as Confidence) and click OK

Figure 10: Weka Apriori Algorithm Configuration Window

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 11: Start the associator


10) The output is represented as shown below

Figure 12: Apriori Algorithm Results

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

(b) Apply the apriori algorithm with support = 0.2, lift = 0.5 & generate 5 frequent patterns and
rules
Results:
1) Open notepad
2) Enter the dataset as follows:

Figure 13: Notepad - supermarket.arff file


3) Save the file in .arff format (supermarket.arff)
4) Open Weka environment, start Weka Explorer

Figure 14: Weka Environment

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 15: Weka Explorer


5) Open file, choose path - weather.arff

Figure 16: Open supermarket.arff

Figure 17: supermarket.arff

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 18: Visualization of all Attributes


6) Viewing the data, click edit

Figure 19: Data Viewer


7) After loading the file, choose the associate tab in the weka explorer window.
8) Under the associate tab, click on choose and select the apriori algorithm as shown below.

Figure 20: Selecting Apriori Algorithm for Association Rule Mining

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 21: Apriori Association Rule Mining Interface


9) Change the parameters as follows (set metricType as Lift) and click OK

Figure 22: Weka Apriori Algorithm Configuration Window

Stanley College of Engineering and Technology for


Women
Department of Computer Science and Engineering
Roll No : 160622733182
Name : Tabasum Syed Tajamul

Figure 23: Start the associator


10) The output is represented as shown below

Figure 24: Apriori Algorithm Results

Stanley College of Engineering and Technology for


Women

You might also like