0% found this document useful (0 votes)

24 views13 pages

Lecture 8

Uploaded by

Talha Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views13 pages

Lecture 8

Uploaded by

Talha Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

ADVANCE DATA

WAREHOUSE AND
MINING

LECTURE 8: ASSOCIATION
RULE MINING
INTRODUCTION
◎ Association Rule Mining (ARM) is a key technique used in data mining to
uncover hidden patterns and relationships in large datasets. It is mostly
applied in scenarios where we want to identify interesting and useful
relationships between variables in datasets.

◎ Real-World Example: Consider the scenario where a supermarket wants to

understand customer buying habits to improve inventory management. By
identifying which products are often bought together, they can optimize store
placement and promotions.

◎ Goal of ARM: To discover rules that indicate associations, for example, "if a
customer buys item A, they are likely to buy item B."
MARKET BASKET ANALYSIS
◎ Market Basket Analysis (MBA) is a primary application of Association Rule
Mining. It analyzes transaction data to find associations between different
products that customers buy together.
◎ Example: In a retail environment, customers who buy milk often buy bread as
well. These associations help improve product placement and marketing
strategies.
◎ Purpose of MBA:
◉ Improve cross-selling (suggesting related products).
◉ Help in store layout decisions (placing related items close to each other).
◉ Build recommendation engines for online retail.
◎ Example in Action: An online bookstore might recommend a book based on
previous purchases like "You may also like..." This is driven by MBA
techniques.
APRIORI ALGORITHM AND FREQUENT ITEMSETS

◎ The Apriori Algorithm is one of the most popular algorithms used for
Association Rule Mining.

◎ Frequent Itemsets are groups of items that appear together in transactions

with a frequency higher than a user-specified threshold (support).
◎ Apriori Algorithm:
◉ Step 1: Generate frequent 1-itemsets (individual items that appear most
often).
◉ Step 2: Combine these frequent items to generate 2-itemsets and so on.
◉ Step 3: Eliminate any itemsets that do not meet the minimum support threshold.

◎ Example: If you have a dataset of transactions, say 100 transactions, and the
itemset {bread, butter} appears in 60 of those, it would have a support of 60%
(if the threshold is 50%, it will be considered frequent).
GENERATING ASSOCIATION RULES
◎ Once we have frequent itemsets, we can generate association rules. These
are rules of the form {A} → {B}, meaning "if A happens, B is likely to happen.“

◎ Confidence and Lift are used to evaluate the strength of these rules.

◎ Steps to Generate Rules:

◉ Identify frequent itemsets using the Apriori algorithm.
◉ For each frequent itemset, generate all possible rules (e.g., for {A, B}, generate
{A} → {B} and {B} → {A}.
◉ Calculate the support, confidence, and lift of each rule to determine if they are
meaningful.

◎ Example Rule: In retail, the rule {bread} → {butter} suggests that customers
who buy bread are likely to buy butter. The strength of this rule is determined
by its confidence and lift.
MEASURES OF RULE INTERESTINGNESS
◎ To determine whether an association rule is interesting, we need to evaluate
it using three key metrics:
◉ Support: The frequency or probability of an itemset occurring in the
dataset.
• Formula: Support(A) = P(A), where A is an itemset.
• Example: If 100 transactions are recorded, and itemset {bread,
butter} appears in 30 transactions, its support is 30%.

◉ Confidence: The probability that an item B is bought when item A is

bought.
• Formula: Confidence(A → B) = P(B|A)
• Example: If 50% of customers who buy {bread} also buy {butter},
then confidence for {bread} → {butter} is 50%.
CONTINUED…
◎ Lift: This measures the strength of the rule by comparing the observed
frequency of {A, B} to what we'd expect if A and B were independent.
◉ Formula: Lift(A → B) = P(A ∩ B) / (P(A) * P(B))
◉ Example: If P(A) = 0.3, P(B) = 0.4, and P(A ∩ B) = 0.12, the lift would be 1,
meaning the two items are not more likely to be bought together than if
they were independent.

◎ Why These Measures Matter:

◉ Support helps us identify common itemsets.
◉ Confidence tells us how reliable the rule is.
◉ Lift helps assess the strength of the rule.
VISUALIZING SUPPORT, CONFIDENCE, AND LIFT

◎ Let’s look at how these metrics work together to help evaluate rules.
◎ Support: A graph showing how frequently an itemset occurs across
transactions.

◎ Confidence: A graph showing how likely it is that item B is bought given that
item A was purchased.

◎ Lift: A graph showing how much stronger the association is compared to a

random purchase.
◎ Example Visualization:
◉ A table or chart comparing multiple rules and their support, confidence,
and lift values.
◉ This would allow the class to visually compare different rules and their
significance.
EXAMPLE OF ASSOCIATION RULE MINING (CASE
STUDY)
◎ Scenario: A retail store wants to understand the relationship between
different products bought by customers.

◎ Dataset: 1,000 transactions with 10 products.

◎ Steps Taken:
◉ Step 1: Identify frequent itemsets like {bread}, {butter}, {bread, butter}.
◉ Step 2: Generate rules like {bread} → {butter} and calculate support,
confidence, and lift.
◉ Step 3: Interpret results to suggest that customers who buy bread are
50% more likely to buy butter.

◎ Outcome: The store places bread and butter close to each other in the store to
increase sales.
APRIORI ALGORITHM STEP-BY-STEP EXAMPLE

◎ Walk through a simple dataset and show how the Apriori algorithm works.
◎ Step 1: Start by counting individual item frequencies.
◎ Step 2: Generate 2-itemsets from frequent 1-itemsets.
◎ Step 3: Repeat the process for 3-itemsets, 4-itemsets, etc., until no further
frequent itemsets are found.
◎ Step 4: Generate association rules from these frequent itemsets and evaluate
them.
◎ Example Dataset: Transaction data for 5 items in a supermarket.
◎ Practical Exercise: You can ask the class to manually calculate support and
confidence for a couple of simple rules.
APPLICATIONS OF ASSOCIATION RULE MINING

◎ Market Basket Analysis: Most commonly used to find associations in retail

transactions.

◎ Recommendation Systems: E-commerce websites suggest products based on

the items frequently bought together.

◎ Healthcare: In medical databases, association rules can uncover relationships

between symptoms, treatments, and outcomes.

◎ Fraud Detection: Banks and financial institutions use association rule mining
to detect unusual patterns of transactions that might indicate fraud.

◎ Example: "People who bought a digital camera also bought a camera bag."
ADVANTAGES AND LIMITATIONS
◎ Advantages:
◉ Automatic Pattern Discovery: No need for human hypothesis.
◉ Actionable Insights: The discovered patterns are easy to interpret and can guide
decisions.
◉ Scalable: Can handle large datasets.

◎ Limitations:
◉ High Computational Cost: Especially with large datasets.
◉ Too Many Rules: Can generate a lot of rules, not all of which are useful.
◉ Threshold Setting: Deciding on the right support and confidence
thresholds can be challenging.
CONCLUSION
◎ Association Rule Mining, particularly through the Apriori algorithm, is a
crucial tool for discovering valuable patterns in large datasets.

◎ The metrics support, confidence, and lift are essential for evaluating the
relevance and strength of these patterns.

◎ Real-world applications in retail, healthcare, and finance demonstrate the

power of ARM in solving practical problems.

◎ Despite its challenges, ARM provides actionable insights that drive better
decision-making.

Oscp Notes Active Directory 1
No ratings yet
Oscp Notes Active Directory 1
70 pages
In The Line of Fire - Compressed
No ratings yet
In The Line of Fire - Compressed
383 pages
UNIT III
No ratings yet
UNIT III
13 pages
Data Analysis (No Free Launch Theorem)
No ratings yet
Data Analysis (No Free Launch Theorem)
8 pages
Ariori Introduction and Concept
No ratings yet
Ariori Introduction and Concept
37 pages
Association Rule Mining Presentation
No ratings yet
Association Rule Mining Presentation
44 pages
DM Unit-2
No ratings yet
DM Unit-2
22 pages
Unit - III
No ratings yet
Unit - III
27 pages
Unit 2
No ratings yet
Unit 2
14 pages
Unit-3 New
No ratings yet
Unit-3 New
75 pages
Association and Recommendation System
No ratings yet
Association and Recommendation System
24 pages
Devdm
No ratings yet
Devdm
7 pages
Aml Unit 3
No ratings yet
Aml Unit 3
17 pages
Pec-It602b
No ratings yet
Pec-It602b
7 pages
Association Rule Mining
No ratings yet
Association Rule Mining
26 pages
Association Rule Mining
No ratings yet
Association Rule Mining
17 pages
Association Analysis and Frequent Sequential Pattern Mining-Apriori Algorithm
No ratings yet
Association Analysis and Frequent Sequential Pattern Mining-Apriori Algorithm
13 pages
Mod 4 Part1 - Merged
No ratings yet
Mod 4 Part1 - Merged
104 pages
DM-M4.1-Association v25.4.2
No ratings yet
DM-M4.1-Association v25.4.2
40 pages
Lecture - 11 - Sathya - Zainab
No ratings yet
Lecture - 11 - Sathya - Zainab
17 pages
Association Rule Mining
No ratings yet
Association Rule Mining
10 pages
Data Mining Mod 2
No ratings yet
Data Mining Mod 2
7 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
19 pages
Marketbasket Analysis
No ratings yet
Marketbasket Analysis
28 pages
Seminar 6
No ratings yet
Seminar 6
30 pages
Chapter 4 Association Rule Mining (ARM)
No ratings yet
Chapter 4 Association Rule Mining (ARM)
22 pages
CSA 106 Market Basket Analysis
No ratings yet
CSA 106 Market Basket Analysis
13 pages
ChatPDF-DataMining Lec4
No ratings yet
ChatPDF-DataMining Lec4
5 pages
UNIT 2 Updated
No ratings yet
UNIT 2 Updated
50 pages
Credit Card Clients Default Payment Prediction Via Data
No ratings yet
Credit Card Clients Default Payment Prediction Via Data
22 pages
Clickstream Analytics
No ratings yet
Clickstream Analytics
22 pages
Contents
No ratings yet
Contents
59 pages
ML Module3
No ratings yet
ML Module3
83 pages
Data Mining Frequent Patterns
No ratings yet
Data Mining Frequent Patterns
22 pages
Unit4 Ch6 AssociationRuleMining
No ratings yet
Unit4 Ch6 AssociationRuleMining
82 pages
Topic 1, 2, 3
No ratings yet
Topic 1, 2, 3
5 pages
Lab - Association Rule
No ratings yet
Lab - Association Rule
6 pages
Unit 2
No ratings yet
Unit 2
8 pages
#CH-2 2 5
No ratings yet
#CH-2 2 5
16 pages
Dr. B.C Roy Engineering College
No ratings yet
Dr. B.C Roy Engineering College
10 pages
Association Rule Mining
No ratings yet
Association Rule Mining
24 pages
Chapter 13 - Association Rules: Data Mining For Business Intelligence
No ratings yet
Chapter 13 - Association Rules: Data Mining For Business Intelligence
22 pages
Chapter - 05 - Association Rules
No ratings yet
Chapter - 05 - Association Rules
38 pages
Unit-2 Dma
No ratings yet
Unit-2 Dma
68 pages
Unit4 1 Association Rules Apriori
No ratings yet
Unit4 1 Association Rules Apriori
23 pages
Association Rule - Data Mining
100% (1)
Association Rule - Data Mining
131 pages
Unit - 5 Machine Learning
No ratings yet
Unit - 5 Machine Learning
72 pages
12-Association Rule Learning
No ratings yet
12-Association Rule Learning
25 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
4 pages
Association Rule Mining
No ratings yet
Association Rule Mining
8 pages
DM - Unit II
No ratings yet
DM - Unit II
65 pages
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
No ratings yet
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
32 pages
Association Rule Mining:: Dm-Unit-2
No ratings yet
Association Rule Mining:: Dm-Unit-2
16 pages
Association Rule Mining (ARM)
No ratings yet
Association Rule Mining (ARM)
24 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
Lecture 2.3.1 2.3.2
No ratings yet
Lecture 2.3.1 2.3.2
23 pages
Unit 3 Final
No ratings yet
Unit 3 Final
13 pages
6 - Association Rules - For Students
No ratings yet
6 - Association Rules - For Students
39 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
7 pages
Smart Pricing (Review and Analysis of Raju and Zhang's Book)
From Everand
Smart Pricing (Review and Analysis of Raju and Zhang's Book)
BusinessNews Publishing
4/5 (1)
Pricing Tactics
From Everand
Pricing Tactics
Robert C. Brenner
3.5/5 (4)
Ratio 30 May 2025
No ratings yet
Ratio 30 May 2025
23 pages
LC Automation Meeting Brief
No ratings yet
LC Automation Meeting Brief
2 pages
Temenos T24 Roadmap
No ratings yet
Temenos T24 Roadmap
3 pages
Dgs & Igs in Pakistan (As of 2025) : Assemblies: Speakers & Deputy Speakers (2019-2024)
No ratings yet
Dgs & Igs in Pakistan (As of 2025) : Assemblies: Speakers & Deputy Speakers (2019-2024)
14 pages
PR Ekyc
No ratings yet
PR Ekyc
2 pages
AikTrade System Architecture Detailed
No ratings yet
AikTrade System Architecture Detailed
4 pages
TGD - Personal Details Form
No ratings yet
TGD - Personal Details Form
1 page
Docker Most Asked Interview Questions With Answers
No ratings yet
Docker Most Asked Interview Questions With Answers
19 pages
Google Advanced Data Analytics
No ratings yet
Google Advanced Data Analytics
1 page
Coursera Microsoft Azure Data Engineering Associate (DP-203)
No ratings yet
Coursera Microsoft Azure Data Engineering Associate (DP-203)
1 page
Talha Khan - 9651 - Quiz 4
No ratings yet
Talha Khan - 9651 - Quiz 4
2 pages
Warner DP 203 Slides
100% (1)
Warner DP 203 Slides
91 pages
Talha Khan - 9651 - Assignment 2
No ratings yet
Talha Khan - 9651 - Assignment 2
4 pages
Talha's Resume
No ratings yet
Talha's Resume
1 page
Arish Khan Human Resources Officer Resume 2023 07 26 002929
No ratings yet
Arish Khan Human Resources Officer Resume 2023 07 26 002929
2 pages
Talha Khan - 9651 - Quiz 1
No ratings yet
Talha Khan - 9651 - Quiz 1
2 pages
E-II Final Exam (Sum 22)
No ratings yet
E-II Final Exam (Sum 22)
8 pages
Project Report
No ratings yet
Project Report
46 pages
Tax Calculation
No ratings yet
Tax Calculation
3 pages
Lecture 3 Key Points and Discussion Questions
No ratings yet
Lecture 3 Key Points and Discussion Questions
2 pages
EL201-Accounting For IT
No ratings yet
EL201-Accounting For IT
84 pages
Bme 2 Strategic Management in Tourism and Hospitality Industry
No ratings yet
Bme 2 Strategic Management in Tourism and Hospitality Industry
41 pages
Unit - Iii
No ratings yet
Unit - Iii
88 pages
Constitutional Law Notes
No ratings yet
Constitutional Law Notes
79 pages
The Effects of Cotton Gin and Interchangeable Parts On American Society and Economy
No ratings yet
The Effects of Cotton Gin and Interchangeable Parts On American Society and Economy
8 pages
RoyalMail 2016 Online Prices
No ratings yet
RoyalMail 2016 Online Prices
11 pages
Project Scheduling Analysis
No ratings yet
Project Scheduling Analysis
9 pages
Portfolio
No ratings yet
Portfolio
47 pages
Companies in Hyderabad - Secunderabad - AmbitionBox
No ratings yet
Companies in Hyderabad - Secunderabad - AmbitionBox
54 pages
P2P Benefits
No ratings yet
P2P Benefits
1 page
How To Case Method - Management and Organization
No ratings yet
How To Case Method - Management and Organization
3 pages
Roi Theories
No ratings yet
Roi Theories
2 pages
Title The Impact of Remote Work On Organizational Culture An
No ratings yet
Title The Impact of Remote Work On Organizational Culture An
20 pages
QA Chapter 1 Lecture - 2
No ratings yet
QA Chapter 1 Lecture - 2
33 pages
Colliers Manila H2 2022 Industrial v1
No ratings yet
Colliers Manila H2 2022 Industrial v1
4 pages
Financial Management
No ratings yet
Financial Management
11 pages
New Book On SAP S/4HANA Cloud For Advanced Financial Closing
No ratings yet
New Book On SAP S/4HANA Cloud For Advanced Financial Closing
4 pages
Shyam Sel & Power Ltd. Pithampur 25th June 2025
No ratings yet
Shyam Sel & Power Ltd. Pithampur 25th June 2025
5 pages
Chapter-8 Employee Grievance and Disciplines
No ratings yet
Chapter-8 Employee Grievance and Disciplines
23 pages
Daibb - Management Accounting - Broad Question & Answer
No ratings yet
Daibb - Management Accounting - Broad Question & Answer
31 pages
Notes MKTG 3208 Topic 8
No ratings yet
Notes MKTG 3208 Topic 8
3 pages
Buyer Persona Template
No ratings yet
Buyer Persona Template
19 pages
Changing Role of Managerial Accounting in Dynamic Business Environment
No ratings yet
Changing Role of Managerial Accounting in Dynamic Business Environment
3 pages
Slides Chapter 2 - Business Environment
No ratings yet
Slides Chapter 2 - Business Environment
49 pages
Chapter 13 Multinational Enterprises
No ratings yet
Chapter 13 Multinational Enterprises
19 pages
Sirisanjeevani Business Paln
No ratings yet
Sirisanjeevani Business Paln
20 pages
Ford Case
No ratings yet
Ford Case
4 pages
My 1
No ratings yet
My 1
49 pages

Lecture 8

Uploaded by

Lecture 8

Uploaded by

ADVANCE DATA

◎ Real-World Example: Consider the scenario where a supermarket wants to

◎ Frequent Itemsets are groups of items that appear together in transactions

◎ Steps to Generate Rules:

◉ Confidence: The probability that an item B is bought when item A is

◎ Why These Measures Matter:

◎ Lift: A graph showing how much stronger the association is compared to a

◎ Dataset: 1,000 transactions with 10 products.

◎ Market Basket Analysis: Most commonly used to find associations in retail

◎ Recommendation Systems: E-commerce websites suggest products based on

◎ Healthcare: In medical databases, association rules can uncover relationships

◎ Real-world applications in retail, healthcare, and finance demonstrate the

You might also like