0% found this document useful (0 votes)
71 views29 pages

Tutorial 04

The document discusses association rule mining. It defines an association rule as predicting the occurrence of an item based on other items in a transaction. Support and confidence are the key metrics for evaluating rules, with support measuring the fraction of transactions containing both items and confidence measuring how often one item appears given the other. An example applies the Apriori algorithm to sample transaction data to find all frequent itemsets with support above 0.2, then lists the association rules generated from the most frequent itemset along with their support and confidence values.

Uploaded by

Nehal Patodi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views29 pages

Tutorial 04

The document discusses association rule mining. It defines an association rule as predicting the occurrence of an item based on other items in a transaction. Support and confidence are the key metrics for evaluating rules, with support measuring the fraction of transactions containing both items and confidence measuring how often one item appears given the other. An example applies the Apriori algorithm to sample transaction data to find all frequent itemsets with support above 0.2, then lists the association rules generated from the most frequent itemset along with their support and confidence values.

Uploaded by

Nehal Patodi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Data Mining – Association Rules

1
 Review Questions
◦ Question 1: Data Mining and Metrics
 Algorithm Questions
◦ Question 2: Applying Apriori Algorithm
◦ Question 3: Finding Association Rules

2
3
 What is an Association Rule?

4
 What is an Association Rule?
An association rule states that: given a set of
records, each of which contain some number of
items from a given collection, there will be a
dependency rule that will predict the occurrence of
an item based on the occurrences of other items in
the transaction. In other words, if it has been
found in all transactions that coke is always
bought with milk, then there will be a rule that
states {milk} -> {coke} (however, not the other way
around since not all milk is bought with coke).

5
 What are the metrics for evaluating
association rules?

6
 What are the metrics for evaluating
association rules?
The association rule evaluation metrics are
“Support” (s) and “Confidence” (c). Support is
the fractions of the transactions that contain
both X and Y. Confidence measures how often
items in Y appears in transactions that contain
X.

7
 What are the metrics for
evaluating association rules?

For example given the following TID Items


table, these are the support and 1 Bread, Milk
confidence values: 2 Bread, Diaper,
Beer, Eggs
Example Association Rule: 3 Milk, Diaper,
{Milk, Diaper} => Beer Beer, Coke
4 Bread, Milk,
s = (Milk, Diaper, Beer)/Total Diaper, Beer
Transactions 5 Bread, Milk,
= 2/5 Diaper, Coke
= 0.4
c = (Milk, Diaper, Beer)/ (Milk,
Diaper)
= 2/3
= 0.67

8
9
 Apply the Apriori algorithm to find all itemsets with
support >= 0.2 from the following data:

Transaction Items in Transaction


1 Milk, Bread, Eggs
2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
7 Coffee, Juice
8 Milk, Bread, Cookies, Eggs
9 Cookies, Butter
10 Milk, Bread

10
 Apriori Principle Step 1: Count up the occurrences
of 1 item:
Itemset Count
Milk 5
Bread 4
Eggs 4
Juice 3
Butter 2
Coffee 3
Cookies 2

*Note: since it is out of 10, 0.2 support means if it


appears twice in the list.

11
 Apriori Principle Step 2: Look for frequent
occurrences of 2 items (in bold, not strikethrough):
Itemset Count
Milk, Bread 4
Milk, Eggs 3
Milk, Juice 1
Milk, Cookies 1
Bread, Eggs 3
Bread, Cookies 1
Eggs, Coffee 1
Eggs, Cookies 1
Juice, Butter 1
Juice, Coffee 1
Butter, Cookies 1

12
 Apriori Principle Step 3: Look for frequent
occurrences of 3 items (in bold, not strikethrough):

Itemset Count
Milk, Bread, Eggs 3

Therefore, the most frequent and highest itemset


data mining sub-itemset is {Milk, Bread, Eggs}.

13
 Using the data set in question 2 ({Milk, Bread,
Eggs}), find all the association rules with support
>= 0.2 and confidence >= 0.8.

 “{Milk, Bread} -> Eggs” where {Milk, Bread} is X and


Eggs is Y.
 Support = {itemset (X and Y)}/transactions
 Confidence = {itemset (X and Y)}/{itemset (X)}
 To do this, we check each permutation of the
association rules.

14
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 1 Milk, Bread, Eggs
Confidence = 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 6 Coffee
Confidence = 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread
Confidence =

15
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 6 Coffee
Confidence = 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread

Confidence =

16
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 3/10 = 0.3 6 Coffee
Confidence = 3/3 = 1 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 10 Milk, Bread
Confidence =

17
Association Rules for {Milk, Bread, Eggs}:
Transaction Items in
{Milk, Bread} -> {Eggs} Transaction
Support = 3/10 = 0.3 1 Milk, Bread, Eggs
Confidence = 3/4 = 0.75 2 Milk, Juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} -> {Bread} 5 Coffee, Eggs
Support = 3/10 = 0.3 6 Coffee
Confidence = 3/3 = 1 7 Coffee, Juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} -> {Milk} 9 Cookies, Butter
Support = 3/10 = 0.3 10 Milk, Bread
Confidence = 3/3 = 1

18
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

19
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 4/10 = 0.4 2 Milk, Juice
3 Juice, Butter
Confidence = 4/5 = 0.8 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

20
Association Rules for {Milk, Bread}:
Transaction Items in

{Milk} -> {Bread} 1


Transaction
Milk, Bread, Eggs
Support = 4/10 = 0.4 2 Milk, Juice
3 Juice, Butter
Confidence = 4/5 = 0.8 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Bread} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = 4/10 = 0.4 Cookies, Eggs
9 Cookies, Butter
Confidence = 4/4 = 1 10 Milk, Bread

21
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

22
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.3 2 Milk, Juice
3 Juice, Butter
Confidence = 3/5 = 0.6 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

23
Association Rules for {Milk, Eggs}:
Transaction Items in

{Milk} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.25 2 Milk, Juice
3 Juice, Butter
Confidence = 3/5 = 0.6 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Milk} 7 Coffee, Juice
8 Milk, Bread,
Support = 3/10 = 0.3 Cookies, Eggs
9 Cookies, Butter
Confidence = 3/4 = 0.75 10 Milk, Bread

24
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 2 Milk, Juice
3 Juice, Butter
Confidence = 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

25
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.3 2 Milk, Juice
3 Juice, Butter
Confidence = 3/4 = 0.75 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

26
Association Rules for {Bread Eggs}:
Transaction Items in

{Bread} -> {Eggs} 1


Transaction
Milk, Bread, Eggs
Support = 3/10 = 0.25 2 Milk, Juice
3 Juice, Butter
Confidence = 3/4 = 0.75 4 Milk, Bread, Eggs
5 Coffee, Eggs
6 Coffee
{Eggs} -> {Bread} 7 Coffee, Juice
8 Milk, Bread,
Support = 3/10 = 0.3 Cookies, Eggs
9 Cookies, Butter
Confidence = 3/4 = 0.75 10 Milk, Bread

27
Therefore, the only Association Rules that satisfy the
restriction of having support >= 2 and confidence
>= 0.8 is:

 {Milk, Eggs} -> {Bread} (s=0.3, c=1)


 {Eggs, Bread} -> {Milk} (s=0.3, c=1)
 {Milk} -> {Bread} (s=0.4, c=0.8)
 {Bread} -> {Milk} (s=0.4, c=1)

28
29

You might also like