0% found this document useful (0 votes)
39 views59 pages

Part 4 Mining Freqent Patterns

Data mining

Uploaded by

sh1637
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views59 pages

Part 4 Mining Freqent Patterns

Data mining

Uploaded by

sh1637
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

kgroclaucgllon) Itolege Ellllueng

INF 489 .11 T h o u . r e , JP


ke d o r -
104 • r a s e - e t &red
K.A#11 ' t m i t e t ;

t 1

Instructor: r e 4 5 , 1 A p a l
NICA
Dr. Mohamed H. Farrag DATA MININ
Concepts a n d Techniques

kporei Hon I iity-cheeme Wilber I i o n Poi

Instructor: Dr. Mohamed H. Farrag 1 C o u r s e : Data Mining CMS: AssociationAnalysis BasicConcept: 0 1 )


TerUll,x)o&
Main textbook,
cDem Miloluvg Concepb and Tochinllquez
(3rd ed.)
JlaweiHan,MichelineKamber,and„Han Pei
Universityof Illinois atUrbana-Champaign &
SimonFraser University
0PhigmciascOon D a f t Millfinll[rD,gg
2ndEdition
Tan, Steinbach,
Karpatne, Kumar

ModifiedforhtpoductiontoDataMiringbyDr.MohamedH. Farrag

Instructor: Dr.MohamedH.Farrag 2 C o u r s e : DataMiningCh4:AssociationAnalysisBasicConcept. O f


-
ChapgaT4
Mw
,,,I•-•tute-‘,/ ' h ' . . . * 4
%ArI w li a o r • • s %
sok P 0 _ # • n e t ist I g m
ei 4 1 Aori,
OAssociation Anaiysis: Basic Concepts 't•• .•1*
4 40
r i •4d 4e• L 4 p i l i t
A N & Aa
.Ilt
DATA MININ
Concepts a n d Techniques

kromi Hon I Mkheeme Kanter I k i n Poi

Instructor: Dr. Mohamed H. Farrag 3 C o u r s e : Data Mining CMS: As s oci at on Analysis, Basic Concept 0 1 )
MarketBasket Analysis
• Given:
• A databaseofcustomertransactions(e.g.,shoppn.lgbaskets),where each
transactionisasetofitems(e.g., products)
• Find:
•Groupsofitemswhicharefrequentlypurchased together

A Cash tece01

trefisection: <1'{A) B5 C

Instructor:Dr.MohamedH. Farrag 4 C o u r s e : DataMiningCh4:AssociationAnalysisBasic Concept.


Market Basket Analysis
• Extract information on purchasing behavior
—"IF buys beer and sausage, THEN also buy mustard with high
probability"
• Actionable information: can suggest...
—New store layouts and product assortments
—Which products to put on promotion
• MBA approach i s applicable whenever a customer
purchases multiple things in proximity
—Credit cards
—Services of telecommunication companies
—Banking services
—Medical treatments

instructor: Dr. Mohamed H. Farrag 5 C C o u r s e : C


clyB
so
:A
O
g
in
tM
a
D
t018) —
p
ce
n
o
Association Rules: Basics
• Association rule mining:
—Finding frequent patterns, associations, correlations, o r
causal structures among s e t s o f items o r objects i n
transaction databases, relational databases, a n d other
information repositories.
• Comprehensibility: Simple to understand
• Utilizability: Provide actionable information
• Efficiency: E f f i c i e n t discovery algorithms exist
• Applications:
—Market basket d a t a analysis, cross-marketing, catalog
design, loss-leader analysis, clustering, classification, etc.

Instructor: Dr. Mohamed H. Farrag C C o u r s e : clyBo


so
A
h
C
g
in
tM
a
D
n
ce
p
t:Oro—
• Given a set of transactions, find rules that will predict the
occurrence of an item based on the occurrences of other
items in the transaction

Market-Basket transactions Example of Association Rules


TM I t e m s
{Diaper} —> {Beer},
1 Bread, Milk {Milk, Bread} —> {Eggs,Coke},
/ Bread, Diaper, Beer, Eggs {Beer, Bread} —>
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer Implication means co-
5 Bread, Milk, Diaper, Coke occurrence, not causality!

nstructor Dr. Mohamed H. Farrag 7 c


Course: Data Mining C h t Association Analysis Basic Concept: o r o _ -
Definition: Frequent Itemset

• Itemset
—A collection of one or more items
• Example: {Milk, Bread, Diaper} HI) I t e m s
—k-itemset 1 Bread, Milk
• An itemset that contains K items 2 Bread, Diaper, Beer, Eggs
• Support count (a) 3 Milk, Diaper, Beer, Coke
—Frequency of occurrence of an itemset 4 Bread, Milk, Diaper, Beer
—E.g. G({Milk, Bread, Diaperl) = 2 5 Bread Milk, Diaper Coke
• Support
—Fraction of transactions that contain an itemset
—E.g. s({Milk, Bread, Diaper}) = 2/5
• Frequent Itemset
—An itemset whose support is greater than or equal to a minsup
threshold

Instructor: Dr. Mohamed H. Farrag 8 Course: Data Mining Ch4: Association Analysis Basic Concept C Op_I—
Definition:Association Rule
•Association Rule
—Find alltherulesoftheformX—›Y, where X TM Items
andYare itemsets 1 Bread, Milk
—Withminimumconfidenceand support
2 Bread,Diaper,Beer, Eggs
• Example: 3 Milk,Diaper,Beer, Coke
(Milk,Diaper)—› {Beer} 4 Bread,Milk,Diaper, Beer
5 Bread,Milk,Diaper, Coke
• RuleEvaluation Metrics
- Support (s) Example:
• Fraction of transactions that contain {Milk,Diaper} {Beer}
bothXandYP(XUnionY) I T
s=c(Milk,Diaper, Beer)—04
- Confidence (c) T 5
• Measures how often items in Y c=a(Milk,Diaper,Beer)——
2 0.67
appear i n transactions t h a t c(Milk,Diaper) 3
containX P(YIX)

Instructor: Dr.MohamedH. Farrag 9 C o u r s e : DataMiningCh4:AssociationAnaly•sis BasicConcept


Association Rules: Basics

• Support: denotes the frequency of the rule within


transactions.
support(A B 1 s, c 1) = p(AuB) = support (fA,B1)

• Confidence: denotes the percentage of transactions


containing A which contain also B.
confidence(A B 1 s, c 1) = p(BIA) = p(AuB) / p(A) =
support(tA,B1) / support(fAl)

(Th
Instructor: Dr. Mohamed H. Farrag 10 Course: Data Mining CMS: Association Analysis, Basic Concept a d o —
Association Rules: Basics
• Typical representation formats for association rules:

—diapers> beer [0.5%, 60%]


—buys:diapers> buys:beer [0.5% 60%]

—"IF buys diapers, THEN buys beer in 60% of the cases. Diapers and
beer are bought together in 0.5% of the rows in the database."

• Other representations (used in Han's book):


—buys(x, "diapers")> buys(x, "beer") [0.5%, 60%]
—major(x, "CS") A takes(x, "DB") > grade(x, "A") [1%, 75%]

instructor: Dr. Mohamed H. Farrag 11 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 ) —
Association Rules: Basics

"IF buys diapers,


THEN buys beer
in 60% of the cases
in 0.5% of the rows"

1Antecedent, left-hand side (LHS), body


2 Consequent, right-hand side (RHS), head
3 Support, frequency ("in how big part of the data the things in
left- and right-hand sides occur together")
4 Confidence, strength ("if the left-hand side occurs, how likely
the right-hand side occurs")

instructor: Dr. Mohamed H. Farrag 12 C o u r s e : Data Mining CMS: Association Analysis, Basic Concept: 0 1 ) —
Association Rule Mining Task

• Given a set of transactions T, the goal o f association rule


mining is to find all rules having
—support ≥ minsup threshold
—confidence ≥ minconf threshold
• Brute-force approach:
—List all possible association rules
—Compute the support and confidence for each rule
—Prune rules that fail the minsup and minconf thresholds
Computationally prohibitive!

Instructor: Dr. Mohamed H. Farrag 13 C o u r s e : Data Mining CMS: Association Analysis, Basic Concept a d o —
Association Rules: Basics
• Given: (1) database of transactions, (2) each transaction
is a list of items bought (purchased by a customer in a
visit)
O a r ) a g h t
100 A.B.0
200 A.0 (B) and fC1
400 A.D {ID}, fEl and {F)
500 AC
pairs
• Find: all rules with minimum support and confidence
• I f min. support 50% and min. confidence 50%, then
A C [50%, 66.6%], C = A [50%, 100%]

Instructor: Dr. Mohamed H. Farrag 14 C o u r s e : Data Mining Ch4: Association Analysis, Basic Concept O D —
Mining Association Rules

TM Items Example of Rules:


1 Bread, Milk
2 Bread, Diape r, Beer, Eggs
{Milk,Diaper}—>{Beer} (s=0.4, c=0.67)
{Milk,Beer}—> {Diaper} (s=0.4, c=1 .0)
3 Milk, Diaper, Beer, Coke {Diaper,Beer} —>{Milk} (s=0.4, c=0.67)
4 Bread, Milk, Diaper, Beer {Beer}—> {Milk,Diaper} (s=0.4, c=0.67)
5 Bread, Milk, Diaper, Coke {Diaper} {Milk,Beer} (s=0.4, c=0.5)
{Milk}—>{Diaper,Beer} (s=0.4, c=0.5)

Observations:
• All t h e above rules a r e binary partitions o f t h e s a m e itemset:
{Milk, Diaper, Beer}
• Rules originating from the same itemset have identical support but
can have different confidence
• Thus, we may decouple the support and confidence requirements

Instructor: Dr. Mohamed H. Farrag 15 C o u r s e : Data Mining CMS: Association Analysis, Basic Concept: O D —
Mining Association Rules

• Tw o - s t e p approach:
1. Frequent ltemset Generation
—Generate all itemsets whose support≥ minsup

2. Rule Generation
—Generate high confidence rules from each frequent itemset,
where each rule is a binary partitioning of a frequent itemset

• Frequent itemset generation is still computationally expensive

Instructor: Dr. Mohamed H. Farrag 16 C o u r s e : Data Mining Chit Association Analysis Basic Concept: O r o —
FrequentItemset Generation

Givenditems, there
are2d possible
candidate itemsets
Instructor: Dr.MohamedH. Farrag 17 C o u r s e : DataMiningCh4:AssociationAnalysis Basic Concept.
Frequent Itemset Generation

• Brute-force approach:
—Each itemset in the lattice is a candidate frequent itemset
—Count the support of each candidate by scanning the database

Tr a n s a c t i o n s List o f
Candidates
TID Bents I
1 Bread, M i l k
2 Bread, Diaper, Beer, Eggs
3 M i l k , Diaper, Beer, Coke
4 Bread, M i l k , Diaper, Beer
5 Bread, M i l k , Diaper, Coke
W

—Match each transaction against every candidate


—Complexity—O(1\IMw) =› Expensive since M = 2d ID

Instructor: Dr. Mohamed H. Farrag 18 C o u r s e : Data Mining CMS: Association Analysis, Basic Concept: O D —
Computational Complexity

• Given d unique items:


—Total number of itemsets = 2d
—Total number of possible association rules:

d—I
R = X I
R=1 J = 1
0,
2
0
,-, 3 3d + 1
If d=6, R = 602 rules

1
4 6 10

Instructor: Dr. Mohamed H. Farrag 19 C o u r s e : Data Mining CMS: Association Analysis, Basic Concept a d o —
Frequent Itemset Generation Strategies

• Reduce the number of candidates (M)


—Complete search: M=2d
—Use pruning techniques to reduce M

• Reduce the number of transactions (N)


—Reduce size of N as the size of itemset increases
—Used by DHP and vertical-based mining algorithms

• Reduce the number of comparisons (NM)


—Use efficient data structures to store the candidates or
transactions
—No need to match every candidate against every transaction

instructor: Dr. Mohamed H. Farrag 20 C o u r s e : Data Mining OM: Association Analysis Basic Concept —
Reducing Number of Candidates

• Apriori principle:
—If an itemset is frequent, then all of its subsets must
also be frequent

• Apriori principle holds due to the following property of the


support measure:
VX,Y ( X c s ( X ) ≥ s(Y)

—Support o f an itemset never exceeds the support o f its


subsets
—This is known as the anti-monotone property of support

instructor: Dr. Mohamed H. Farrag 21 C o u r s e : Data Mining CMS: Association Analysis Basic Concept O D —
IlustratingApriori Principle

Instructor:Dr.MohamedH.Farrag 22Course:DataMiningCM:AssociationAnalysisBasicConcept 00

Association Rule Generation
• Association rule mining is a two-step process:
STEP 1: Find the frequent itemsets: the sets of
items that have minimum support.
- So called Apriori trick: A subset of a frequent itemset must also be a
frequent itemset:
• i.e., i f {AB} is a frequent itemset, both {A} and {B} should be
frequent itemsets
—Iteratively find frequent itemsets with size from 1 to k (k-itemset)

STEP 2: Use the frequent itemsets to generate


association rules.

Instructor: Dr. Mohamed H. Farrag 23 C o u r s e : Data Mining C h t Association Analysis Basic Concept: O r o —
Frequent Sets with Apriori
• Join Step: C k is generated by joining Lk4with itself
• Prune Step: Any (k-/)-itemset that is not frequent cannot be a subset
of a frequent k-itemset

Pseudo-code:
Ck: Candidate itemset of size k; Lk: Frequent itemset of size k
Li = {frequent items};
for (k 1 ; Lk !=0; k-F-F) do begin
Ck+1 = {candidates generated from Lk};
for each transaction tin database do
increment the count of all candidates in Ck+1
that are contained in t
Lk+1 = (candidates in Ck+1 with min_support)
end
return Uk Lk;

C.N
Instructor: Dr. Mohamed H. Farrag 24 Course: Data Mining CM: Association Analysis, Basic Concept a d o —
Apriori Algorithm
—Fk: frequent k-itemsets
—Lk: candidate k-itemsets

• Algorithm
—Let k=1
—Generate Fl = {frequent 1-itemsets}
—Repeat until Fk is empty
• Candidate Generation: Generate Lk+1 from Fk
• Candidate Pruning: Prune candidate itemsets in Lk+1 containing
subsets of length k that are infrequent
• Support Counting: Count the support of each candidate in Lk+1 by
scanning the DB
• Candidate Elimination: Eliminate candidates in Lk+1 that are
infrequent, leaving only those that are frequent =› F
- k+1

(Th
Instructor: Dr. Mohamed H. Farrag 25 Course: Data Mining CM: Association Analysis, Basic Concept a d o —
Apriori Candidate Generation
• The Apriori principle:
Any subset of a frequent itemset must be frequent
• L3={abc, abd, acd, ace, bcd}
• Self-joining: L3113
—abcd from abc and abd
—acde from acd and ace
• Pruning:
—acde is removed because ade is not in L3
• C4={abcd}

instructor: Dr. Mohamed H. Farrag 26 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 _
Candidate Generation: FicA x Fk4 Method

• Merge two frequent (k-1)-itemsets if their first (k-2) items are identical

• F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE}
—Merge(ABC, ABD) = ABCD
—Merge(ABC, ABE) = ABCE
—Merge(ABD, ABE) = ABDE

—Do not merge(ABD,ACD) because they share only prefix of length 1


instead of length 2

instructor: Dr. Mohamed H. Farrag 27 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 ) —
Candidate Pruning

• Let F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE} be the set of frequent 3-


itemsets

• L4 = {ABCD,ABCE,ABDE} is the set of candidate 4-itemsets generated


(from previous slide)

• Candidate pruning
—Prune ABCE because ACE and BCE are infrequent
—Prune ABDE because ADE is infrequent

• After candidate pruning: L4 = {ABCD}

instructor: Dr. Mohamed H. Farrag 28 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 8 ) —
Alternate FkA x Fic_i Method

• Merge two frequent (k-1)-itemsets if the last (k-2) items of the first one
is identical to the first (k-2) items of the second.

• F3 = {ABC,ABD,ABE,ACD,BCD BDE,CDE}
—Merge(ABC, BCD) = ABCD
—Merge(ABD, BDE) = ABDE
—Merge(ACD, CDE) = ACDE
—Merge(BCD, CDE) = BCDE

instructor: Dr. Mohamed H. Farrag 29 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 ) —
Candidate Pruning for Alternate Fic_i x Fk_i Method

• Let F3 = {ABC,ABD,ABE,ACD,BCD,BDE,CDE} be the set of frequent 3-


itemsets

• L4 = {ABCD,ABDE,ACDE,BCDE} is the set of candidate 4-itemsets


generated (from previous slide)
• Candidate pruning
—Prune ABDE because ADE is infrequent
—Prune ACDE because ACE and ADE are infrequent
—Prune BCDE because BCE
• After candidate pruning: L4 = {ABCD}

instructor: Dr. Mohamed H. Farrag 30 C o u r s e : Data Mining OM: Association Analysis Basic Concept 0 1 ) —
Rule Generation

• Given a frequent itemset L, find all non-empty subsets f c L such that f — f


satisfies the minimum confidence requirement
—If {A,B,C,D} is a frequent itemset, candidate rules:
ABC —>D, A B D —>C, A C D —>B, B C D —A,
A —>BCD, B —>ACD, C —>ABD, D —ABC
AB —›CD, A C —› BD, A D —› BC, B C —AD,
BD —>AC, C D —>AB,

• I f ILI = k, then there are 2k — 2 candidate association rules (ignoring L —› 0 and 0


—› L)

C.N
Instructor: Dr. Mohamed H. Farrag 31 Course: Data Mining CMS: Association Analysis, Basic Concept a d o
Rule Generation

• I n general, confidence does not have an anti-monotone property


c(ABC ->D) can be larger or smaller than c(AB —>D)

• B u t confidence of rules generated from the same itemset has an anti-


monotone property
—E.g., Suppose {A,B,C,D} is a frequent 4-itemset:

c(ABC —› D) ≥ c(AB —› CD) ≥ c(A —› BCD)

—Confidence is anti-monotone w.r.t. number of items on the RHS of the rule

Instructor: Dr. Mohamed H. Farrag 32 C o u r s e : Data Mining CM: Association Analysis, Basic Concept a d o —
RuleGenerationforApriori Algorithm

Latticeof rules
Low W W I

Confideetc
e

Pruned
Rules

Instructor: Dr.MohamedH.Farrag 3 3 C o u r s e : DataMiningCh4:AssociationAnaly•sis BasicConcept


-
Apriori Example (1/6)

Database D Cl


• itemset sup.
TID Items • i t e m s e t sup.
• {1} 2 •
100 134 {1} 2
Scan D {2} 3
200 2 35 • {2} 3
300 12 3 5 {3} 3
{4} 1 {3} 3
400 2 5 {5} 3
{5} 3

nstructor: Dr. Mohamed H. Farrag 34 C o u r s e : Data Mining Ch4: Association Analysis, Basic Concept 0 1 ) —
itemset itemset sup
{1 2 } {1 2} 1 itemset sup
{1 3 } {1 3} 2 {1 3} 2
{1 5 } {1 5} 1 {2 3} 2
{2 3} {2 3} 2 {2 5} 3
{2 5} {2 5} 3 {3 5} 2
T3 51 {3 5} 2

Instructor: Dr. Mohamed H. Farrag 3 5 C o u r s e : Data Mining C h t Association Analysis Basic Concept: O r o 1
itemsetsup
{2 3 5} 2

Instructor: Dr. Mohamed H. Farrag 36 C o u r s e : Data Mining CM: Association Analysis, Basic Concept a d o
Search Space of 12345
Database D

1345 2 3 4 5 1245 1235 1234

123 1 2 4 1 2 5 1 3 4 1 3 5 1 4 5 2 3 4 2 3 5 2 4 5 3 4 5

12 1 3 14 15 23 2 25 3 4 35 45

instructor: Dr. Mohamed H. Farrag 3 7 C o u r s e : Data Mining CMS: Association Analysis Basic Concept: 0 1 ) —
Apriori Trick 12345
on Level 1

1245 1 2 3 5 1 2 3 4

135 1 4 5 2 3 4 2 3 5 2 4 5 3 4 5

Instructor: Dr. Mohamed H. Farrag 3 8 C o u r s e : Data Mining Chit Association Analysis Basic Concept: O r o _
AprioriTrick 1 2 3 4 5
onLevel 2
'345 2345 1245 1235 1234

123 174 125 134 135 145 234 235 245 345

12 13 23 24 25 34 35

Instructor: Dr.MohamedH.Farrag 3 9 L o u r s e : DataMiningCh4:AssociationAnaly•sis BasicConcept


Reve
iwQuestions

Instructor:Dr.MohamedH.Farrag 4 0 Course: DataMiningCh4:AssociationAnalsisBasic Concept


Question1:Applying theApiori Algorithm

ApplytheApriorialgorithmtofindallitemsets with
support ›= 0.2fromthefollowing data:
Transaction Itemsin Transaction
Milk,Bread, Eggs
2 Milk, Juice
3 Juice, Butter
4 Milk,Bread, Eggs
5 Coffee, Eggs
6 Coffee
Coffee, juice
Milk,Bread,Cookies, Eggs
Cookies, Butter
Milk, Bread

41
Question 1: Applying the Apiori Algorithm

Apriori Principle Step 1: Count up the occurrences


of 1 item:
ltemset Count I
Milk I 5
Bread 4
Eggs I 4
Juice 3
Butter I 2
Coffee 3
Cookies I 2

*Note: since it is out of 10, 0.2 support means if it


appears twice in the list.

42
Question 1: Applying the Apiori Algorithm

Apriori Principle Step 2: Look for frequent


occurrences of 2 items (in bold, not strikethrough):
if
itemset Count
Milk, Breacl 4
Milk, Eggs 3
-1-
4-
3

Eggs, Cookies
4

43
Question 1: Applying the Apiori Algorithm

Apriori Principle Step 3: Look for frequent


occurrences of 3 items (in bold, not strikethrough):

Itemset Count i
Milk, Bread, higs I 3

Therefore, the most frequent and highest itemset


data mining sub-itemset is {Milk, Bread, Eggs}.

44
Question 1: Applying the Apiori Algorithm

Using the data set in question 2 M i l k , Bread,


Eggs}), find all the association rules with support
› = 0.2 and confidence > = 0.8.

"{Milk, Bread} - › Eggs" where {Milk, Bread} is X and


Eggs is Y.
Support = litemset (X and Y)}/transactions
Confidence = litemset (X and Y)}i{itemset (X)}
To do this, we check each permutation of the
association rules.

45
Question 2: Applying the Apiori Algorithm

Association Rules for 'Milk, Bread, Eggs}:


Transaction Items in
{Milk, Bread} - › {Eggs} Transaction
Support Milk, Bread, Eggs
Confidence = 2 Milk, juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} - › {Bread} 5 Coffee, Eggs
Support 6 Coffee
Confidence = 7 Coffee, juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} - › {Milk} 9 Cookies, Butter
Support 10 Milk, Bread
Confidence =

46
Question 2: Applying the Apiori Algorithm

Association Rules for 'Milk, Bread, Eggs}:


Transaction Items in
{Milk, Bread} - › {Eggs} Transaction
Support = 3 / 1 0 0 . 3 1 Milk, Bread, Eggs
Confidence = 3 / 4 = 0.75 2 I Milk, Juice
3 Juice, Butter
4 I Milk, Bread, Eggs
{Milk Eggs} - › {Bread} 5 Coffee, Eggs
Support 6 I Coffee
Confidence = 7 Coffee, Juice
8 I Milk, Bread,
I Cookies, Eggs
{Eggs, Bread} - › {Milk} 9 Cookies, Butter
Support 10 Milk, Bread
Confidence =

47
Question 2: Applying the Apiori Algorithm

Association Rules for {Milk, Bread, Eggs}:

{Milk, Bread} - › {Eggs} Transaction Items in


Support 3 / 1 0 — 0.3 Transaction
1 Milk, Bread, Eggs
Confidence = 3 / 4 = 0.75 2 Milk, juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} - › {Bread} 5 Coffee, Eggs
Support3 / 1 0 0 . 3 6 Coffee
1 Coffee, juice
Confidence = 3 / 3 = 1 8 Milk, Bread,
cookies, Eggs
{Eggs, Bread} - › {Milk} 9 Cookies, Butter
10 Milk, Bread
Support =
Confidence =

48
Question 2: Applying the Apiori Algorithm

Association Rules tor Wilk, Bread, Eggs':


Transaction Items in
{Milk, B r e a d } - > {Eggs} Transaction
Support = 3 / 1 0 0 . 3 Milk, Bread, Eggs
Confidence = 3 / 4 = 0.75 2 Milk, juice
3 Juice, Butter
4 Milk, Bread, Eggs
{Milk Eggs} - › {Bread} 5 Coffee, Eggs
Support 3 / 1 0 0 . 3 6 Coffee
Confidence = 3 / 3 = 1 1 Coffee, juice
8 Milk, Bread,
Cookies, Eggs
{Eggs, Bread} - › {Milk} 9 Cookies, Butter
Support 3 / 1 0 0 . 3 10 Milk, Bread
Confidence = 3 / 3 = 1

49
Question 2: Applying the Apiori Algorithm

Association Rules for{Milk, Bread}:


Transaction Items in
Transaction
{Milk} - › {Bread} 1 Milk, Bread, Eggs
Support = 2 I Milk, Juice
3 Juice, Butter
Confidence = 4 I Milk, Bread, Eggs
5 Coffee, Eggs
6 I Coffee
{Bread} - › {Milk} 7 Coffee, juice
8 I Milk, Bread,
Support = I Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

50
Question 2: Applying the Apiori Algorithm

Association Rules for{Milk, Bread}:


Transaction Items in
Transaction
{Milk} - › {Bread} 1 Milk, Bread, Eggs
Support = 4/1 0 = 0.4 2 I Milk, Juice
3 Juice, Butter
Confidence = 4 / 5 = 0.8 4 I Milk, Bread, Eggs
5 Coffee, Eggs
6 I Coffee
{Bread} - › {Milk} 7 Coffee, juice
8 I Milk, Bread,
Support = I Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

51
Question 2: Applying the Apiori Algorithm

Association Rules for {Milk, Bread}:


Transaction Items in
Transaction
{Milk} - › {Bread} 1 Milk, Bread, Eggs
Support = 4 / 1 0 = 0.4 2 I Milk, Juice
3 Juice, Butter
Confidence = 4 / 5 = 0.8 4 I Milk, Bread, Eggs
5 Coffee, Eggs
6 I Coffee
{Bread}-> {Milk} 7 Coffee, juice
8 I Milk, Bread,
Support = 4 / 1 0 = 0.4 I Cookies, Eggs
9 Cookies, Butter
Confidence = 4 / 4 = 1 10 Milk, Bread

52
Question 2: Applying the Apiori Algorithm

Association Rules for {M ilk, Eggs}:


Transaction Items in
Transaction
{Milk} - › {Eggs} 1 Milk, Bread, Eggs
Support = 2 I Milk, Juice
3 Juice, Butter
Confidence = 4 I Milk, Bread, Eggs
5 Coffee, Eggs
6 I Coffee
{Eggs} - › {Milk} 7 Coffee, juice
8 I Milk, Bread,
Support = I Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

53
Question2:Applying theApiori Algorithm

AssociationRulesfor{Milk, Eggs}:
Transaction Items in
{Milk} > {Eggs} Transactio
Milk,Bread, Eggs
Support= 3/10 = 03 3
Milk, Juice
juice, Butter
Confidence=3/5 = 0.6 Milk,Bread, Eggs
Coffee, Eggs
ICoffee
{Eggs}-> {Milk} 8
Coffee, juice
Milk, Bread,
Support = 9
Cookies, Eggs
Cookies, Butter
Confidence = 10 • Milk, Bread
Question2:Applying theApiori Algorithm

AssociationRulesfor{Milk, Eggs}:
Transaction Items in
{Milk} > {Eggs} Transaction
Milk,Bread, Eggs
Support= 3/10 = 0.25 Milk, juice
Juice, Butter
Confidence=3/5 = 0.6 Milk,Bread, Eggs
Coffee, Eggs
Coffee
Coffee, Juice
Milk, Bread,
Cookies, Eggs
Cookies, Butter
Milk, Bread
Question 2: Applying the Apiori Algorithm

Association Rules for {Bread Eggs}:


Transaction Items in
Transaction
{Bread} - › {Eggs} 1 Milk, Bread, Eggs
Support = 2 I Milk, juice
3 Juice, Butter
Confidence = 4 I Milk, Bread, Eggs
5 Coffee, Eggs
6 I Coffee
{Eggs} - › {Bread} 7 Coffee, juice
8 I Milk, Bread,
Support = I Cookies, Eggs
9 Cookies, Butter
Confidence = 10 Milk, Bread

56
Question2:Applying theApiori Algorithm

AssociationRulesfor{Bread Eggs}:
Transaction Items in
{Bread} > {Eggs} I I
Transaction
Milk,Bread, Eggs
Support= 3/10 = 0.3 11111•11111- Milk, juice
Juice, Butter
Confidence=3/4 = 0.75 Milk,Bread, Eggs
Coffee, Eggs
Coffee
{Eggs}-> {Bread} Coffee, Juice
Milk, Bread,
Support = Cookies, Eggs
Cookies, Butter
Confidence = Milk, Bread
Question2:Applying theApiori Algorithm

AssociationRules for {Bread Eggs}:


Transaction Items in
{Bread} > {Eggs} Transaction
1 Milk,Bread, Eggs
Support= 3/10 = 0.25 2
3
I Milk, juice
Juice, Butter
Confidence= 3/4 = 0.75 I Milk,Bread, Eggs
Coffee, Eggs
6 I Coffee
{Eggs} > {Bread} 8
Coffee, juice
I Milk, Bread,
Support= 3/10 = 0.3 9
Cookies, Eggs
Cookies, Butter
Confidence= 3/4 = 0.75 10 Milk, Bread

58
Question 2: Applying the Apiori Algorithm

Therefore, the only Association Rules that satisfy the


restriction of having support >-= 2 and confidence
› = 0.8 is:

{Milk, Eggs} - › {Bread} (s=0.3, c=1)


{Eggs, Bread} - › {Milk} (s=0.3, c=1)
{Milk} - › {Bread} (s=0.4, c=0.8)
{Bread} - › {Milk} (s=0.4, c=1)

59

You might also like