0% found this document useful (0 votes)

27 views17 pages

DWDM Lecture Notes U-4

Uploaded by

harshale13

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views17 pages

DWDM Lecture Notes U-4

Uploaded by

harshale13

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

UNIT-IV

Association Rule Mining:

Association rule mining is a popular and well researched method for discovering
interesting relations between variables in large databases.
It is intended to identify strong rules discovered in databases using different measures
of interestingness.
Based on the concept of strong rules, RakeshAgrawal et al. introduced association rules.
Problem Definition:
The problem of association rule mining is defined as:

Let be a set of binary attributes called items.

Let be a set of transactions called the database.

Each transaction in has a unique transaction ID and contains a subset of the items in .
A rule is defined as an implication of the form
where and .
The sets of items (for short itemsets) and are called antecedent (left-hand-side or LHS) and
consequent (right-hand-side or RHS) of the rule respectively.
Example:
To illustrate the concepts, we use a small example from the supermarket domain. The set of
items is and a small database containing the items (1
codes presence and 0 absence of an item in a transaction) is shown in the table.

An example rule for the supermarket could be meaning that if

butter and bread are bought, customers also buy milk.

1
Example database with 4 items and 5 transactions

Transaction ID milk bread butter beer

1 1 1 0 0
2 0 0 1 0
3 0 0 0 1
4 1 1 1 0
5 0 1 0 0

Important concepts of Association Rule

Mining:

The support ofan itemset is defined as the proportion of transactions in the

data set which contain the itemset. In the example database, the itemset

has a support of since it occurs in 20% of

all transactions (1 out of 5 transactions).

The confidenceof a rule is defined

For example, the rule has a confidence of

in the database, which means that for 100% of the transactions
containing butter and bread the rule is correct (100% of the times a customer buys butter
and bread, milk is bought as well). Confidence can be interpreted as an estimate of the
probability , the probability of finding the RHS of the rule in transactions
under the condition that these transactions also contain the LHS.

The liftof a rule is defined as

2
or the ratio of the observed support to that expected if X and Y were independent. The

rule has a lift of .

The conviction of a rule is defined as

The rule has aconviction of ,

and can be interpreted as the ratio of the expected frequency that X occurs without Y
(that is to say, the frequency that the rule makes an incorrect prediction) if X and Y were
independent divided by the observed frequency of incorrect predictions.

Market basket analysis:

This processanalyzes customer buying habits by finding associations between the different items
thatcustomers place in their shopping baskets. The discovery of such associationscan help
retailers develop marketing strategies by gaining insight into which itemsare frequently
purchased together by customers. For instance, if customers are buyingmilk, how likely are they
to also buy bread (and what kind of bread) on the same trip to the supermarket. Such information
can lead to increased sales by helping retailers doselective marketing and plan their shelf space.

3
Example:

If customers who purchase computers also tend to buy antivirussoftware at the same time, then
placing the hardware display close to the software displaymay help increase the sales of both
items. In an alternative strategy, placing hardware andsoftware at opposite ends of the store may
entice customers who purchase such items topick up other items along the way. For instance,
after deciding on an expensive computer,a customer may observe security systems for sale while
heading toward the software displayto purchase antivirus software and may decide to purchase a
home security systemas well. Market basket analysis can also help retailers plan which items to
put on saleat reduced prices. If customers tend to purchase computers and printers together,
thenhaving a sale on printers may encourage the sale of printers as well as computers.

Frequent Pattern Mining:

Frequent patternmining can be classified in various ways, based on the following criteria:

4
1. Based on the completeness of patterns to be mined:

We can mine the complete set of frequent itemsets, the closed frequent itemsets, and
the maximal frequent itemsets, given a minimum support threshold.
We can also mine constrained frequent itemsets, approximate frequent
itemsets,near- match frequent itemsets, top-k frequent itemsets and so on.

2. Based on the levels of abstraction involved in the rule set:

Some methods for associationrule mining can find rules at differing levels of abstraction.

For example, supposethat a set of association rules mined includes the following
rules where X is a variablerepresenting a customer:

buys(X, ―computer‖))=>buys(X, ―HP printer‖) (1)

buys(X, ―laptop computer‖)) =>buys(X, ―HP printer‖) (2)

In rule (1) and (2), the items bought are referenced at different levels ofabstraction (e.g.,
―computer‖ is a higher-level abstraction of ―laptop computer‖).
3. Based on the number of data dimensions involved in the rule:

If the items or attributes in an association rule reference only one dimension, then it is
a single-dimensional association rule.
buys(X, ―computer‖))=>buys(X, ―antivirus software‖)

If a rule references two or more dimensions, such as the dimensions age, income, and
buys, then it is amultidimensional association rule. The following rule is an exampleof a
multidimensional rule:
age(X, ―30,31…39‖) ^ income(X, ―42K,…48K‖))=>buys(X, ―high resolution TV‖)

5
4. Based on the types of values handled in the rule:
If a rule involves associations between the presence or absence of items, it is a
Boolean association rule.
If a rule describes associations between quantitative items or attributes, then it is
a quantitative association rule.

5. Based on the kinds of rules to be mined:

Frequent pattern analysis can generate various kinds ofrules and other
interesting relationships.
Association rule mining cangenerate a large number of rules, many of which
are redundant or do not indicatea correlation relationship among itemsets.
The discovered associations can be further analyzed to uncover statistical
correlations, leading to correlation rules.

6. Based on the kinds of patterns to be mined:

Many kinds of frequent patterns can be mined from different kinds of data sets.
Sequential pattern mining searches for frequent subsequences in a sequence data
set, where a sequence records an ordering of events.
For example, with sequential pattern mining, we can study the order in which items are
frequently purchased. For instance, customers may tend to first buy a PC, followed by a
digitalcamera,and then a memory card.
Structuredpatternminingsearches for frequent substructuresin a structureddata set.
Single items are the simplest form of structure.
Each element of an itemsetmay contain a subsequence, a subtree, and so on.
Therefore, structuredpattern mining can be considered as the most general
formof frequent pattern mining.

6
Efficient Frequent Itemset Mining Methods:
Finding Frequent Itemsets Using Candidate
Generation:The Apriori Algorithm

Apriori is a seminal algorithm proposed by R. Agrawal and R. Srikant in 1994 for mining
frequent itemsets for Boolean association rules.
The name of the algorithm is based on the fact that the algorithm uses prior knowledge of
frequent itemset properties.
Apriori employs an iterative approach known as a level-wise search, where k-itemsets are
used to explore (k+1)-itemsets.
First, the set of frequent 1-itemsets is found by scanning the database to accumulate the
count for each item, and collecting those items that satisfy minimum support. The
resulting set is denoted L1.Next, L1 is used to find L2, the set of frequent 2-itemsets,
which is used to find L3, and so on, until no more frequent k-itemsets can be found.
The finding of each Lkrequires one full scan of the database.
A two-step process is followed in Aprioriconsisting of joinand prune action.

7
Example:

TID List of item IDs

T100 I1, I2, I5
T200 I2, I4
T300 I2, I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2, I3, I5
T900 I1, I2, I3

There are nine transactions in this database, that is, |D| = 9.

8
Steps:
1. In the first iteration of the algorithm, each item is a member of the set of candidate1-
itemsets, C1. The algorithm simply scans all of the transactions in order to countthe number of
occurrences of each item.
2. Suppose that the minimum support count required is 2, that is, min sup = 2. The set of
frequent 1-itemsets, L1, can thenbe determined. It consists of the candidate 1-itemsets
satisfying minimum support.In our example, all of the candidates in C1 satisfy minimum
support.
3. To discover the set of frequent 2-itemsets, L2, the algorithm uses the join L1 on L1
togenerate a candidate set of 2-itemsets, C2.No candidates are removed fromC2 during the
prune step because each subset of thecandidates is also frequent.
4. Next, the transactions inDare scanned and the support count of each candidate itemsetInC2 is
accumulated.
5. The set of frequent 2-itemsets, L2, is then determined, consisting of those candidate2-
itemsets in C2 having minimum support.
6. The generation of the set of candidate 3-itemsets,C3, Fromthejoin step, we first getC3 =L2x
L2 = ({I1, I2, I3}, {I1, I2, I5}, {I1, I3, I5}, {I2, I3, I4},{I2, I3, I5}, {I2, I4, I5}. Based on the
Apriori property that all subsets of a frequentitemsetmust also be frequent, we can determine
that the four latter candidates cannotpossibly be frequent.

7. The transactions in D are scanned in order to determine L3, consisting of

thosecandidate 3-itemsets in C3 having minimum support.
8. The algorithm uses L3x L3 to generate a candidate set of 4-itemsets, C4.

9
Generating Association Rules from Frequent Itemsets:
Once the frequent itemsets from transactions in a database D have been found, it
is straightforward to generate strong association rules from them.

1
Example:

Mining Multilevel Association Rules:

For many applications, it is difficult to find strong associations among data items at
low or primitive levels of abstraction due to the sparsity of data at those levels.
Strong associations discovered at high levels of abstraction may represent
commonsense knowledge.
Therefore, data mining systems should provide capabilities for mining association
rules at multiple levels of abstraction, with sufficient flexibility for easy traversal
amongdifferentabstraction spaces.

1
Association rules generated from mining data at multiple levels of abstraction arecalled
multiple-level or multilevel association rules.
Multilevel association rules can be mined efficiently using concept hierarchies under a
support-confidence framework.
In general, a top-down strategy is employed, where counts are accumulated for the
calculation of frequent itemsets at each concept level, starting at the concept level 1 and
working downward in the hierarchy toward the more specific concept levels,until no
more frequent itemsets can be found.

A concepthierarchy defines a sequence of mappings froma set of low-level concepts to

higherlevel,more general concepts. Data can be generalized by replacing low-level
conceptswithin the data by their higher-level concepts, or ancestors, from a concept hierarchy.

1
The concept hierarchy has five levels, respectively referred to as levels 0to 4, starting with
level 0 at the root node for all.

Here, Level 1 includes computer, software, printer&camera, and computer accessory.

Level 2 includes laptop computer, desktop computer, office software, antivirus
software Level 3 includes IBM desktop computer, . . . , Microsoft office software, and
so on.
Level 4 is the most specific abstraction level of this hierarchy.

2.5.1 Approaches ForMining Multilevel Association Rules:

1. UniformMinimum Support:
The same minimum support threshold is used when mining at each level of abstraction.
When a uniform minimum support threshold is used, the search procedure is simplified.
The method is also simple in that users are required to specify only one minimum
support threshold.
The uniform support approach, however, has some difficulties. It is unlikely thatitems at
lower levels of abstraction will occur as frequently as those at higher levelsof
abstraction.
If the minimum support threshold is set too high, it could miss somemeaningful
associations occurring at low abstraction levels. If the threshold is set too low, it may
generate many uninteresting associations occurring at high abstractionlevels.

2. Reduced Minimum Support:

1
Each level of abstraction has its own minimum support threshold.

1
The deeper the level of abstraction, the smaller the corresponding threshold is.
For example,the minimum support thresholds for levels 1 and 2 are 5% and
3%,respectively. In this way, ―computer,‖ ―laptop computer,‖ and ―desktop computer‖ areall
considered frequent.

3. Group-Based Minimum Support:

Because users or experts often have insight as to which groups are more important than
others, it is sometimes more desirable to set up user-specific, item, or group based minimal
support thresholds when mining multilevel rules.
For example, a user could set up the minimum support thresholds based on product price, or
on items of interest, such as by setting particularly low support thresholds for laptop
computersand flash drives in order to pay particular attention to the association patterns
containing items in these categories.

Mining Multidimensional Association Rules from

Relational Databases and Data Warehouses:

Single dimensional or intradimensional association rule contains a single distinct

predicate (e.g., buys)with multiple occurrences i.e., the predicate occurs more than once
within the rule.

buys(X, ―digital camera‖)=>buys(X, ―HP printer‖)

Association rules that involve two or more dimensions or predicates can be

referredto as multidimensional association rules.

1
age(X, “20…29”)^occupation(X, “student”)=>buys(X, “laptop”)
Above Rule contains three predicates (age, occupation,and buys), each of which occurs
only once in the rule. Hence, we say that it has norepeated predicates.
Multidimensional association rules with no repeated predicates arecalled
interdimensional association rules.
We can also mine multidimensional associationrules with repeated predicates, which
contain multiple occurrences of some predicates.These rules are called hybrid-
dimensional association rules. An example of sucha rule is the following, where the
predicate buys is repeated:
age(X, ―20…29‖)^buys(X, ―laptop‖)=>buys(X, ―HP printer‖)

Mining Quantitative Association Rules:

Quantitative association rules are multidimensional association rules in which the numeric
attributes are dynamically discretized during the mining process so as to satisfy some mining
criteria, such as maximizing the confidence or compactness of the rules mined.
In this section, we focus specifically on how to mine quantitative association rules having
two quantitative attributes on the left-hand side of the rule and one categorical attribute on
the right-hand side of the rule. That is
Aquan1 ^Aquan2 =>Acat
whereAquan1 and Aquan2 are tests on quantitative attribute interval
Acattests a categorical attribute fromthe task-relevantdata.
Such rules have been referred to as two-dimensional quantitative association rules,
because they contain two quantitative dimensions.
For instance, suppose you are curious about the association relationship between pairs of
quantitative attributes, like customer age and income, and the type of television (such as
high-definition TV, i.e., HDTV) that customers like to buy.
An example of such a 2-D quantitative association rule is
age(X, ―30…39‖)^income(X, ―42K…48K‖)=>buys(X, ―HDTV‖)

1
From Association Mining to Correlation Analysis:
A correlation measure can be used to augment the support-confidence frameworkfor
association rules. This leads to correlation rules of the form
A=>B [support, confidence, correlation]
That is, a correlation rule is measured not only by its support and confidence but alsoby
the correlation between itemsetsA and B. There are many different correlation
measuresfrom which to choose. In this section, we study various correlation measures
todetermine which would be good for mining large data sets.

Lift is a simple correlation measure that is given as follows. The occurrence of

itemset A is independent of the occurrence of itemsetB if = P(A)P(B);
otherwise, itemsetsA and B are dependent and correlated as events. This definition
can easily be extended to more than two itemsets.

The lift between the occurrence of A and B can bemeasured by computing

If the lift(A,B) is less than 1, then the occurrence of A is negativelycorrelated with

the occurrence of B.
If the resulting value is greater than 1, then A and B are positively correlated, meaning
that the occurrence of one implies the occurrence of the other.
If the resulting value is equal to 1, then A and B are independent and there is no
correlation between them.

Data Mining Unit-Ii Notes
No ratings yet
Data Mining Unit-Ii Notes
24 pages
Sample of Globe Proof of Billing
No ratings yet
Sample of Globe Proof of Billing
2 pages
Manual Instructions For Sap Note 3167391 Edocument Mexico - Annex 20 V4.0: Full Solution
No ratings yet
Manual Instructions For Sap Note 3167391 Edocument Mexico - Annex 20 V4.0: Full Solution
4 pages
Pipe Network Assignment: Computation in Engineering I
No ratings yet
Pipe Network Assignment: Computation in Engineering I
11 pages
Unit-II Association Rules
No ratings yet
Unit-II Association Rules
16 pages
DM - Unit II
No ratings yet
DM - Unit II
65 pages
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
No ratings yet
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
32 pages
Association Analysis (DMDW)
No ratings yet
Association Analysis (DMDW)
16 pages
CH - 5
No ratings yet
CH - 5
43 pages
Association Rule Mining
No ratings yet
Association Rule Mining
17 pages
Unit - III
No ratings yet
Unit - III
27 pages
DWDM - Unit - IV
No ratings yet
DWDM - Unit - IV
67 pages
Contents
No ratings yet
Contents
59 pages
Data Mining Task - Association Rule Mining
No ratings yet
Data Mining Task - Association Rule Mining
30 pages
BIA Unit2
No ratings yet
BIA Unit2
17 pages
Association Rule Mining
No ratings yet
Association Rule Mining
24 pages
CH-4 Mining Association Rules
No ratings yet
CH-4 Mining Association Rules
35 pages
DMDW Chapter 4
No ratings yet
DMDW Chapter 4
28 pages
DM Chapter 6 (Association)
100% (1)
DM Chapter 6 (Association)
21 pages
Association Rule Mining:: Dm-Unit-2
No ratings yet
Association Rule Mining:: Dm-Unit-2
16 pages
Association Rule Mining
No ratings yet
Association Rule Mining
10 pages
TMK - DWDM - Unit 4. From Government Engineering College
No ratings yet
TMK - DWDM - Unit 4. From Government Engineering College
176 pages
Data Mining
No ratings yet
Data Mining
4 pages
Unit-5 Finalized
No ratings yet
Unit-5 Finalized
15 pages
DMDW Chapter 4
No ratings yet
DMDW Chapter 4
29 pages
Data - Analytics - Chapter 3
No ratings yet
Data - Analytics - Chapter 3
54 pages
Chapter 5 Data Mining: Dr. Huma Lone
No ratings yet
Chapter 5 Data Mining: Dr. Huma Lone
56 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
DM Unit - 2
No ratings yet
DM Unit - 2
14 pages
Data Mining Frequent Patterns
No ratings yet
Data Mining Frequent Patterns
22 pages
Unit 4 - Part 1
No ratings yet
Unit 4 - Part 1
152 pages
Association Rule Mining
No ratings yet
Association Rule Mining
72 pages
Association Rule Mining
No ratings yet
Association Rule Mining
54 pages
DWDM Unit 2 and 3
No ratings yet
DWDM Unit 2 and 3
31 pages
DMDW Chapter 4 (Updated)
No ratings yet
DMDW Chapter 4 (Updated)
28 pages
BCA Semester VI Data Mining Module 3 (Presentation Kind of N
No ratings yet
BCA Semester VI Data Mining Module 3 (Presentation Kind of N
108 pages
Unit 4 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Data Mining - WWW - Rgpvnotes.in
10 pages
UNIT-4 DMCT Discovering Patterns and Rules
No ratings yet
UNIT-4 DMCT Discovering Patterns and Rules
18 pages
DMDW - Association Analysis
No ratings yet
DMDW - Association Analysis
12 pages
5 DM Association
No ratings yet
5 DM Association
27 pages
Association Rule Mining
No ratings yet
Association Rule Mining
21 pages
Association RuleMining
No ratings yet
Association RuleMining
52 pages
Unit-2 Dma
No ratings yet
Unit-2 Dma
68 pages
3final CH 5 Concept
No ratings yet
3final CH 5 Concept
101 pages
Unit 5
No ratings yet
Unit 5
40 pages
Module1 Part2
No ratings yet
Module1 Part2
17 pages
Mining Frequent Patterns, Association and Correlations - Basic Concepts and Methods
No ratings yet
Mining Frequent Patterns, Association and Correlations - Basic Concepts and Methods
55 pages
Unit 2 Material
No ratings yet
Unit 2 Material
17 pages
Association-Analysis
No ratings yet
Association-Analysis
72 pages
DM Unit-2
No ratings yet
DM Unit-2
22 pages
Frequent Itemsets and Associations
No ratings yet
Frequent Itemsets and Associations
15 pages
Data Mining: Concepts and Techniques: Mining Association Rules in Large Databases
No ratings yet
Data Mining: Concepts and Techniques: Mining Association Rules in Large Databases
81 pages
Association Rules Classroom
No ratings yet
Association Rules Classroom
102 pages
Association: Market Basket Analysis
No ratings yet
Association: Market Basket Analysis
40 pages
Session 8-Association Rules Mining
No ratings yet
Session 8-Association Rules Mining
75 pages
Module5 DMW
No ratings yet
Module5 DMW
13 pages
Data Mining Association Rules
No ratings yet
Data Mining Association Rules
54 pages
Data Mining Unit 2 1
No ratings yet
Data Mining Unit 2 1
15 pages
Association Analysis and Frequent Sequential Pattern Mining-Apriori Algorithm
No ratings yet
Association Analysis and Frequent Sequential Pattern Mining-Apriori Algorithm
13 pages
Association Rule Mining
No ratings yet
Association Rule Mining
61 pages
Data Mining M2
No ratings yet
Data Mining M2
18 pages
Forward Chaining: Fundamentals and Applications
From Everand
Forward Chaining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
IMaster NCE Smart LCT V100R021C00 User Guide 01-C
No ratings yet
IMaster NCE Smart LCT V100R021C00 User Guide 01-C
59 pages
2port-Efr Thu-6404
No ratings yet
2port-Efr Thu-6404
255 pages
Instagram DBMS
No ratings yet
Instagram DBMS
16 pages
Directorate General of Commercial Intelligence and Statistics
No ratings yet
Directorate General of Commercial Intelligence and Statistics
4 pages
Artificial Intelligence (AI) Applications For Marketing: A Literature-Based Study
No ratings yet
Artificial Intelligence (AI) Applications For Marketing: A Literature-Based Study
15 pages
TURNIRIN TP060237 - Individual Assignment
No ratings yet
TURNIRIN TP060237 - Individual Assignment
29 pages
5.2.1 Packet Tracer - Configure VTP and DTP - ITExamAnswers
No ratings yet
5.2.1 Packet Tracer - Configure VTP and DTP - ITExamAnswers
6 pages
TSMC PDK Usage Guide
No ratings yet
TSMC PDK Usage Guide
45 pages
DAA - Paper - CT Exam - 2022-2023 - K.kaushik
No ratings yet
DAA - Paper - CT Exam - 2022-2023 - K.kaushik
2 pages
Operation Manual MIPLUS REV - 00 en
No ratings yet
Operation Manual MIPLUS REV - 00 en
86 pages
A Concise Survey Paper On Automated Plant Irrigation System
No ratings yet
A Concise Survey Paper On Automated Plant Irrigation System
7 pages
Dewe-5000 220e
No ratings yet
Dewe-5000 220e
24 pages
SIP Master Stations: Configuration Guide
No ratings yet
SIP Master Stations: Configuration Guide
36 pages
Pune Garden List
No ratings yet
Pune Garden List
25 pages
Planning & Carrying Out A Search
No ratings yet
Planning & Carrying Out A Search
5 pages
JXCX OMT0002
No ratings yet
JXCX OMT0002
113 pages
Badar Part
No ratings yet
Badar Part
2 pages
FLOODWALL A Real-Time Flash Flood Monitoring and Forecasting System Using IoT
No ratings yet
FLOODWALL A Real-Time Flash Flood Monitoring and Forecasting System Using IoT
13 pages
High School Football Schedule - Oct 29 - Dec 4 Rev Oct 28
No ratings yet
High School Football Schedule - Oct 29 - Dec 4 Rev Oct 28
1 page
Wireless Security Camera PC530
No ratings yet
Wireless Security Camera PC530
100 pages
Create A Larger Than 4GB Casper Partition: Search
No ratings yet
Create A Larger Than 4GB Casper Partition: Search
6 pages
Samsung Mobile Secret Codes
100% (1)
Samsung Mobile Secret Codes
42 pages
DSGW-060 Smart Gateway
No ratings yet
DSGW-060 Smart Gateway
7 pages
Online Exams: SR. NO Olympiad Date of Examination Registration Fees Cost of Books (Optional)
No ratings yet
Online Exams: SR. NO Olympiad Date of Examination Registration Fees Cost of Books (Optional)
2 pages
Your Psychic Bot
No ratings yet
Your Psychic Bot
11 pages
DD vcredistUI0CD6
No ratings yet
DD vcredistUI0CD6
2 pages
HCF and LCM PDF
No ratings yet
HCF and LCM PDF
31 pages

DWDM Lecture Notes U-4

Uploaded by

DWDM Lecture Notes U-4

Uploaded by

UNIT-IV

Association Rule Mining:

Let be a set of binary attributes called items.

Let be a set of transactions called the database.

An example rule for the supermarket could be meaning that if

Transaction ID milk bread butter beer

Important concepts of Association Rule

The support ofan itemset is defined as the proportion of transactions in the

has a support of since it occurs in 20% of

The confidenceof a rule is defined

For example, the rule has a confidence of

The liftof a rule is defined as

rule has a lift of .

The rule has aconviction of ,

Market basket analysis:

Frequent Pattern Mining:

2. Based on the levels of abstraction involved in the rule set:

buys(X, ―computer‖))=>buys(X, ―HP printer‖) (1)

buys(X, ―laptop computer‖)) =>buys(X, ―HP printer‖) (2)

5. Based on the kinds of rules to be mined:

6. Based on the kinds of patterns to be mined:

TID List of item IDs

There are nine transactions in this database, that is, |D| = 9.

7. The transactions in D are scanned in order to determine L3, consisting of

Mining Multilevel Association Rules:

A concepthierarchy defines a sequence of mappings froma set of low-level concepts to

Here, Level 1 includes computer, software, printer&camera, and computer accessory.

2.5.1 Approaches ForMining Multilevel Association Rules:

2. Reduced Minimum Support:

3. Group-Based Minimum Support:

Mining Multidimensional Association Rules from

Single dimensional or intradimensional association rule contains a single distinct

buys(X, ―digital camera‖)=>buys(X, ―HP printer‖)

Association rules that involve two or more dimensions or predicates can be

Mining Quantitative Association Rules:

Lift is a simple correlation measure that is given as follows. The occurrence of

The lift between the occurrence of A and B can bemeasured by computing

If the lift(A,B) is less than 1, then the occurrence of A is negativelycorrelated with

You might also like