0% found this document useful (0 votes)

90 views9 pages

Data Minig Unit 2nd

Association rule learning is an unsupervised machine learning technique used to discover relationships between variables in large datasets. It analyzes frequent patterns and correlations between data to identify association rules showing itemsets that occur together in transactions. This allows retailers, for example, to analyze customer purchase data to discover related products customers buy together and inform marketing strategies. Key concepts include support, confidence and lift metrics for evaluating strong rules, and the Apriori algorithm for efficiently mining large transactional datasets to extract frequent itemsets and association rules.

Uploaded by

Malik Bilaal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views9 pages

Data Minig Unit 2nd

Uploaded by

Malik Bilaal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Association Rule Learning

Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be more
profitable. It tries to find some interesting relations or associations among the variables of dataset.
It is based on different rules to discover the interesting relations between variables in the
database.
The association rule learning is one of the very important concepts of machine learning, and it is
employed in Market Basket analysis, Web usage mining, continuous production, etc. Here
market basket analysis is a technique used by the various big retailer to discover the associations
between items. We can understand it by taking an example of a supermarket, as in a supermarket,
all products that are purchased together are put together.

For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these
products are stored within a shelf or mostly nearby.
Consider the below Examples:
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke

Before we start defining the rule, let us first see the basic definitions.

Support Count(\sigma) – Frequency of occurrence of a itemset.

Here \sigma({Milk, Bread, Diaper})=2

Frequent Itemset – An itemset whose support is greater than or equal to minsup threshold.

Association Rule – An implication expression of the form X -> Y, where X and Y are
any 2 item sets.

Example: {Milk, Diaper}->{Beer}

Rule Evaluation Metrics –

 Support(s) –

The number of transactions that include items in the {X} and {Y} parts of the rule as a
percentage of the total number of transaction. It is a measure of how frequently the
collection of items occur together as a percentage of all transactions.
Support = \sigma(X+Y) \div total –
It is interpreted as fraction of transactions that contain both X and Y.
 Confidence(c) –

It is the ratio of the no of transactions that includes all items in {B} as well as the no
of transactions that includes all items in {A} to the no of transactions that includes all
items in {A}.
 Conf(X=>Y) = Supp(X\cupY) \div Supp(X) –

It measures how often each item in Y appears in transactions that contains items in X
also.
 Lift(l) –

The lift of the rule X=>Y is the confidence of the rule divided by the expected confidence,
assuming that the item sets X and Y are independent of each other. The expected
confidence is the confidence divided by the frequency of {Y}.
 Lift(X=>Y) = Conf(X=>Y) \div Supp(Y) –

Lift value near 1 indicates X and Y almost often appear together as expected, greater
than 1 means they appear together more than expected and less than 1 means they
appear less than expected. Greater lift values indicate stronger association.

Example – From the above table, {Milk, Diaper}=>{Beer}

s= \sigma({Milk, Diaper, Beer}) \div |T|

= 2/5
= 0.4

c= \sigma(Milk, Diaper, Beer) \div \sigma(Milk, Diaper)

= 2/3
= 0.67

l= Supp({Milk, Diaper, Beer}) \div Supp({Milk, Diaper})*Supp({Beer})

= 0.4/(0.6*0.6)
= 1.11
The Association rule is very useful in analysing datasets. The data is collected using bar-code
scanners in supermarkets. Such databases consist of a large number of transaction records which
list all items bought by a customer on a single purchase. So, the manager could know if certain
groups of items are consistently purchased together and use this data for adjusting store layouts,
cross-selling, promotions based on statistics.

Applications of Association Rule Learning

It has various applications in machine learning and data mining. Below are some popular
applications of association rule learning:
o Market Basket Analysis: It is one of the popular examples and applications of
association rule mining. This technique is commonly used by big retailers to determine the
association between items.
o Medical Diagnosis: With the help of association rules, patients can be cured easily, as it
helps in identifying the probability of illness for a particular disease.
o Protein Sequence: The association rules help in determining the synthesis of artificial
Proteins.

o It is also used for the Catalog Design and Loss-leader Analysis and many more other
applications.

Some Important Definitions

o Apriori, translated from Latin as “from the former”, is an algorithm that generates
association rules.
o Association rules represent relationships between individual items or item sets within the
data. These are often written in {A}→{B} format.
o A market basket is a group of one or more items that a customer purchases in one
transaction.
o Confidence (denoted as c) is an estimate of the conditional probability that one item (for
example, {A} is in the basket given that another {B} is already present.
o High confidence rules meet or exceed a predefined confidence threshold.
An item is an individual unit, especially when described as part of a set, list, basket, or
other grouping.
o Support (denoted as s) of an individual item is “the fraction of transactions that contain
[it],” or the frequentist probability with which it occurs within a set of transactions. It is,
therefore, an estimate of the proportion of future baskets that will contain the item.
o An item set is any grouping of one or more items.
o Frequent item sets are individual item sets that meet or exceed a given minimum support
threshold.
o Support between item sets is an association rule that gives the frequency at which both
item sets occur in the same basket. It is also an estimate of the proportion of future baskets
that will contain the item sets.
And these are all the important definitions you need to know for association analysis!

Association Analysis
Association analysis is the task of finding interesting relationships in large datasets. These
interesting relationships can take two forms: frequent item sets or association rules.
Frequent item sets are a collection of items that frequently occur together. The second way
to view interesting relationships is association rules. Association rules suggest that a
strong relationship exists between two items. I’ll illustrate these two concepts with an
example. A list of transactions from a grocery store is shown in figure 11.1.
Figure 11.1.
A simple list of transactions from a natural foods grocery store called Hole Foods

Frequent items sets are lists of items that commonly appear together. One example from
figure 11.1 is {wine, diapers, soy milk}. (Recall that sets are denoted by a pair of brackets
{}). From the dataset we can also find an association rule such as diapers → wine. This
means that if someone buys diapers, there’s a good chance they’ll buy wine. With the
frequent item sets and association rules, retailers have a much better understanding of
their customers. Although common examples of association analysis are from the retail
industry, it can be applied to a number of other industries, such as website traffic analysis
and medicine.

Association Analysis:
Basic Concepts and Algorithms

Many business enterprises accumulate large quantities of data from their day-to-day
operations. For example, huge amounts of customer purchase data are collected daily at
the checkout counters of grocery stores. Table 6.1 illustrates an example of such data,
commonly known as market basket transactions. Each row in this table corresponds to a
transaction, which contains a unique identifier labelled T ID and a set of items bought by a
given customer. Retailers are interested in analysing the data to learn about the
purchasing behaviour of their customers. Such valuable information can be used to support
a variety of business-related applications such as marketing promotions, inventory
management, and customer relationship management.

This chapter presents a methodology known as association analysis,

which is useful for discovering interesting relationships hidden in large data
sets. The uncovered relationships can be represented in the form of association rules or
sets of frequent items. For example, the following rule can be
extracted from the data set shown in Table 6.1:
Table 6.1. An example of market basket transactions.
T ID Items
1 {Bread, Milk}
2 {Bread, Diapers, Beer, Eggs}
3 {Milk, Diapers, Beer, Cola}
4 {Bread, Milk, Diapers, Beer}
5 {Bread, Milk, Diapers, Cola}

{Diapers} −→ {Beer}.

The rule suggests that a strong relationship exists between the sale of diapers and beer
because many customers who buy diapers also buy beer. Retailers can use this type of
rules to help them identify new opportunities for cross-selling their products to the
customers.
Besides market basket data, association analysis is also applicable to other application
domains such as bioinformatics, medical diagnosis, Web mining, and scientific data
analysis. In the analysis of Earth science data, for example, the association patterns may
reveal interesting connections among the ocean, land, and atmospheric processes. Such
information may help Earth scientists develop a better understanding of how the different
elements of the Earth system interact with each other. Even though the techniques
presented here are generally applicable to a wider variety of data sets, for illustrative
purposes, our discussion will focus mainly on market basket data.
There are two key issues that need to be addressed when applying association
analysis to market basket data. First, discovering patterns from a large transaction data set
can be computationally expensive. Second, some of the discovered patterns are potentially
spurious because they may happen simply by chance. The remainder of this chapter is
organized around these two issues. The first part of the chapter is devoted to explaining
the basic concepts of association analysis and the algorithms used to efficiently mine such
patterns. The second part of the chapter deals with the issue of evaluating the discovered
patterns in order to prevent the generation of spurious results.

Problem Definition
This section reviews the basic terminology used in association analysis and
presents a formal description of the task.

Binary Representation Market basket data can be represented in a binary

format as shown in Table 6.2, where each row corresponds to a transaction
and each column corresponds to an item. An item can be treated as a binary
variable whose value is one if the item is present in a transaction and zero
otherwise. Because the presence of an item in a transaction is often considered
more important than its absence, an item is an asymmetric binary variable.

Table 6.2. A binary 0/1 representation of market basket data.

TID Bread Milk Diapers Beer Eggs Cola
1110000
2101110
3011101
4111100
5111001
This representation is perhaps a very simplistic view of real market basket data
because it ignores certain important aspects of the data such as the quantity
of items sold or the price paid to purchase them.

Itemset and Support Count Let I = {i1,i2,. . .,id} be the set of all items
in a market basket data and T = {t1, t2, . . . , tN } be the set of all transactions.
Each transaction ti contains a subset of items chosen from I. In association
analysis, a collection of zero or more items is termed an itemset. If an itemset
contains k items, it is called a k-itemset. For instance, {Beer, Diapers, Milk}
is an example of a 3-itemset. The null (or empty) set is an itemset that does
not contain any items.

The transaction width is defined as the number of items present in a trans-

action. A transaction tj is said to contain an itemset X if X is a subset of
tj . For example, the second transaction shown in Table 6.2 contains the item-
set {Bread, Diapers} but not {Bread, Milk}. An important property of an
itemset is its support count, which refers to the number of transactions that
contain a particular itemset. Mathematically, the support count, σ(X), for an
itemset X can be stated as follows:
σ(X) = ∣
∣{ti|X ⊆ ti, ti ∈ T }∣
∣,
where the symbol | · | denote the number of elements in a set. In the data set
shown in Table 6.2, the support count for {Beer, Diapers, Milk} is equal to
two because there are only two transactions that contain all three items.
Association Rule An association rule is an implication expression of the
form X −→ Y , where X and Y are disjoint itemsets, i.e., X ∩ Y = ∅. The
strength of an association rule can be measured in terms of its support and
confidence. Support determines how often a rule is applicable to a given
data set, while confidence determines how frequently items in Y appear in
transactions that contain X. The formal definitions of these metrics are
Support, s(X −→ Y ) = σ(X ∪ Y ) (6.1)
_______ ;
N

Confidence, c(X −→ Y ) = σ(X ∪ Y ) (6.2)

________
σ(X).
Example 6.1. Consider the rule {Milk, Diapers} −→ {Beer}. Since the
support count for {Milk, Diapers, Beer} is 2 and the total number of trans-
actions is 5, the rule’s support is 2/5 = 0.4. The rule’s confidence is obtained
by dividing the support count for {Milk, Diapers, Beer} by the support count
for {Milk, Diapers}. Since there are 3 transactions that contain milk and di-
apers, the confidence for this rule is 2/3 = 0.67.
Why Use Support and Confidence?
Support is an important measure because a rule that has very low support may occur
simply by chance. A low support rule is also likely to be uninteresting from a business
perspective because it may not be profitable to promote items that customers seldom buy
together (with the exception of the situation described in Section 6.8). For these reasons,
support is often used to eliminate uninteresting rules. As will be shown in Section 6.2.1,
support also has a desirable property that can be exploited for the efficient discovery of
association rules. Confidence, on the other hand, measures the reliability of the inference
made by a rule. For a given rule X −→ Y , the higher the confidence, the more likely it is for
Y to be present in transactions that contain X. Confidence also provides an estimate of the
conditional probability of Y given X. Association analysis results should be interpreted with
caution. The inference made by an association rule does not necessarily imply causality.
Instead, it suggests a strong co-occurrence relationship between items in the antecedent
and consequent of the rule. Causality, on the other hand, requires knowledge about the
causal and effect attributes in the data and typically involves relationships occurring over
time (e.g., ozone depletion leads to global warming)

What is Apriori Algorithm?

Apriori algorithm refers to an algorithm that is used in mining frequent products sets and
relevant association rules. Generally, the Apriori algorithm operates on a database
containing a huge number of transactions. For example, the items customers but at a Big
Bazar.

Apriori algorithm helps the customers to buy their products with ease and increases the
sales performance of the particular store.

Apriori Algorithm
Apriori algorithm refers to the algorithm which is used to calculate the association rules
between objects. It means how two or more objects are related to one another. In other
words, we can say that the Apriori algorithm is an association rule leaning that analyzes
that people who bought product A also bought product B.

The primary objective of the Apriori algorithm is to create the association rule between
different objects. The association rule describes how two or more objects are related to
one another. Apriori algorithm is also called frequent pattern mining. Generally, you
operate the Apriori algorithm on a database that consists of a huge number of transactions.
Let's understand the Apriori algorithm with the help of an example; suppose you go to Big
Bazar and buy different products. It helps the customers buy their products with ease and
increases the sales performance of the Big Bazar.
How does the Apriori Algorithm work in Data Mining?
We will understand this algorithm with the help of an example

Consider a Big Bazar scenario where the product set is P = {Rice, Pulse, Oil, Milk, Apple}.
The database comprises six transactions where 1 represents the presence of the product
and 0 represents the absence of the product.

Transaction ID Rice Pulse Oil Milk Apple

t1 1 1 1 0 0
t2 0 1 1 1 0
t3 0 0 0 1 1
t4 1 1 0 1 0
t5 1 1 1 0 1
t6 1 1 1 1 1

The Apriori Algorithm makes the given assumptions

o All subsets of a frequent itemset must be frequent.

o The subsets of an infrequent item set must be infrequent.
o Fix a threshold support level. In our case, we have fixed it at 50 percent.

Step 1:

Make a frequency table of all the products that appear in all the transactions. Now, short
the frequency table to add only those products with a threshold support level of over 50
percent. We find the given frequency table.

Product Frequency (Number of transactions)

Rice (R) 4
Pulse(P) 5
Oil(O) 4
Milk(M) 4
The above table indicated the products frequently bought by the customers.

Step 2:

Create pairs of products such as RP, RO, RM, PO, PM, OM. You will get the given
frequency table.

Itemset Frequency (Number of transactions)

RP 4
RO 3
RM 2
PO 4
PM 3
OM 2
Step 3:

Implementing the same threshold support of 50 percent and consider the products that are
more than 50 percent. In our case, it is more than 3

Thus, we get RP, RO, PO, and PM

Step 4:

Now, look for a set of three products that the customers buy together. We get the given
combination.

RP and RO give RPO

PO and PM give POM

Step 5:

Calculate the frequency of the two item sets, and you will get the given frequency table.

Itemset Frequency (Number of transactions)

RPO 4
POM 3
If you implement the threshold assumption, you can figure out that the customers' set of
three products is RPO.

We have considered an easy example to discuss the Apriori algorithm in data mining. In
reality, you find thousands of such combinations.

__________________________________________________________

Trial Memorandum Plaintiff SAMPLE
100% (4)
Trial Memorandum Plaintiff SAMPLE
10 pages
A level Economics Revision: Cheeky Revision Shortcuts
From Everand
A level Economics Revision: Cheeky Revision Shortcuts
Scool Revision
3/5 (1)
CG Data Management System
No ratings yet
CG Data Management System
2 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
Chapter-6 (Association Analysis Basic Concepts and Algorithms)
No ratings yet
Chapter-6 (Association Analysis Basic Concepts and Algorithms)
75 pages
III Unit-DM
No ratings yet
III Unit-DM
9 pages
Association Rule Mining
No ratings yet
Association Rule Mining
17 pages
Unit 3 Final
No ratings yet
Unit 3 Final
13 pages
Unit 2
No ratings yet
Unit 2
14 pages
DWDM Unit-4
No ratings yet
DWDM Unit-4
27 pages
Data Analysis Using Apriori Algorithm & Neural Netwok: Ashutosh Padhi
No ratings yet
Data Analysis Using Apriori Algorithm & Neural Netwok: Ashutosh Padhi
27 pages
AssociationRule and Apriori
No ratings yet
AssociationRule and Apriori
45 pages
Association Rule Mining:: "If A Customer Buys Bread, He's 70% Likely of Buying Milk."
No ratings yet
Association Rule Mining:: "If A Customer Buys Bread, He's 70% Likely of Buying Milk."
12 pages
Importance of Association Rule Mining and Its Real-Time Applications
No ratings yet
Importance of Association Rule Mining and Its Real-Time Applications
28 pages
Unit 4 - Association Analysis
100% (1)
Unit 4 - Association Analysis
12 pages
Unit 4 - Association Analysis
No ratings yet
Unit 4 - Association Analysis
12 pages
AprioriTID Algorithm Improved From Apriori Algorithm
No ratings yet
AprioriTID Algorithm Improved From Apriori Algorithm
5 pages
Unit 4 Association Rule Mining
No ratings yet
Unit 4 Association Rule Mining
18 pages
DM Unit Ii
No ratings yet
DM Unit Ii
30 pages
Association Rule Mining
No ratings yet
Association Rule Mining
6 pages
Lecture06 Association Mining
No ratings yet
Lecture06 Association Mining
54 pages
Association
No ratings yet
Association
54 pages
Unit 2
No ratings yet
Unit 2
14 pages
Association Rule Mining: Applications in Various Areas: Akash Rajak and Mahendra Kumar Gupta
No ratings yet
Association Rule Mining: Applications in Various Areas: Akash Rajak and Mahendra Kumar Gupta
5 pages
Dokumen - Pub - Introduction To Data Mining 2nbsped 2017048641 9780133128901 0133128903 576 593
No ratings yet
Dokumen - Pub - Introduction To Data Mining 2nbsped 2017048641 9780133128901 0133128903 576 593
18 pages
Association Rules v3
No ratings yet
Association Rules v3
9 pages
Association Rules v2
No ratings yet
Association Rules v2
9 pages
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 04 Association Rules
No ratings yet
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 04 Association Rules
9 pages
Association Rule Mining
No ratings yet
Association Rule Mining
24 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
14 pages
Mining Frequent Pattern
No ratings yet
Mining Frequent Pattern
36 pages
M5 m6 KC
No ratings yet
M5 m6 KC
36 pages
Association Rules: An Association Rule Has 2 Parts
No ratings yet
Association Rules: An Association Rule Has 2 Parts
3 pages
Data Analytics Project
No ratings yet
Data Analytics Project
5 pages
Data Mining
No ratings yet
Data Mining
4 pages
ch14 Min Assoc Rules
No ratings yet
ch14 Min Assoc Rules
12 pages
Seminar 6
No ratings yet
Seminar 6
30 pages
7638 16634 1 SM
No ratings yet
7638 16634 1 SM
10 pages
Association Rule
No ratings yet
Association Rule
17 pages
DM Unit-II
No ratings yet
DM Unit-II
80 pages
Association Rule
No ratings yet
Association Rule
3 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
Dataanalytics Unit-4
No ratings yet
Dataanalytics Unit-4
23 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
Lab - Association Rule
No ratings yet
Lab - Association Rule
6 pages
Clickstream Analytics
No ratings yet
Clickstream Analytics
22 pages
Topic 03 - Mining Association Rules
No ratings yet
Topic 03 - Mining Association Rules
12 pages
Market Basket Analysis Using Association Rules Unit 5
No ratings yet
Market Basket Analysis Using Association Rules Unit 5
21 pages
Lec.5.Intro.D.S. Fall 2023
No ratings yet
Lec.5.Intro.D.S. Fall 2023
18 pages
An Overview of Association Rule Mining & Its Application: by Abhinav Rai
No ratings yet
An Overview of Association Rule Mining & Its Application: by Abhinav Rai
22 pages
Association Rule in Data Mining
No ratings yet
Association Rule in Data Mining
4 pages
Data - Analytics - Chapter 3
No ratings yet
Data - Analytics - Chapter 3
54 pages
04-Association Rule Mining
No ratings yet
04-Association Rule Mining
22 pages
Ariori Introduction and Concept
No ratings yet
Ariori Introduction and Concept
37 pages
CH 5
No ratings yet
CH 5
53 pages
UNIT 2 Updated
No ratings yet
UNIT 2 Updated
50 pages
Chapter 4 New
No ratings yet
Chapter 4 New
17 pages
Unit 3 1
No ratings yet
Unit 3 1
34 pages
Association Rule - Data Mining
100% (1)
Association Rule - Data Mining
131 pages
Association Rule Mining Presentation
No ratings yet
Association Rule Mining Presentation
11 pages
Lec 2
No ratings yet
Lec 2
18 pages
Association Rules
No ratings yet
Association Rules
39 pages
Adult Christian Education: A Training of Kingdom Workers
No ratings yet
Adult Christian Education: A Training of Kingdom Workers
9 pages
Updates in Hyperkalemia: Outcomes and Therapeutic Strategies
No ratings yet
Updates in Hyperkalemia: Outcomes and Therapeutic Strategies
7 pages
IoT Quantum Computing A Future Concept
No ratings yet
IoT Quantum Computing A Future Concept
8 pages
Forrester - Enabling Smarter Procurement
No ratings yet
Forrester - Enabling Smarter Procurement
15 pages
Maroon Black Minimalist Best Genre Movie List Planner
No ratings yet
Maroon Black Minimalist Best Genre Movie List Planner
5 pages
FAC3761 - Exam Prep - Mock Question Paper - Suggested Solution
No ratings yet
FAC3761 - Exam Prep - Mock Question Paper - Suggested Solution
9 pages
HL Business Management Course Outline - Final
No ratings yet
HL Business Management Course Outline - Final
14 pages
Ict2611 Octnov24
No ratings yet
Ict2611 Octnov24
15 pages
General Biology Chapter 2 Assignment
No ratings yet
General Biology Chapter 2 Assignment
2 pages
The College Walkthrough Ver 0.39
No ratings yet
The College Walkthrough Ver 0.39
22 pages
Isolated Footing Excel Computation
No ratings yet
Isolated Footing Excel Computation
27 pages
Research Proposal
No ratings yet
Research Proposal
10 pages
BHS Inggris Xi Sem-1 TP 2021-2022
No ratings yet
BHS Inggris Xi Sem-1 TP 2021-2022
8 pages
Result
No ratings yet
Result
1 page
C3000GT PM en 09
No ratings yet
C3000GT PM en 09
127 pages
Taxi Book
No ratings yet
Taxi Book
4 pages
Major Assignment 1
No ratings yet
Major Assignment 1
4 pages
A2 Chapter 4 Notes & HW 6
No ratings yet
A2 Chapter 4 Notes & HW 6
36 pages
Stephen Hawking - 'Transcendence Looks at The Implications of Artificial Intelligence - But Are We Taking AI Seriously Enough - X 4'
No ratings yet
Stephen Hawking - 'Transcendence Looks at The Implications of Artificial Intelligence - But Are We Taking AI Seriously Enough - X 4'
2 pages
Native Corn Recipes
100% (3)
Native Corn Recipes
115 pages
Table Showing Current Ratio: List of Tables
No ratings yet
Table Showing Current Ratio: List of Tables
37 pages
Teach Anyone: Understanding Personality To
No ratings yet
Teach Anyone: Understanding Personality To
18 pages
Assisting Decision-Making On Age of Neutering For
No ratings yet
Assisting Decision-Making On Age of Neutering For
8 pages
Hardening
No ratings yet
Hardening
7 pages
Fee Structure Agm Current
No ratings yet
Fee Structure Agm Current
2 pages
Inspection Preparation For Ships
No ratings yet
Inspection Preparation For Ships
3 pages
Cambridge First B2, Test Book 1, Test 2, Speaking Script For Exam Practice.
No ratings yet
Cambridge First B2, Test Book 1, Test 2, Speaking Script For Exam Practice.
4 pages
A Study On The Performance of Insurance Companies in 1xynrowx1f
No ratings yet
A Study On The Performance of Insurance Companies in 1xynrowx1f
13 pages

Data Minig Unit 2nd

Uploaded by

Data Minig Unit 2nd

Uploaded by

Association Rule Learning

Support Count(\sigma) – Frequency of occurrence of a itemset.

Here \sigma({Milk, Bread, Diaper})=2

Example: {Milk, Diaper}->{Beer}

Example – From the above table, {Milk, Diaper}=>{Beer}

s= \sigma({Milk, Diaper, Beer}) \div |T|

c= \sigma(Milk, Diaper, Beer) \div \sigma(Milk, Diaper)

l= Supp({Milk, Diaper, Beer}) \div Supp({Milk, Diaper})*Supp({Beer})

Applications of Association Rule Learning

Some Important Definitions

This chapter presents a methodology known as association analysis,

Binary Representation Market basket data can be represented in a binary

Table 6.2. A binary 0/1 representation of market basket data.

The transaction width is defined as the number of items present in a trans-

Confidence, c(X −→ Y ) = σ(X ∪ Y ) (6.2)

What is Apriori Algorithm?

Transaction ID Rice Pulse Oil Milk Apple

The Apriori Algorithm makes the given assumptions

o All subsets of a frequent itemset must be frequent.

Product Frequency (Number of transactions)

Itemset Frequency (Number of transactions)

Thus, we get RP, RO, PO, and PM

RP and RO give RPO

Itemset Frequency (Number of transactions)

You might also like