0% found this document useful (0 votes)
17 views

Assignment 2

Uploaded by

Dina subharithi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Assignment 2

Uploaded by

Dina subharithi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

SETHU INSTITUTE OF TECHNOLOGY

(An Autonomous Institution | Accredited with ‘A’ Grade by NAAC)


PULLOOR, KARIAPATTI – 626 115.

DEPARTMENTOF COMPUTER SCIENCE AND BUSINESS SYSTEM

2023-2024 EVEN SEMESTER

YEAR& SEM / SEC: IV / VIII / A BATCH: 2020- 2024

19UCB911 – DATA SCIENCE FOR ENGINEERS

Assignment II

Marks: 50

Questions II

Explain in details about the apriori algorithm and write its function

Rubrics for Evaluation

S
Elements/Grade Good (5) Average(3) Needs Improvement (1)
No
Definition Well clear
definition Definition
1 ( 5 Marks) Definition is not correct
Apriori not clear
Algorithm.
Property Properties are
Correct properties not clear or
( 5Marks) Lack in detailed explanation
2 involved in proper and
of property
algorithm explanation is
also not clear.
Work Explanation
How does it work
3 is also not Explanation not clear
( 5Marks) in premises
clear.
Steps involved Process but it
Step-by-Step
(Algorithm) is not Process is not properly
4 process with
sequential correct
( 5 Marks) sequential order
order.
5 Applications, Summarize the 5 Below 6 Only two and the below
to 10 points you points
advantages and
think are most
disadvantages
important
( 5 Marks)

Co to be attained: CO4, CO5 & CO6

Answer:

APRIORI ALGORITHM

An algorithm known as Apriori is a common one in data mining. It's used to identify the
most frequently occurring elements and meaningful associations in a dataset. As an
example, products brought in by consumers to a shop may all be used as inputs in this
system.

An effective Market Basket Analysis is critical since it allows consumers to purchase


their products with more convenience, resulting in a rise in market sales. Furthermore, it
has been applied in healthcare to aid in the identification of harmful medication
responses. A clustering algorithm is generated that identifies which combinations of
drugs and patient factors are associated with adverse drug reactions.

Apriori Property

In 1994, R. Agrawal and R. Srikant developed the Apriori method for identifying the
most frequently occurring itemsets in a dataset using the boolean association rule.
Since it makes use of previous knowledge about common itemset features, the method
is referred to as Apriori. This is achieved by the use of an iterative technique or level-
wise approach, in which k-frequent itemsets are utilized to locate k+1 itemsets.

An essential feature known as the Apriori property is utilized to boost the effectiveness
of level-wise production of frequent itemsets. This property helps by minimizing the
search area, which in turn serves to maximize the productivity of level-wise creation of
frequent patterns.

How Does the Apriori Algorithm Work?


The Apriori algorithm operates on a straightforward premise. When the support value of
an item set exceeds a certain threshold, it is considered a frequent item set. Take into
account the following steps. To begin, set the support criterion, meaning that only those
things that have more than the support criterion are considered relevant.

 Step 1: Create a list of all the elements that appear in every transaction and
create a frequency table.
 Step 2: Set the minimum level of support. Only those elements whose support
exceeds or equals the threshold support are significant.
 Step 3: All potential pairings of important elements must be made, bearing in
mind that AB and BA are interchangeable.
 Step 4: Tally the number of times each pair appears in a transaction.
 Step 5: Only those sets of data that meet the criterion of support are significant.
 Step 6: Now, suppose you want to find a set of three things that may be bought
together. A rule, known as self-join, is needed to build a three-item set. The item
pairings OP, OB, PB, and PM state that two combinations with the same initial
letter are sought from these sets.

1. OPB is the result of OP and OB.


2. PBM is the result of PB and PM.

 Step 7: When the threshold criterion is applied again, you'll get the significant
itemset.

Steps for Apriori Algorithm

The Apriori algorithm has the following steps:

 Step 1: Determine the level of transactional database support and establish the
minimal degree of assistance and dependability.
 Step 2: Take all of the transaction's supports that are greater than the standard
or chosen support value.
 Step 3: Look for all rules with greater precision than the cutoff or baseline
standard, in these subgroups.
 Step 4: It is best to arrange the rules in ascending order of strength.

Methods to Improve Apriori Efficiency

The algorithm's efficiency may be improved in a variety of ways.

 Hash-Based Technique

Using a hash-based structure known as a hash table, the k-itemsets and their related
counts are generated. The table is generated using a hash function.

 Transaction Reduction
There are fewer transactions to scan throughout each loop when using this strategy.
Items that are not often used in a process are either tagged or deleted.

 Partitioning

Two database searches are all that is needed to find the frequently occurring itemsets
using this approach. For any item set to be considered "possibly frequent" in the
database, it must be prevalent in at least a few of the database subdivisions.

 Sampling

A random sample S is selected from database D, and then a search is conducted for
frequent itemsets within that sample S. Global frequent itemsets may be misplaced. By
reducing the min sup, this may be decreased.

 Dynamic Itemset Counting

During the screening of the dataset, this approach may add new iterations at any
indicated starting position of the directory.

Advantages of Apriori

 An algorithm that is simple to grasp.


 The Merge and Squash processes are simple to apply on big itemsets in huge
databases.

Disadvantages of Apriori

 It requires a significant amount of calculations if the itemsets are extremely big


and the minimal support is maintained to a bare minimum.
 A full scan of the whole database is required.

Applications of Apriori Algorithm

Apriori is used in the following fields:

 Education

Through the use of traits and specializations, data mining of accepted students may be
used to extract association rules.

 Medical

Analyzing the patient's database, for example, might be appropriate.

 Forestry
Frequency and intensity of forest fire analysis using forest fire data.

 Autocomplete Tool

Apriori is employed by a number of firms, including Amazon's recommender system and


Google's autocomplete tool.

You might also like