0% found this document useful (0 votes)

13 views8 pages

MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques

Uploaded by

zoubir zerrouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views8 pages

MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques

Uploaded by

zoubir zerrouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/371202670

MBA: Market Basket Analysis Using Frequent Pattern Mining Techniques

Article in International Journal on Recent and Innovation Trends in Computing and Communication · April 2023
DOI: 10.17762/ijritcc.v11i5s.6591

CITATIONS READS

0 229

3 authors, including:

Sallam Fageeri Mohammad Abu Kausar

University of Nizwa University of Nizwa
28 PUBLICATIONS 136 CITATIONS 21 PUBLICATIONS 295 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Mohammad Abu Kausar on 01 June 2023.

The user has requested enhancement of the downloaded file.

International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________

MBA: Market Basket Analysis Using Frequent

Pattern Mining Techniques
Sallam Osman Fageeri1*, Mohammad Abu Kausar2, Arockiasamy Soosaimanickam3
1,2,3Department of Information Systems, College of EMIS, University of Nizwa
Nizwa, Sultanate of Oman
Sallam, Kausar, [email protected]

Abstract—This Market Basket Analysis (MBA) is a data mining technique that uses frequent pattern mining algorithms to discover
patterns of co-occurrence among items that are frequently purchased together. It is commonly used in retail and e-commerce businesses to
generate association rules that describe the relationships between different items, and to make recommendations to customers based on their
previous purchases. MBA is a powerful tool for identifying patterns of co-occurrence and generating insights that can improve sales and
marketing strategies. Although a numerous works has been carried out to handle the computational cost for discovering the frequent itemsets,
but it still needs more exploration and developments. In this paper, we introduce an efficient Bitwise-Based data structure technique for mining
frequent pattern in large-scale databases. The algorithm scans the original database once, using the Bitwise-Based data representations as well
as vertical database layout, compared to the well-known Apriori and FP-Growth algorithm. Bitwise-Based technique enhance the problems of
multiple passes over the original database, hence, minimizes the execution time. Extensive experiments have been carried out to validate our
technique, which outperform Apriori, Éclat, FP-growth, and H-mine in terms of execution time for Market Basket Analysis.
Keywords- Market Basket; Bitwise-Based; Frequent Patterns; Support; Confidence

I. INTRODUCTION frequent itemsets based on minimum support, (2) generate

candidate itemsets based on minimum confidence.
Data mining also can be identified as knowledge discovery
Mining frequent itemsets as well as association rules require
in database (KDD) aims to extract valuable, useful, and
to satisfy the two interesting measure, which is minimum
understandable knowledge and patterns from obtainable
support and minimum confidence thresholds, that is specified by
databases to discover probably the most relevant and interesting
the user, the discovered frequent items affected by the specified
patterns and trends. Data mining is a collection of exploration
support value, usually low support value lead to discover great
technique based on advanced analytical methods and tools for
number of frequent items, also based on the Apriori property,
handling a large amount of data. The technique can find a novel
that’s all superset of frequent itemsets, its subset also will be
pattern that may assist an enterprise in understanding the
frequent. Hence, specifying suitable support and confidence
business better and in forecasting [1-3].
value are important criteria.
Nowadays, it has been used in many different fields, such as
Support- In [11], the support has been defined as one of the
health care [4-6], decision support system [5],
measure parameters used to find the occurrence of an item or set
telecommunication networks [7], crime investigation [8],
of items among the total number of transactions. In the other
Intrusion detection [9], and log file analysis [10]. Mining
words, support can calculate how many times an item or set of
association rule is commonly used in data mining, i.e. to
items appears in a set of transactions. An item or set of items can
investigate the correlations among the product items that
be known as frequent or large item if it has greater support.
purchased together by customers during a visit to a store.
Using probability concept, we can formulate support as:
Normally the association rule can be denoted as X Y, where X
(1)
and Y are two sample products, and the rule condition is that, if
item X is purchased, item Y will also be purchased. Two
parameters are used to measure the relation of the association
rules, for frequent itemset generation support criteria are used,
A and B represents the itemsets in a database D.
and for extracting association rule confidence is used, with the
Confidence- is used for the purpose of measuring the
help of the frequent itemset values generated by using the
strength of relation and association between the itemsets [12].
support. Association rule mining was suggested by Agrawal et
The confidence evaluation determines the probability of an item
al. (1993) [11], and the two main steps include: (1) generate
B occurs in the same transaction that also contains A. In the other

15
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
words, the confidence is used to explore the conditional frequent 2-itemsets by adopting a Hash technology, and this
probability of the used items. The formula of confidence is method also improves the process of creating candidate itemsets.
DIC, proposed by Brin et al. [24] can append candidate itemsets
dynamically in different courses of scanning database. The
(2) above algorithms are all have an advantage over Apriori, but
they still spend a great deal of time scanning database, finding
and testing candidate itemsets.
Association rule can be written in expression as an Éclat (Equivalence CLASS Transformation) algorithm [25]
implication of X Y, where X and Y are items of itemset I. where uses a vertical data representation and divide-and-conquer
X I, Y I, and X∩Y =. The expression means that if a transactions approach, the benefits of using the vertical representation
T contains the items in X, it also tends to contain the items in Y. optimize the parallel processing of the search space using depth-
An illustration of such a rule might be that 60% out of the total first generation of frequent itemsets. Éclat is the first considered
transactions that contain milk also contains sugar; 40% of all algorithm used to generate frequent itemsets in vertical format
transactions contain the two items together". Here 60% will be using only once passes over the database; although, a vertical
known as confidence of the rule, and 40% will be known as data representation usually reside in memory. It does not fully
support of the rule. Mining association rules from a set of items exploit the downward closure property of support as it does not
idea originates from the data analysis of market-basket, where utilize step (2) of Apriori, the pruning step; in addition, it
will be the interest in mining association rules for describing generates a candidate (k+l)-itemset if two of its k-subsets are
customer’s interest in buying product items. frequent, resulting in a larger number of candidates compared
to Apriori [26].
II. LITERATURE REVIEW As an alternative solution to the bottleneck of Apriori and
Apriori algorithm, introduced by [11], is a first frequent Apriori-like algorithms problem of candidate set generation and
pattern and association rule mining algorithm, in order to testing, Han et al [27], proposed the pattern-growth approach
control the exponential of candidate itemset growth, the for mining frequent patterns. One of the fastest mining methods
algorithm use the support-based pruning concept, to investigate for frequent itemsets is the well-known FP-growth algorithm
the concept behind the Apriori principle. Apriori algorithm [11], [27]. FP-growth utilizes an effective and compact data structure
used to discover the frequent pattern in database transactions. known as Frequent Pattern tree (FP-tree). Unlike Apriori
The Apriori algorithm use multiple passes method over the algorithm which uses candidate generation and testing; FP-
database. Lead to employs an iterative (level-wise search) growth performs pattern growth approach in order to obtain
during the search space, Apriori algorithm start generating frequent itemsets. FP-growth divides the compact database into
candidate 1-itemset C1, the algorithm scans the transactions and set of conditional databases, then all frequent itemsets are
count all its items in individually. Based on the given support generated from the conditional databases. FP-tree is constructed
the frequent 1-itemsets will be determined as L1. Then in the by firstly building a header table. The header table contains item
next step the algorithm discover L2 set of frequent 2-itemsets. name as well as a corresponding link to each item. Link entries
The algorithm uses L1*L1 to generate the candidate 2-itemset in the header table were initialized to null. Then, every item
C2. Each itemset in C2 that is greater than the specified added first time to the tree, its corresponding entry is updated.
minimum support will be add to frequent 2-itemset L2. To Also the root node, marked as “null“, is constructed. Child
determine the frequent 3-itemset L3, the algorithm use L2*L2 nodes are then attached through database scan. Paths that share
to discover candidates of size 3-items C3. Based on the same prefixes are searched firstly. If a path has same prefix of
minimum support threshold the frequent L3 will be discovered. a transaction items exist, then the shared prefix part count is
The circulation continues like this, until no possibility of increased by one in the FP tree; the rest of the items that don’t
generate more combinations. belong to the shared prefix are then attached to the last node in
To overcome the Apriori Bottleneck extensive improvements the 1st order and their counters are set to one. On the other hand
had been introduced, such as, sampling approach [13], if the transaction items don’t share prefix part with any path;
incremental mining [14], dynamic itemset counting [15], then they are attached to the root node in the 1st order. Lastly,
hashing technique [16], parallel and distributed mining [17-20], every path in the FP-tree corresponds to a transaction in the
partition technique [21]. Tight upper bound number of database.
candidate patterns derived the association rules, which can be Based on FP-growth, many people have proposed their
discovered using the level-wise method [22]. The result can be improved algorithms [28-32], Jian Pei and Jiawei Han [33]
effective through reducing the number of database passes. DHP, proposed the H-mine algorithm which has high performance
proposed by Pork et al.[23] improves the efficiency of finding and very small space overhead by taking the advantage of H-

16
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
struct data structure and re-adjusting the links at mining IV. MATERIAL AND METHOD
different “projected” databases. Yahan Hu et al. [34] proposed The algorithm used contains four procedures. The first
the MIS-tree structure which is a FP-tree-like structure, and they procedure is getItemSet, which gets the list of items in the
also proposed a high performance algorithm called CFP-growth database as well as generate the binary representation of the
to mining association rules with multiple minimum supports. database. The second procedure is getFrequentItems which
Pei et al. [35] developed a H-mine algorithm, that is used to scan the database and counts the frequent of all items in the item
discover all the frequent itemsets from the given transactional set and keeps those items whose support is greater than or equal
database. The proposed algorithm uses a simple and novel data minimum support in the frequent item list. The third procedure
structure hyper-link, H-struct, and a new mining algorithm, H- is generateFrequentItemSets which generates the lists of all
mine dynamically able to adjust links in the mining task, taking frequent itemsets that passed the value of the minimum support.
into account the advantages of the previously mentioned data Finally the fourth procedure is generateAssociations, which
structure. H-mine algorithm can have a scaled up to very large generates all the association rules based on the frequent itemsets
database using database partitioning, moreover, one of the that has been generated in the previous procedure.
distinct feature of H-mine is a very limited memory cost. The algorithm use an empty bitSet of size itemSet list. We
Apriori and FP-growth both have limitations, especially when will use this empty bitSet as a mask. Firstly, we start by setting
the number of attributes is very high and the minimum support the bit that corresponds to the first item in the frequent items
degree is very low. Based on the analysis of the advantages and list, frequentItems. Then we loop through all the remaining
disadvantages of existing algorithms, we propose an efficient items in frequentItems setting the corresponding bit of each one
algorithm known as Bitwise-Based data structure and algorithm at a time together with the first one that is already set before.
for frequent pattern and association rule mining, compared to This means that at each time we would have two bits set in the
Apriori-like algorithm, FP-tree algorithm, m+ and twice mask. Then we AND the mask with all bitTransactions in
respectively, Bitwise-Based technique, scans database only BitSetDb. If the result of the AND equals the mask then items
once, doesn’t need to generate candidate itemsets, and all FCIs appeared in the transaction and their support is increased by
can be found by creating a comparative mask from the frequent one. We do this for all remaining combinations. If the support
1-itemset. Bitwise-Based algorithm performance better on of paired items is greater than or equal the minsup they are
mining association rules with low minimum support degree added to the frequent hash table, FreqHT. Once finished
from databases with a large number of attributes and large generating associations of two items, we record the size of the
number of transactions [36-37] hash table. Now we repeat the same steps but for three items.
The rest of this paper is organized as follows. The problem We set the bits that correspond to the first two frequent items
statement given in Section 3. We describe the proposed and alternate setting the remaining items each one at a time.
algorithm in section 4. In section 5, an empirical evaluation of This means that we would have three bits set correspond to three
our approach using synthetic and real dataset are presented. items in the most frequent items list. Each time the mask is
Finally, conclusion is presented in Section 6. ANDED as before with all bitTransactions In BitSetDB and we
get the support of all combinations of size three. Before start
III. PROBLEM STATEMENT
getting combination of length four, we check the size of the
In [11] the problem of mining association rules between a FreqHT if the size is same as the size recorded before this means
set of database transactions introduced as follows: Let I = {i1, that no frequent combination of size three are found. Thus we
i2, in} be a set of n literal or binary attributes called an items. directly stop here and display the frequent items of length two
Let D = {t1, t2. . . tn} be a set of transactions called the database as well as their associations. If the size grows then we proceed
of transactions. Each transaction in D has a unique identifier to find associations of four items, and the process is repeated
transaction ID and contains a subset of the items in I. A rule is again.
defined as an implication of the form X → Y where X, Y  I Consider an example contains a database transaction in
and X  Y =  . The problem is to: - Table 3.8. There are 6 items with 5 transactions in the database
with TIDs 100, 200… 500. In this example we want to discover
• Find all the frequent itemsets in database D, items which
the frequent itemsets that satisfy the minimum support count 3.
pass the condition of support that is greater than or equal
to the predefined minimum threshold.
• Use the frequent items to generate the possible
combination and association between a set of database
items, after passing the condition of confident greater
than or equal to predefined minimum confidence.

17
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
TABLE 1: SAMPLE OF DATABASE TRANSACTIONS were found in the first part, then the procedure terminates here
TID ITEMST USED and return immediately, the algorithm stops here reporting no
100 Bread Milk Tissue frequent itemsets were found.
200 Bread Tissue Juice Eggs To calculate the frequent itemsets of size 2, we use a mask
300 Milk Tissue Juice Yoghurt to do a combination of itemsets of size two. This is done using
400 Bread Milk Tissue Juice an empty bitSet of size itemSet, which will be represented as a
500 Bread Milk Tissue Yoghurt
mask. Table 4. From the previous step we can design our mask
Based on the first procedure, the algorithm gets all the items in based on the output of table 3.
the database and store them in the list of itemSet. The algorithm
generates the corresponding binary database which contains the TABLE 4: MASK BITSET REPRESENTATION
binary representation of the transactions that is recorded and ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support
stored as either 0 or 1. If the database transactions in the binary Mask 1 1 2
representation contains 1, means that the item is present in the
rule. If the item appeared in the transaction contains 0, which
When we AND the mask with the binary BitSetDB the mask
means the items is not present in the rule, table 2.
will match two occurrences in the database transactions
TABLE 2. BITWISE-BASED DATA REPRESENTATION Table.5.
ITEMS Bread Milk Tissue Juice Eggs Yoghurt TABLE 5: ITEMSETS OCCURRENCE COUNT
100 1 1 1 0 0 0 ITEM Brea Mil Tissu Juic Egg Yoghur Occurrence
200 1 0 1 1 1 0 S d k e e s t s
300 0 1 1 1 0 1 100 1 1 1 0 0 0 x
200 1 0 1 1 1 0 √
400 1 1 1 1 0 0
300 0 1 1 1 0 1 x
500 1 1 1 0 0 1 400 1 1 1 1 0 0 √
500 1 1 1 0 0 1 x
During the database transformation we simultaneously
calculates the frequency of all items in the itemSet and store the As it shown in the first AND between the mask BitSet and
frequency of each item in its corresponding index in array Freq. BitSetDB the two items Bread and Juice occur together two
The procedure starts by scanning the binary database BitSetDB. times in the whole transactions, Table 5.
The current binary transaction is retrieved and stored in a
temporary BitSet (b). Then we check the bits of b. If a bit is set Then we will continue to the remaining combination of the other
to 1, i.e. meaning that its corresponding item appears in this 3 probabilities as shown in table.6.
transaction, then its equivalent index in array Freq is increment TABLE 6: REMAINING COMBINATION OF ITEMSET >=2
by one. Once this step finishes then, the algorithm iterates ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
through all elements of array Freq and if the count of an element Mask ort
1 1 3
passes the specified minimum support threshold, then the index
of the element, which corresponds to an item in the itemSet, is ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
added to the frequentItems list. This step can be illustrated in Mask ort
table. 3. 1 1 2

TABLE 3: THE FIRST STEP PROCESS ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
Mask ort
Items Juice Bread Yoghurt Tissue Milk Eggs
1 1 4
Freq 3 4 2 5 4 1
Support 0.6 0.8 0.4 1.0 0.8 0.2 ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
Mask ort
From the first step output, we proceed to discover the 1 3
1
frequent itemset of size 2. The important point is that, we have
to check if the output of the first part contains frequent itemsets ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
or not. If frequent itemsets that is can be used in the second step Mask ort
were found, then the procedure continues to the second part and 1 1 4
calculates frequent itemsets of size ≥2. If no frequent itemsets
that is can be used to discover the frequent itemsets of size 2

18
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
Based on the previous step we will get the following output that’s publicly available in the FIMI dataset repository.
shown in table 7. Mentioned datasets are having different transaction size, item
size, and other behaviors. The graph x axis show the support
TABLE 7. FREQUENT ITEMSETS OF SIZE <=2 percentage value specified by the user, and y axis are carrying
Itemsets Juice Tissue Bread Tissue Bread Tissue
an execution time in milliseconds.
Milk Milk
Freq 3 4 3 4
Apriori Eclat FP-Growth
H-mine Binary
6000

Execution time(MS)
Again we start masking the remaining part which can be
5000
done for the itemsets > two combinations as shown in Table.8.
4000
TABLE.8. COMBINATION OF ITEMSETS > TWO ITEMSETS 3000
ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support
2000
Mask
1 1 1 2 1000
0
ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support 0.09 0.07 0.05 0.03 0.01
Mask Support %
1 1 1 3
Figure (1) Execution time using T20I6D100K dataset
Based on the previous step the final maximal frequent
itemsets will be the itemsets Bread, Milk, and Tissue as shown Apriori Eclat FP-Growth H-mine Binary
in table.9. 2000
Execution time (MS)

TABLE.9. THE MAXIMAL FREQUENT ITEMSETS 1500

Itemsets Bread Milk Tissue

1000
Freq 3
500
Now the frequent itemsets for the full transactions are
presented in Table.10. 0
0.9 0.8 0.7 0.6 0.5
TABLE 10: THE FREQUENT ITEMSETS AND THEIR SUPPORTS Support %

Itemsets Support Figure (2) Execution time using Mushroom dataset

Bread 4
In figure (1) The execution time of the all mentioned algorithms
Milk 4 make a nearby sense, except Apriori algorithm, the reason of
Tissue 5 that Apriori use the candidate generating and testing approach
Juice 3 that face the problem in execution time when the support value
Bread Milk 3 decreased. In figure (2) Bitwise-Based algorithm outperforms
Bread Tissue 4 all other algorithms, the reason is that Bitwise-Based algorithm
Milk Tissue 4 use the vertical data layout that is efficiently calculate the item
support of the frequent pattern, that lead to affect the execution
Tissue Juice 3 time. The benefits of using the vertical layout also appear in
Bread Milk Tissue 3 Éclat which is come in the second order of execution time out
of all other algorithms.
V. EXPERIMENTAL RESULT Significant of Execution Time Benchmarking A student t-
To proof the efficiency of Bitwise-Based algorithm, the test has been conducted using Matlab 2012b to verify the
experiments were conducted on Intel® corei5™ CPU, 2.4 GHz, significant of the obtained results of execution times. The t-test
and 02 GB of RAM computer. Graph 1 and graph 2 in figure has reported a significant reduction in execution times when
(1) and figure (2) respectively shows the comparison of the using Bitwise-Based approach (mean=657.2 and standard
execution time for mining frequent patterns using synthetic deviation=382.83) against the best results recorded by Eclat
dataset T20I6D100K provided by the QUEST generator from algorithm (mean=1188 and standard deviation=259.36) in
T20I6D100K dataset which contains 1000 of items and 100000
IBM’s Almaden research lab, and the real dataset, Mushroom,

19
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
database transactions, using α=0.05. Table 11 summarizes the [3] A. Bansal and M. R. Rastogi, "LEARNING BEHAVIOR OF
obtained t-test results for T20I6D100K dataset. Here the t column ANALYSIS OF HIGHER STUDIES USING DATA
refers to the generated t-test value, while the p column refers to MINING," International Journal of Advanced Research in
Computer Engineering & Technology (IJARCET), vol. 1,
the probability of the t value of the t-test. The significance of
pp. pp: 80-84, 2012.
the t-test depends on both α and p. If the value of p is less than
[4] S. U. Kumar, H. H. Inbarani, and S. S. Kumar, "Bijective soft
the value of α then the t-test reports a significant result and set based classification of medical data," in Pattern
hence the h column which refers to the test hypothesis will be 1 Recognition, Informatics and Medical Engineering
which means rejecting the Null hypothesis and accepting the (PRIME), 2013 International Conference on, 2013, pp. 517-
alternative hypothesis; otherwise h will be 0 which means non- 521.
significant result. [5] R. Al Iqbal, "Hybrid clinical decision support system: An
automated diagnostic system for rural Bangladesh," in
TABLE 11: T-TEST RESULTS FOR T20I6D100K DATASET (Α=0.05)
Informatics, Electronics & Vision (ICIEV), 2012
Test Sample Bitwise-Based Éclat
t p h International Conference on, 2012, pp. 76-81.
T20I6D100 𝑥̅ sd 𝑥̅ sd
K dataset 657. 382.8 118 259.3 - 0.037 1 [6] B. Milovic, "Prediction and decision making in Health Care
2 3 8 6 2.566 0 using Data Mining," International Journal of Public Health
7 Science (IJPHS), vol. 1, pp. 69-78, 2012.
[7] M. V. Joseph, "Data Mining and Business Intelligence
The student t-test has also been conducted to verify the Applications in Telecommunication Industry," International
Journal of Engineering and Advanced Technology (IJEAT),
significant of the obtained results of execution times in Retail
vol. 2, pp. 525-528, 2013.
dataset. The t-test has reported a significant reduction in
[8] R. Sujatha and D. Ezhilmaran, "A Proposal for Analysis of
execution times when using Bitwise-Based approach Crime Based on Socio–Economic Impact using Data Mining
(mean=790 and standard deviation=232.59) against the best Techniques," International Journal of Societal Applications
results recorded by Éclat algorithm (mean=1360 and standard of Computer Science, vol. 2, pp. 229-231, 2013.
deviation=289.65) using α=0.05. Table 4.8 summarizes the [9] A. Chauhan, G. Mishra, and G. Kumar, Survey on Data
obtained t-test results for Retail dataset. mining Techniques in Intrusion Detection: Lap Lambert
Academic Publ, 2012.
TABLE T-TEST RESULTS FOR RETAIL DATASET (Α=0.01) [10] S. O. Fageeri, R. Ahmad, and B. Baharum, "A Log File
Test Bitwise-Based Eclat Analysis Technique Using Binary-Based Approach," in
Sample t p h Proceedings of the First International Conference on
Retail 𝑥̅ Sd 𝑥̅ sd Advanced Data and Information Engineering (DaEng-
dataset 790 232.59 1360 289.65 - 0.0096 1 2013), 2014, pp. 3-11.
3.4310 [11] R. Agrawal, T. Imieliński, and A. Swami, "Mining
association rules between sets of items in large databases,"
VI. CONCLUSIONS in ACM SIGMOD Record, 1993, pp. 207-216.
This paper overall, uses a powerful technique for identifying [12] M. M. Mazid, A. Shawkat Ali, and K. S. Tickle, "Finding a
patterns of co-occurrence among items in transactional data unique association rule mining algorithm based on data
(MBA) for generating insights that can help businesses improve characteristics," in Electrical and Computer Engineering,
their sales and marketing strategies. We introduce an efficient 2008. ICECE 2008. International Conference on, 2008, pp.
Bitwise-Based data structure technique for mining frequent 902-908.
[13] H. Toivonen, "Sampling large databases for association
pattern in large-scale databases, the algorithm scans the original
rules," in VLDB, 1996, pp. 134-145.
database once, using the Bitwise based data representations as
[14] H. Cheng, X. Yan, and J. Han, "IncSpan: incremental mining
well as vertical database layout, compared to the well-known of sequential patterns in large database," in Proceedings of
Éclat and FP-Growth algorithm, Bitwise based technique the tenth ACM SIGKDD international conference on
enhances the problems of multiple passes over the original Knowledge discovery and data mining, 2004, pp. 527-532.
database, Hence, minimize the execution time. [15] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic
itemset counting and implication rules for market basket
REFERENCES data," in ACM SIGMOD Record, 1997, pp. 255-264.
[1] P. B. Jensen, L. J. Jensen, and S. Brunak, "Mining electronic [16] J. S. Park, M.-S. Chen, and P. S. Yu, "Using a hash-based
health records: towards better research applications and method with transaction trimming for mining association
clinical care," Nature Reviews Genetics, vol. 13, pp. 395- rules," Knowledge and Data Engineering, IEEE
405, 2012. Transactions on, vol. 9, pp. 813-825, 1997.
[2] R. Gupta, "Analysis and design of data mining techniques [17] J. S. Park, M.-S. Chen, and P. S. Yu, "Efficient parallel data
for prevention and detection of financial frauds," 2013. mining for association rules," in Proceedings of the fourth

20
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
international conference on Information and knowledge support tuning mechanism," Decision Support Systems, vol.
management, 1995, pp. 31-36. 42, pp. 1-24, 2006.
[18] R. Agrawal and J. C. Shafer, "Parallel mining of association [35] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, "H-
rules," Knowledge and Data Engineering, IEEE mine: Hyper-structure mining of frequent patterns in large
Transactions on, vol. 8, pp. 962-969, 1996. databases," in Data Mining, 2001. ICDM 2001, Proceedings
[19] D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu, "A IEEE International Conference on, 2001, pp. 441-448.
fast distributed algorithm for mining association rules," in [36] Fageeri, S.O., Hossain, S.M.E., Arockiasamy, S., Al-Salmi,
Parallel and Distributed Information Systems, 1996., Fourth T.Y. (2022). High-Utility Pattern Mining Using ULB-Miner.
International Conference on, 1996, pp. 31-42. In: Aurelia, S., Hiremath, S.S., Subramanian, K., Biswas,
[20] M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, S.K. (eds) Sustainable Advanced Computing. Lecture Notes
"Parallel algorithms for discovery of association rules," Data in Electrical Engineering, vol 840.
mining and knowledge discovery, vol. 1, pp. 343-373, 1997. [37] Fageeri, S., Ahmad, R., Alhussian, H. (2020). An Efficient
[21] A. Savasere, E. R. Omiecinski, and S. B. Navathe, "An Algorithm for Mining Frequent Itemsets and Association
efficient algorithm for mining association rules in large Rules. In: Subair, S., Thron, C. (eds) Implementations and
databases," 1995. Applications of Machine Learning. Studies in
[22] F. Geerts, B. Goethals, and J. Van den Bussche, "A tight Computational Intelligence, vol 782. Springer, Cham
upper bound on the number of candidate patterns," in Data
Mining, 2001. ICDM 2001, Proceedings IEEE International
Conference on, 2001, pp. 155-162.
[23] J. S. Park, M.-S. Chen, and P. S. Yu, An effective hash-based
algorithm for mining association rules vol. 24: ACM, 1995.
[24] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic
itemset counting and implication rules for market basket
data," in ACM SIGMOD Record, 1997, pp. 255-264.
[25] M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "New
Algorithms for Fast Discovery of Association Rules," in
KDD, 1997, pp. 283-286.
[26] R. Agrawal and R. Srikant, "Fast algorithms for mining
association rules," in Proc. 20th Int. Conf. Very Large Data
Bases, VLDB, 1994, pp. 487-499.
[27] J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without
candidate generation," in ACM SIGMOD Record, 2000, pp.
1-12.
[28] F.-y. DENG and Z.-y. LIU, "(Dept. of Management Science,
Xiamen University, Xiamen 361005, China); An
Ameliorating FP-growth Algorithm Based on Patterns-
matrix [J]," Journal of Xiamen University (Natural Science),
vol. 5, 2005.
[29] Z. Y. LüHongbing, "An Incremental Updating Algorithm to
Mine Association Rules Based on Frequent Pattern Growth
[J]," Computer Engineering and Applications, vol. 26, p.
055, 2004.
[30] K. Wang, L. Tang, J. Han, and J. Liu, Top down fp-growth
for association rule mining: Springer, 2002.
[31] Y. Qiu, Y.-J. Lan, and Q.-S. Xie, "An improved algorithm of
mining from FP-tree," in Machine Learning and
Cybernetics, 2004. Proceedings of 2004 International
Conference on, 2004, pp. 1665-1670.
[32] A. Pietracaprina, "Mining frequent itemsets using patricia
tries," 2003.
[33] J.-W. Han, J. Pei, and X.-F. Yan, "From sequential pattern
mining to structured pattern mining: a pattern-growth
approach," Journal of Computer Science and Technology,
vol. 19, pp. 257-279, 2004.
[34] Y.-H. Hu and Y.-L. Chen, "Mining association rules with
multiple minimum supports: a new mining algorithm and a

21
IJRITCC | May 2023, Available @ http://www.ijritcc.org

View publication stats

CampusX DSMP 2.0 Syllabus
No ratings yet
CampusX DSMP 2.0 Syllabus
66 pages
OMOP Common Data Model Extract Transform Load
No ratings yet
OMOP Common Data Model Extract Transform Load
161 pages
2023-24 CS 12 Links Swati Chawla
100% (1)
2023-24 CS 12 Links Swati Chawla
7 pages
Oracle Data Guard Presentation
No ratings yet
Oracle Data Guard Presentation
52 pages
DA Unit 4
100% (1)
DA Unit 4
125 pages
Data Mining Unit-Ii Notes
No ratings yet
Data Mining Unit-Ii Notes
24 pages
SharePoint Migration Checklist
100% (1)
SharePoint Migration Checklist
7 pages
Hsslive Xii Computer Science Solved Question Bank Lavan Radhika Soumya PDF
No ratings yet
Hsslive Xii Computer Science Solved Question Bank Lavan Radhika Soumya PDF
111 pages
KP
0% (1)
KP
13 pages
Mining Frequent Itemset-Association Analysis
No ratings yet
Mining Frequent Itemset-Association Analysis
59 pages
APIs Vs Interfaces
No ratings yet
APIs Vs Interfaces
8 pages
Academic Year: 2020: Course Title: Data Structures and Algorithms Lab
No ratings yet
Academic Year: 2020: Course Title: Data Structures and Algorithms Lab
7 pages
PArkash DAta Structure
No ratings yet
PArkash DAta Structure
118 pages
Unit 5 Mining Frequent Patterns and Cluster Analysis
No ratings yet
Unit 5 Mining Frequent Patterns and Cluster Analysis
63 pages
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
No ratings yet
Unit 4 - DA - Frequent Itemsets and Clustering-1 (Unit-5)
86 pages
Normalization
No ratings yet
Normalization
177 pages
Unit 4 - Part 1
No ratings yet
Unit 4 - Part 1
152 pages
Association Rule Mining
No ratings yet
Association Rule Mining
61 pages
Team-4 - Project Report
No ratings yet
Team-4 - Project Report
94 pages
VIPDMTheoryChapter 5
No ratings yet
VIPDMTheoryChapter 5
96 pages
Mining Frequent, Patterns, Associations, and Correlations
No ratings yet
Mining Frequent, Patterns, Associations, and Correlations
13 pages
ML Unit - Iii
No ratings yet
ML Unit - Iii
64 pages
Mining Frequent Patterns, Association and Correlations - Basic Concepts and Methods
No ratings yet
Mining Frequent Patterns, Association and Correlations - Basic Concepts and Methods
55 pages
Data - Analytics - Chapter 3
No ratings yet
Data - Analytics - Chapter 3
54 pages
p139 Data Mining Mafia
No ratings yet
p139 Data Mining Mafia
13 pages
TMK - DWDM - Unit 4. From Government Engineering College
No ratings yet
TMK - DWDM - Unit 4. From Government Engineering College
176 pages
CH - 5
No ratings yet
CH - 5
43 pages
DM Unit Ii
No ratings yet
DM Unit Ii
30 pages
Unit 3 1
No ratings yet
Unit 3 1
34 pages
Association Rule Mining Using Apriori Al PDF
No ratings yet
Association Rule Mining Using Apriori Al PDF
11 pages
Business Requirements Document (BRD) Template: Tech Comm Templates
No ratings yet
Business Requirements Document (BRD) Template: Tech Comm Templates
9 pages
Introduction To Tableau: Data Visualization With Tableau
No ratings yet
Introduction To Tableau: Data Visualization With Tableau
17 pages
Note 1455181909
No ratings yet
Note 1455181909
30 pages
CH-4 Mining Association Rules
No ratings yet
CH-4 Mining Association Rules
35 pages
Advanced Database Lab
No ratings yet
Advanced Database Lab
36 pages
DWDM Module III
No ratings yet
DWDM Module III
33 pages
Cs 3610 Software Engineering Summer Software Requirements Specification Document Project Title Road Repair Tracking System
No ratings yet
Cs 3610 Software Engineering Summer Software Requirements Specification Document Project Title Road Repair Tracking System
24 pages
How To Build A Data Science Portfolio
No ratings yet
How To Build A Data Science Portfolio
17 pages
Data Mining - 8
No ratings yet
Data Mining - 8
19 pages
DM - Unit II
No ratings yet
DM - Unit II
65 pages
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
No ratings yet
14-Introduction To Apriori Level Wise Algorithm-03-09-2024
32 pages
Unit II
No ratings yet
Unit II
22 pages
Data Warehouse and Data Mining - Unit 5
No ratings yet
Data Warehouse and Data Mining - Unit 5
30 pages
Data Mining Unit 2 1
No ratings yet
Data Mining Unit 2 1
15 pages
DWM 5
No ratings yet
DWM 5
17 pages
Unit Iii (DWDM)
No ratings yet
Unit Iii (DWDM)
11 pages
Unit - III
No ratings yet
Unit - III
38 pages
Unit - 3 - IDC by SS
No ratings yet
Unit - 3 - IDC by SS
22 pages
jBASE Indexing
No ratings yet
jBASE Indexing
35 pages
Data Mining M2
No ratings yet
Data Mining M2
18 pages
CSA 106 Market Basket Analysis
No ratings yet
CSA 106 Market Basket Analysis
13 pages
Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods
No ratings yet
Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods
12 pages
Association RuleMining
No ratings yet
Association RuleMining
52 pages
Market Basket Analysis For A Supermarket
No ratings yet
Market Basket Analysis For A Supermarket
9 pages
DM Unit - 2
No ratings yet
DM Unit - 2
14 pages
DWDM Lecture Notes U-4
No ratings yet
DWDM Lecture Notes U-4
17 pages
Chapter 5 Mining Frequent Pattern-DWM
No ratings yet
Chapter 5 Mining Frequent Pattern-DWM
48 pages
UNIT-4 DMCT Discovering Patterns and Rules
No ratings yet
UNIT-4 DMCT Discovering Patterns and Rules
18 pages
Market Basket Analysis For A Supermarket
No ratings yet
Market Basket Analysis For A Supermarket
9 pages
ch14 Min Assoc Rules
No ratings yet
ch14 Min Assoc Rules
12 pages
Applications of Fuzzy Logic in Geographic Informat
No ratings yet
Applications of Fuzzy Logic in Geographic Informat
11 pages
Association Rule
No ratings yet
Association Rule
20 pages
Full UNIT 4 Notes
No ratings yet
Full UNIT 4 Notes
37 pages
Unit-II Association Rules
No ratings yet
Unit-II Association Rules
16 pages
Efficient Frequent Itemset Mining Mechanism Using Support Count
No ratings yet
Efficient Frequent Itemset Mining Mechanism Using Support Count
7 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
Sequences and Synonyms
No ratings yet
Sequences and Synonyms
16 pages
Module 2
No ratings yet
Module 2
13 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
13 pages
Data Mining UNIT 3 LECTURE NOTES
No ratings yet
Data Mining UNIT 3 LECTURE NOTES
13 pages
Market Basket Analysis Using Apriori and FP Growth Algorithm
No ratings yet
Market Basket Analysis Using Apriori and FP Growth Algorithm
7 pages
Terraform Notes
No ratings yet
Terraform Notes
6 pages
DMDW - Association Analysis
No ratings yet
DMDW - Association Analysis
12 pages
DMDW Qa-3.2
No ratings yet
DMDW Qa-3.2
11 pages
Unit 2 - Apriori and FP Growth Algortithm
No ratings yet
Unit 2 - Apriori and FP Growth Algortithm
15 pages
Unit 2 Material
No ratings yet
Unit 2 Material
17 pages
Roblox Operations Platform - People Schema
No ratings yet
Roblox Operations Platform - People Schema
8 pages
CS Xii PB MS - Set1
No ratings yet
CS Xii PB MS - Set1
6 pages
Application of Data Mining Techniques To A Selected Business Organization With Special Reference To Buying Behavior
No ratings yet
Application of Data Mining Techniques To A Selected Business Organization With Special Reference To Buying Behavior
13 pages
Data Mining
No ratings yet
Data Mining
4 pages
Frequent Pattern Mining With Associations: Lesson Introduction
No ratings yet
Frequent Pattern Mining With Associations: Lesson Introduction
6 pages
Report of 2nd Defence
No ratings yet
Report of 2nd Defence
6 pages
Document 2123153.1, How To Deploy and Configure Oracle Incentive Compensation Analytics For Oracle Data Integrator 12c
No ratings yet
Document 2123153.1, How To Deploy and Configure Oracle Incentive Compensation Analytics For Oracle Data Integrator 12c
4 pages
SQL Server To AWS Migration
No ratings yet
SQL Server To AWS Migration
5 pages
Market Basket Analysis Using Association Rule: ISSN: 2454-132X Impact Factor: 4.295
No ratings yet
Market Basket Analysis Using Association Rule: ISSN: 2454-132X Impact Factor: 4.295
4 pages
Predicting Missing Items in Shopping Carts Using Fast Algorithm
No ratings yet
Predicting Missing Items in Shopping Carts Using Fast Algorithm
7 pages
Manish Resume
No ratings yet
Manish Resume
1 page
Time Management Table Names Sap Eassy Access-Go To Sm30 & Enter The Table Name Compensatory Off Comp-Off Attendance Type S.No Subject Table Name
No ratings yet
Time Management Table Names Sap Eassy Access-Go To Sm30 & Enter The Table Name Compensatory Off Comp-Off Attendance Type S.No Subject Table Name
1 page
Oracle Database: From Wikipedia, The Free Encyclopedia
No ratings yet
Oracle Database: From Wikipedia, The Free Encyclopedia
1 page
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet

MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques

Uploaded by

MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

MBA: Market Basket Analysis Using Frequent Pattern Mining Techniques

Sallam Fageeri Mohammad Abu Kausar

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

MBA: Market Basket Analysis Using Frequent

I. INTRODUCTION frequent itemsets based on minimum support, (2) generate

TABLE.9. THE MAXIMAL FREQUENT ITEMSETS 1500

Itemsets Bread Milk Tissue

Itemsets Support Figure (2) Execution time using Mushroom dataset

View publication stats

You might also like