MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques
MBAMarket Basket Analysis Using Frequent Pattern Mining Techniques
net/publication/371202670
Article in International Journal on Recent and Innovation Trends in Computing and Communication · April 2023
DOI: 10.17762/ijritcc.v11i5s.6591
CITATIONS READS
0 229
3 authors, including:
All content following this page was uploaded by Mohammad Abu Kausar on 01 June 2023.
Abstract—This Market Basket Analysis (MBA) is a data mining technique that uses frequent pattern mining algorithms to discover
patterns of co-occurrence among items that are frequently purchased together. It is commonly used in retail and e-commerce businesses to
generate association rules that describe the relationships between different items, and to make recommendations to customers based on their
previous purchases. MBA is a powerful tool for identifying patterns of co-occurrence and generating insights that can improve sales and
marketing strategies. Although a numerous works has been carried out to handle the computational cost for discovering the frequent itemsets,
but it still needs more exploration and developments. In this paper, we introduce an efficient Bitwise-Based data structure technique for mining
frequent pattern in large-scale databases. The algorithm scans the original database once, using the Bitwise-Based data representations as well
as vertical database layout, compared to the well-known Apriori and FP-Growth algorithm. Bitwise-Based technique enhance the problems of
multiple passes over the original database, hence, minimizes the execution time. Extensive experiments have been carried out to validate our
technique, which outperform Apriori, Éclat, FP-growth, and H-mine in terms of execution time for Market Basket Analysis.
Keywords- Market Basket; Bitwise-Based; Frequent Patterns; Support; Confidence
15
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
words, the confidence is used to explore the conditional frequent 2-itemsets by adopting a Hash technology, and this
probability of the used items. The formula of confidence is method also improves the process of creating candidate itemsets.
DIC, proposed by Brin et al. [24] can append candidate itemsets
dynamically in different courses of scanning database. The
(2) above algorithms are all have an advantage over Apriori, but
they still spend a great deal of time scanning database, finding
and testing candidate itemsets.
Association rule can be written in expression as an Éclat (Equivalence CLASS Transformation) algorithm [25]
implication of X Y, where X and Y are items of itemset I. where uses a vertical data representation and divide-and-conquer
X I, Y I, and X∩Y =. The expression means that if a transactions approach, the benefits of using the vertical representation
T contains the items in X, it also tends to contain the items in Y. optimize the parallel processing of the search space using depth-
An illustration of such a rule might be that 60% out of the total first generation of frequent itemsets. Éclat is the first considered
transactions that contain milk also contains sugar; 40% of all algorithm used to generate frequent itemsets in vertical format
transactions contain the two items together". Here 60% will be using only once passes over the database; although, a vertical
known as confidence of the rule, and 40% will be known as data representation usually reside in memory. It does not fully
support of the rule. Mining association rules from a set of items exploit the downward closure property of support as it does not
idea originates from the data analysis of market-basket, where utilize step (2) of Apriori, the pruning step; in addition, it
will be the interest in mining association rules for describing generates a candidate (k+l)-itemset if two of its k-subsets are
customer’s interest in buying product items. frequent, resulting in a larger number of candidates compared
to Apriori [26].
II. LITERATURE REVIEW As an alternative solution to the bottleneck of Apriori and
Apriori algorithm, introduced by [11], is a first frequent Apriori-like algorithms problem of candidate set generation and
pattern and association rule mining algorithm, in order to testing, Han et al [27], proposed the pattern-growth approach
control the exponential of candidate itemset growth, the for mining frequent patterns. One of the fastest mining methods
algorithm use the support-based pruning concept, to investigate for frequent itemsets is the well-known FP-growth algorithm
the concept behind the Apriori principle. Apriori algorithm [11], [27]. FP-growth utilizes an effective and compact data structure
used to discover the frequent pattern in database transactions. known as Frequent Pattern tree (FP-tree). Unlike Apriori
The Apriori algorithm use multiple passes method over the algorithm which uses candidate generation and testing; FP-
database. Lead to employs an iterative (level-wise search) growth performs pattern growth approach in order to obtain
during the search space, Apriori algorithm start generating frequent itemsets. FP-growth divides the compact database into
candidate 1-itemset C1, the algorithm scans the transactions and set of conditional databases, then all frequent itemsets are
count all its items in individually. Based on the given support generated from the conditional databases. FP-tree is constructed
the frequent 1-itemsets will be determined as L1. Then in the by firstly building a header table. The header table contains item
next step the algorithm discover L2 set of frequent 2-itemsets. name as well as a corresponding link to each item. Link entries
The algorithm uses L1*L1 to generate the candidate 2-itemset in the header table were initialized to null. Then, every item
C2. Each itemset in C2 that is greater than the specified added first time to the tree, its corresponding entry is updated.
minimum support will be add to frequent 2-itemset L2. To Also the root node, marked as “null“, is constructed. Child
determine the frequent 3-itemset L3, the algorithm use L2*L2 nodes are then attached through database scan. Paths that share
to discover candidates of size 3-items C3. Based on the same prefixes are searched firstly. If a path has same prefix of
minimum support threshold the frequent L3 will be discovered. a transaction items exist, then the shared prefix part count is
The circulation continues like this, until no possibility of increased by one in the FP tree; the rest of the items that don’t
generate more combinations. belong to the shared prefix are then attached to the last node in
To overcome the Apriori Bottleneck extensive improvements the 1st order and their counters are set to one. On the other hand
had been introduced, such as, sampling approach [13], if the transaction items don’t share prefix part with any path;
incremental mining [14], dynamic itemset counting [15], then they are attached to the root node in the 1st order. Lastly,
hashing technique [16], parallel and distributed mining [17-20], every path in the FP-tree corresponds to a transaction in the
partition technique [21]. Tight upper bound number of database.
candidate patterns derived the association rules, which can be Based on FP-growth, many people have proposed their
discovered using the level-wise method [22]. The result can be improved algorithms [28-32], Jian Pei and Jiawei Han [33]
effective through reducing the number of database passes. DHP, proposed the H-mine algorithm which has high performance
proposed by Pork et al.[23] improves the efficiency of finding and very small space overhead by taking the advantage of H-
16
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
struct data structure and re-adjusting the links at mining IV. MATERIAL AND METHOD
different “projected” databases. Yahan Hu et al. [34] proposed The algorithm used contains four procedures. The first
the MIS-tree structure which is a FP-tree-like structure, and they procedure is getItemSet, which gets the list of items in the
also proposed a high performance algorithm called CFP-growth database as well as generate the binary representation of the
to mining association rules with multiple minimum supports. database. The second procedure is getFrequentItems which
Pei et al. [35] developed a H-mine algorithm, that is used to scan the database and counts the frequent of all items in the item
discover all the frequent itemsets from the given transactional set and keeps those items whose support is greater than or equal
database. The proposed algorithm uses a simple and novel data minimum support in the frequent item list. The third procedure
structure hyper-link, H-struct, and a new mining algorithm, H- is generateFrequentItemSets which generates the lists of all
mine dynamically able to adjust links in the mining task, taking frequent itemsets that passed the value of the minimum support.
into account the advantages of the previously mentioned data Finally the fourth procedure is generateAssociations, which
structure. H-mine algorithm can have a scaled up to very large generates all the association rules based on the frequent itemsets
database using database partitioning, moreover, one of the that has been generated in the previous procedure.
distinct feature of H-mine is a very limited memory cost. The algorithm use an empty bitSet of size itemSet list. We
Apriori and FP-growth both have limitations, especially when will use this empty bitSet as a mask. Firstly, we start by setting
the number of attributes is very high and the minimum support the bit that corresponds to the first item in the frequent items
degree is very low. Based on the analysis of the advantages and list, frequentItems. Then we loop through all the remaining
disadvantages of existing algorithms, we propose an efficient items in frequentItems setting the corresponding bit of each one
algorithm known as Bitwise-Based data structure and algorithm at a time together with the first one that is already set before.
for frequent pattern and association rule mining, compared to This means that at each time we would have two bits set in the
Apriori-like algorithm, FP-tree algorithm, m+ and twice mask. Then we AND the mask with all bitTransactions in
respectively, Bitwise-Based technique, scans database only BitSetDb. If the result of the AND equals the mask then items
once, doesn’t need to generate candidate itemsets, and all FCIs appeared in the transaction and their support is increased by
can be found by creating a comparative mask from the frequent one. We do this for all remaining combinations. If the support
1-itemset. Bitwise-Based algorithm performance better on of paired items is greater than or equal the minsup they are
mining association rules with low minimum support degree added to the frequent hash table, FreqHT. Once finished
from databases with a large number of attributes and large generating associations of two items, we record the size of the
number of transactions [36-37] hash table. Now we repeat the same steps but for three items.
The rest of this paper is organized as follows. The problem We set the bits that correspond to the first two frequent items
statement given in Section 3. We describe the proposed and alternate setting the remaining items each one at a time.
algorithm in section 4. In section 5, an empirical evaluation of This means that we would have three bits set correspond to three
our approach using synthetic and real dataset are presented. items in the most frequent items list. Each time the mask is
Finally, conclusion is presented in Section 6. ANDED as before with all bitTransactions In BitSetDB and we
get the support of all combinations of size three. Before start
III. PROBLEM STATEMENT
getting combination of length four, we check the size of the
In [11] the problem of mining association rules between a FreqHT if the size is same as the size recorded before this means
set of database transactions introduced as follows: Let I = {i1, that no frequent combination of size three are found. Thus we
i2, in} be a set of n literal or binary attributes called an items. directly stop here and display the frequent items of length two
Let D = {t1, t2. . . tn} be a set of transactions called the database as well as their associations. If the size grows then we proceed
of transactions. Each transaction in D has a unique identifier to find associations of four items, and the process is repeated
transaction ID and contains a subset of the items in I. A rule is again.
defined as an implication of the form X → Y where X, Y I Consider an example contains a database transaction in
and X Y = . The problem is to: - Table 3.8. There are 6 items with 5 transactions in the database
with TIDs 100, 200… 500. In this example we want to discover
• Find all the frequent itemsets in database D, items which
the frequent itemsets that satisfy the minimum support count 3.
pass the condition of support that is greater than or equal
to the predefined minimum threshold.
• Use the frequent items to generate the possible
combination and association between a set of database
items, after passing the condition of confident greater
than or equal to predefined minimum confidence.
17
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
TABLE 1: SAMPLE OF DATABASE TRANSACTIONS were found in the first part, then the procedure terminates here
TID ITEMST USED and return immediately, the algorithm stops here reporting no
100 Bread Milk Tissue frequent itemsets were found.
200 Bread Tissue Juice Eggs To calculate the frequent itemsets of size 2, we use a mask
300 Milk Tissue Juice Yoghurt to do a combination of itemsets of size two. This is done using
400 Bread Milk Tissue Juice an empty bitSet of size itemSet, which will be represented as a
500 Bread Milk Tissue Yoghurt
mask. Table 4. From the previous step we can design our mask
Based on the first procedure, the algorithm gets all the items in based on the output of table 3.
the database and store them in the list of itemSet. The algorithm
generates the corresponding binary database which contains the TABLE 4: MASK BITSET REPRESENTATION
binary representation of the transactions that is recorded and ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support
stored as either 0 or 1. If the database transactions in the binary Mask 1 1 2
representation contains 1, means that the item is present in the
rule. If the item appeared in the transaction contains 0, which
When we AND the mask with the binary BitSetDB the mask
means the items is not present in the rule, table 2.
will match two occurrences in the database transactions
TABLE 2. BITWISE-BASED DATA REPRESENTATION Table.5.
ITEMS Bread Milk Tissue Juice Eggs Yoghurt TABLE 5: ITEMSETS OCCURRENCE COUNT
100 1 1 1 0 0 0 ITEM Brea Mil Tissu Juic Egg Yoghur Occurrence
200 1 0 1 1 1 0 S d k e e s t s
300 0 1 1 1 0 1 100 1 1 1 0 0 0 x
200 1 0 1 1 1 0 √
400 1 1 1 1 0 0
300 0 1 1 1 0 1 x
500 1 1 1 0 0 1 400 1 1 1 1 0 0 √
500 1 1 1 0 0 1 x
During the database transformation we simultaneously
calculates the frequency of all items in the itemSet and store the As it shown in the first AND between the mask BitSet and
frequency of each item in its corresponding index in array Freq. BitSetDB the two items Bread and Juice occur together two
The procedure starts by scanning the binary database BitSetDB. times in the whole transactions, Table 5.
The current binary transaction is retrieved and stored in a
temporary BitSet (b). Then we check the bits of b. If a bit is set Then we will continue to the remaining combination of the other
to 1, i.e. meaning that its corresponding item appears in this 3 probabilities as shown in table.6.
transaction, then its equivalent index in array Freq is increment TABLE 6: REMAINING COMBINATION OF ITEMSET >=2
by one. Once this step finishes then, the algorithm iterates ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
through all elements of array Freq and if the count of an element Mask ort
1 1 3
passes the specified minimum support threshold, then the index
of the element, which corresponds to an item in the itemSet, is ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
added to the frequentItems list. This step can be illustrated in Mask ort
table. 3. 1 1 2
TABLE 3: THE FIRST STEP PROCESS ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
Mask ort
Items Juice Bread Yoghurt Tissue Milk Eggs
1 1 4
Freq 3 4 2 5 4 1
Support 0.6 0.8 0.4 1.0 0.8 0.2 ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
Mask ort
From the first step output, we proceed to discover the 1 3
1
frequent itemset of size 2. The important point is that, we have
to check if the output of the first part contains frequent itemsets ITEMS Bread Milk Tissue Juice Eggs Yoghurt Supp
or not. If frequent itemsets that is can be used in the second step Mask ort
were found, then the procedure continues to the second part and 1 1 4
calculates frequent itemsets of size ≥2. If no frequent itemsets
that is can be used to discover the frequent itemsets of size 2
18
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
Based on the previous step we will get the following output that’s publicly available in the FIMI dataset repository.
shown in table 7. Mentioned datasets are having different transaction size, item
size, and other behaviors. The graph x axis show the support
TABLE 7. FREQUENT ITEMSETS OF SIZE <=2 percentage value specified by the user, and y axis are carrying
Itemsets Juice Tissue Bread Tissue Bread Tissue
an execution time in milliseconds.
Milk Milk
Freq 3 4 3 4
Apriori Eclat FP-Growth
H-mine Binary
6000
Execution time(MS)
Again we start masking the remaining part which can be
5000
done for the itemsets > two combinations as shown in Table.8.
4000
TABLE.8. COMBINATION OF ITEMSETS > TWO ITEMSETS 3000
ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support
2000
Mask
1 1 1 2 1000
0
ITEMS Bread Milk Tissue Juice Eggs Yoghurt Support 0.09 0.07 0.05 0.03 0.01
Mask Support %
1 1 1 3
Figure (1) Execution time using T20I6D100K dataset
Based on the previous step the final maximal frequent
itemsets will be the itemsets Bread, Milk, and Tissue as shown Apriori Eclat FP-Growth H-mine Binary
in table.9. 2000
Execution time (MS)
Bread 4
In figure (1) The execution time of the all mentioned algorithms
Milk 4 make a nearby sense, except Apriori algorithm, the reason of
Tissue 5 that Apriori use the candidate generating and testing approach
Juice 3 that face the problem in execution time when the support value
Bread Milk 3 decreased. In figure (2) Bitwise-Based algorithm outperforms
Bread Tissue 4 all other algorithms, the reason is that Bitwise-Based algorithm
Milk Tissue 4 use the vertical data layout that is efficiently calculate the item
support of the frequent pattern, that lead to affect the execution
Tissue Juice 3 time. The benefits of using the vertical layout also appear in
Bread Milk Tissue 3 Éclat which is come in the second order of execution time out
of all other algorithms.
V. EXPERIMENTAL RESULT Significant of Execution Time Benchmarking A student t-
To proof the efficiency of Bitwise-Based algorithm, the test has been conducted using Matlab 2012b to verify the
experiments were conducted on Intel® corei5™ CPU, 2.4 GHz, significant of the obtained results of execution times. The t-test
and 02 GB of RAM computer. Graph 1 and graph 2 in figure has reported a significant reduction in execution times when
(1) and figure (2) respectively shows the comparison of the using Bitwise-Based approach (mean=657.2 and standard
execution time for mining frequent patterns using synthetic deviation=382.83) against the best results recorded by Eclat
dataset T20I6D100K provided by the QUEST generator from algorithm (mean=1188 and standard deviation=259.36) in
T20I6D100K dataset which contains 1000 of items and 100000
IBM’s Almaden research lab, and the real dataset, Mushroom,
19
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
database transactions, using α=0.05. Table 11 summarizes the [3] A. Bansal and M. R. Rastogi, "LEARNING BEHAVIOR OF
obtained t-test results for T20I6D100K dataset. Here the t column ANALYSIS OF HIGHER STUDIES USING DATA
refers to the generated t-test value, while the p column refers to MINING," International Journal of Advanced Research in
Computer Engineering & Technology (IJARCET), vol. 1,
the probability of the t value of the t-test. The significance of
pp. pp: 80-84, 2012.
the t-test depends on both α and p. If the value of p is less than
[4] S. U. Kumar, H. H. Inbarani, and S. S. Kumar, "Bijective soft
the value of α then the t-test reports a significant result and set based classification of medical data," in Pattern
hence the h column which refers to the test hypothesis will be 1 Recognition, Informatics and Medical Engineering
which means rejecting the Null hypothesis and accepting the (PRIME), 2013 International Conference on, 2013, pp. 517-
alternative hypothesis; otherwise h will be 0 which means non- 521.
significant result. [5] R. Al Iqbal, "Hybrid clinical decision support system: An
automated diagnostic system for rural Bangladesh," in
TABLE 11: T-TEST RESULTS FOR T20I6D100K DATASET (Α=0.05)
Informatics, Electronics & Vision (ICIEV), 2012
Test Sample Bitwise-Based Éclat
t p h International Conference on, 2012, pp. 76-81.
T20I6D100 𝑥̅ sd 𝑥̅ sd
K dataset 657. 382.8 118 259.3 - 0.037 1 [6] B. Milovic, "Prediction and decision making in Health Care
2 3 8 6 2.566 0 using Data Mining," International Journal of Public Health
7 Science (IJPHS), vol. 1, pp. 69-78, 2012.
[7] M. V. Joseph, "Data Mining and Business Intelligence
The student t-test has also been conducted to verify the Applications in Telecommunication Industry," International
Journal of Engineering and Advanced Technology (IJEAT),
significant of the obtained results of execution times in Retail
vol. 2, pp. 525-528, 2013.
dataset. The t-test has reported a significant reduction in
[8] R. Sujatha and D. Ezhilmaran, "A Proposal for Analysis of
execution times when using Bitwise-Based approach Crime Based on Socio–Economic Impact using Data Mining
(mean=790 and standard deviation=232.59) against the best Techniques," International Journal of Societal Applications
results recorded by Éclat algorithm (mean=1360 and standard of Computer Science, vol. 2, pp. 229-231, 2013.
deviation=289.65) using α=0.05. Table 4.8 summarizes the [9] A. Chauhan, G. Mishra, and G. Kumar, Survey on Data
obtained t-test results for Retail dataset. mining Techniques in Intrusion Detection: Lap Lambert
Academic Publ, 2012.
TABLE T-TEST RESULTS FOR RETAIL DATASET (Α=0.01) [10] S. O. Fageeri, R. Ahmad, and B. Baharum, "A Log File
Test Bitwise-Based Eclat Analysis Technique Using Binary-Based Approach," in
Sample t p h Proceedings of the First International Conference on
Retail 𝑥̅ Sd 𝑥̅ sd Advanced Data and Information Engineering (DaEng-
dataset 790 232.59 1360 289.65 - 0.0096 1 2013), 2014, pp. 3-11.
3.4310 [11] R. Agrawal, T. Imieliński, and A. Swami, "Mining
association rules between sets of items in large databases,"
VI. CONCLUSIONS in ACM SIGMOD Record, 1993, pp. 207-216.
This paper overall, uses a powerful technique for identifying [12] M. M. Mazid, A. Shawkat Ali, and K. S. Tickle, "Finding a
patterns of co-occurrence among items in transactional data unique association rule mining algorithm based on data
(MBA) for generating insights that can help businesses improve characteristics," in Electrical and Computer Engineering,
their sales and marketing strategies. We introduce an efficient 2008. ICECE 2008. International Conference on, 2008, pp.
Bitwise-Based data structure technique for mining frequent 902-908.
[13] H. Toivonen, "Sampling large databases for association
pattern in large-scale databases, the algorithm scans the original
rules," in VLDB, 1996, pp. 134-145.
database once, using the Bitwise based data representations as
[14] H. Cheng, X. Yan, and J. Han, "IncSpan: incremental mining
well as vertical database layout, compared to the well-known of sequential patterns in large database," in Proceedings of
Éclat and FP-Growth algorithm, Bitwise based technique the tenth ACM SIGKDD international conference on
enhances the problems of multiple passes over the original Knowledge discovery and data mining, 2004, pp. 527-532.
database, Hence, minimize the execution time. [15] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic
itemset counting and implication rules for market basket
REFERENCES data," in ACM SIGMOD Record, 1997, pp. 255-264.
[1] P. B. Jensen, L. J. Jensen, and S. Brunak, "Mining electronic [16] J. S. Park, M.-S. Chen, and P. S. Yu, "Using a hash-based
health records: towards better research applications and method with transaction trimming for mining association
clinical care," Nature Reviews Genetics, vol. 13, pp. 395- rules," Knowledge and Data Engineering, IEEE
405, 2012. Transactions on, vol. 9, pp. 813-825, 1997.
[2] R. Gupta, "Analysis and design of data mining techniques [17] J. S. Park, M.-S. Chen, and P. S. Yu, "Efficient parallel data
for prevention and detection of financial frauds," 2013. mining for association rules," in Proceedings of the fourth
20
IJRITCC | May 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 5s
DOI: https://doi.org/10.17762/ijritcc.v11i5s.6591
Article Received: 24 February 2023 Revised: 25 March 2023 Accepted: 12 April 2023
___________________________________________________________________________________________________________________
international conference on Information and knowledge support tuning mechanism," Decision Support Systems, vol.
management, 1995, pp. 31-36. 42, pp. 1-24, 2006.
[18] R. Agrawal and J. C. Shafer, "Parallel mining of association [35] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, "H-
rules," Knowledge and Data Engineering, IEEE mine: Hyper-structure mining of frequent patterns in large
Transactions on, vol. 8, pp. 962-969, 1996. databases," in Data Mining, 2001. ICDM 2001, Proceedings
[19] D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu, "A IEEE International Conference on, 2001, pp. 441-448.
fast distributed algorithm for mining association rules," in [36] Fageeri, S.O., Hossain, S.M.E., Arockiasamy, S., Al-Salmi,
Parallel and Distributed Information Systems, 1996., Fourth T.Y. (2022). High-Utility Pattern Mining Using ULB-Miner.
International Conference on, 1996, pp. 31-42. In: Aurelia, S., Hiremath, S.S., Subramanian, K., Biswas,
[20] M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, S.K. (eds) Sustainable Advanced Computing. Lecture Notes
"Parallel algorithms for discovery of association rules," Data in Electrical Engineering, vol 840.
mining and knowledge discovery, vol. 1, pp. 343-373, 1997. [37] Fageeri, S., Ahmad, R., Alhussian, H. (2020). An Efficient
[21] A. Savasere, E. R. Omiecinski, and S. B. Navathe, "An Algorithm for Mining Frequent Itemsets and Association
efficient algorithm for mining association rules in large Rules. In: Subair, S., Thron, C. (eds) Implementations and
databases," 1995. Applications of Machine Learning. Studies in
[22] F. Geerts, B. Goethals, and J. Van den Bussche, "A tight Computational Intelligence, vol 782. Springer, Cham
upper bound on the number of candidate patterns," in Data
Mining, 2001. ICDM 2001, Proceedings IEEE International
Conference on, 2001, pp. 155-162.
[23] J. S. Park, M.-S. Chen, and P. S. Yu, An effective hash-based
algorithm for mining association rules vol. 24: ACM, 1995.
[24] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic
itemset counting and implication rules for market basket
data," in ACM SIGMOD Record, 1997, pp. 255-264.
[25] M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "New
Algorithms for Fast Discovery of Association Rules," in
KDD, 1997, pp. 283-286.
[26] R. Agrawal and R. Srikant, "Fast algorithms for mining
association rules," in Proc. 20th Int. Conf. Very Large Data
Bases, VLDB, 1994, pp. 487-499.
[27] J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without
candidate generation," in ACM SIGMOD Record, 2000, pp.
1-12.
[28] F.-y. DENG and Z.-y. LIU, "(Dept. of Management Science,
Xiamen University, Xiamen 361005, China); An
Ameliorating FP-growth Algorithm Based on Patterns-
matrix [J]," Journal of Xiamen University (Natural Science),
vol. 5, 2005.
[29] Z. Y. LüHongbing, "An Incremental Updating Algorithm to
Mine Association Rules Based on Frequent Pattern Growth
[J]," Computer Engineering and Applications, vol. 26, p.
055, 2004.
[30] K. Wang, L. Tang, J. Han, and J. Liu, Top down fp-growth
for association rule mining: Springer, 2002.
[31] Y. Qiu, Y.-J. Lan, and Q.-S. Xie, "An improved algorithm of
mining from FP-tree," in Machine Learning and
Cybernetics, 2004. Proceedings of 2004 International
Conference on, 2004, pp. 1665-1670.
[32] A. Pietracaprina, "Mining frequent itemsets using patricia
tries," 2003.
[33] J.-W. Han, J. Pei, and X.-F. Yan, "From sequential pattern
mining to structured pattern mining: a pattern-growth
approach," Journal of Computer Science and Technology,
vol. 19, pp. 257-279, 2004.
[34] Y.-H. Hu and Y.-L. Chen, "Mining association rules with
multiple minimum supports: a new mining algorithm and a
21
IJRITCC | May 2023, Available @ http://www.ijritcc.org