Mining Recent Maximal Frequent Itemsets Over Data Streams With Sliding Window
Mining Recent Maximal Frequent Itemsets Over Data Streams With Sliding Window
Abstract: The huge number of data streams makes it impossible to mine recent frequent itemsets. Due to the maximal frequent
itemsets can perfectly imply all the frequent itemsets and the number is much smaller, therefore, the time cost and the memory
usage for mining maximal frequent itemsets are much more efficient. This paper proposes an improved method called Recent
Maximal Frequent Itemsets Mining (RMFIsM) to mine recent maximal frequent itemsets over data streams with sliding
window. The RMFIsM method uses two matrixes to store the information of data streams, the first matrix stores the
information of each transaction and the second one stores the frequent 1-itemsets. The frequent p-itemsets are mined with
“extension” process of frequent 2-itemsets, and the maximal frequent itemsets are obtained by deleting the sub-itemsets of
long frequent itemsets. Finally, the performance of the RMFIsM method is conducted by a series of experiments, the results
show that the proposed RMFIsM method can mine recent maximal frequent itemsets efficiently.
Keywords: Data streams, recent maximal frequent itemsets, sliding window, matrix structure.
frequent itemsets mining method are described in named DSM-Miner to mine maximal frequent itemsets
section 4. The experimental analysis is presented in over data streams, it used appropriate method to reduce
section 5, and the conclusion is given in section 6. the effects of old transactions, and then the Sliding
Window Maximum frequent pattern Tree (called
2. Related Work SWM-Tree) was proposed to maintain the latest
pattern’s information. In the process of mining
At present, several methods have proposed to mine the maximal frequent patterns, DSM-Miner used
frequent itemsets over data streams. The models of appropriate pruning operations, calculation pattern of
frequent itemsets mining can be divided into: bit items group and “depth-first” search strategies, the
1. Landmark window model. experimental results showed that DSM-Miner was
2. Damped window model. better in time performance and memory usage. A new
3. Sliding window model according to the processing algorithm that based on the prefix-tree data structure
models of data streams. was proposed by Deypir et al. [5] to find and update
frequent itemsets of the windows, a batch of
For landmark window model, the researchers always transactions were used as the unit of insertion and
focus on the data in the entire data streams, and get the deletion within the window to improve the
global frequent itemsets through the analysis of performance, moreover, an effective traversal strategy
historical data. Li et al. [8] referred to Apriori for the prefix-tree and the suitable representation for
algorithm to present a method called Data Stream each batch of transactions were used in the algorithm,
Mining for Maximal Frequent Itemsets (DSM-MFI), it the required information in each node of the prefix-tree
used prefix tree structure to store the data information was stored and the old batch of transactions were
of data streams, and then the maximal frequent deleted directly.
itemsets mining process was realized with the However, some disadvantages also have existed in
constructed prefix tree structure. INSTANT method the proposed methods. The drawbacks of INSTANT
was presented by Mao et al. [10], it defined some sub- algorithm [10] were the amount of arrays designed for
operators of itemsets and maintained itemsets with maintaining all maximal frequent itemsets was very
different level of support in memory, the advantage of large and the cost of memory usage was also very
INSTANT method was that the maximal frequent expensive, moreover, no efficient superset or subset
itemsets could be displayed directly to user through a were used to check the newly identified maximal
serious of sub-operations when the new transaction frequent itemsets of each array, therefore, the
arriving. comparison times were increased very fast and the
For damped window model, each transaction has a memory usage was enlarged rapidly when the average
corresponding value and the value decreases gradually length of the transactions became longer.
with the increase of time, therefore, preserve and
reduce the related information of historical data need to
be considered in the control of the value. Chang and 3. Definitions And Problems Statement
Lee [2] developed a method called estDec in 2003, this In this section, we first provide some formal
method examined each transaction in turn without the definitions of the important terms used in this paper
generation of any candidate, the occurred count of the and then give the problems statement.
itemsets that appeared in each transaction was
maintained with a prefix-tree structure, and the effect 3.1. Definitions
of old transaction on current mining result was
diminished by defining the parameter called Let I= {i1, i2, i3,…, im} be a finite set of m distinct
debilitating factor. Lin et al. [9] presented the Mining items. The data streams DS= [T1, T2, T3, …, Tn), where
Recently Frequent Itemsets with Variable Support over each transaction Tj∈DS is a subset of I with a unique
Data Streams (MRVSDS) algorithm to store frequent identifier TID. If the relation of itemset α and β
itemsets in current window into PFI-tree structure, the is , α is called the sub-itemset of β and β is called
itemsets were deleted from PFI-tree when the degree of the super-itemset of α. If the length of itemset is k, it is
the transaction was less than min_sup. In addition, the called k-itemset. Table 1 shows an example of data
authors also designed the Decaying Synopsis Vector stream as the running example to clearly explain the
(DSYV) structure to store the processed transaction, definitions. In this example, assume that min_sup is
and the frequent itemsets were found by re-mining the 0.33 and the size of sliding window is 6.
transactions from DSYV when the current itemsets’
support was less than historical min_sup.
For sliding window model, the focus is always on
the recent transactions, therefore, the mining results are
the local frequent itemsets over a certain period of
time. Yang et al. [13] designed an efficient algorithm
Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window 963
Table 1. An example of data streams. response is very important to users. In addition, the
TID Transaction TID Transaction huge nature of data streams makes it impossible to
T1 {i1,i2,i3} T6 {i1,i2,i3,i5,i6}
store all the data information into main memory or
even in secondary storage due to they can easily
T2 {i1,i2,i4} T7 {i1,i2,i5}
consume all resources of system and bring difficulties
T3 {i2,i3,i5} T8 {i1,i2,i3,i4} to the underlying mining tasks.
T4 {i1,i2,i3,i5} T9 {i2,i3,i5} Specifically, the DSM-MFI method [8] took the
T5 {i1,i3,i5} … …… structure of summary frequent itemset forest to store
every sub-projection of affairs, and two main problems
Support: The frequency of itemset xi in DS is of DSM-MFI method could be included as:
defined as support, that is, support ({xi})= count(xi,
DS) / |SW|, where count(xi, DS) is the number of 1. Large memory storage was wasted to store the sub-
contained itemset xi in DS and |SW| is the size of projections for a part of sub-projections were not
sliding window. frequent.
2. Much time was wasted for deleting the sub-
For example, itemset {i1} is existing in T1, T2, T4, T5 projections from summary frequent itemset forest to
and T6 in current sliding window, therefore, support achieve the lower memory occupancy.
({i1}) = 5/6. Itemset {i1, i2} is existing in T1, T2, T4 and
T6 in current sliding window, therefore, support ({i1, The size of prefix tree that generated by estDec method
i2})= 4/6. [2] was very large with the increasing number of
frequent itemsets, and more seriously, the estDec
Frequent Itemsets (FIs): The frequent itemsets mean method would stop working once the prefix tree
that the itemsets’ support is not less than the occupies full of the memory. The drawback of TMFI
predefined minimal support threshold min_sup. method [6] was the infrequent 1-itemsets were also
For example, itemset {i1, i3} is existing in T1, T4, T5 stored in matrix structure, therefore, some meaningless
and T6, support ({i1, i3})=4/6>0.33, therefore, {i1,i3} is “extension” operation of infrequent itemsets also were
a frequent itemset. conducted to gain longer itemsets.
In general, the time cost, the memory storage and
Infrequent Itemsets (IFIs): The infrequent itemsets the accuracy rate of mining process are the most
mean that the itemsets’ support is less than the important problems we should to deal with.
predefined minimal support threshold min_sup.
For example, itemset {i2, i4} is existing in T2, 4. Mining Recent Maximal Frequent
support ({i2, i4})=1/6<0.33, therefore, {i2,i4} is an
infrequent itemset.
Itemsets
Maximal Frequent Itemsets (MFIs): The itemsets are In this section, we refer TMFI method [6] to propose
the maximal frequent itemsets should satisfy the an improved method called RMFIsM to mine the
following two conditions: recent MFIs over data streams. The RMFIsM method
uses two matrixes (record as: matrix A and frequent
1. They are frequent itemsets.
matrix B) to store the information of each item, and the
2. No super-itemset of them is frequent.
infrequent itemsets need to be deleted from matrix
For example, itemset {i4} is not a MFI due to support immediately to reduce the time cost and memory usage
({i4})=1/6<0.33. Itemset {i1} is not a MFI though based on downward closure property.
support ({i1}) = 5/6>0.33, the reason is that its super-
itemset {i1, i2} is frequent. Itemset {i1, i2, i3, i5} is a 4.1. The Structure of RMFIsM Method
MFI due to support ({i1, i2, i3, i5})=2/6>0.33 and no
super-itemset of it is frequent. Matrix A is constructed to store the information of
each item of data streams, and frequent matrix B is
Dictionary order: If the appeared sequence of built to record the information of frequent 1-itemsets.
itemset A is earlier than itemset B in dictionary, the The rows of matrix A stand for the information of
dictionary order of itemset A and itemset B can be transactions Ti and the columns of matrix A stand for
recorded as: A » B. Similarly, the next itemsets can the information of each item of {i1, i2, i3,…, im}, the
be recored as: A » ABD » ACD » BD in dictionary size of matrix A is (n+1)*m, where row (n+1) records
order. the support of each item. Specifically, the transactions
are scanned in order when the current sliding window
3.2. Problems Statement is not full, and Ad,k is marked as 1 if item ik is appeared
For mining useful information over data streams, the in transaction Td, otherwise, Ad,k is marked as 0.
final mining results should be send to users In order to effectively mine the recent information
immediately, it means that any useful data should be of data streams, old transactions need to be replaced by
processed in an efficient way, in this case, the real-time new ones directly. The position of new transaction Td
964 The International Arab Journal of Information Technology, Vol. 16, No. 6, November 2019
is calculated by Equation 1, where n is the size of Theorem 1. If Xk is a frequent k-itemset, then, any
sliding window, and the information of transaction Td nonempty sub-itemset Xk-1 of Xk is also frequent.
is recorded in row n if the result of pos is 0. Proof. Since X k 1 X k , the transactions contains
pos d %n (1) itemset Xk must contains the itemset Xk-1, that is:
TID( X ) TID ( X ) , it follows that: support(Xk-1)≥
k k -1
Frequent matrix B is built to store the data information
of frequent 1-itemsets in dictionary order, the original support(Xk)≥ min_sup. Hence, any nonempty sub-
element of matrix B is 0 and the real size of matrix B is itemset Xk-1 of Xk is also frequent if Xk is a frequent
(k-1)*(k-1), where k is the number of frequent 1- itemset.
itemsets. The construction process of matrix B is Theorem 2. If Xk is an infrequent k-itemset, then, any
shown as follows: for frequent 1-itemsets ip and iq with super-itemset Xk+1 of Xk is also infrequent.
the order of ip » iq, doing “logic and” operation process Proof. Since X k X k 1 , the transactions contains
for every element of columns p and q in matrix A, Bp,q itemset Xk+1 must contains the itemset Xk, that is:
is marked as 1 if the result of “logic and” for itemset k 1
TID( X ) TID( X )
k
, it follows that:
{ip,iq} is not less than min_sup, otherwise, Bp,q is
support(X )≤ support(X )≤ min_sup. Hence, any
k+1 k
marked as 0.
super-itemset Xk+1 of Xk is also infrequent if Xk is an
Matrix A and frequent matrix B are the basis for
infrequent itemset.
mining maximal frequent itemsets, the pseudo-code of
constructing matrix A and frequent matrix B is shown It can be easily known from downward closure
in Algorithm 1. property that the “extension” process of infrequent
Algorithm 1: Construct matrix A and frequent matrix B itemsets is meaningless, thus, the downward closure
property should to be considered in every step of
Input: Data streams, n(the maximal |SW|), m(maximal number maximal frequent itemsets mining. More specifically,
of different items), min_sup
Output: matrix A, frequent matrix B
the infrequent itemsets that existing in matrix A should
for (|SW|=1 to n) not add into frequent matrix B as the basic element of
{ “extension” process for RMFIsM method, that is, if 1-
for (k=1 to m) itemset ip is an infrequent itemset, its super-itemsets
{ are impossible being the frequent itemsets, therefore, ip
if (ik in Td) should not appear in matrix B to reduce the time cost
Ad,k=1
and memory usage in both constructing matrix B and
else
Ad,k=0 calculating the support value of these meaningless
} extended itemsets.
}
return matrix A 4.3. The Main Idea of RMFIsM Method
for (k=1 to m)
{ The main idea of RMFIsM method can be included
if (support(ik) ≥min_sup) into next three parts:
add ik to matrix B
else 1. Extend the short frequent itemsets into long itemsets.
delete ik 2. Calculate the support value for the extended long
} itemsets and save the frequent long itemsets into
for (k=1 to |B|) maximal frequent itemsets library MFIs_L.
{ 3. check and move the frequent sub-itemsets of the
for (s=k+1 to |B|)
{
extended frequent long itemsets out from MFIs_L.
if (support({ik,is}) ≥min_sup) Note that, each itemset need to be checked before
Bk,s=1 “extension” process to discard the infrequent
else itemsets to further improve the mining efficiency.
Bk,s=0
} Once matrix A is constructed and each element is
} written into matrix A, the support value of each item is
return matrix B calculated and written in row (n+1), and the frequent
1-itemsets are stored into MFIs_L. After constructing
4.2. Downward Closure Property matrix B and the corresponding items are written into
matrix B, the frequent 2-itemsets where the item is
The downward closure property is an important part of
marked in 1 are stored into MFIs_L, and then all 1-
RMFIsM method, it is the foundation of pruning
itemsets are checked and each sub-itemset of frequent
strategy for reducing the meaningless “extension”
2-itemsets are moved out from MFIs_L.
process to save the time cost and memory usage.
If itemset {ik1, ik2, …, ikp} is a frequent p-itemset, the
“extension” process of frequent p-itemset into (p+1)-
Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window 965
1-itemsets is the sub-itemset of frequent 2-itemsets. 5.1. Time Cost for RMFIsM Method
Here, the itemsets in MFIs_L are {i1, i2}, {i1, i3}, {i1,
i5}, {i2, i3}, {i2, i5} and {i3, i5}. The time cost for mining recent MFIs on sparse dataset
T10.I4.D1000K with different value of min_sup is
Then, the frequent 2-itemsets need to be extended
shown in Figure 3-a. The time cost on T10.I4.D1000K
into 3-itemsets. The frequent 2-itemset {i1,i2} is first
with different size of sliding window is shown in
selected as the conditional potential itemset, due to
Figure 3-b. The time cost on T10.I4.D1000K with
B(i1,i3)=1 and B(i2,i3)=1, {i1,i2} can be extended into
different number of transactions is shown in Figure 3-c.
{i1,i2,i3} and it is saved into MFIs_L due to
The time cost on dense dataset T30.I20.D1000K is
support({i1,i2,i3})=0.5>0.33. Repeat the same
shown in Figure 4-a, Figure 4-b, and Figure 4-c
process to gain the frequent 3-itemsets {i1, i2, i5}, {i1,
separately.
i3, i5}, {i2, i3, i5}, and they are saved into MFIs_L.
It can be seen from Figure 3-a and Figure 4-a that
After gaining all frequent 3-itemsets, each frequent
the time cost of RMFIsM, DSM-MFI, estDec and
2-itemset is checked and all sub-itemsets need to be
TMFI methods shows a decreasing trend with the
moved out from MFIs_L.
increasing value of min_sup. The time cost of our
Next, the frequent 3-itemsets need to be extended
proposed RMFIsM method is the lowest of the
into 4-itemsets. The frequent {i1,i2,i3} is first compared four methods, the reason is that in the
selected as the conditional potential itemset, due to
process of mining MFIs, RMFIsM method just
B(i1,i5)=1, B(i2,i5)=1 and B(i3,i5)=1, {i1,i2,i3} can be implements the “logic and” operation of each data
extended into {i1,i2,i3,i5} and it is saved into MFIs_L
information that stored in matrixes, which reduces the
for support({i1,i2,i3,i5})= 0.333>0.33. After gaining
operations of iteration, sorting and pruning, moreover,
the frequent 4-itemsets, each frequent 3-itemset is
the infrequent itemsets are discarded directly in
checked and each sub-itemset need to be moved out
RMFIsM method to avoid meaningless “extension”
from MFIs_L. operation. Compared with DSM-MFI method, the
After above steps, the MFIs_L is {i1, i2, i3, i5}.
saved time of RMFIsM algorithm is great in the first
and becomes smaller gradually with the increasing
5. Experimental Analysis value of min_sup, the reason is that the total frequent
To verify the efficiency of our proposed RMFIsM itemsets are decreasing significantly accompanied with
method, the estDec method [2], the TMFI method [6] large value of min_sup. Compared with dataset
and the DSM-MFI method [8] are compared in our T10.I4.D1000K, the time cost of T30.I20.D1000K on
experiment. All experiments are conducting on a MFIs mining process is much more, the reason is that
machine running Windows 7 with an Intel dual core i3- the itemsets in dense dataset T30.I20.D1000K are more
2020 2.93 GHz processor, the development likely frequent for their larger support value.
environment is Microsoft Visual Studio 2010. The It can be seen from Figures 3-b and 4-b that with the
performance of RMFIsM method is analyzed on increasing size of sliding window, the time cost of the
synthetic sparse datasets of T10.I4.D1000K and compared four methods shows an increasing trends, the
synthetic dense dataset of T30.I20.D1000K that reason is that the number of frequent itemsets is rising
generated by IBM data generator, where |T| means the rapidly as |SW| is becoming larger gradually. The time
average size of the transactions, |I| means the potential cost of our proposed RMFIsM method is the lowest of
size of frequent itemsets and |D| means the total the four methods, and the time cost on
number of transactions, K means one thousand. T30.I20.D1000K is much larger than that on
Experiments are conducted to investigate the T10.I4.D1000K.
efficiency of the RMFIsM method both in time cost We can obviously see from Figures 3-c and 4-c that
and memory usage with different value of min_sup, the time cost of compared four methods is increasing
different size of sliding window and different number with the increased number of transactions, the reason is
of transactions, the experiments are also conducted to that the frequent itemsets increase gradually when the
test the accuracy rate of RMFIsM method. Each group number of transactions is rising. The time cost of
of experiments is repeated for 50 times, and the RMFIsM method is less than DSM-MFI, estDec and
average time and memory usage are calculated. TMFI methods, and the time cost on T30.I20.D1000K
is much more than that on T10.I4.D1000K.
Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window 967
150 300
100
100 200
50
50 100
0 0 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 200 400 600 800 1000 1200 1400 1600 300K 400K 500K 600K 700K 800K 900K1000K
Value of min_sup (|SW|=1000, Transactions=1000K) Size of sliding window (min_sup=0.1, Transactions=1000K) Number of transactions (min_sup=0.1, |SW|=1000)
a) Different min_sup. b) Different sizes of sliding window. c) Different numbers of transactions.
400 300
500
200
200
100
0 0 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 200 400 600 800 1000 1200 1400 1600 300K 400K 500K 600K 700K 800K 900K1000K
Value of min_sup (|SW|=1000, Transactions=1000K) Size of sliding window (min_sup=0.1, Transactions=1000K) Number of transactions (min_sup=0.1, |SW|=1000)
a) Different min_sup. b) Different sizes of sliding window. c) Different numbers of transactions.
1500 1500
500 500
0 0 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 200 400 600 800 1000 1200 1400 1600 300K 400K 500K 600K 700K 800K 900K1000K
Value of min_sup (|SW|=1000, Transactions=1000K) Size of sliding window (min_sup=0.1, Transactions=1000K) Number of transactions (min_sup=0.1, |SW|=1000)
a) Different min_sup. b) Different sizes of sliding window. c) Different numbers of transactions.
5.2. Memory Usage for RMFIsM Method transactions, the parameters used in this experiment is
same with that in subsection 5.1, and the experimental
The memory usage is an important factor to measure results are shown in Figures 5 and 6.
the efficiency of our proposed RMFIsM method. The We can see from Figures 5-a and 6-a that with the
experiment to test the peak memory usage is also increasing value of min_sup, the peak memory usage
conducted with different value of min_sup, different of the compared four algorithms shows a decreasing
size of sliding window and different number of trend. It is owing to that the number of frequent 1-
968 The International Arab Journal of Information Technology, Vol. 16, No. 6, November 2019
itemsets is decreasing gradually with the larger value of becomes much larger with the extending size of
min_sup, therefore, the number of intermediate sliding window. The peak memory usage on sparse
generated itemsets in MFIs mining process is also dataset T10.I4.D1000K is much smaller than on dense
reduced much. The peak memory usage of our dataset T30.I20.D1000K.
proposed RMFIsM method is lowest of the four We can see from Figures 5-c and 6-c that the peak
methods, the reason is that the infrequent itemsets have memory usage of RMFIsM, DSM-MFI, estDec and
been discarded in the beginning of RMFIsM method, so TMFI methods is increasing smoothly with the
the meaningless “extension” operation hasn’t been increasing number of transactions and the occupied
conducted to occupy the additional memory storage. peak memory usage is linearly related to the number
Compared with sparse dataset T10.I4.D1000K, the of transactions. In the compared four methods, the
memory usage of MFIs mining process on dense peak memory usage of RMFIsM method is lower than
dataset T30.I20.D1000K is much more. estDec, TMFI and DSM-MFI methods in a certain
Figures 5-b and 6-b show that the peak memory extent. The peak memory usage on dense dataset
usage of the compared four methods grows up T30.I20.D1000K is also much larger than that on
gradually with the increasing size of sliding window, sparse dataset T10.I4.D1000K.
the reason is that the number of frequent itemsets
Table 2. Accuracy rate of RMFIsM method.
Dataset min_sup T10 T30 Dataset |SW| T10 T30 Dataset Transactions T10 T30
0.05 87.2% 89.6% 200 91.3% 92.2% 300K 92.1% 93.4%
0.1 92.3% 93.4% 400 91.6% 92.6% 400K 92.4% 93.2%
0.15 95.2% 96.1% 600 91.9% 92.8% 500K 92.3% 93.5%
0.2 96.4% 96.9% 800 92% 93.1% 600K 92.2% 93.5%
0.25 96.8% 97.3% 1000 92.3% 93.5% 700K 92.4% 93.3%
0.3 97.1% 97.5% 1200 92.5% 93.6% 800K 92.1% 93.4%
0.35 97.2% 97.6% 1400 92.7% 93.7% 900K 92.2% 93.2%
0.4 97.3% 97.8% 1600 92.8% 93.9% 1000K 92.3% 93.5%
Computing, vol. 41, pp. 214-223, 2016. Saihua Cai is a Ph.D. student in
[4] Deypir M. and Sadreddini M., “A Dynamic College of Information and
Layout of Sliding Window for Frequent Itemset Electrical Engineering, China
Mining Over Data Streams,” Journal of Systems Agricultural University, China. He
and Software, vol. 85, no. 3, pp. 746-759, 2012. received the MS degree from
[5] Deypir M., Sadreddini M., and Tarahomi M., “An Jiangsu University, China, in 2016.
Efficient Sliding Window Based Algorithm for His major research interests include
Adaptive Frequent Itemset Mining over Data uncertain data management, data mining, outlier
Streams,” Journal of Information Science and detecting and software testing.
Engineering, vol. 29, no. 5, pp. 1001-1020, 2013.
[6] Guidan F. and Shaohong Y., “A Frequent Shangbo Hao is a Master Student in
Itemsets Mining Algorithm Based on Matrix in College of Information and
Sliding Window Over Data Streams,” in Electrical Engineering, China
Proceedings of 3rd International Conference on Agricultural University, China. His
Intelligent System Design and Engineering research interests include pattern
Applications, Hong Kong, pp. 66-69, 2013. mining and outlier detecting.
[7] Han M., Ding J., and Li J., “TDMCS: An
Efficient Method for Mining Closed Frequent
Patterns over Data Streams Based on Time Decay Ruizhi Sun is a Full Professor in
Model,” The International Arab Journal of College of Information and
Information Technology, vol. 14, no. 6, pp. 851- Electrical Engineering, China
860, 2017. Agricultural University, China. He
[8] Li H., Lee S., and Shan M., “Online Mining received his Ph.D. degree in
(Recently) Maximal Frequent Item sets Over Data Computer Science and Technology
Streams,” in Proceedings of 15th International from Tsinghua University, Beijing,
Workshop on Research Issues in Data China, in 2003. His major research interests include
Engineering: Stream Data Mining and agricultural data acquisition and processing
Applications, Tokyo, pp. 11-18, 2005. technology, computer network and applications,
[9] Lin M., Hsueh S., and Wang C., “Interactive workflow management and cloud computing.
Mining of Frequent Patterns in A Data Stream of
Time-Fading Models,” in Proceedings of 8th Gang Wu is an associate professor
International Conference on Intelligent Systems in Secretary of Computer Science
Design and Applications, Kaohsiung, pp. 513- Department, Tarim University,
518, 2008. China. His research interests mainly
[10] Mao G., Wu X., Zhu X., Chen G., and Liu C., involve agriculture information
“Mining Maximal Frequent Itemsets From Data processing technology, data mining,
Streams,” Journal of Information Science, vol. 33, agricultural remote sensing
no. 3, pp. 251-262, 2007. application.
[11] Nori F., Deypir M., and Sadreddini M., “A
Sliding Window Based Algorithm For Frequent
Closed Itemset Mining Over Data Streams,”
Journal of Systems and Software, vol. 86, no. 3,
pp. 615-623, 2013.
[12] Shin S., Lee D., and Lee W., “CP-Tree: An
Adaptive Synopsis Structure for Compressing
Frequent Itemsets Over Online Data Streams,”
Information Sciences, vol. 278, pp. 559-576,
2014.
[13] Yang J., Wei Y., and Zhou F., “An Efficient
Algorithm for Mining Maximal Frequent Patterns
over Data Streams,” in Proceedings of 7th
International Conference on Intelligent Human-
Machine Systems and Cybernetics, Hangzhou, pp.
444-447, 2015.