An Efficient Algorithm For Enumerating Closed Patterns in Transaction Databases
An Efficient Algorithm For Enumerating Closed Patterns in Transaction Databases
1 Introduction
Frequent pattern mining is one of the fundamental problems in data mining and has
many applications such as association rule mining [1, 5, 7] and condensed represen-
tation of inductive queries [12]. To handle frequent patterns efficiently, equivalence
classes induced by the occurrences had been considered. Closed patterns are maximal
patterns of an equivalence class.
This paper addresses the problems of enumerating all frequent closed patterns. For
solving this problem, there have been proposed many algorithms [14, 13, 15, 20, 21].
These algorithms are basically based on the enumeration of frequent patterns, that is,
enumerate frequent patterns, and output only those being closed patterns. The enumera-
tion of frequent patterns has been studied well, and can be done in efficiently short time
[1, 5]. Many computational experiments supports that the algorithms in practical take
very short time per pattern on average.
However, as we will show in the later section, the number of frequent patterns can
be exponentially larger than the number of closed patterns, hence the computation time
can be exponential in the size of datasets for each closed pattern on average. Hence, the
existing algorithms use heuristic pruning to cut off non-closed frequent patterns. How-
ever, the pruning are not complete, hence they still have possibilities to take exponential
time for each closed pattern.
Moreover, these algorithms have to store previously obtained frequent patterns in
memory for avoiding duplications. Some of them further use the stored patterns for
3
Presently working for Fujitsu Laboratory Ltd., e-mail: [email protected]
checking the “closedness” of patterns. This consumes much memory, sometimes ex-
ponential in both the size of both the database and the number of closed patterns. In
summary, the existing algorithms possibly take exponential time and memory for both
the database size and the number of frequent closed patterns. This is not only a theoreti-
cal observation but is supported by results of computational experiments in FIMI’03[7].
In the case that the number of frequent patterns is much larger than the number of fre-
quent closed patterns, such as BMS-WebView1 with small supports, the computation
time of the existing algorithms are very large for the number of frequent closed patterns.
In this paper, we propose a new algorithm LCM (Linear time Closed pattern Miner)
for enumerating frequent closed patterns. Our algorithm uses a new technique called
prefix preserving extension (ppc-extension), which is an extension of a closed pattern
to another closed pattern. Since this extension generates a new frequent closed pattern
from previously obtained closed pattern, it enables us to completely prune the unnec-
essary non-closed frequent patterns. Our algorithm always finds a new frequent closed
pattern in linear time of the size of database, but never take exponential time. In the
other words, our algorithm always terminates in linear time in the number of closed
patterns. Since any closed pattern is generated by the extension from exactly one of
the other closed patterns, we can enumerate frequent closed patterns in a depth-first
search manner, hence we need no memory for previously obtained patterns. Thereby,
the memory usage of our algorithm depends only on the size of input database. This
is not a theoretical result but our computational experiments support the practical effi-
ciency of our algorithm. The techniques used in our algorithm are orthogonal to the ex-
isting techniques, such as FP-trees, look-ahead, and heuristic preprocessings. Moreover,
we can add the existing techniques for saving the memory space so that our algorithm
can handle huge databases much larger than the memory size.
For practical computation, we propose occurrence deliver, anytime database reduc-
tion, and fast ppc-test, which accelerate the speed of the enumeration significantly. We
examined the performance of our algorithm for real world and synthesis datasets taken
from FIMI’03 repository, including the datasets used in KDD-cup [11], and compare
with straightforward implementations. The results showed that the performance of the
combination of these techniques and the prefix preserving extension is good for many
kinds of datasets.
In summary, our algorithm has the following advantages.
· Linear time enumeration of closed patterns
· No storage space for previously obtained closed patterns
· Generating any closed pattern from another unique closed pattern
· Depth-first generation of closed patterns
· Small practical computational cost of generation of a new pattern
The organization of the paper is as follows. Section 2 prepares notions and defini-
tions. In the Section 3, we give an example to show the number of frequent patterns can
be up to exponential to the number of closed patterns. Section 4 explains the existing
schemes for closed pattern mining, then present the prefix-preserving closure extension
and our algorithm. Section 5 describes several improvements for practical use. Section
6 presents experimental results of our algorithm and improvements on synthesis and
realworld datasets. We conclude the paper in Section 7.
2 Preliminaries
We give basic definitions and results on closed pattern mining according to [1, 13, 6].
Let I = {1, . . . , n} be the set of items. A transaction database on I is a set T =
{t1 , . . . , tm } such that each ti is included in I.PEach ti is called a transaction. We
denote the total size of T by ||T ||, i.e., ||T || = t∈T |t|. A subset P of I is called a
pattern (or itemset). For pattern P , a transaction including P is called an occurrence of
P . The denotation of P , denoted by T (P ) is the set of the occurrences of P . |T (P )|
is called the frequency of P, and denoted by f rq(P ). For given constant θ ∈ N, called
a minimum support, pattern P is frequent if f rq(P ) ≥ θ. For any patterns P and Q,
T (P ∪ Q) = T (P ) ∩ T (Q) holds, and if P ⊆ Q then T (Q) ⊇ T (P ).
Let T be a database and P be a pattern on I. For a pair of patterns P and Q, we
say P and Q are equivalent to each other if T (P ) = T (Q). The relationship induces
equivalence classes on patterns. A maximal pattern and a minimal pattern of an equiva-
lence class, w.r.t. set inclusion, are called a closed pattern and key pattern, respectively.
We denote by F and C the sets of all frequent patterns and the set of frequent closed
patterns in T , respectively. T
Given set S ⊆ T of transactions, let I(S) = T ∈S T be the set of items common
to all transactions
T in S. Then, we define the closure of pattern P in T , denoted by
Clo(P ), by T ∈T (P ) T . For every pair of patterns P and Q, the following properties
hold (Pasquier et al.[13]).
(1) If P ⊆ Q, then Clo(P ) ⊆ Clo(Q).
(2) If T (P ) = T (Q), then Clo(P ) = Clo(Q).
(3) Clo(Clo(P )) = Clo(P ).
(4) Clo(P ) is the unique smallest closed pattern including P .
(5) A pattern P is a closed pattern if and only if Clo(P ) = P .
Note that a key pattern is not the unique minimal element of an equivalence class,
while the closed pattern is unique. Here we denote the set of frequent closed patterns
by C, the set of frequent patterns by F, the set of items by I, and the size of database
by ||T ||.
For pattern P and item i ∈ P , let P (i) = P ∩ {1, . . . , i} be the subset of P
consisting only of elements no greater than i, called the i-prefix of P . Pattern Q is a
closure extension of pattern P if Q = Clo(P ∪ {i}) for some i 6∈ P . If Q is a closure
extension of P , then Q ⊃ P, and f rq(Q) ≤ f rq(P ).
From the theorem, we can see that frequent pattern mining based algorithms can
take exponentially longer time for the number of closed patterns. Note that such patterns
may appear in real world data in part, because some transactions may share a common
large pattern.
4 Algorithm for Enumerating Closed Patterns
We will start with the existing schemes for closed pattern enumeration.
Through these lemmas, we can see that all closed patterns can be generated by
closure extensions to closed patterns. It follows the basic version of our algorithm,
Algorithm Closure version
1. D := {⊥}
2. D0 := { Clo(P ∪ {i}) | P ∈ D, i ∈ I\P }
3. if D0 = ∅ then output D ; halt
4. D := D ∪ D 0 ; go to 2
which uses levelwise (breadth-first) search similar to Apriori type algorithms [1] using
closed expansion instead of tail expansion. We describe the basic algorithm in Figure 1.
Since the algorithm deals with no non-closed pattern, the computational cost depends
on |C| but not on |F|. However, we still need much storage space to keep D in memory.
A possible improvement is to use depth-first search instead of Apriori-style lev-
elwise search. For enumerating frequent patterns, Bayardo [5] proposed an algorithm
based on tail extension, which is an extension of a pattern P by an item larger than the
maximum item of P . Since any frequent pattern is a tail extension of another unique
frequent pattern, the algorithm enumerates all frequent patterns without duplications in
a depth-first manner, with no storage space for previously obtained frequent patterns.
This technique is efficient, but cannot directly be applied to closed pattern enumeration,
since a closed pattern is not always a tail-extension of another closed pattern.
We here propose prefix-preserving closure extension satisfying that any closed pat-
tern is an extension of another unique closed pattern unifying ordinary closure-expansion
and tail-expansion. This enables depth-first generation with no storage space.
We start with definitions. Let P be a closed pattern. The core index of P , denoted by
core i(P ), is the minimum index i such that T (P (i)) = T (P ). We let core i(⊥) = 0.
Here we give the definition of ppc-extension. Pattern Q is called a prefix-preserving
closure extension (ppc-extension) of P if
(i) Q = Clo(P ∪ {i}) for some i ∈ P , that is, Q is obtained by first adding i to P
and then taking its closure,
(ii) item i satisfies i 6∈ P and i > core i(P ), and
(iii) P (i − 1) = Q(i − 1), that is, the (i − 1)-prefix of P is preserved.
Theorem 2. Let Q 6= ⊥ be a closed pattern. Then, there is just one closed pattern P
such that Q is a ppc-extension of P .
1234 56
1 2 4 5 6
23 5
2 5 2 3 5
12 4 56 2 4 2 5 3 4 6 4 6
2 4
1 4 6
34 6 1 4 6 2 3 4 6
Fig. 2. Example of all closed patterns and their ppc extensions. Core indices are circled
Proof. Since i > core i(P ), we have T (P ) = T (P (i)). From Lemma 2, Clo(P (i) ∪
{i}) = Clo(P ∪ {i}) = Clo(Q), thus core i(Q) ≤ i. Since the extension preserves the
i-prefix of P , we have P (i − 1) = Q(i − 1). Thus, Clo(Q(i − 1)) = Clo(P (i − 1)) =
P 6= Q. It follows that core i(Q) > i − 1, and we conclude core i(Q) = i. t
u
Let Q be a closed pattern and P(Q) be the set of closed patterns such that Q is
their closure extension. We show that Q is a ppc-extension of a unique closed pattern
of P(Q).
Lemma 4. Let Q 6= ⊥ be a closed pattern, and P = Clo(Q(core i(Q) − 1)). Then,
Q is a ppc-extension of P.
Proof. Since T (P ) = T (Q(core i(Q)−1)), we have T (P ∪{i}) = T (Q(core i(Q)−
1) ∪ {i}) = T (Q(core i(Q))). This implies Q = Clo(P ∪ {i}), thus Q satisfies
condition (i) of ppc-extension. Since P = Clo(Q(core i(Q) − 1)), core i(P ) ≤ i − 1.
Thus, Q satisfies condition (ii) of ppc-extension. Since P ⊂ Q and Q(i − 1) ⊆ P, we
have P (i − 1) = Q(i − 1). Thus, Q satisfies condition (iii) of ppc-extension. t
u
proof of Theorem 2: From Lemma 4, there is at least one closed pattern P in P(Q)
such that Q is a ppc-extension of P. Let P = Clo(Q(core i(Q) − 1)). Suppose that
there is a closed pattern P 0 6= P such that Q is a ppc-extension of P 0 . From lemma 3,
Q = Clo(P 0 ∪{i}). Thus, from condition (iii) of ppc-extension, P 0 (i−1) = Q(i−1) =
P (i − 1). This together with T (P ) = T (P (i − 1)) implies that T (P ) ⊃ T (P 0 ). Thus,
we can see T (P 0 (i − 1)) 6= T (P 0 ), and core i(P 0 ) ≥ i. This violates condition (ii) of
ppc-extension, and is a contradiction. t
u
From this theorem, we obtain our algorithm LCM, described in Figure 3, which
generate ppc-extensions for each frequent closed pattern.
Since the algorithm takes O(||T (P )||) time to derive the closure of each P ∪ {i},
we obtain the following theorem.
Algorithm LCM(T :transaction database, θ:support )
1. call E NUM C LOSED PATTERNS(⊥) ;
Procedure E NUM C LOSED PATTERNS( P : frequent closed pattern )
2. if P is not frequent then Return;
2. output P ;
3. for i = core i(P ) + 1 to |I|
4. Q = Clo(P ∪ {i});
5. If P (i − 1) = Q(i − 1) then // Q is a ppc-extension of P
6. Call E NUM C LOSED PATTERNS(Q);
7. End for
Theorem 3. Given a database T , the algorithm LCM enumerates all frequent closed
patterns in O(||T (P )|| × |I|) time for each pattern P with O(||T ||) memory space.
The time and space complexities of the existing algorithms [21, 15, 13] are O(||T ||×
|F|) and O(||T || + |C| × |I|), respectively. As we saw in the example in Section 3, the
difference between |C| and |F|, and the difference between |C| × |I| and ||T || can
be up to exponential. As Compared with our basic algorithm, the ppc extension based
algorithm exponentially reduces the memory complexity when O(|C|) is exponentially
larger than ||T ||. In practice, such exponential differences often occur (see results in
Section 6). Thus, the performance of our algorithm possibly exponentially better than
the existing algorithms in some instances.
The computation time of LCM described in the previous section is linear in |C|, with a
factor depending on ||T || × |I| for each closed pattern P ∈ C. However, this still takes
a long time if it is implemented in a straightforward way. In this section, we propose
some techniques for speeding up frequency counting and closure operation. These tech-
niques will increase practical performance of the algorithms and incorporated into the
implementations used in the experiments in Section 6 of the paper, although indepen-
dent of our main contribution. In Figure 7, we describe the details of LCM with these
practical techniques.
Occurrence deliver reduces the construction time for T (P ∪ {i}), which is used for
frequency counting and closure operation. This technique is particularly efficient for
sparse datasets, such that |T (P ∪ {i})| is much smaller than |T (P )| on average. In a
usual way, T (P ∪ {i}) is obtained by T (P ) ∩ T (P ∪ {i}) in O(|T (P )| + |T (P ∪ {i})|)
time (this is known as down-project []). Thus, generating all ppc-extensions needs |I|
scans and takes O(||T (P )||) time.
Instead of this, we build for all i = core i(P ), . . . , |I| denotations T (P ∪ {i}),
simultaneously, by scanning the transactions in T (P ). We initialize X[i] := ∅ for all i.
A 2 3 4 5 7
B 1 3 4 5 6 8
C 2 3 5 6 7
T({5})
A A A
B B B B
C C C
X[4] X[5] X[6] X[7] X[8]
Fig. 4. Occurrence deliver: build up denotations by inserting each transaction to each its member
1 2 345
1 2 3 5 7 9 1 235 (× 2)
1 2 3 5 1 2359 (× 2)
1 2 345 89 1 2 (× 2)
1 2 7 1 23 9 (× 1)
1 2 3 9
1 2 6
For each t ∈ T (P ) and for each i ∈ t; i > core i(P ), we insert t to X[i]. Then, each
X[i] is equal to T (P ∪ {i}). See Fig. 4 for the explanation. This correctly computes
T (P ∪{i}) for all i in O(||T (P )||). Table 1 shows results of computational experiments
where the number of item-accesses were counted. The numbers are deeply related to the
performance of frequency counting heavily depending on the number of item-accesses.
We also use anytime database reduction for closure operation. Suppose that we have
closed pattern P with core index i. Let transactions t1 , ..., tk ∈ T (P ) have the same
i-suffix, i.e., t1 ∩ {i, ..., |I|} = t2 ∩ {i, ..., |I|} =, ..., = tk ∩ {i, ..., |I|}.
Lemma 5. Let j; j < i be an item, and j 0 be items such that j 0 > i and included in all
t1 , ..., tk . Then, if j is not included in at least one transaction of t1 , ..., tk , j 6∈ Clo(Q)
holds for any pattern Q including P ∪ {i}.
Proof. Since T (Q) includes all t1 , ..., tk , and j is not included in one of t1 , ..., tk .
Hence, j is not included in Clo(Q). t
u
A 2 3 4 5 7
B 1 3 4 5 6 8
C 2 3 5 6 7
D 1 2 3 4 5 6 7 8
T({5})
E 1 2 3 5 7 8
F 2 3 4 5 6 7 8
Fig. 6. Transaction A has the minimum size. Fast ppc test accesses only circled items while
closure operation accesses all items
According to this lemma, we can remove j from t1 , ..., tk from the database if j is
not included
T in at least one of t1 , ..., tk . By removing all such items, t1 , ..., tk all be-
come h=1,...,k th . Thus, we can merge them into one transaction similar to the above
reduction. This reduces the number of transactions as much as the reduced database for
frequency counting. Thus, the computation time of closure operation is shorten drasti-
cally. We describe the details on anytime database reduction for closure operation.
1. Remove transactions not including P
2. Remove items i such that f rq(P ∪ {i}) < θ
3. Remove items of P
4. Replace the transactions
T T1 , ..., Tk having the same i-suffix by the intersection,
i.e., replace them by {T1 , ..., Tk }.
In fact, the test requires an adjacency matrix (sometimes called a bitmap) repre-
senting the inclusion relation between items and transactions. The adjacency matrix
requires O(|T | × |I|) memory, which is quite hard to store for large instances. Hence,
we keep columns of the adjacency matrix for only transactions larger than |I|/δ, where
δ is a constant number. In this way, we can check whether j ∈ t or not in constant time
if j is large, and also in short time by checking items of t if t is small. The algorithm
uses O(δ × ||T ||) memory, which is linear in the input size.
6 Computational Experiments
This section shows the results of computational experiments for evaluating the practi-
cal performance of our algorithms on real world and synthetic datasets. Fig. 8 lists the
datasets, which are from the FIMI’03 site (http://fimi.cs.helsinki.fi/):
retail, accidents; IBM Almaden Quest research group website (T10I4D100K); UCI ML
repository (connect, pumsb); (at
http://www.ics.uci.edu/ mlearn/MLRepository.html) Click-stream
Data by Ferenc Bodon (kosarak); KDD-CUP 2000 [11] (BMS-WebView-1, BMS-POS)
( at http://www.ecn.purdue.edu/KDDCUP/).
To evaluate the efficiency of ppc extension and practical improvements, we imple-
mented several algorithms as follows.
time(sec), #solutions(1000)
occ occ
100
occ+dbr occ+dbr
10
occ+ftest occ+ftest
#freq closed #freq closed
#freqset 10 #freqset
0.64% 0.32% 0.16% 0.08% 0.04% 0.02% 0.01%
1
0.16% 0.08% 0.04% 0.02% 0.01% 0.005% support
support 1
1000 10000
BMS- kosarak
WebView1
1000
100
freqset freqset
time(sec), #solutions(1000)
time(sec), #solutions(1000)
straight straight
100
occ occ
10
occ+dbr occ+dbr
occ+ftest 10 occ+ftest
#freq closed #freq closed
1 #freqset
#freqset 1
0.16% 0.08% 0.06% 0.04% 0.02% 0.01%
1%
0%
0%
0%
0%
0%
9%
support 8%
support
0.8
0.6
0.4
0.2
0.1
0.0
0.0
0.1 0.1
1000 1000
retail accidents
100 100 freqset
freqset
time(sec), #solutions(1000)
straight
time(sec), #solutions(1000)
straight occ
occ
10 10 occ+dbr
occ+dbr
occ+ftest
occ+ftest
#freq closed
#freq closed
#freqset
1 #freqset 1
70% 60% 50% 40% 30% 20% 10%
2%
6%
8%
4%
2%
1%
%
05
0.3
0.1
0.0
0.0
0.0
0.0
support
0.0
support
0.1 0.1
10000 10000
connect pumsb
1000 1000
freqset
freqset
time(sec), #solutions(1000)
time(sec), #solutions(1000)
straight
straight
100 100 occ
occ
occ+dbr
occ+dbr
occ+ftest
10 occ+ftest 10
#freq closed
#freq closed
#freqset
1 #freqset 1
90% 80% 70% 60% 50% 40% 30% 20% 10% 90% 80% 70% 60% 50% 40%
support support
0.1 0.1
· occ+fchk: LCM with occurrence deliver, anytime database reduction for
frequency counting, and fast ppc test
The figure also displays the number of frequent patterns and frequent closed pat-
terns, which are written as #freqset and #freq closed. The algorithms were implemented
in C and compiled with gcc 3.3.1. The experiments were run on a notebook PC with mo-
bile Pentium III 750MHz, 256MB memory. Fig. 9 plots the running time with varying
minimum supports for the algorithms on the eight datasets.
From Fig. 9, we can observe that LCM with practical optimization (occ, occ+dbr,
occ+fchk) outperforms the frequent pattern enumeration-based algorithm (freqset). The
speed up ratio of the ppc extension algorithm (straight) against algorithm freqset totally
depends on the ratio of #freqset and #freq. closed. The ratio is quite large for several
real world and synthetic datasets with small supports, such as BMS-WebView, retails,
pumsb, and connect. On such problems, sometimes the frequent pattern based algorithm
takes quite long time while LCM terminates in short time.
Occurrence deliver performs very well on any dataset, especially on sparse datasets,
such as BMS-WebView, retail, and IBM datasets. In such sparse datasets, since T ({i})
is usually larger than T (P ∪ {i}), occurrence deliver is efficient.
Anytime database reduction decreases the computation time well, especially in
dense datasets or those with large supports. Only when support is very small, closed
to zero, the computation time were not shorten, since a few items are eliminated and
few transactions become the same. In such cases, fast ppc test performs well. However,
fast ppc test does not accelerate speed so much in dense datasets or large supports.
For a detailed study about the performance of our algorithm compared with other al-
gorithms, consult the companion paper [18] and the competition report of FIMI’03 [7].
Note that the algorithms we submitted to [18, 7] were old versions, which does not in-
clude anytime database reduction, thus they are slower than the algorithm in this paper.
7 Conclusion
We addressed the problem of enumerating all frequent closed patterns in a given trans-
action database, and proposed an efficient algorithm LCM to solve this, which uses
memory linear in the input size, i.e., the algorithm does not store the previously ob-
tained patterns in memory. The main contribution of this paper is that we proposed
prefix-preserving closure extension, which combines tail-extension of [5] and closure
operation of [13] to realize direct enumeration of closed patterns.
We recently studied frequent substructure mining from ordered and unordered trees
based on a deterministic tree expansion technique called the rightmostexpansion [2–
4]. There have been also pioneering works on closed pattern mining in sequences and
graphs [17, 19]. It would be an interesting future problem to extend the framework of
prefix-preserving closure extension to such tree and graph mining.
Acknowledgment
We would like to thank Professor Ken Satoh of National Institute of Informatics and
Professor Kazuhisa Makino of Osaka University for fruitful discussions and comments
on this issue. This research was supported by group research fund of National Institute
of Informatics, Japan. We are also grateful to Professor Bart Goethals, and people sup-
porting FIMI’03 Workshop/Repository, and the authors of the datasets for the datasets
available by the courtesy of them.
References
1. R. Agrawal,H. Mannila,R. Srikant,H. Toivonen,A. I. Verkamo, Fast Discovery of Association
Rules, In Advances in Knowledge Discovery and Data Mining, MIT Press, 307–328, 1996.
2. T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, S. Arikawa, Efficient Substructure
Discovery from Large Semi-structured Data, In Proc. SDM’02, SIAM, 2002.
3. T. Asai, H. Arimura, K. Abe, S. Kawasoe, S. Arikawa, Online Algorithms for Mining Semi-
structured Data Stream, In Proc. IEEE ICDM’02, 27–34, 2002.
4. T. Asai, H. Arimura, T. Uno, S. Nakano, Discovering Frequent Substructures in Large Un-
ordered Trees, In Proc. DS’03, 47–61, LNAI 2843, 2003.
5. R. J. Bayardo Jr., Efficiently Mining Long Patterns from Databases, In Proc. SIGMOD’98,
85–93, 1998.
6. Y. Bastide, R. Taouil, N. Pasquier, G. Stumme, L. Lakhal, Mining Frequent Patterns with
Counting Inference, SIGKDD Explr., 2(2), 66–75, Dec. 2000.
7. B. Goethals, the FIMI’03 Homepage, http://fimi.cs.helsinki.fi/, 2003.
8. E. Boros, V. Gurvich, L. Khachiyan, K. Makino, On the Complexity of Generating Maximal
Frequent and Minimal Infrequent Sets, In Proc. STACS 2002, 133-141, 2002.
9. D. Burdick, M. Calimlim, J. Gehrke, MAFIA: A Maximal Frequent Itemset Algorithm for
Transactional Databases, In Proc. ICDE 2001, 443-452, 2001.
10. J. Han, J. Pei, Y. Yin, Mining Frequent Patterns without Candidate Generation, In Proc. SIG-
MOD’00, 1-12, 2000
11. R. Kohavi, C. E. Brodley, B. Frasca, L. Mason, Z. Zheng, KDD-Cup 2000 Organizers’ Re-
port: Peeling the Onion, SIGKDD Explr., 2(2), 86-98, 2000.
12. H. Mannila, H. Toivonen, Multiple Uses of Frequent Sets and Condensed Representations,
In Proc. KDD’96, 189–194, 1996.
13. N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Efficient Mining of Association Rules Using
Closed Itemset Lattices, Inform. Syst., 24(1), 25–46, 1999.
14. N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Discovering Frequent Closed Itemsets for
Association Rules, In Proc. ICDT’99, 398-416, 1999.
15. J. Pei, J. Han, R. Mao, CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets,
In Proc. DMKD’00, 21-30, 2000.
16. R. Rymon, Search Through Systematic Set Enumeration, In Proc. KR-92, 268–275, 1992.
17. P. Tzvetkov, X. Yan, and J. Han, TSP: Mining Top-K Closed Sequential Patterns, In
Proc. ICDM’03, 2003.
18. T. Uno, T. Asai, Y. Uchida, H. Arimura, LCM: An Efficient Algorithm for Enumerating Fre-
quent Closed Item Sets, In Proc. IEEE ICDM’03 Workshop FIMI’03, 2003. (Available as
CEUR Workshop Proc. series, Vol. 90, http://ceur-ws.org/vol-90)
19. X. Yan and J. Han, CloseGraph: Mining Closed Frequent Graph Patterns, In Proc. KDD’03,
ACM, 2003.
20. M. J. Zaki, Scalable Algorithms for Association Mining, Knowledge and Data Engineering,
12(2), 372–390, 2000.
21. M. J. Zaki, C. Hsiao, CHARM: An Efficient Algorithm for Closed Itemset Mining, In
Proc. SDM’02, SIAM, 457-473, 2002.