0% found this document useful (0 votes)
23 views6 pages

Multi-Dimensional Model-Based Clustering For User-Behavior Mining in Telecommunications Industry

The document presents a novel model-based clustering technique for analyzing customer churn behaviors in the telecommunications industry, particularly focusing on the challenges of long and multidimensional sequences of user data. It highlights the limitations of traditional clustering methods and proposes a first-order Markov model to effectively identify user transitional behaviors. The research aims to improve understanding of customer behavior and enhance data mining techniques in a sector facing significant churn issues.

Uploaded by

Olfa Trabelsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views6 pages

Multi-Dimensional Model-Based Clustering For User-Behavior Mining in Telecommunications Industry

The document presents a novel model-based clustering technique for analyzing customer churn behaviors in the telecommunications industry, particularly focusing on the challenges of long and multidimensional sequences of user data. It highlights the limitations of traditional clustering methods and proposes a first-order Markov model to effectively identify user transitional behaviors. The research aims to improve understanding of customer behavior and enhance data mining techniques in a sector facing significant churn issues.

Uploaded by

Olfa Trabelsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004

MULTI-DIMENSIONAL MODEL-BASED CLUSTERING FOR


USER-BEHAVIOR MINING IN TELECOMMUNICATIONS INDUSTRY
YI-MING YANG‘, HUI WAN@, LEI LI’, TIAN-YI LIZ,WEN-MIN LIZ
QIANG YANG-’, WE1 LV PING HUANG

‘Software Insitution, Zhongshan University. Guangzhou, China.


’Guangzhou E-DM Tech Co. Ltd. Guangzhou, China.
’Computer Science, Hong Kong University of Science and Technology. Hong Kong, China.
4Guangdong Telecom Academy of Science and Technology . Guangzhou, China.
E-MAIL: [email protected] ,[email protected], [ Wanghui,ian lee,[email protected]
[email protected], [email protected]

Abstcack mining has been a central focus of miicb research and


We develop an innovative sequential data mining system applications. Because it is difficult to provide labeled
for mining the customers’ cbnrning behaviors for the information manually for massive databases and
telecommunicationsindustry. Recently, an increasing number Web-based systems, many researchers have tumed to
OP telecommunications customers are switching from one applying unsupervised learning on sequence data, using
senice or service provider to another. Thi., phenomenon is
called ‘churn’, which is a major cause of corporations‘ loss of methods such as sequence clustering.
profitability. It is important for a telecommunications Sequence clustering problems can be stated as follows:
company to find out the transitional behavior of its customers given a collection of sequences where each of which is of
through data mining. Our approach is to use a model-hased the form (XI, x% ..., x“], find common user transitional
clustering method, extended to handle multi dimensional data, behaviors in the form of clusters, such that the distance
to automatieally and efficiently partition the customer between clusters is large, and the distance between
behavior according to their behavior. We model this problem sequences within a cluster IS small. Even though many
as a sequential clustering problem, and present an etYective algorithms have been designed at the past for clustering
solution for solving the problem when the elements in the (e.g., the K-means algorithm), in practice, the problem of
sequencer are oP a multidimensional natnre. We provide
theory and algorithm for the task, and empirically finding good clusters that satisfy all of these constraints
demonstrate that the method M effective in mining the have been proven difficult. In our research, we study the
customers data for the telecommunicationsindustry. customer-chum behavior in telecommunications industry in
China. In particular, we wish to mine the transitional
Keywords: behavior of users from one telecommunication service to
Machine Learning for Sequential User Behavior; another using sequence mining techniques. We have found
Sequential data mining; Model-based clustering; Telecom- that the traditional clustering methods suffer from several
munications Applications. shortcomings.
First, much previous work has assumed that the
1. Introduction sequences are relatively short and even in length. For
example, a typical Web-user session is on the order of ten
In data mining and knowledge discovery, learning or at most several dozen symbols long (with an exception
about sequences has captured the special attention of many in biological sequences). In our work, we have found that
researchers and practitioners. Many techniques and the user sessions are typically very long and extremely
applications have been explored in the ast, including uneven, where some sequences reach a maximum length of
modeling using Finite State Automata [””’. [”, 14’. [61, several thousand, while others are several symbols long.
Markov Model-based classification and clustering for Web Second, while previous techniques assumed that each
mining [’I. ‘81.[91, association-rule based methods for element of a sequence is a single or low-dimensional vector,
market-basket analysis[’o’.[‘%nd N-gram based approaches such as a single Web-page ID, in our application we have
for Web click-stream prediction and Web-object found that the dimensionality of each element can be quite
prefetching “*’. ‘”’
[I4’. Of these past work, user-behavior high; in fact it can be the number of attributes we

0-7803-8403-2/04/$20.00WOO4 IEEE
1650

ed licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restriction
Proceedings of the Third International Conference on Machime Learning and Cybernetics, Shanghai, 26-29 August 2004

accumulate for each customer in data warehouse system. (unknown) period of time, following some hidden
Third, while in other applications the sequences have clear motivations. The telecom databases logged the services of a
boundaries, such as idle time between Webpage clicks, user along with the time stamp and other information such
between user sessions, in our work we do not have such as the duration, total cost, registration information. etc. for
clear indicators. Finally, our dataset is of very large sizes, using each service. The log data can be mated as
making it important to search for efficient methods for sequences of elements, where each element is a
clustering. These new challenges motivate us to explore multi-dimensional data record. Hence, the telecom problem
novel sequence mining techniques. can be considered as a sequential-mining problem.
In this paper, we present a novel model-based Specifically for our research, we are interested in fmding
clustering technique for high-dimensional sequences in our user clusters from these sequences
analyzing users' chum behaviors in the telecommunications
industry. We show that while typically such an industry has 3. Relatedwork
(very large data sets where the sequences have features that
we point out above, our model-based clustering method is For the sequence mining problem, researchers in the
very effective in finding out who is more likely to chum past have applied probabilistic approaches. These
along with their corresponding chuming behaviors. approaches use the most frequently occurring pattems from
This paper is organized as follows: In the next section, all the possible pattems as a basis for analyzing user
we describe the problem domain. In Section 3, we review behaviors. Network systems researchers have been using
previous work on model-based clustering, and in Section 4 Markov models and N-grams models to construct
we describe the main tecbniques. In section 4, we show the sequential classifiers. For example, Su et al. [I3] performed
experimental setup and results. Finally, we conclude our an empirical study on the trade-offs between precision and
contribution with a discussion of future work in Section 6. applicability of different N-gram models, who showed that
longer N-gram models could make more accurate
2. Problem Domain Description prediction as compared to shorter ones at the expense of
lower coverage. Pitkow et d. 'I5' suggested a way to make
We solve a special problem from a telecom company predictions based on K'*-order Markov models.
in China, which has a data warehouse that maintains In clustering, Mobasher et al. [I6] applied a point-based
historical customer behavior in the terabyte level every transaction clustering method for the Web personalization
month. The data holds the records that represent dozens of problem by grouping transactions that are similar together
services from the company itself to its competitors. In based on co-occurrence pattems of UU. references. Han et
mainland China, there are at least six telecom companies al. 11'1,1181 developed an association-rule-based hyper-graph
providing over 20 services of the same kind for the end clustering algorithm. Nasraoui et al. [I9] presented a
user - causing the so-called the customer-chum problem. hierarchical clustering technique based on the concept of
The telecom industry is very interested in finding out the genetic niches called Hierarchical Unsupervised Niche
customer transitional behavior through data mining. Clustering(HUNC). Their work is a point-based approach
In order to study user behavior, the traditional rather than the path-based approach and the dimension n is
approach of the telecom company is to use SQL scripts in the total number of valid URLs. Some researchers take the
their analysis. For example, if they 'think' that people using sequential relationship in the data into account. Hay et al!*O1
service S. and s b is a group of customers who tends to leave used an alignment method to solve the clustering
from S. to s b or from Sb to Sa, then they would probe the navigation problem. They illustrated how to cluster
data warehouse with a SQL scripts like this: navi ation pattems using a Sequence Alignment Method
SELECT * FROM record-tableWHERE service-id=S, and service-id=Sa sm[dl,instead of clustering users by means of a Euclidean
and count-S. > 50 and count-S, > 103 and service-time > ...; distance measure. Web researchers such as Cadez et al."'
Clearly, this approach has several shortcomings. For
grouped user behaviors by first-order Markov models.
example, how could we know services like S. and S,, should
These models are trained by an Eh4 algorithm [221. These
be grouped together, but not S, and S, or others? How could clustering algorithms reveal insight on users'
we know the boundary for each group or clustering ahead
Web-browsing behavior through visualization.
of time, should be set at 50 but not other values? What is
The traditional clustering algorithms such as K-means
the period of validity of those values? Without using data
or K-Medoids suffer from the fact that they model each
mining techniques, these answers are hard to come by.
data item as a pint; thus they cannot handle sequences too
Taking a closer look, telecommunications users often
well, nor can they deal with irregular sharps and differing
jump from one indushy to another within a certain sizes of different clusters in the data. Consider a data set D

1651

d licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restrictio
Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004
. .
consisting of N sequences, D=(Sj, ...So...S,J, where which are complex structures. However, the original model
Si={x,',. is a sequeqce of length Lithat is composed of assumes that each state is a simple symbol. In the next
potentially multivariate items &.The problem of clustering section, we describe our state model in more detail.
sequences is how to discover the natural groupings in the
sequential data. This is analogous to clustering in 4.2. The Feature Vector State Model
multivariate feature space which is normally handled by
methods such as K-means and Gaussian mixtures. However, We introduce the concept of a session fmt. A session
we want to cluster the sequences s rather than the feature x. is defined as a sequence boundary from the whole
Moreover, the sequences can be of different lengths. It is historylog of a user. It is a trace of states (called services in
not clear what a meaningful distance metric is for sequence telecommunications industry) with a unique index number
comparison. for each user during the service period. We define the
period is one month in OUT application. More fo?mally, a
4. Our Markov Model-Based Clustering Algorithm user session s is defined as, S=<PjP ,... P,7, where n is
an integer that stands for the service in session S, P; is an
4.1. Problem Statement arbitrary service and Pi.j is used just before Pi.
For ease of representation, a state Pi can be converted
More formally, a sequence is defined as Si=PjP +.Pa, to a feature vector vi We use a capital letter P to represent
where Pk are telecom services that a user used in a certain the whole state space and V to represent the whole
period. If we use the discrete integer numbers to denote the feature-vector space. Thus, for any page sequence S, we
individual services, the sequences from service-log is have a corresponding feature-vector sequence v for its
exemplified in Table , representation: S=<PIPZ...P p , pi & P , and
In our work, we use a first-order Markov model for
clustering. Formally speaking, a first-order Markov model v=<vIv, ...V">, vi E v
is a model that assumes the probability distribution over the Furthermore, we define a feature vector vi as a record
next state only depends on the current state (and not on the consisting of several features. For an example, a vector vi
previous ones). Let S, be the system's.state at time step f; in can be a triple:
our telecommunications dataset a state corresponds to a <fnumeric-feature/,(categorical-feature/,(range-feature/>.
customer record. A first-order Markov model is a triple The numeric-feature set holds all the features that
(Q,A.n), where: Q = l q j , q 2 .....qJ is a set of states: A is the announced as numeric feature; categorical-feature set
transition probability matrix, where fi~=P(S,=@,.j=q,) is contains all the features that announced as discrete feature;
the probability of transition from a state qi to a state q; , the range-feature is a special nnmet'ic-feature'that has lower
which is assumed io be stationary for all t>O. n is the and upper bound.
initial probability vector, where 77 ,=P(Spqi) is the As an example, we have two sessions from users uj
probability that the initial state is qi and y2 in @%?! %e&IJ183a1. . We treat the athibutes
'age' as a numeric feature, 'gender', 'service', 'weekday',
'month' as categorical features and 'duration' as a range

'.
feature.

Table 2. Example of Telecom Record, two sessions: ut


Given a set of observed sequences, the maximum-
likelihood estimate of an initial probability K , is the
fraction of sequences that start in state q; The
maximum-likelihood estimate of the transition probability
a. is the fraction of visits to q, that immediately followed
by a transition to e.. Here, Q is the categoly space, matrix A
holds all the transitional probabilities among categories and
Z? is the initial probability vector for categories.
The advantage of using a fmt-order Markov model is
that onr clustering algorithm uses the transitional (The 'time' is the seconds from 1970-01-01 0O:M):OO. We
information between items to form clusters. In our removed content to protect trade secret.).
application, we deal with states that are customer records,

1652

d licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restrictio
Proceedings of the Third International Conference on Machine Learning and Cybemetics, Shanghai, 26-29 August 2004

For a different feature set, we use a different similarity Each state v; takes values from a discrete alphabet
function to measure any two feature vectors, as follows: v/ E [I,...,MI.We write e=UZ,e',O' > where:
a). For numeric features:
7c is a vector of K mixture weights.

d
s,= 2z (Papi -gapJ) .wheregapk= valuek-Meun.
(gapj + gap )
w
W 8' is a set of K initial state probability vectors.
eT
is a set of K transition matrices.
b) For categorical features:

e). For range features:


I
$,
I,vaiue, =valuej
0, value, # vulue,
1 the
In a general situation, we have K clusters. We obtain
e
by three update rules for d,eT to maximize
the Q function. But for the sake of space, we skip these
valuFMeear three formulas.
s,=z/=.where
2z- c = uppeplowel
6. Experimental Setup and Analysis
Hence, the total similarity: S=(S.+S~+S,)/(C.+Cc+C,),
where c(x) is the size of each feature set.
6.1. The Experimental Setup
5. Mixture of Markov model-based clustering
We selected records with only four services as a test
algorithm
bed. 'The services are called SO, SI,S2 and S3.The records
of these services are very unbalanced. For example, in .the
Once we have defined the whole feature vector,space
original records, if we just count the frequency of the
and extracted the user se3sions which are the feature-vector
services, the total size of SO is larger than 8 7 8 , while the
sequences, we can address the clustering issue at the
lowest is only about 3% (SI).K-means clustering algorithm
feature-vector level. This is nontrivial since the data of our
works poorly on such data. We extracted sessions from
feature-vector approach is multidimensional while the
these records according to the services a user used within
traditional Markov model-based approach is just
one month. The length of session is not uniform either. We
one-dimensional. Our clustering algorithm will cluster the restricted the shortest sessions to have a length of three.In
user sessions in the following steps:
our data, the longest session has a length over 1,300
services. The average length is 55. Other model-based
Stepl. Cluster the feature-vector sequences into K-clusters algorithms such as the HMM or sequence alignment
by K-mixture Markov models using an EM methods has very high computational cost on such
algorithm on the training data. sequences.
Step2. For each sequence s in the test data, calculate the We chose five additional features from the original
probability values that the sequence belongs to the records. They are user-id, duration, service-id, weekday-id
K clusters. Assign S to the cluster with the highest and the start time. The user-id and time are used to identify
score. user sessions which are the sequences for the future mining
A mixture model for v with K components has the process. The total number of sequences is over 2,300. The
f o r m : p ( v i e ) = i p ( c , ie)p,(vic,,e). w h e ~pfCkl
k=I
os duration means how long a customer stayed with the
service. The weekday-id denotes the weekday index (from
0 to 6). The time feature are the seconds since 1970-01-01
the marginal Probability of the k'* c1uster.x P(ctl b ) = l , 00:00:00. Furthermore, we treated the duration as a range
2'(ckl cb 8 ) is the stztktical model describing the behavior feature, weekday-id as a categorical feature and'used the
for users in the !icluster, ' * and 4 denotes the parameters similarity functions on them respectively.
of the model. Here, V = V , V ~ ...vL is an arbitrarily long
sequence of feature vectors. We assume that each model 6.2. Result Analysis
component is a first-order Markov model, described
We show the experimental results here. Our purpose is
by p k(v I ct ,e). This model captures the order of users'
to mine meaningful user groups either in terms of their
requests, including their initial'requests and the dependency transitional behavior or their gmnp sizes. First, consider the
between two consecutive requests. result obtained by the K-Means method (shown in Figure :
Let d,Fh={vl,...,/) be a set of sequences, where each left two columns). We have four services marked in four
sequence vi consists of Li observed states vi+; ,...,vtii). different colors. Each of the two boxes describes a cluster,
which corresponds to a customer group. Figures in the left

1653

d licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restrictio
Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004

column are the real transitional behavior (the transaction


graph) calculated from the clustering result. We normalized
the left ones into the right side cells. All the figures are
translation graphs which denote the probability that a
telecom service a user may choose a next service. For
example, in the first user group from Figure : left two
columns, we have a column labeled as SO, which means
that a user is using the service SO (the red column). The
lengths of each color in this column correspond to the
prohahility that the user will jump to another service. In
this case, the customer tends to choose a next service SO Figure 1. Four services in four colors: SO red; SI blue; s2 in
with a probability over 99%. and s3 yellow.
The biggest problem with K-M&s clustering is the
very unbalanced data as we mentioned before. This is I. Conclusion
supported by our experimental result, where three poor
customer groups are shown in Figure : Left two columns. Sequence clustering has been proven useful for many
Their sizes over the original data are: 0.658,0.18% and applications. In this paper, we have presented a novel
99.82%. The quality of the K-means clustering result also model based clustering algorithm that integrates sequence
suffers severely; all the clu ters are the same type. Clearly, clustering with multi-dimensional data mining. Our method
these K-means results are meaningless and useless. is a generalization of the 1st-order Markov model-based
Consider our model-based clustering (E-DM) results clustering algorithm, which can deal with more natural data
(shown in Figure : nght two columns). Services are denoted with additional attributes. Compared with other clustering
in different colors: SO in red; SI in blue; S2 in green and S3 methods such as HMM, sequence alignment methods or
in yellow. The purpose is to group the users with similar manually issuing SQL scripts, our algorithm generates
transitional behavior. We expected to observe some very high-quality results from huge data sets in an efficient way.
different groups from those obtained by K-means. The sizes In the future, we wish to explore the integration of
from the E-DM method are 4%. 5.47% and 90.53% which classification algorithms with model-based clustering for
fits the real situation much better as compared to K-Means. identifying the likely customers who are churning for
The third group (the last line) corresponds to the loyal competitors. Such a behavior can be obtained thmugh
customers who stay with service SO. The second cluster is semi-supervised learning, which the model clusters can he
made up of loyal users of service type S2. We see that SO strengthened using a small sample of laheled dataset.
and SI are the competitors to each other slightly. However,
in the fmt service group, the customers of S2 and S3 may Acknowledgements
jump to the service SO.
Our results can give some guidance to the marketing The authors are grateful to Dr Zhaohui Wang for
departments in telecommunications companies: if the suggestions that finished the experiment and improved the
purpose is to capture customers who use the service of type paper.
S2, we shall focus on the first cluster group. Furthermore, if
we adopt a strategy of giving a preferential price to References
customers in the first group, it is much easier to change the
loyal customers from service S2 to SO compared with group [I] J. Borges and M. Levene, “Data mining of user
two in the E-DM result table. Hence, the E-DM results are navigation pattems,”in The Workshop on Web Usage
very meaningful for the decision maker. Analysis and User Profiling(WEBKDD99). San
Left six figures: K-Means results, Each line describes a Diego, CA, August 1999, pp. 31-36.
clustering result corresponding to a customer group. [2] M. Levene and G. Loizou, “A probabilistic approach
Figures in the left column are the real transitional behavior. to navigation in hypertext,” Information Sciences, vol.
We normalized the left column and put the results in the 114, no. 1-4, pp. 165-186, 1999.
right cells. [3] Z. Kohavi, Switching and Finite Automata Theory
Right six figures: E-DM results, which has same meaning (Second Edition). New York: McGraw-Hill Book
as the left six figures. Company, 1978.

1654

d licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restrictio
Proceedmp of the Third International Conference on Machine Learning and Cybernetics, Shanghai, 26-29 August 2004

P. Hingston, “Using finite state automata for sequence [13] Z. Su, Q. Yang, Y. Lu,and H.-1. Zhang, “What next:
mining,” in The Twenty-Fifth Australasian A prediction system for web request using n-gram
Conference on Computer Science, vol. Volume 24 sequence models,’’ in Web Information Systems
Issue 1. Inc. Darlinghuist, Australia, Australia: Engineering, 2000, pp. 214-221.
Australian Computer Society, January 2002, pp. [I41 M. Perkowitz and 0. Etzioni, “Towards adaptive Web
105-110. sites: conceptual framework and case study,”
C.-N. Hsu and M.-T. Dung, “Generating finite-state Computer Networks, vol. 31, no. 11-16, pp.
transducers for semi-structured data extraction from 1245-1258, 1999.
the web,” Information Systems, vol. 23, no. 8, pp. [15] 1. Pitkow and P. Pirolli, “Mining longest repeating
521-518.199R.
~~~ ..~,~ subsequences to predict world wide web slllfng,” in
R. L. Rivest and R. E. Schapire, “Diversity-based Second USENIX Symposium on Internet
inference of linite automata,” in Roc. 28th Annu. Technologies and Systems. Boulder, CO, Oct 1999, pp.
IEEE Sympos. Found. Comput. Sci. IEEE Computer 139-150.
Society Press, Los Alamitos, CA, 1987, pp. 78-87. [16] B. Mobasher, R. Cooley, and I. Srivastava,
C. R. Anderson and P. Domingos, “Relational markov “Automatic personalization based on Web usage
models and their application to adaptive web mining,” Communications of the ACM, vol. 43, no. 8,
navigation,” in The Eighth ACM SIGKDD pp. 142-151,2OOO.
International Conference on Knowledge Discovery [17] E. hong Han, G. Karypis, V. Kumar, and B. Mobasher,
and Data Mining (KDD-2002). Edmonton, Alberta, “Hypergraph based clustering in high-dimensional
Canada: ACM Press, July 2002, pp.143-152. data sets: A summary of results,’’ Data Engineering
I. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. Bulletin, vol. 21, no. 1, pp. 15-22, 1998.
white, “Visualization of navigation pattems on a web [I81 E.-H. Han, G. Karypis, V. Kumar, and’B. Mobasher,
site using model-based clustering,” in Knowledge “Clustering based on association rule hypergraphs,” in
Discovery and Data Mining, March 2000, pp. Research Issues on Data Mining and Knowledge
280-284. Discovery, 1997, pp. 9-13.
P. Smyth, “Clustering sequences with bidden markov [19] 0. Nasraoui and R. Krishnapuram, “Robust
models,” in Advances in Neural Information multi-resolution web usage data mining using
Processing Systems, M. C. Mozer, M. I. Jordan, and T. hierarchical unsupervised niche clustering,” in
Petsche, Eds., vol. 9. The MIT Press, 1997, p.648. ANNIE (Artificial Neural Networks In Engineering)
[lo] R. Agrawal and R. Srikant, “Mining sequential Conference, St. Louis, Nov 2001, pp. 369-374, won
patterns,” in Eleventh Intemational Conference on the Best Paper Award.
Data Engineering, P. S. Yu and A. S. P. Chen, Eds. [201 B. Hay, G. Wets, and K. Vanhoof, “Clustering
Taipei, Taiwan: IEEE Computer Society Press, 1995, navigation patterns on a website using a sequence
pp.3-14. alignment,” in Intelligent Techniques for Web
[I I] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and Personalization: UCAI 2001 17th International Joint
A. Verkamo, “Fast discovery of association rules,” Conference on Artificial Intelligence, Seattle, Wash.,
Advances in Knowledge Discovery and Data Mining, USA, August 2 0 0 q : 1-6. ,
pp. 307-328, 1996. [211 S. D., Time Wraps, tnng Mts, and Macromolecules:
[I21 K.-F. Lee, Automatic Speech Recognition: The The Theory and Practice of Sequence Comparison, 1.
Development of the SPHINX System. Kluwer Kruskal, Ed. Addison-Wesley,l983.
Academic Puhlishers,Boston, MA: Boston: Kluwer, [22] I. Cadez and P. Smyth, “Probabilistic clustering using
1989. hierqrchical models,” 1999.

1655

d licensed use limited to: MINISTERE DE L'ENSEIGNEMENT SUPERIEUR ET DE LA RECHERCHE SCIENTIFIQUE. Downloaded on January 30,2025 at 14:07:50 UTC from IEEE Xplore. Restrictio

You might also like