The Framework For Behavioral Pattern-Based Clustering
The Framework For Behavioral Pattern-Based Clustering
The Framework for Behavioral patterns. Hence, one approach is to use the number of
Pattern-Based Clustering strong patterns generated as a proxy for the similar- B
ity. If itemsets are used to represent patterns, then the
Consider a collection of customer transactions to be number of frequent itemsets in a cluster can be used
clustered { T1 , T2 , … , Tn }. A clustering C is a partition as a proxy for similarity.
{ C1 , C2 , … , Ck } of { T1 , T2 , … , Tn } and each
Ci is a cluster. The goal is to maximize the difference The Clustering Algorithm
between clusters and the similarity of transactions within
clusters. In words, we cluster to maximize a quantity The ideal algorithm will be one that maximizes M
M, where M is defined as follows: (defined in previous section). However, for the objec-
tive function defined above, if there are n transactions
k
M (C1 , C2 ,..., Ck ) = Difference (C1 , C2 ,..., Ck ) + ∑ Similarity (Ci )
and two clusters that we are interested in learning, the
i =1 number of possible clustering schemes to examine
is 2n. Hence, a heuristic approach is called for. Yang
Here we only give specific definition for the dif- & Padmanabhan (2003, 2005) provide two different
ference between two clusters. This is sufficient, since clustering algorithms. The main heuristic used in the
hierarchical clustering techniques can be used to cluster hierarchical algorithm presented in Yang & Padmanab-
the transactions repeatedly into two groups in such a han (2005) is as follows. For each pattern, the data is
way that the process results in clustering the transactions divided into two parts such that all records containing
into an arbitrary number of clusters (which is gener- that pattern are in one cluster and the remaining are in
ally desirable because the number of clusters does not the other cluster. The division maximizing the global
have to be specified up front). The exact definition of objective M is chosen. Further divisions are conducted
difference and similarity will depend on the specific following similar heuristic. The experiments in Yang
representation of behavioral patterns. Yang & Padma- & Padmanabhan (2003, 2005) indicate that the behav-
nabhan (2003, 2005) focus on clustering customers’ ioral pattern-based customer segmentation approach is
Web transactions and uses itemsets as the representation highly effective.
of behavioral patterns. With the representation given,
the difference and similarity between two clusters are
defined as follows: FUTURE TRENDS
For each pattern Pa considered, we calculate the sup-
port of this pattern in cluster Ci and the support of the Firms are increasingly realizing the importance of
pattern in cluster Cj, then compute the relative difference understanding and leveraging customer-level data, and
between these two support values and aggregate these critical business decision models are being built upon
relative differences across all patterns. The support of analyzing such data. Nowadays, massive amount of
a pattern in a cluster is the proportion of the transac- data is being collected for customers reflecting their
tions containing that pattern in the cluster. The intuition behavioral patterns, so the practice of analyzing such
behind the definition of difference is that the support data to identify behavioral patterns and using the patterns
of the patterns in one cluster should be different from discovered to facilitate decision making is becoming
the support of the patterns in the other cluster if the more and more popular. Utilizing behavioral patterns
underlying behavioral patterns are different. Here we for segmentation, classification, customer retention,
use the relative difference between two support values targeted marketing, etc. is on the research agenda. For
instead of the absolute difference. Yang & Padmanabhan different application domains, the representations of be-
(2007) proves that under certain natural distributional havioral patterns can be different. Different algorithms
assumptions the difference metric above is maximized need to be designed for different pattern representations
when the correct clusters are discovered. in different domains. Also, given the representation of
Here, the goal of the similarity measure is to cap- the behavioral patterns, similarity and difference may
ture how similar transactions are within each cluster. also need to be defined differently. These all call for
The heuristic is that, if transactions are more similar more research in this field.
to each other, then they can be assumed to share more