4. Frequent pattern based clustering
4. Frequent pattern based clustering
clustering
Frequent pattern based clustering
• Frequent pattern-based clustering methods leverage the concept of frequent
itemsets or patterns to identify clusters in a dataset. Frequent itemsets
represent sets of items that frequently co-occur in a dataset. These methods
often find associations or patterns among items and use them to discover
clusters of similar data points. Here are a few approaches within this
category:
1.Frequent Itemset Mining for Clustering:
• Apriori Algorithm: Originally designed for association rule mining, the Apriori
algorithm can be adapted for clustering. It identifies frequent itemsets, and the
resulting sets of items can be used as clusters. Items that frequently co-occur are
considered similar.
2.Pattern-Growth Methods:
• FP-Growth (Frequent Pattern Growth): An alternative to the Apriori algorithm, FP-
Growth employs a divide-and-conquer strategy to mine frequent patterns efficiently. It
builds a compact data structure called an FP-tree and extracts frequent itemsets from
it. The patterns discovered can be used for clustering.
Frequent pattern based
clustering(contd..)
3. Frequent Pattern-based Partitional Clustering:
• K-Means with Frequent Pattern-based Initialization: Frequent patterns can
be used to initialize the centroids in the k-means algorithm. This helps in
obtaining better initial cluster assignments based on the inherent patterns in the
data.
4. Density-Based Approaches with Frequent Patterns:
• DBSCAN with Frequent Pattern-based Seed Points: Density-Based Spatial
Clustering of Applications with Noise (DBSCAN) can benefit from using frequent
patterns as seed points for clusters. These seed points guide the cluster
formation based on the density of data points.
5. Graph-Based Methods:
• Graph-Based Clustering with Frequent Patterns: Constructing a graph
representation of the data where nodes represent data points and edges
represent relationships based on frequent patterns. Clusters can then be
identified from the graph structure.
Frequent pattern based
clustering(contd..)
6. Sequential Pattern Mining for Temporal Clustering:
• SPADE (Sequential Pattern Discovery using Equivalence classes):
Originally designed for sequential pattern mining, SPADE can be adapted for
temporal clustering by considering frequent sequential patterns as indicative of
clusters in time-series data.
7. Sequential Pattern-based Clustering:
• SPAM (Sequential Pattern Mining): Similar to SPADE, SPAM focuses on
discovering frequent sequential patterns, which can be indicative of clusters in
sequential data.
• These methods leverage the inherent relationships and co-occurrences in
the data captured by frequent patterns to form clusters. The resulting
clusters often represent subsets of data points that share common
characteristics or behaviors. Keep in mind that the effectiveness of these
methods depends on the nature of the data and the specific goals of the
clustering task.