Cluster Analysis
Cluster Analysis
Unsupervised Learning
•Interpretability
The result of clustering should be usable, understandable and interpretable.
•Helps in dealing with messed up data
Usually, the data is messed up and unstructured. It cannot be analyzed quickly, and that is
why the clustering of information is so significant in data mining. Grouping can give some
structure to the data by organizing it into groups of similar data objects. It becomes more
comfortable for the data expert in processing the data and also discover new things.
•High Dimensional
Data clustering is also able to handle the data of high dimension along with the data of
small size.
Requirements of Clustering in Data Mining
•Attribute shape clusters are discovered
Arbitrary shape clusters are detected by using the algorithm of clustering.
Small size cluster with spherical shape can also be found.
•Algorithm Usability with multiple data kind
Many different kinds of data can be used with algorithms of clustering. The
data can be like binary data, categorical and interval-based data.
•Clustering Scalability
The database usually is enormous to deal with. The algorithm should be
scalable to handle extensive database, so it needs to be scalable.
Data Mining Clustering Methods
1. Partitioning Clustering Method
In this method, let us say that “m” partition is done on the “p” objects of the
database. A cluster will be represented by each partition and m < p.
Often, if not always, an object can be part of more than one cluster.
These are mostly borderline objects, in which we define the
boundaries of clusters to overlap each other.
An example is demographic clustering, in which students can be part
of both a student cluster and a high-spender cluster, which would be
rare.
6. Model-Based Clustering Methods