0% found this document useful (0 votes)
51 views9 pages

Clustering

Clustering is an unsupervised machine learning technique used to group unlabeled data points that are similar to each other. It divides the data points into a number of groups where data points within each group are more similar to other data points in the same group than those in other groups. Clustering is commonly used in pattern recognition, image analysis, and machine learning. It is useful for data reduction, finding natural groupings in data, and outlier detection. Common clustering models include connectivity, centroid, distribution, and density models. Clustering has various applications such as market segmentation, data analysis, social network analysis, and image segmentation.

Uploaded by

nikhil shinde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views9 pages

Clustering

Clustering is an unsupervised machine learning technique used to group unlabeled data points that are similar to each other. It divides the data points into a number of groups where data points within each group are more similar to other data points in the same group than those in other groups. Clustering is commonly used in pattern recognition, image analysis, and machine learning. It is useful for data reduction, finding natural groupings in data, and outlier detection. Common clustering models include connectivity, centroid, distribution, and density models. Clustering has various applications such as market segmentation, data analysis, social network analysis, and image segmentation.

Uploaded by

nikhil shinde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Jan 30 2023

Introduction to Clustering
 It is basically a type of unsupervised learning method.
 An unsupervised learning method is a method in
which we draw references from datasets consisting of
input data without labeled responses.
 Generally, it is used as a process to find meaningful
structure, explanatory underlying processes,
generative features, and groupings inherent in a set of
examples
Overview
 Clustering is the task of dividing the population or
data points into a number of groups such that data
points in the same groups are more similar to other
data points in the same group and dissimilar to the
data points in other groups.
 It is basically a collection of objects on the basis of
similarity and dissimilarity between them.
Overview
 It is a main task of exploratory data analysis, and a
common technique for statistical data analysis, used in
many fields, including
 pattern recognition,
 image analysis,
 machine learning.
Overview
Overview
Why Clustering
 Clustering is very much important as it determines the
intrinsic grouping among the unlabelled data present.
 There are no criteria for good clustering. It depends on the
user, what is the criteria they may use which satisfy their
need. For instance, we could be interested in finding
representatives for homogeneous groups (data reduction),
in finding “natural clusters” and describe their unknown
properties (“natural” data types), in finding useful and
suitable groupings (“useful” data classes) or in finding
unusual data objects (outlier detection).
 This algorithm must make some assumptions that
constitute the similarity of points and each assumption
make different and equally valid clusters.
Cluster Model types
Typical cluster models include:
 Connectivity models: for example, hierarchical
clustering builds models based on distance connectivity.
 Centroid models: for example, the k-means
algorithm represents each cluster by a single mean vector.
 Distribution models: clusters are modeled using statistical
distributions, such as multivariate normal
distributions used by the expectation-maximization
algorithm.
 Density models: for example, DBSCAN and OPTICS defines
clusters as connected dense regions in the data space.
Clustering Uses
The clustering technique can be widely used in various
tasks. Some most common uses of this technique are:
 Market Segmentation
 Statistical data analysis
 Social network analysis
 Image segmentation
 Anomaly detection, etc.

You might also like