0% found this document useful (0 votes)
2 views

ML Module 4 Unsupervised Learning - Updated

The presentation covers clustering, a method for grouping data points based on similarity, and outlines various types including partitioning, hierarchical, fuzzy, density-based, and model-based clustering. It details the K-means algorithm as a partitioning method, explaining its iterative process, advantages, and disadvantages. Additionally, it introduces fuzzy clustering and hierarchical clustering, along with their respective algorithms and characteristics.

Uploaded by

Sumita Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

ML Module 4 Unsupervised Learning - Updated

The presentation covers clustering, a method for grouping data points based on similarity, and outlines various types including partitioning, hierarchical, fuzzy, density-based, and model-based clustering. It details the K-means algorithm as a partitioning method, explaining its iterative process, advantages, and disadvantages. Additionally, it introduces fuzzy clustering and hierarchical clustering, along with their respective algorithms and characteristics.

Uploaded by

Sumita Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

THIS PRESENTATION IS ABOUT

Introduction of Clustering
Types of Clustering
 Partitioning based Clustering
 K-means Algorithm

 Fuzzy Clustering
 Fuzzy C-Means Algorithm

 Hierarchical based Clustering


 Agglomerative Algorithm

 Density based Clustering


 DBSCAN Algorithm

 Model based Clustering


CLUSTERING
CLUSTERING: INTRODUCTION
Clustering is the task of dividing the population or data
points into a number of groups such that data points in the
same groups are more similar to other data points in the
same group than those in other groups
The aim is to segregate groups with similar traits and assign them
into clusters.
 Unsupervised Learning  Requires Data, but no labels.
 Detect Patterns:
 Group emails or search results
 Customer shopping patterns
 Regions of images
TYPES OF CLUSTERING
CLUSTERING: TYPES
 Partitioning methods:
Its simply a division of the set of data objects into non-
overlapping clusters such that each objects is in exactly one
subset. Example: k-Means

 Hierarchical clustering:
Also known as 'nesting clustering' as it also clusters to exist
within bigger clusters to form a tree. Example:
Agglometric Clustering

 Fuzzy clustering:
It is used to reflect the fact that an object can
simultaneously belong to more than one group
CLUSTERING: TYPES
 Density-based clustering:
In this clustering model there will be a searching of data
space for areas of varied density of data points in the
data space. Example: DBSCAN

 Model-based clustering:
It provides a framework for incorporating our
knowledge about a domain.
PARTITIONING C LUSTERING
PARTITION BASED CLUSTERING
E XAMPLE : K-M EANS
K-MEANS CLUSTERING
 An Iterative Clustering Algorithm
 Partition-based Clustering
 Each Cluster is associated with a centroid
 Each point is assigned to the cluster with the closest
centroid
 Number of clusters, K, must be specified.
K-MEANS CLUSTERING
K-MEANS CLUSTERING
 1. Initial centroids are often chosen randomly.
 Clusters produced vary from one run to another
 2. The centroid is (typically) the mean of the points in the
cluster.
 3. “Closeness” is measured by Euclidean distance, cosine
similarity, correlation, etc.
 4. K-means will converge for common similarity measures
mentioned above.
 5. Most of the convergence happens in the first few
iterations.
 Often the stopping condition is changed to “Until relatively
few points change clusters”
K-MEANS CLUSTERING: EXAMPLE
 Given: No. of clusters -2
 Data points:
K-MEANS CLUSTERING: EXAMPLE
 Solution:
Step 1. Initial Centroids: K1- (185,72) K2- (170,56)
K-MEANS CLUSTERING: EXAMPLE
Step 2 : Calculate Euclidean Distance of both the centroids
with each of the data point.
K-MEANS CLUSTERING: EXAMPLE
K-MEANS CLUSTERING: EXAMPLE
K-MEANS CLUSTERING: EXAMPLE
Step 3: Final cluster allocation
K-MEANS ADVANTAGES
Advantages
 Relatively simple to implement.
 Scales to large data sets.
 Guarantees convergence.
 Can warm-start the positions of centroids.
 Easily adapts to new examples.
 Generalizes to clusters of different shapes and
sizes, such as elliptical clusters.
K-MEANS DISADVANTAGE
 Choosing k manually.
 Being dependent on initial values.
For a low k, you can mitigate this dependence by running k-means
several times with different initial values and picking the best result.
 Clustering data of varying sizes and density.
k-means has trouble clustering data where clusters are of varying sizes
and density.
 Clustering outliers.
Centroids can be dragged by outliers, or outliers might get their own
cluster instead of being ignored. Consider removing or clipping outliers
before clustering.
 Scaling with number of dimensions.
As the number of dimensions increases, a distance-based similarity
measure converges to a constant value between any given examples.
K-NN VS K-M EANS
F UZZY C LUSTERING
F UZZY C LUSTERING
C LUSTERING S CHEMAS
E XAMPLE : F UZZY C-M EANS
F UZZY C-M EANS A LGORITHM
FCM A DVANTAGES & D ISADVANTAGES
H IERARCHICAL C LUSTERING
H IERARCHICAL C LUSTERING
H IERARCHICAL C LUSTERING
E XAMPLE : A GGLOMERATIVE
C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
A GGLOMERATIVE C LUSTERING
H IERARCHICAL C LUSTERING
D ENSITY B ASED C LUSTERING
D ENSITY B ASED C LUSTERING
K-M EANS VS
D ENSITY B ASED C LUSTERING
D ENSITY B ASED C LUSTERING
E XAMPLE : DBSCAN
DBSCAN
DBSCAN
DBSCAN
DBSCAN: A DVANTAGES &
D ISADVANTAGES

You might also like