Module 5
Module 5
Module-5
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Bangalore A, B
Region
Mumbai C, D, C
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Greater than
C, E
50,000/-
Monthly
Income
Less than
A, B, D
50,000/-
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Greater
C
than 35
Less than
Age A, E
30
Between
B
30 to 35
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
iPhone A, C, D, E
Phone
Model
Motorola B
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
Introduction to clustering
• Lets take simple example to understand the clustering technique
Customer Name Region Monthly Income Age Phone Model Laptop Model
A Bangalore 50,000 29 iPhone MacBook
B Bangalore 35,000 34 Motorola Dell
C Mumbai 80,000 36 iPhone Dell
D Mumbai 40,000 26 iPhone Dell
E Mumbai 55,000 39 iPhone MacBook
MacBook A, E
Laptop
Model
Dell B, C, D
Introduction to clustering
• Properties of a cluster
• Property 1: Cohesion (Intra-cluster
Similarity)
• All the data points in a cluster should be similar to
each other.
• This property means that all data points within the
same cluster should be as similar to each other as
possible.
• In other words, the points within a single cluster
should be close together, indicating a strong internal
similarity. The more similar the data points are
within a cluster, the better the cohesion of that
cluster.
Introduction to clustering
• Properties of a cluster
• Property 1: Separation (Inter-cluster
Dissimilarity)
• The data points from different clusters should be as
different as possible
• This property focuses on ensuring that data points
from different clusters are as different from each
other as possible.
• This means that the clusters should be well-
separated, with a clear distinction between them.
The more dissimilar the clusters are, the better the
separation between them.
Overview of distance metrics
• Distance metrics are a key part of several machine learning algorithms.
• They are used in both supervised and unsupervised learning, generally to calculate the similarity
between data points.
• An effective distance metric improves the performance of our machine learning model, whether that’s
for classification tasks or clustering.
• Distance Metrics allow us to numerically quantify how similar two points are by calculating the
distance in between them.
• When the calculation of a distance metric leads to a small quantity, it means that the two points are
similar, when it is big, they are different.
Overview of distance metrics
• Types of Distance Metrics in Machine Learning
1. Euclidean Distance
2. Manhattan Distance
3. Minkowski Distance
4. Hamming Distance
Overview of distance metrics
• Euclidean Distance
• Euclidean Distance represents the shortest
distance between two vectors.
• It is the square root of the sum of squares of
differences between corresponding elements.
• Most machine learning algorithms, including
K-Means use this distance metric to measure
the similarity between observations.
Overview of distance metrics
• Euclidean Distance
• Let’s say we have two points, as shown in figure
Overview of distance metrics
• Euclidean Distance
• Let’s say we have two points, as shown in figure:
• So, the Euclidean Distance between these two points, A and B, will be
Overview of distance metrics
• Euclidean Distance
• Let’s say we have two points, as shown in figure:
• So, the Euclidean Distance between these two points, A and B, will be
• Formula for Euclidean Distance
Overview of distance metrics
• Euclidean Distance
• Let’s say we have two points, as shown in figure:
• So, the Euclidean Distance between these two points, A and B, will be
• Formula for Euclidean Distance
• We use this formula when we are dealing with 2 dimensions. We can generalize this for an n-dimensional
space as:
Where,
n = number of dimensions
pi, qi = data points
Overview of distance metrics
• Euclidean distance is a useful metric in many machine learning algorithms,
• K-nearest neighbor
• K-means clustering
Overview of distance metrics
Overview of distance metrics
Overview of distance metrics
• Manhattan Distance
• Manhattan Distance is the sum of absolute
differences between points across all the
dimensions.
• Also known as the City Block distance.
• This involves measuring the distance
between two points by summing the
differences in their coordinates along each
dimension.
• It is often used in cases where movement
can only occur along grid lines.
Overview of distance metrics
• Manhattan Distance
• Manhattan Distance is the sum of absolute differences
between points across all the dimensions.
• We can represent Manhattan Distance as
• So, the Manhattan distance in a 2-dimensional space is
given as
Where,
n = number of dimensions
pi, qi = data points
Overview of distance metrics
• Comparison between Manhattan
Distance and Euclidean distance
• While Manhattan distance measures the
path along grid lines, Euclidean distance
measures the straight-line distance
between two points.
• For 2D example: Consider two points:
A(1, 1) and B(4, 5):
• Manhattan distance: |x₁ - x₂| + |y₁ - y₂|
= |1 - 4| + |1 - 5| = 4 + 3
• Euclidean distance: Sqrt((1-4)² + (1-5)²)
= 5 units
Overview of distance metrics
• Applications of Manhattan Distance:
1. Pathfinding algorithms (e.g., A* algorithm)
2. Clustering techniques (e.g., K-Means clustering)
Overview of distance metrics
Overview of distance metrics
Overview of distance metrics
• Minkowski Distance
• The Minkowski distance is a generalization
of the Euclidean and Manhattan distances.
• It allows the flexibility to consider
different power values, determining
whether the distance calculation is
influenced more by the larger differences
or the smaller ones
𝑛 ℎ
ℎ
𝐷 = 𝑝i − 𝑞𝑖
𝑖=1
• For h=1
𝑛 1
1
𝐷 = 𝑝i − 𝑞𝑖
𝑖=1
Overview of distance metrics
• Minkowski Distance
• The Minkowski distance is a generalization
of the Euclidean and Manhattan distances.
• It allows the flexibility to consider
different power values, determining
whether the distance calculation is
influenced more by the larger differences
or the smaller ones
𝑛 1/ℎ
ℎ
𝐷 = 𝑝i − 𝑞𝑖
𝑖=1
• For h=2
𝑛 1/2
2
𝐷 = 𝑝i − 𝑞𝑖
𝑖=1
Overview of distance metrics
• Minkowski Distance
1. K-Nearest Neighbors,
2. Learning Vector Quantization (LVQ),
3. Self-Organizing Map (SOM)
4. K-Means Clustering.
Overview of distance metrics
Overview of distance metrics
Overview of distance metrics
Overview of distance metrics
• Hamming Distance
• Hamming distance is used to measure the
difference between two binary vectors,
and it counts the number of positions at
which the corresponding bits are different. • 𝑎𝑖 represents 𝑖𝑡ℎ symbol of string A
• Hamming distance is all about calculating • 𝑏𝑖 represents𝑖𝑡ℎ symbol of string B
the similarity between two strings of equal • 𝛿 𝑎𝑖 , 𝑏𝑖 is an indication function that
length. returns 0 if ai and bi are the same and 1 if
• Or it is useful when calculating the they are different
distance in between observations for
which we have only binary features.
Overview of distance metrics
Major clustering approaches
• Approaches used in clustering algorithms
1. Hard Clustering
• Definition: In hard clustering, each data point is assigned to exactly one cluster. This means that every point
definitively belongs to a single cluster without any ambiguity.
Major clustering approaches
• Approaches used in clustering algorithms
2. Soft Clustering
• Definition: In soft clustering, each data point can belong to multiple clusters, with a certain probability or
degree of membership. Instead of a hard assignment to a single cluster, data points are assigned a set of
probabilities that sum to 1, representing their likelihood of belonging to each cluster.
Types of clustering
• Approaches used in clustering algorithms
2. Soft Clustering
• Definition: In soft clustering, each data point can belong to multiple clusters, with a certain probability or
degree of membership. Instead of a hard assignment to a single cluster, data points are assigned a set of
probabilities that sum to 1, representing their likelihood of belonging to each cluster.
Learning with Clustering
5.1
Introduction to clustering with overview of distance metrics
Major clustering approaches
5.2
Graph Based Clustering: Clustering with minimal spanning tree
Model based Clustering: Expectation Maximization Algorithm
Density Based Clustering: DBSCAN
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Types of clustering
Types of
clustering
Partition
Density
Based Hierarchical Grid Based Graph Based Model Based
Based
Clustering
BIRCH DBSCAN
K-means STING MST EM
CURE DENCLUE
K-Mediods CLIQUE CLICK COBWEB
ROCK OPTICS
Partition Based Clustering: K-Means
K-Means Clustering Algorithm:
• Clustering is dividing data points into homogeneous classes or clusters
• Points in the same group are as similar as possible
• Points in different group are as dissimilar as possible
Partition Based Clustering: K-Means
K-Means Clustering Algorithm:
• A K-means clustering algorithm tries to group similar items in the form of clusters.
• The number of groups are represented by K. If K=2, there will be two clusters.
• It is centroid based algorithm where each cluster is associated with centroid.
• The main aim of this algorithm is to minimize the sum of the distance between the data
points and their corresponding clusters.
Partition Based Clustering: K-Means
Partition Based Clustering: K-Means
Cluster the following eight points (with (x, y) representing locations) into three clusters:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)
Solution:
Step1: Select number k to decide number of clusters
𝐾=3
𝑑 = 2−2 2 + 10 − 10 2
Partition Based Clustering: K-Means
Data Points x2 y2 Distance to Cluster New Initial Centroids
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 A4 (5,8)
A2 2 5 A7 (1,2)
A3 8 4
A4 5 8
A5 7 5
A6 6 4
A7 1 2
A8 4 9
𝑑 = 2−2 2 + 10 − 10 2
= 0.00
Partition Based Clustering: K-Means
Data Points x2 y2 Distance to Cluster New Initial Centroids
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 A4 (5,8)
A2 2 5 5.00 A7 (1,2)
A3 8 4 8.49
A4 5 8 3.61
A5 7 5 7.07
A6 6 4 7.21
A7 1 2 8.06
A8 4 9 2.24
Partition Based Clustering: K-Means
Data Points Distance to Cluster New Initial Centroids
x2 y2
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 A4 (5,8)
A2 2 5 5.00 A7 (1,2)
A3 8 4 8.49
A4 5 8 3.61
A5 7 5 7.07
A6 6 4 7.21
A7 1 2 8.06
A8 4 9 2.24
Partition Based Clustering: K-Means
Data Points Distance to Cluster New Initial Centroids
x2 y2
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 A4 (5,8)
A2 2 5 5.00 4.24 A7 (1,2)
A3 8 4 8.49 5.00
A4 5 8 3.61 0.00
A5 7 5 7.07 3.61
A6 6 4 7.21 4.12
A7 1 2 8.06 7.21
A8 4 9 2.24 1.41
Partition Based Clustering: K-Means
Data Points Distance to x2 y2 Cluster New Initial Centroids
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 A4 (5,8)
A2 2 5 5.00 4.24 A7 (1,2)
A3 8 4 8.49 5.00
A4 5 8 3.61 0.00
A5 7 5 7.07 3.61
A6 6 4 7.21 4.12
A7 1 2 8.06 7.21
A8 4 9 2.24 1.41
Partition Based Clustering: K-Means
Data Points Distance to x2 y2 Cluster New Initial Centroids
x1 y1 2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 A4 (5,8)
A2 2 5 5.00 4.24 3.16 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 A4 (5,8)
A2 2 5 5.00 4.24 3.16 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 A4 (5,8)
A2 2 5 5.00 4.24 3.16 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62 2
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2
A5 7 5 7.07 3.61 6.71 2
A6 6 4 7.21 4.12 5.39 2
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62 2
Cluster: 1
Cluster: 1
(2,10)
Cluster: 1
(2,10)
Cluster: 2
Cluster: 2
= (8+5+7+6+4)/5 , (4+8+5+4+9)/5
= (6,6)
Step4: Calculate new centroid for each cluster
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2 New Centroids
A5 7 5 7.07 3.61 6.71 2 (2,10)
A6 6 4 7.21 4.12 5.39 2 (6, 6)
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62 2
Cluster: 2
= (8+5+7+6+4)/5 , (4+8+5+4+9)/5
= (6,6)
Step4: Calculate new centroid for each cluster
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2 New Centroids
A5 7 5 7.07 3.61 6.71 2 (2,10)
A6 6 4 7.21 4.12 5.39 2 (6, 6)
A7 1 2 8.06 7.21 0.00 3
A8 4 9 2.24 1.41 7.62 2
Cluster: 3
Cluster: 3
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step4: Calculate new centroid for each cluster
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2 New Centroids
A5 7 5 7.07 3.61 6.71 2 (2,10)
A6 6 4 7.21 4.12 5.39 2 (6, 6)
A7 1 2 8.06 7.21 0.00 3 (1.5, 3.5)
A8 4 9 2.24 1.41 7.62 2
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step4: Calculate new centroid for each cluster
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 5 8 1 2 Cluster
A1 (2,10)
A1 2 10 0.00 3.61 8.06 1 A4 (5,8)
A2 2 5 5.00 4.24 3.16 3 A7 (1,2)
A3 8 4 8.49 5.00 7.28 2
A4 5 8 3.61 0.00 7.21 2 New Centroids
A5 7 5 7.07 3.61 6.71 2 (2,10)
A6 6 4 7.21 4.12 5.39 2 (6, 6)
A7 1 2 8.06 7.21 0.00 3 (1.5, 3.5)
A8 4 9 2.24 1.41 7.62 2
Cluster: 1
= (2+4)/2, (10+9)/2
= 3, 9.5
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 6 6 1.5 3.5 Cluster
(2,10)
A1 2 10 0.00 5.66 6.52 1 1 (6, 6)
A2 2 5 5.00 4.12 1.58 3 3 (1.5, 3.5)
A3 8 4 8.49 2.83 6.52 2 2
A4 5 8 3.61 2.24 5.70 2 2 New Centroids
A5 7 5 7.07 1.41 5.70 2 2 3, 9.5
A6 6 4 7.21 2.00 4.53 2 2
A7 1 2 8.06 6.40 1.58 3 3
A8 4 9 2.24 3.61 6.04 2 1
Cluster: 1
= (2+4)/2, (10+9)/2
= 3, 9.5
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 6 6 1.5 3.5 Cluster
(2,10)
A1 2 10 0.00 5.66 6.52 1 1 (6, 6)
A2 2 5 5.00 4.12 1.58 3 3 (1.5, 3.5)
A3 8 4 8.49 2.83 6.52 2 2
A4 5 8 3.61 2.24 5.70 2 2 New Centroids
A5 7 5 7.07 1.41 5.70 2 2 3, 9.5
A6 6 4 7.21 2.00 4.53 2 2
A7 1 2 8.06 6.40 1.58 3 3
A8 4 9 2.24 3.61 6.04 2 1
Cluster: 2
Cluster: 2
Cluster: 2
= (8+5+7+6)/4, (4+8+5+4)/4
= 6.5, 5.25
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 6 6 1.5 3.5 Cluster
(2,10)
A1 2 10 0.00 5.66 6.52 1 1 (6, 6)
A2 2 5 5.00 4.12 1.58 3 3 (1.5, 3.5)
A3 8 4 8.49 2.83 6.52 2 2
A4 5 8 3.61 2.24 5.70 2 2 New Centroids
A5 7 5 7.07 1.41 5.70 2 2 3, 9.5
A6 6 4 7.21 2.00 4.53 2 2 6.5, 5.25
A7 1 2 8.06 6.40 1.58 3 3
A8 4 9 2.24 3.61 6.04 2 1
Cluster: 2
= (8+5+7+6)/4, (4+8+5+4)/4
= 6.5, 5.25
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 6 6 1.5 3.5 Cluster
(2,10)
A1 2 10 0.00 5.66 6.52 1 1 (6, 6)
A2 2 5 5.00 4.12 1.58 3 3 (1.5, 3.5)
A3 8 4 8.49 2.83 6.52 2 2
A4 5 8 3.61 2.24 5.70 2 2 New Centroids
A5 7 5 7.07 1.41 5.70 2 2 3, 9.5
A6 6 4 7.21 2.00 4.53 2 2 6.5, 5.25
A7 1 2 8.06 6.40 1.58 3 3
A8 4 9 2.24 3.61 6.04 2 1
Cluster: 3
Cluster: 3
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
2 10 6 6 1.5 3.5 Cluster
(2,10)
A1 2 10 0.00 5.66 6.52 1 1 (6, 6)
A2 2 5 5.00 4.12 1.58 3 3 (1.5, 3.5)
A3 8 4 8.49 2.83 6.52 2 2
A4 5 8 3.61 2.24 5.70 2 2 New Centroids
A5 7 5 7.07 1.41 5.70 2 2 3, 9.5
A6 6 4 7.21 2.00 4.53 2 2 6.5, 5.25
A7 1 2 8.06 6.40 1.58 3 3 1.5, 3.5
A8 4 9 2.24 3.61 6.04 2 1
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points Distance to Cluster New Initial Centroids
Cluster
3, 9.5
A1 2 10 1 1 6.5, 5.25
A2 2 5 3 3 1.5, 3.5
A3 8 4 2 2
A4 5 8 2 2
A5 7 5 2 2
A6 6 4 2 2
A7 1 2 3 3
A8 4 9 2 1
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step6: If any reassignment is there then goto step 4
Partition Based Clustering: K-Means
Data Points Distance to Cluster New Initial Centroids
3 9.5 6.5 5.25 1.5 3.5 Cluster
3, 9.5
A1 2 10 1 6.5, 5.25
A2 2 5 3 1.5, 3.5
A3 8 4 2
A4 5 8 2
A5 7 5 2
A6 6 4 2
A7 1 2 3
A8 4 9 1
Cluster: 1
Cluster: 1
Cluster: 1
= (2+5+4)/3, (10+8+9)/3
= 3.67, 9
Step6: Repeat step-4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
3 9.5 6.5 5.25 1.5 3.5 Cluster
3, 9.5
A1 2 10 1.12 6.54 6.52 1 1 6.5, 5.25
A2 2 5 4.61 4.51 1.58 3 3 1.5, 3.5
A3 8 4 7.43 1.95 6.52 2 2
A4 5 8 2.50 3.13 5.70 2 1 New Centroids
A5 7 5 6.02 0.56 5.70 2 2 3.67, 9
A6 6 4 6.26 1.35 4.53 2 2
A7 1 2 7.76 6.39 1.58 3 3
A8 4 9 1.12 4.51 6.04 1 1
Cluster: 1
= (2+5+4)/3, (10+8+9)/3
= 3.67, 9
Step6: Repeat step-4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
3 9.5 6.5 5.25 1.5 3.5 Cluster
3, 9.5
A1 2 10 1.12 6.54 6.52 1 1 6.5, 5.25
A2 2 5 4.61 4.51 1.58 3 3 1.5, 3.5
A3 8 4 7.43 1.95 6.52 2 2
A4 5 8 2.50 3.13 5.70 2 1 New Centroids
A5 7 5 6.02 0.56 5.70 2 2 3.67, 9
A6 6 4 6.26 1.35 4.53 2 2
A7 1 2 7.76 6.39 1.58 3 3
A8 4 9 1.12 4.51 6.04 1 1
Cluster: 2
Cluster: 2
Cluster: 2
= (8+7+6)/3, (4+5+4)/3
= 7, 4.33
Step6: Repeat step-4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
3 9.5 6.5 5.25 1.5 3.5 Cluster
3, 9.5
A1 2 10 1.12 6.54 6.52 1 1 6.5, 5.25
A2 2 5 4.61 4.51 1.58 3 3 1.5, 3.5
A3 8 4 7.43 1.95 6.52 2 2
A4 5 8 2.50 3.13 5.70 2 1 New Centroids
A5 7 5 6.02 0.56 5.70 2 2 3.67, 9
A6 6 4 6.26 1.35 4.53 2 2 7, 4.33
A7 1 2 7.76 6.39 1.58 3 3
A8 4 9 1.12 4.51 6.04 1 1
Cluster: 2
= (8+7+6)/3, (4+5+4)/3
= 7, 4.33
Step6: Repeat step-4
Partition Based Clustering: K-Means
Data Points 1 Distance
2 to 3 Cluster New Initial Centroids
3 9.5 6.5 5.25 1.5 3.5 Cluster
3, 9.5
A1 2 10 1.12 6.54 6.52 1 1 6.5, 5.25
A2 2 5 4.61 4.51 1.58 3 3 1.5, 3.5
A3 8 4 7.43 1.95 6.52 2 2
A4 5 8 2.50 3.13 5.70 2 1 New Centroids
A5 7 5 6.02 0.56 5.70 2 2 3.67, 9
A6 6 4 6.26 1.35 4.53 2 2 7, 4.33
A7 1 2 7.76 6.39 1.58 3 3
A8 4 9 1.12 4.51 6.04 1 1
Cluster: 3
Cluster: 3
Cluster: 3
= (2+1)/2, (5+2)/2
= 1.5, 3.5
Step6: Repeat step-4
Partition Based Clustering: K-Means
Data Points Distance to Cluster New Initial Centroids
Cluster
3.67, 9
A1 2 10 1 1 7, 4.33
A2 2 5 3 3 1.5, 3.5
A3 8 4 2 2
A4 5 8 2 1
A5 7 5 2 2
A6 6 4 2 2
A7 1 2 3 3
A8 4 9 1 1
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
B H T T T H H T H T H
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒉𝒆𝒂𝒅𝒔 𝒖𝒔𝒊𝒏𝒈 𝑪𝟏
A H H H H T H H H H H 𝜽𝟏 =
𝒕𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒍𝒊𝒑𝒔 𝒖𝒔𝒊𝒏𝒈 𝑪𝟏
A H T H H H H H T H H
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒉𝒆𝒂𝒅𝒔 𝒖𝒔𝒊𝒏𝒈 𝑪𝟐
𝜽𝟐 =
𝒕𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒍𝒊𝒑𝒔 𝒖𝒔𝒊𝒏𝒈 𝑪𝟐
B H T H T T T H H T T
A T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
B H T T T H H T H T H Coin A Coin B
5 H, 5 T
A H H H H T H H H H H
9 H, 1 T
A H T H H H H H T H H
8 H, 2 T
B H T H T T T H H T T
4 H, 6 T
A T H H H T H H H T H 7 H, 3 T
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Coin A Coin B
B H T T T H H T H T H
5 H, 5 T
A H H H H T H H H H H 9 H, 1 T
A H T H H H H H T H H 8 H, 2 T
4 H, 6 T
B H T H T T T H H T T
7 H, 3 T
A T H H H T H H H T H
24 H, 6 T 9 H, 11 T
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Coin A Coin B
• We toss the chosen coin 10 times 5 H, 5 T
9 H, 1 T
B H T T T H H T H T H
8 H, 2 T
A H H H H T H H H H H
4 H, 6 T
7 H, 3 T
A H T H H H H H T H H 24 H, 6 T 9 H, 11 T
B H T H T T T H H T T
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑒𝑎𝑑𝑠 𝑢𝑠𝑖𝑛𝑔 𝐶1
𝜃1 =
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑙𝑖𝑝𝑠 𝑢𝑠𝑖𝑛𝑔 𝐶1
A T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Coin A Coin B
• We toss the chosen coin 10 times 5 H, 5 T
9 H, 1 T
B H T T T H H T H T H
8 H, 2 T
A H H H H T H H H H H
4 H, 6 T
7 H, 3 T
A H T H H H H H T H H 24 H, 6 T 9 H, 11 T
B H T H T T T H H T T
24
𝜃1 = = 0.8
24 + 6
A T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
Coin A Coin B
First Experiment
• We choose 5 times one of the coins 5 H, 5 T
• We toss the chosen coin 10 times 9 H, 1 T
8 H, 2 T
B H T T T H H T H T H 4 H, 6 T
7 H, 3 T
A H H H H T H H H H H
24 H, 6 T 9 H, 11 T
A H T H H H H H T H H
24
𝜃1 = = 0.8
B H T H T T T H H T T 24 + 6
9
A T H H H T H H H T H 𝜃2 = = 0.45
9 + 11
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
B H T T T H H T H T H
• Assume More challenging
A H H H H T H H H H H problem
A H T H H H H H T H H
• We don’t know the identities of
the coins used for each set of
B H T H T T T H H T T
tosses (We treat them as a hidden
variables
A T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
H T T T H H T H T H
• Assume More challenging problem
H H H H T H H H H H • We don’t know the identities of the
coins used for each set of tosses (We
H T H H H H H T H H treat them as a hidden variables
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
H T T T H H T H T H
H H H H T H H H H H
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-01: Initial Values
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-01: Initial Values
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H
Step-02: E-step
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
𝑃 𝐴
T H H H T H H H T H ✓ Normalise 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑜𝑓 𝐴 by using
𝑃 𝐴 +𝑃 𝐵
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
Step-02: E-step
• We choose 5 times one of the coins
• We toss the chosen coin 10 times ✓ 𝑊𝑒 ℎ𝑎𝑣𝑒 𝑐𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑒𝑑 𝜃1 = 0.60, 𝜃2 = 0.50
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H
Step-02: E-step
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80
Step-02: E-step
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
H T T T H H T H T H A 0.45 B 0.55 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads =5 Tails =5
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45*5 0.55*5 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads =5 Tails =5
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = Tails =
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 9 Tails = 1
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 9 Tails = 1
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = Tails =
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 8 Tails = 2
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 8 Tails = 2
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = Tails =
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 4 Tails = 6
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 4 Tails = 6
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = Tails =
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 7 Tails = 3
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
4.5H, 1.9T 2.5H, 1.1T
H T H T T T H H T T A 0.35 B 0.65
T H H H T H H H T H A 0.65 B 0.35
Heads = 7 Tails = 3
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment
• We choose 5 times one of the coins Step-01: Initial Values
• We toss the chosen coin 10 times
Consider
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B 𝜃1 = 0.60, 𝜃2 = 0.50
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T Step-02: E-step
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
4.5H, 1.9T 2.5H, 1.1T
H T H T T T H H T T A 0.35 B 0.65
21.3H, 8.6T 11.7H, 8.4T
T H H H T H H H T H A 0.65 B 0.35
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment Step-01: Initial Values
𝜃1 = 0.60, 𝜃2 = 0.50
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Step-02: E-step
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
4.5H, 1.9T 2.5H, 1.1T
H T H T T T H H T T A 0.35 B 0.65
21.3H, 8.6T 11.7H, 8.4T
T H H H T H H H T H A 0.65 B 0.35
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment Step-01: Initial Values
𝜃1 = 0.60, 𝜃2 = 0.50
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Step-02: E-step
0.45 0.55 Coin A Coin B
H T T T H H T H T H A B Step-03: M-step
2.2H, 2.2T 2.8H, 2.8T
H H H H T H H H H H A 0.80 B 0.20 7.2H, 0.8T 1.8H, 0.2T
5.9H, 1.5T 2.1H, 0.5T
H T H H H H H T H H A 0.73 B 0.27
1.4H, 2.1T 2.6H, 3.9T
4.5H, 1.9T 2.5H, 1.1T
H T H T T T H H T T A 0.35 B 0.65
21.3H, 8.6T 11.7H, 8.4T
T H H H T H H H T H A 0.65 B 0.35
Model based Clustering: Expectation Maximization Algorithm
Expectation Maximization Algorithm: Example
First Experiment Step-01: Initial Values
𝜃1 = 0.60, 𝜃2 = 0.50
• We choose 5 times one of the coins
• We toss the chosen coin 10 times
Step-02: E-step
Disadvantages of EM algorithm
• The convergence of the EM algorithm is very slow.
Model based Clustering: Expectation Maximization Algorithm
Applications of EM algorithm
• Data clustering,
• Natural language processing (NLP),
• Computer vision,
• Image reconstruction,
• Structural engineering
Learning with Clustering
5.1
Introduction to clustering with overview of distance metrics
Major clustering approaches
5.2
Graph Based Clustering: Clustering with minimal spanning tree
Model based Clustering: Expectation Maximization Algorithm
Density Based Clustering: DBSCAN
Density Based Clustering: DBSCAN
Density based clustering:
• Density-based clustering is a method used in data analysis to identify clusters in a dataset
based on the density of data points in a given region.
• The idea is that clusters are areas of high data point density, separated by areas of low
density.
• This approach does not require the number of clusters to be specified beforehand, unlike
other methods such as k-means.
• It's an unsupervised learning method.
Density Based Clustering: DBSCAN
DBSCAN:
• DBSCAN is the abbreviation for Density-Based Spatial Clustering of Applications with
Noise.
• It is an unsupervised clustering algorithm.
• DBSCAN clustering can work with clusters of any size from huge amounts of data and
can work with datasets containing a significant amount of noise.
• It is basically based on the criteria of a minimum number of points within a region.
• DBSCAN algorithm can cluster densely grouped points efficiently into one cluster.
• It can identify local density in the data points among large datasets. DBSCAN can very
effectively handle outliers.
• An advantage of DBSACN over the K-means algorithm is that the number of centroids
need not be known beforehand in the case of DBSCAN.
Density Based Clustering: DBSCAN
DBSCAN:
• DBSCAN algorithm depends upon two parameters epsilon and minPoints.
• Epsilon is defined as the radius of each data point around which the density is considered.
• minPoints is the number of points required within the radius so that the data point becomes a
core point.(𝑚𝑖𝑛𝑃𝑜𝑖𝑛𝑡𝑠 = 4)
e
e
D B
e
C
Density Based Clustering: DBSCAN
DBSCAN:
• DBSCAN algorithm depends upon two parameters epsilon and minPoints.
• Epsilon is defined as the radius of each data point around which the density is considered.
• minPoints is the number of points required within the radius so that the data point becomes a
core point.(𝑚𝑖𝑛𝑃𝑜𝑖𝑛𝑡𝑠 = 4)
e
✓ We can see that point B has no
points inside epsilon(e) radius. B
✓ Hence it is a Noise Point.
Density Based Clustering: DBSCAN
DBSCAN:
• DBSCAN algorithm depends upon two parameters epsilon and minPoints.
• Epsilon is defined as the radius of each data point around which the density is considered.
• minPoints is the number of points required within the radius so that the data point becomes a
core point.(𝑚𝑖𝑛𝑃𝑜𝑖𝑛𝑡𝑠 = 4)
e
e
D
A
object A A
Density Based Clustering: DBSCAN
DBSCAN:
Density reachable:
An object q is density-reachable from p w.r.t ε and
MinPts if there is a chain of objects q1, q2…, qn,
with q1=p, qn=q such that qi+1 is directly
density-reachable from qi w.r.t ε and MinPts for all
1 <= i <= n
Density Based Clustering: DBSCAN
DBSCAN: minPoint=4
Density reachable:
• Point X is directly density reachable Y C
from object Y X
B
Density Based Clustering: DBSCAN
DBSCAN:
Density connectivity:
Object q is density-connected to object p w.r.t ε
and MinPts if there is an object o such that both p
and q are density-reachable from o w.r.t ε and
MinPts.
Density Based Clustering: DBSCAN
DBSCAN:
Density connectivity:
Object q is density-connected to object p w.r.t ε and MinPts if there is an object o such that
both p and q are density-reachable from o w.r.t ε and MinPts.
Density Based Clustering: DBSCAN
DBSCAN Algorithm:
Density Based Clustering: DBSCAN
DBSCAN Algorithm:
Density Based Clustering: DBSCAN
DBSCAN Example:
Apply DBSCAN algorithm to the given data points and create the clusters with
minPts=4 and epsilon(e)=1.9
P1 (3, 7)
P2 (4, 6)
P3 (5, 5)
P4 (6, 4)
Use Euclidian distance and calculate distance between each points
P5 (7, 3)
P6 (6, 2)
P7 (7. 2)
P8 (8, 4)
P9 (3, 3)
P10 (2, 6)
P11 (3, 5)
P12 (2, 4)
Density Based Clustering: DBSCAN
DBSCAN Example: minPts=4 and epsilon(e)=1.9
P1 P2, P10
P2 P1, P3, P11
P3 P2, P4
P4 P3, P5
P5 P4, P6, P7, P8
P6 P5, P7
P7 P5, P6
P8 P5
P9 P12
P10 P1, P11
P11 P2, P10, P12
P12 P9, P11
Density Based Clustering: DBSCAN
DBSCAN Advantages
• Robust to outliers : It is robust to outliers as it defines clusters based on dense regions of data, and
isolated points are treated as noise.
• No need to specify clusters : Unlike some clustering algorithms, DBSCAN does not require the user
to specify the number of clusters beforehand, making it more flexible and applicable to a variety of
datasets.
• Can find arbitrary shaped clusters : DBSCAN can identify clusters with complex shapes and is not
constrained by assumptions of cluster shapes, making it suitable for data with irregular structures.
• Only 2 hyperparameters to tune : DBSCAN has only two primary hyperparameters to tune: “eps”
(distance threshold for defining neighborhood) and “minPoints” (minimum number of points required to
form a dense region). This simplicity can make parameter tuning more straightforward.
Density Based Clustering: DBSCAN
DBSCAN Disadvantages
• Sensitivity to hyperparameters : The performance of DBSCAN can be sensitive to the choice of its
hyperparameters, especially the distance threshold (eps) and the minimum number of points
(min_samples). Suboptimal parameter selection may lead to under-segmentation or over-segmentation.
• Difficulty with varying density clusters : DBSCAN struggles with clusters of varying densities. It
may fail to connect regions with lower point density to the rest of the cluster, leading to suboptimal
cluster assignments in datasets with regions of varying densities.
• Does not predict : Unlike some clustering algorithms, DBSCAN does not predict the cluster
membership of new, unseen data points. Once the model is trained, it is applied to the existing dataset
without the ability to generalize to new observations outside the training set.
Density Based Clustering: DBSCAN
DBSCAN Applications
• Spatial Data Analysis: DBSCAN is particularly well-suited for spatial data clustering due to its ability to find
clusters of arbitrary shapes, which is common in geographic data. It’s used in applications like identifying regions of
similar land use in satellite images or grouping locations with similar activities in GIS (Geographic Information
Systems).
• Anomaly Detection: The algorithm’s effectiveness in distinguishing noise or outliers from core clusters makes it
useful in anomaly detection tasks, such as detecting fraudulent activities in banking transactions or identifying
unusual patterns in network traffic.
• Customer Segmentation: In marketing and business analytics, DBSCAN can be used for customer segmentation
by identifying clusters of customers with similar buying behaviors or preferences.
• Environmental Studies: DBSCAN can be used in environmental monitoring, for example, to cluster areas based
on pollution levels or to identify regions with similar environmental characteristics.
• Traffic Analysis: In traffic and transportation studies, DBSCAN is useful for identifying hotspots of traffic
congestion or for clustering routes with similar traffic patterns.
• Machine Learning and Data Mining: More broadly, in the fields of machine learning and data mining,
DBSCAN is employed for exploratory data analysis, helping to uncover natural structures or patterns in data that
might not be apparent otherwise.
Learning with Clustering
5.1
Introduction to clustering with overview of distance metrics
Major clustering approaches
5.2
Graph Based Clustering: Clustering with minimal spanning tree
Model based Clustering: Expectation Maximization Algorithm
Density Based Clustering: DBSCAN
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree
• Clustering using a Minimum Spanning Tree (MST) is an approach that involves
transforming a set of data points into a graph, where the edges between the points
represent some distance or dissimilarity metric.
• The goal is to connect all points in the data set with the minimal total edge weight,
without forming any cycles, which results in the Minimum Spanning Tree.
• Construct a Graph: Create a fully connected graph where each vertex represents a data point. The edges
between vertices are weighted based on the distance (e.g., Euclidean distance) between data points.
• Compute the Minimum Spanning Tree: Use an algorithm like Kruskal’s or Prim’s to construct the MST.
The MST is a subgraph that connects all the points with the minimum total edge weight and no cycles.
• Remove Long Edges: Once the MST is constructed, the idea is to remove the longest edges (those that
span different clusters). These edges typically represent points that are far apart and hence, belong to
different clusters. By removing a certain number of edges (typically the ones with the largest weights),
you create disconnected subgraphs, which represent different clusters.
• Resulting Clusters: After removing the longest edges, the remaining connected components in the graph
represent the clusters.
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Algorithm
Make a Graph of given data points
Remove the edges from the MST Longest edges in the MST
C D
E F
A G B
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Create 3 clusters for following data Points:
A (0,0), B (6,0), C (0,8), D (6,8), E (3,4), F (9,4), G (3,0)
E F
A G B
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Create 3 clusters for following data Points:
A (0,0), B (6,0), C (0,8), D (6,8), E (3,4), F (9,4), G (3,0)
C D
E F
A G B
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Create 3 clusters for following data Points:
A (0,0), B (6,0), C (0,8), D (6,8), E (3,4), F (9,4), G (3,0)
C D
E F
A G B
Graph Based Clustering: Clustering with MST
Clustering with minimal
spanning tree: Example
Create 3 clusters for following data Points:
A (0,0), B (6,0), C (0,8), D (6,8), E (3,4), F (9,4),
G (3,0)
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step1: Construct a Graph:
1. Create a fully connected graph where each vertex represents a data point.
2. The edges between vertices are weighted based on the distance (e.g., Euclidean
distance) between data points.
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
• Remove inconsistent edges
8 7
2 3 4
4 9
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
8 7
2 3 4
4 9
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
Remove 9
8 7
2 3 4
4 9
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
No. of clusters: 2
8 7
2 3 4
4
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
Remove 8
8 7
2 3 4
4
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
Remove 8
7
2 3 4
4
2
1 9 4 5
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
8 7 6
1 2
Graph Based Clustering: Clustering with MST
Clustering with minimal spanning tree: Example
Step3: Remove the edges to create clusters
• Remove longest edges
Number of clusters=03
7
2 3 4
4
2
1 9 4 5
8 7 6
1 2
Learning with Clustering
5.1
Introduction to clustering with overview of distance metrics
Major clustering approaches
5.2
Graph Based Clustering: Clustering with minimal spanning tree
Model based Clustering: Expectation Maximization Algorithm
Density Based Clustering: DBSCAN
Case Study on EM Algorithm
H T T T H H H H T H
H H H H T H H H H H
H T T T H H H T H H
H T H T T T H H T T
T H H H T T H H T H