Probablistic_Clustering
Probablistic_Clustering
Session No.: 30
Course Name: Machine Learning
Course Code: E1UA406C
Instructor Name:
Date of Conduction of Class: 05-06-2025
Galgotias University 1
Review of the key concepts of previous session
Hierarchical Clustering produces a tree-like structure of clusters (called a
dendrogram).
In single linkage, the distance between two clusters is defined as the shortest
distance between any single point in one cluster and any point in the other cluster.
In complete linkage, the distance between two clusters is defined as the maximum
distance between any single point in one cluster and any point in the other cluster.
Galgotias University 2
Galgotias University 3
At the end of this session you will be able to
2 Basic Concepts
5 Summary
5
Probabilistic Clustering is a type of clustering technique in
machine learning where each data point is assigned to one
or more clusters based on a probability distribution.
Galgotias University 6
Expectation-Maximization (EM) Algorithm:
Galgotias University 7
Gaussian Mixture Models (GMM)
Galgotias University 8
Example Problem:
We have a small dataset of 2D points. We want to cluster them into two groups using a
Gaussian Mixture Model (GMM).
Imagine we have exam scores of 6 students in two subjects: Math and English.
Our goal is to group these students into 2 clusters based on their scores, but with
probabilistic assignments, not hard ones.
Galgotias University 9
Step 1: Plot the data
If you plot the data, you'd see:
▪ Students A and B are high scorers
▪ C and D are low scorers
▪ E and F are in the middle
So we might expect 2 clusters, but E and F could belong partly to both
clusters.
❑ Or heuristically chosen (e.g., first half of data in one cluster, second half in another).
Cluster 1 → [1, 2, 3]
Cluster 2 → [8, 9, 10]
Galgotias University 17
Galgotias University 18
Galgotias University 19
Galgotias University 20
Galgotias University 21
Galgotias University 22
Galgotias University 23
Galgotias University 24
Learning Activity 1:
Quiz on LMS
26
Next Session
Dimensionality
Reduction
27
19
Thank You
28