0% found this document useful (0 votes)

15 views21 pages

4 Clustering

This document discusses various clustering techniques used in predictive analytics and healthcare: 1. K-means clustering aims to minimize distances between clusters and is an iterative process that assigns patients to a set number of predefined clusters. 2. Fuzzy C-means clustering is similar but allows patients to have partial membership in multiple clusters, introducing overlap. 3. Spectral clustering can handle irregularly shaped clusters and is based on the eigenvalues of the similarity matrix between patients. 4. Hierarchical clustering builds a dendrogram showing cluster relationships and can be agglomerative or divisive in merging or splitting clusters.

Uploaded by

paulitxenko08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views21 pages

4 Clustering

Uploaded by

paulitxenko08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Unit 4 – Clustering

Predictive Analytics in Healthcare

Miguel Rodrigo
Department of Electronic Engineering, School of Engineering
Universitat de València, Avgda. Universitat s/n
46100 Burjassot (Valencia)
[email protected]

1
Clustering: definition

Unsupervised learning technique that allows to group samples

(patients) based on the similarity of their characteristics (features). The
label is never used in this analysis.
Data is represented on a N-dimension space, being N the number of
features, and clusters (groups) of patients are identified as continuous
area of the data space with a relatively high density of points, that is
separated from other dense locations, by areas whose density of points
is relatively low.
How many clusters/groups
do we have?
Two? Maybe four?

2
Proximity measures

Similarity measures (Correlation):

• Inner product, Cosine measure, Tanimoto’s measure

Dissimilarity measures (Distances):

• Euclidean, Mahalanobis, Bhattacharyya

Caution: before using clustering algorithms, features should be

normalized. Otherwise, distances have different measures
depending on the range of the feature (e.g. age vs siblings).

3
K-means

The algorithm tries to minimize the

distance of all patterns that belong
to a given cluster. This is why the first
step is to find the closest cluster (j-th
cluster) to the i-th pattern, and then
the second step is based on making
intra-cluster distances as low as
possible. It is an iterative process
(randomly initialized in most cases).
It is based on the minimization of the
following cost function:

4
Fuzzy C-Means
K-means
Based on fuzzy logic, where a given measure may have a membership to different
categories. When applied to clustering, any pattern may belong to different clusters
with a fuzzy membership, thus introducing overlapping, which is a common
phenomenon, in a natural way and with a robust mathematical background.
Similar to K-Means but instead of using the distance from a given pattern only to the
closest cluster, all distances are taken into account, with an associated fuzzy
membership (m>1 for a fuzzy clustering).
Fuzzy C-means

5
K-means and Fuzzy C-Means

Despite being the most widely used

clustering algorithm, it has yet some
drawbacks:
• The number of clusters must be known
in advance (partially solved by
ISODATA).
• Clusters tend to have hyper-spherical
shapes, thus mixing or breaking up
natural clusters with non-spherical
shapes.
• All clusters are formed by a similar
number of patterns, again mixing or
breaking up natural clusters.

6
Spectral clustering

There are methods focused on dividing a set of graphs or nodes of

a graph into different clusters. Based on using the eigenvalues
(spectrum) of the Laplacian of the proximity (dissimilarity) matrix,
which contains the distance values between each pair of points.

The stages of the algorithm are:

1. Constructing a nearest neighbors graph or radius-based graph.
2. Embed the data points in low dimensional space (spectral
embedding) in which the clusters are more obvious with the use of
eigenvectors of the Laplacian graph.
3. Use the lowest eigenvalue in order to choose the eigenvector for
the cluster.

7
Spectral clustering

It can deal with clusters that are not compact or within convex boundaries, where most of the classical
clustering algorithms fail.
ISODATA is a variant of spectral clustering.

K-means Spectral
K-means Spectral

8
Hierarchical clustering

Dendogram
Clustering method based in an iterative
process, that may be agglomerative or Large and single cluster
divisive, in which the different clusters are
created using a hierarchy based on
proximity/dissimilarity measures.
Depending on the hierarchy that is
selected, a different number of clusters
turns up. Its main advantage and
drawback is the same, though it seems
paradoxical: the selection of a correct
hierarchy is a challenge but the
hierarchical visualization (dendrogram) is
useful in itself. Smallest and separated clusters

9
Hierarchical clustering: Agglomerative approach

F G

D
B E
C
A

10
Hierarchical clustering: Agglomerative approach

Initial clustering: each pattern is a cluster

For each iteration:
1. Calculation of the distances d(Cr,Cs) between all pairs of
clusters produced in the last iteration. That pair of clusters
with the shortest distance (Ci,Cj) is selected to be joined in a
new cluster
2. The new clustering is like the one in the previous iteration but
without clusters Ci and Cj that now are a unique cluster Cq,
that does appear in the new clustering.

The algorithm stops when there is only one cluster for the whole
data set. Sometimes, the whole hierarchy is not obtained, but just a
selection that seems reasonable for the problem at hand.

11
Hierarchical clustering: Agglomerative approach

Not all distances must be calculated at each iteration, only the distance between
the new cluster Cq and the rest of clusters; algorithms based on Lance and
Williams formula are normally used:
• Single-link algorithm:
• Complete-link algorithm:
• Unweighted average algorithm:
• Weighted average algorithm:
• Unweighted centroid algorithm:
• Weighted centroid algorithm:

12
Cluster validity
Suitable number of clusters for a
given distribution of samples Dunn index (M is the number of clusters)
(patients)? Usually, there is not a
unique and definite answer.
Some indices that can help find the
correct number of clusters are
based on Compactness and Silhouette Coefficient:
Isolation.
Score = (b-a)/max(a,b)
a= average intra-cluster
distance
b= average inter-cluster
distance

13
Cluster validity

• SSE (Sum of Squared Error) or Inertia plot. elbow point

Average SSE error between samples and cluster
centers. Elbow point: the point after which the
SSE or inertia starts decreasing in a linear
fashion.
• Hopkins statistics. Used to assess the clustering
tendency of a data set by measuring the
probability that a given data set is generated by
uniform data distribution. In other words, it
tests the spatial randomness of the data.

14
Self Organizing Maps (SOM)

Unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional)

representation of a higher dimensional data set while preserving the topological structure of the data.
SOM are build using shallow neural networks, which are trained using competitive learning, in which several
nodes compete for the right to respond to a subset of the input data.

Neurons, defined as the points of the black net, are

trained to cover the data distribution (blue)

15
Once trained, SOM maps show information into a
uniform 2D map. These SOM map shows the
same 2D distribution for all the features used in
the clustering. Each sample (patient) is always in
the same position of the 2D map.

In colors: Average
feature value per
neuron

Number of samples (patients)

per neuron

16
Other clustering methods: Density-based

The data points in the region separated by two clusters of low point density are considered as noise. The
surroundings with a radius ε of a given object are known as the ε neighborhood of the object. If the ε
neighborhood of the object comprises at least a minimum number of objects then it is called a core object.

• DBSCAN (Density-Based
Spatial Clustering of
Applications with Noise). It
depends on a density-
based notion of cluster. It
also identifies clusters of
arbitrary size in the spatial
database with outliers.

17
Other clustering methods: Density-based

• OPTICS (Ordering Points To Identify

the Clustering Structure). Addresses
the problem of data density that
cannot solve DBSCAN.

• DENCLUE (DENsity-based CLUstEring).

It uses grid cells but only keeps
information about grid cells that do
actually contain data points and
manages these cells in a tree-based
access structure. It is good for data
sets with a huge amount of noise.
Reachability-plot (a special kind of dendrogram)

18
Other clustering methods: Manifold learning

Class of unsupervised estimators that seeks to describe datasets as low-dimensional manifolds embedded in
high-dimensional spaces (e.g. piece of paper).

• Multidimensional scaling (MDS). Set of related

ordination techniques used in information visualization,
in particular to display the information contained in a
distance matrix. It is a form of non-linear dimensionality
reduction.
• Locally linear embedding (LLE). It tries to reduce these
n-Dimensions while trying to preserve the geometric
features of the original non-linear feature structure.
• Isometric mapping (IsoMap). Combines MDS with the
geodesic distance for reducing the dimensionality of
data sampled from a smooth manifold.

19
Clustering with categorical features

In many clinical problems, many of the input features for clustering

are categorical variables. This limits the clustering performance, as
points (patients) are not well spatially distributed.

Tips:
• Remember to use any kind of data normalization to compare
categorical (Boolean) with continuous features (e.g. age vs sex).
• Consider running clustering only on continuous features if
possible.
• There are some clustering methods / metrics that better deal with
binary features: Hierarchical clustering, DBSCAN, etc.

20
Clustering methods for supervised learning

Unsupervised learning does not allow to train models for classification,

regression or prediction, as clustering algorithms are not made to ‘fit’ a
desired response.
However, clustering techniques are very useful for supervised learning
problems, as they can be used for several tasks:
• Finding relevant subpopulations in our data: inherent clinical
groups that are relevant for our problem, subpopulations with
different responses to the label, etc. Training different
classification/regression models for each subpopulation can be
beneficial.
• Helping in feature selection. Those features better allowing patient
clustering may be better for the problem. This can checked through
clustering metrics (silhouette, etc.) or by directly comparing cluster
distribution with the label (contingency tables, etc.). Cluster ID Label

Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Clustering
No ratings yet
Clustering
45 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
22 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
UNIT5
No ratings yet
UNIT5
60 pages
Clustering
No ratings yet
Clustering
39 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
W6 Clustering
No ratings yet
W6 Clustering
29 pages
Data Mining Unit-Iv
No ratings yet
Data Mining Unit-Iv
34 pages
DS9 - Clustering
No ratings yet
DS9 - Clustering
35 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
77 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
18 pages
UnsupervisedLearning FoundationalMathofAI S24
No ratings yet
UnsupervisedLearning FoundationalMathofAI S24
6 pages
SJNanda - Spider and CollidingBodies
No ratings yet
SJNanda - Spider and CollidingBodies
50 pages
ML L14 Clustering
No ratings yet
ML L14 Clustering
59 pages
Clustering Slides
No ratings yet
Clustering Slides
22 pages
Clustering 2
No ratings yet
Clustering 2
17 pages
MLP U4
No ratings yet
MLP U4
11 pages
Clustering
No ratings yet
Clustering
34 pages
Clustering Part1
No ratings yet
Clustering Part1
79 pages
Clustering Basics
No ratings yet
Clustering Basics
39 pages
Clustering
No ratings yet
Clustering
20 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
CLUSTERING
No ratings yet
CLUSTERING
20 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
w6 Clustering
No ratings yet
w6 Clustering
29 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
21 pages
AIMLB PGP 2024 Session 12
No ratings yet
AIMLB PGP 2024 Session 12
46 pages
Lecture 9 Clustering
No ratings yet
Lecture 9 Clustering
36 pages
Clustering in R
No ratings yet
Clustering in R
12 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
ML Module5 Clustering
No ratings yet
ML Module5 Clustering
71 pages
Unit 5
No ratings yet
Unit 5
5 pages
Cluster
100% (1)
Cluster
72 pages
PR Assignment 02 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 02 - Seemal Ajaz (206979)
5 pages
DW & DM Unit 4 Notes
No ratings yet
DW & DM Unit 4 Notes
40 pages
DMW Unit-V
No ratings yet
DMW Unit-V
47 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
Clustering
No ratings yet
Clustering
75 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
M5
No ratings yet
M5
40 pages
Module 5
No ratings yet
Module 5
91 pages
DSS09 (B) - Clustering
No ratings yet
DSS09 (B) - Clustering
35 pages
K Medoids
No ratings yet
K Medoids
101 pages
Unsupervised Learning-01
No ratings yet
Unsupervised Learning-01
42 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
64 pages
9 Som
No ratings yet
9 Som
32 pages
5 - Clustering
No ratings yet
5 - Clustering
13 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
Session 7 Clustering
No ratings yet
Session 7 Clustering
93 pages
M5
No ratings yet
M5
40 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
15 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
Clustering (Unit 3)
100% (2)
Clustering (Unit 3)
71 pages
Clustering
No ratings yet
Clustering
104 pages
ML Unit-3
No ratings yet
ML Unit-3
22 pages
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Optimization Report
No ratings yet
Optimization Report
9 pages
Linear Algebra and Optimization T2
No ratings yet
Linear Algebra and Optimization T2
19 pages
Artificial Intelligence, Machine Learning and Smart Technologies For Nondestructive Evaluation
No ratings yet
Artificial Intelligence, Machine Learning and Smart Technologies For Nondestructive Evaluation
17 pages
Jin-Han2010 ReferenceWorkEntry K-MeansClustering
No ratings yet
Jin-Han2010 ReferenceWorkEntry K-MeansClustering
10 pages
Data Clustering
No ratings yet
Data Clustering
37 pages
Community Detection in Social Networks An Overview
No ratings yet
Community Detection in Social Networks An Overview
6 pages
Spectral Clustering
No ratings yet
Spectral Clustering
7 pages
DiffCut Catalyzing Zero-Shot Semantic Segmentation
No ratings yet
DiffCut Catalyzing Zero-Shot Semantic Segmentation
24 pages
Data Clustering: 50 Years Beyond K-Means
No ratings yet
Data Clustering: 50 Years Beyond K-Means
35 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
(KDD 2023) Boosting Multitask Learning On Graphs Through Higher Order Task Affinities
No ratings yet
(KDD 2023) Boosting Multitask Learning On Graphs Through Higher Order Task Affinities
10 pages
SocialNetworkAnalysis FullNote
No ratings yet
SocialNetworkAnalysis FullNote
10 pages
Data Science and Big Data An Environment of Computational Intelligence
100% (4)
Data Science and Big Data An Environment of Computational Intelligence
303 pages
Floorplanning With Graph Attention
No ratings yet
Floorplanning With Graph Attention
6 pages
MSRFellowship Writeup
No ratings yet
MSRFellowship Writeup
2 pages
Dimensionality Reduction: Pca, SVD, MDS, Ica, and Friends
No ratings yet
Dimensionality Reduction: Pca, SVD, MDS, Ica, and Friends
50 pages
03 23MAT214 MIS4 KMeans Spectral Clustering
No ratings yet
03 23MAT214 MIS4 KMeans Spectral Clustering
52 pages
Spectral Clustering
No ratings yet
Spectral Clustering
5 pages
Computer Vision (600.461/600.661) Homework 6: Segmentation and Recognition
No ratings yet
Computer Vision (600.461/600.661) Homework 6: Segmentation and Recognition
2 pages
Application of Spectral Clustering On Microarray Data of Carcinoma by Using K-Means Partition Algorithm
No ratings yet
Application of Spectral Clustering On Microarray Data of Carcinoma by Using K-Means Partition Algorithm
7 pages
Path-Based Spectral Clustering: Guarantees, Robustness To Outliers, and Fast Algorithms
No ratings yet
Path-Based Spectral Clustering: Guarantees, Robustness To Outliers, and Fast Algorithms
66 pages
Machine Learning-4
No ratings yet
Machine Learning-4
73 pages
KKNN
No ratings yet
KKNN
15 pages
Sem232 LA CC07 Group08
No ratings yet
Sem232 LA CC07 Group08
23 pages
Makdad - Chloe - NCUWM2021Poster Connections Between Graph Spectral Clustering and PDEs
No ratings yet
Makdad - Chloe - NCUWM2021Poster Connections Between Graph Spectral Clustering and PDEs
1 page
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
ML Solved Endsem
No ratings yet
ML Solved Endsem
16 pages
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
No ratings yet
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
102 pages
Social Network Analysis Answers
No ratings yet
Social Network Analysis Answers
165 pages
Leniear Algebra Operation For Machine Learning
No ratings yet
Leniear Algebra Operation For Machine Learning
10 pages

4 Clustering

Uploaded by

4 Clustering

Uploaded by

Unit 4 – Clustering

Predictive Analytics in Healthcare

Unsupervised learning technique that allows to group samples

Similarity measures (Correlation):

Dissimilarity measures (Distances):

Caution: before using clustering algorithms, features should be

The algorithm tries to minimize the

Despite being the most widely used

There are methods focused on dividing a set of graphs or nodes of

The stages of the algorithm are:

Initial clustering: each pattern is a cluster

• SSE (Sum of Squared Error) or Inertia plot. elbow point

Unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional)

Neurons, defined as the points of the black net, are

Number of samples (patients)

• OPTICS (Ordering Points To Identify

• DENCLUE (DENsity-based CLUstEring).

• Multidimensional scaling (MDS). Set of related

In many clinical problems, many of the input features for clustering

Unsupervised learning does not allow to train models for classification,

You might also like