0% found this document useful (0 votes)

3 views

Clustering Analysis

The document discusses various clustering methods, particularly hierarchical clustering techniques such as Ward's Method, which utilizes squared error for merging clusters. It highlights the strengths and limitations of different linkage methods, including Single Link, Complete Linkage, and Group Average, as well as their computational complexities. Additionally, it addresses issues like sensitivity to noise and outliers, and the challenges of handling clusters of varying sizes and shapes.

Uploaded by

Third Party

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Clustering Analysis

Uploaded by

Third Party

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

p1 p2 p3 p4 p5 ...

p1
Similarity?
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1

p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1

p2
p3

p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
p2
p3

p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
! Another way to view the processing of the
hierarchical algorithm is that we create links
between their elements in order of increasing
distance
" The MIN – Single Link, will merge two clusters
when a single pair of elements is linked
" The MAX – Complete Linkage will merge two
clusters when all pairs of elements have been
linked.
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5
1 2 .24 0 .15 .20 .14 .25
3 3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 5 .34 .14 .28 .29 0 .39
2 1
6 .23 .25 .11 .22 .39 0
2 3 6

0.2
4
4 0.15

0.1

0.05

Nested Clusters Dendrogram

0
3 6 2 5 4 1
Original Points Two Clusters

• Can handle non-elliptical shapes

Original Points Two Clusters

• Sensitive to noise and outliers

1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
4 1 2 .24 0 .15 .20 .14 .25
2 5 3 .22 .15 0 .15 .28 .11

5 4 .37 .20 .15 0 .29 .22

2 5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
3 6
3 0.4
1
0.35

4 0.3

0.25

0.2

0.15

0.1
Nested Clusters Dendrogram
0.05

0
3 6 4 1 2 5
Original Points Two Clusters

• Less susceptible to noise and outliers

Original Points Two Clusters

•Tends to break large clusters

•Biased towards globular clusters
! Proximity of two clusters is the average of pairwise proximity
between points in the two clusters.
∑ proximity(p , p )
pi∈Clusteri
i j

p j∈Clusterj
proximity(Clusteri , Clusterj ) =
|Clusteri |∗|Clusterj |

! Need to use average connectivity for scalability since total

proximity favors large clusters

1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
2 .24 0 .15 .20 .14 .25
3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5 4 1 2 .24 0 .15 .20 .14 .25

2 3 .22 .15 0 .15 .28 .11

5 4 .37 .20 .15 0 .29 .22

2 5 .34 .14 .28 .29 0 .39
3 6 .23 .25 .11 .22 .39 0
6
1
0.25
4
3 0.2

0.15

0.1

Nested Clusters Dendrogram 0.05

0
3 6 4 1 2 5
! Compromise between Single and Complete
Link

! Strengths
" Less susceptible to noise and outliers

! Limitations
" Biased towards globular clusters
! Similarity of two clusters is based on the
increase in squared error (SSE) when two
clusters are merged
" Similar to group average if distance between points is
distance squared
! Less susceptible to noise and outliers
! Biased towards globular clusters
! Hierarchical analogue of K-means
" Can be used to initialize K-means
5
1 4 1
3
2 5
5 5
2 1 2
MIN MAX
2 3 6 3 6
3
1
4 4
4

5
1 5 4 1
2 2
5 Ward’s Method 5
2 2
3 6 Group Average 3 6
3
4 1 1
4 4
3
! O(N2) space since it uses the proximity
matrix.
" N is the number of points.

! O(N3) time in many cases

" There are N steps and at each step the size, N2,
proximity matrix must be updated and searched
" Complexity can be reduced to O(N2 log(N) ) time
for some approaches
! Computational complexity in time and space
! Once a decision is made to combine two
clusters, it cannot be undone
! No objective function is directly minimized
! Different schemes have problems with one or
more of the following:
" Sensitivity to noise and outliers
" Difficulty handling different sized clusters and convex
shapes
" Breaking large clusters

Review Article: Deep Learning For Computer Vision: A Brief Review
No ratings yet
Review Article: Deep Learning For Computer Vision: A Brief Review
14 pages
DS4 Nonpartitional Clustering
No ratings yet
DS4 Nonpartitional Clustering
53 pages
Cluster Analysis 04: Elbow, Slihouette, Hierarchical Clustering, Agglomerative Clustering, Min, Max, Group Average
No ratings yet
Cluster Analysis 04: Elbow, Slihouette, Hierarchical Clustering, Agglomerative Clustering, Min, Max, Group Average
28 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
08 Clustering Hierarchical
No ratings yet
08 Clustering Hierarchical
44 pages
AI20- Hierarchical-clustering
No ratings yet
AI20- Hierarchical-clustering
31 pages
Introduction To Data Mining Clustering Analysis
No ratings yet
Introduction To Data Mining Clustering Analysis
84 pages
Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm
No ratings yet
Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm
63 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Topic 6d - Hierarchical Algorithm
No ratings yet
Topic 6d - Hierarchical Algorithm
38 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Unit 5 Hierarchical
No ratings yet
Unit 5 Hierarchical
42 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
1629189889 ML TCS Lecture Hierarchical 1608
No ratings yet
1629189889 ML TCS Lecture Hierarchical 1608
41 pages
hierarchical-4-4-03 (1)
No ratings yet
hierarchical-4-4-03 (1)
15 pages
6 - Chapter 6 - Hierarchical Clustering
No ratings yet
6 - Chapter 6 - Hierarchical Clustering
32 pages
Cluster Analysis
No ratings yet
Cluster Analysis
30 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
16 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
Ward Clustering Algorithm
100% (1)
Ward Clustering Algorithm
4 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
Hierarchical
No ratings yet
Hierarchical
9 pages
Cluster 1
No ratings yet
Cluster 1
6 pages
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
No ratings yet
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
41 pages
Topic 6e - Hierarchical Clustering (MIN)
No ratings yet
Topic 6e - Hierarchical Clustering (MIN)
14 pages
Agglomerative Clustering
No ratings yet
Agglomerative Clustering
6 pages
6902 An Applied Algorithmic Foundation For Hierarchical Clustering
No ratings yet
6902 An Applied Algorithmic Foundation For Hierarchical Clustering
10 pages
Example For Agglomerative Clustering
No ratings yet
Example For Agglomerative Clustering
2 pages
ML_Lec-17
No ratings yet
ML_Lec-17
12 pages
Research 1
No ratings yet
Research 1
36 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
UnSupervisedLearning
No ratings yet
UnSupervisedLearning
22 pages
Hierarchical clustering
No ratings yet
Hierarchical clustering
23 pages
Agnes
No ratings yet
Agnes
25 pages
Clustering
No ratings yet
Clustering
19 pages
DWM Exp8 127 133 137
No ratings yet
DWM Exp8 127 133 137
4 pages
Distance Measures
No ratings yet
Distance Measures
11 pages
3.2 HierCluster
No ratings yet
3.2 HierCluster
17 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Clustering
No ratings yet
Clustering
47 pages
Clustring
No ratings yet
Clustring
20 pages
Hierarchical Clustering Methods
No ratings yet
Hierarchical Clustering Methods
22 pages
Hierarchical
No ratings yet
Hierarchical
2 pages
Hierarchical-Clustering-in-Machine-Learning
No ratings yet
Hierarchical-Clustering-in-Machine-Learning
10 pages
Stat401 ch6
No ratings yet
Stat401 ch6
37 pages
clustering
No ratings yet
clustering
8 pages
3CP10 Mjj Hierarchical Clustering
No ratings yet
3CP10 Mjj Hierarchical Clustering
40 pages
Olszewski Clustering
No ratings yet
Olszewski Clustering
4 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Hierarchical Clustering Dendrograms
No ratings yet
Hierarchical Clustering Dendrograms
12 pages
ML_Lec-18
No ratings yet
ML_Lec-18
21 pages
Birch
No ratings yet
Birch
6 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
6 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
34 pages
Hierarchial Clustering
No ratings yet
Hierarchial Clustering
14 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Unit-6 Clustering Techniques
No ratings yet
Unit-6 Clustering Techniques
110 pages
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
From Everand
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
Jack C. Stanely
No ratings yet
EDGE: Stone Souls Book II
From Everand
EDGE: Stone Souls Book II
J.A.L. Solski
No ratings yet
Kmean
No ratings yet
Kmean
4 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
UNIT 4 K-Means Clustring
No ratings yet
UNIT 4 K-Means Clustring
13 pages
Aimoneyflow
No ratings yet
Aimoneyflow
3 pages
hw1 Deep Learning SoSe2024
No ratings yet
hw1 Deep Learning SoSe2024
2 pages
Adobe Scan Dec 17, 2023 (1)
No ratings yet
Adobe Scan Dec 17, 2023 (1)
1 page
Neural Network Toolbox Command List
No ratings yet
Neural Network Toolbox Command List
4 pages
IEEE Xplore Reference
No ratings yet
IEEE Xplore Reference
2 pages
Deep Learning (R20a06610)
No ratings yet
Deep Learning (R20a06610)
170 pages
D1-22683 Aam Tyan 2023-24 SMD
No ratings yet
D1-22683 Aam Tyan 2023-24 SMD
6 pages
Rumor Detection From Social Media
No ratings yet
Rumor Detection From Social Media
4 pages
Machine Learning Paper Set-5
No ratings yet
Machine Learning Paper Set-5
2 pages
CS8082U4L03 RadialBasisFunctions
No ratings yet
CS8082U4L03 RadialBasisFunctions
9 pages
Clustering MMD
No ratings yet
Clustering MMD
1 page
Weka Overview Slides
No ratings yet
Weka Overview Slides
31 pages
Fully Connected (FC) Layer: Usually The Last Part (Or Layers) of Every CNN Architecture
No ratings yet
Fully Connected (FC) Layer: Usually The Last Part (Or Layers) of Every CNN Architecture
2 pages
Algorithem Cheat Sheet
No ratings yet
Algorithem Cheat Sheet
25 pages
ICMR Healthcare Capstone Project - Jupyter Notebook
No ratings yet
ICMR Healthcare Capstone Project - Jupyter Notebook
30 pages
Predictive Analytics Answer Key
No ratings yet
Predictive Analytics Answer Key
3 pages
Unit 1 Introduction to Neural Networks Cleaned
No ratings yet
Unit 1 Introduction to Neural Networks Cleaned
4 pages
ML-Lecture-8-9-Classification
No ratings yet
ML-Lecture-8-9-Classification
35 pages
Artificial Intelligence & Machine Learning Lab With Applications
No ratings yet
Artificial Intelligence & Machine Learning Lab With Applications
6 pages
Scholastic Video Book Series: Artificial Neural Networks
No ratings yet
Scholastic Video Book Series: Artificial Neural Networks
37 pages
Data Mining Cheat Sheet PDF
No ratings yet
Data Mining Cheat Sheet PDF
6 pages
ML Unit 2 ppt
No ratings yet
ML Unit 2 ppt
54 pages
15.03.2024_CSA3007_A24+D23+D24 (1)
No ratings yet
15.03.2024_CSA3007_A24+D23+D24 (1)
8 pages
Summary Title of Article - "Music Genre Classification - A Review of Deep-Learning and Traditional Machine-Learning Approaches"
No ratings yet
Summary Title of Article - "Music Genre Classification - A Review of Deep-Learning and Traditional Machine-Learning Approaches"
1 page
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
No ratings yet
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
19 pages
GRL - EX - 4 (1) .Ipynb - Colaboratory
No ratings yet
GRL - EX - 4 (1) .Ipynb - Colaboratory
7 pages

Clustering Analysis

Uploaded by

Clustering Analysis

Uploaded by

p1 p2 p3 p4 p5 ...

Nested Clusters Dendrogram

• Can handle non-elliptical shapes

• Sensitive to noise and outliers

5 4 .37 .20 .15 0 .29 .22

• Less susceptible to noise and outliers

•Tends to break large clusters

! Need to use average connectivity for scalability since total

2 3 .22 .15 0 .15 .28 .11

5 4 .37 .20 .15 0 .29 .22

Nested Clusters Dendrogram 0.05

! O(N3) time in many cases

You might also like