Clustering Machine Learning Algorithms (2)
Clustering Machine Learning Algorithms (2)
Machine
Learning
Algorithms
Outline
01 ML Categories 03 K-Means
(1) Supervised
● Used to train machines using
labeled data
● Takes labeled inputs and maps it
to known outputs (you already Classification
know the target variable)
&
Regression Problems
Machine Learning Catgeories
Clustering
&
(2) Unsupervised Association Problems
● Uses unlabeled data to discover
patterns and features in the data
● Understands patterns and trends in
the data and discovers the output
Machine Learning Catgeories
Reward Based
(3) Reinforcement Problems
● Uses an agent and an environment
to produce actions and rewards
● Follows trial and error method to
arrive at final solution
● Agent receives award after
finishing task
Clustering
Clusters
01 02
Partitional Hierarchical
Partitional Clustering
Database
‘k’
partitions
‘n’ Objects of data
Satisfying: Process:
- Each group contains at least one object - Create Initial partitioning
- Each object belongs to exactly one cluster - Use an iterative relocation technique to improve
partitioning
K-Means
K-Means
Stop Condition
- Define a maximum number of iterations
- Inertia doesn’t decrease or only
decreases insignificantly
(Inertia is the sum of squared distances. It
keeps decreasing throughout the iterations,
thus improving the data compactness)
K-Means
Advantages
Fast
Can serve as a data reduction
technique
K-Means
Disadvantages
It has a tendency to identify clusters
with same size ad volume (spherical
shapes)
Unable to identify elongated or non-
convex clusters
K-Means
Practical Use
Text Mining
Predictive Marketing
Clustering Methods
01 02
Partitional Hierarchical
Hierarchical Clustering
Database
Dendrogram
‘n’ Objects
Hierarchical Clustering
Hierarchical Clustering
Top-Bottom
Bottom-Up
A B C D E
A 0
B 1 0
C 2 2 0
D 2 5 3 0
E 3 4 6 6 0
Hierarchical Clustering
Linkage Methods
A B C D E
A 0
B 1 0
C 2 2 0
D 2 5 3 0
E 3 4 6 6 0
Hierarchical Clustering
Linkage Methods
A,B C D E
A,B 0
C 0
D 3 0
E 6 6 0
Hierarchical Clustering
Linkage Methods
A,B C D E
A,B 0
Single C 2 0
Link D 2 3 0
E 3 6 6 0
Hierarchical Clustering
Linkage Methods
A,B C D E
A,B 0
Complete C 2 0
Link D 5 3 0
E 4 6 6 0
Hierarchical Clustering
Linkage Methods
A,B C D E
A,B 0
C 2 0
Average D 3.5 3 0
E 3.5 6 6 0
Hierarchical Clustering
Linkage Methods
A,B C D E
A,B 0
C 2 0
Average D 3.5 3 0
E 3.5 6 6 0
Hierarchical Clustering
Linkage Methods
(A,B),C D E
(A,B),C 0
D 0
Average E 6 0
Hierarchical Clustering
Linkage Methods
(A,B),C D E
(A,B),C 0
D 3.33 0
Average E 4.33 6 0
Hierarchical Clustering
Linkage Methods
Centeroid criteria
Ward’s criteria
Hierarchical Clustering
Linkage Methods
Centeroid criteria
Ward’s criteria
Hierarchical Clustering
Advantages Disadvantages