CS583 Unsupervised Learning
CS583 Unsupervised Learning
Unsupervised
Learning
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
Clustering
An illustration
To do targeted marketing.
CS583, Bing Liu, UIC
Aspects of clustering
A clustering algorithm
Partitional clustering
Hierarchical clustering
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
K-means clustering
10
K-means algorithm
2)
3)
4)
11
12
Stopping/convergence
criterion
1. no (or minimum) re-assignments of data
2.
3.
SSE
j 1
xC j
dist (x, m j ) 2
(1)
13
An example
14
An example (cont )
15
16
Not the best method. There are other scaleup algorithms, e.g., BIRCH.
CS583, Bing Liu, UIC
17
18
Strengths of k-means
Strengths:
19
Weaknesses of k-means
Outliers are data points that are very far away from
other data points.
Outliers could be errors in the data recording or
some special data points with very different values.
CS583, Bing Liu, UIC
20
Weaknesses of k-means:
Problems with outliers
21
22
23
24
25
K-means summary
26
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
27
28
29
30
31
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
32
Hierarchical Clustering
33
Types of hierarchical
clustering
Agglomerative (bottom up) clustering: It builds the
dendrogram (tree) from the bottom level, and
Splits the root into a set of child clusters. Each child cluster
is recursively divided further
stops when only singleton clusters of individual data points
remain, i.e., each cluster with only a single point
34
Agglomerative clustering
It is more popular then divisive methods.
At the beginning, each data point forms a
cluster (also called a node).
Merge nodes/clusters that have the least
distance.
Go on merging
Eventually all nodes belong to one cluster
35
Agglomerative clustering
algorithm
36
37
Single link
Complete link
Average link
Centroids
38
39
40
methods
Average link: A compromise between
41
The complexity
Sampling
Scale-up methods (e.g., BIRCH).
CS583, Bing Liu, UIC
42
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
43
Distance functions
Numeric data
Nominal data
44
1
h h
) )
45
46
47
48
Confusion matrix
49
bc
dist (xi , x j )
abcd
CS583, Bing Liu, UIC
50
51
bc
dist (xi , x j )
abc
52
Nominal attributes
rq
dist (xi , x j )
r
CS583, Bing Liu, UIC
53
54
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
55
Data standardization
56
Interval-scaled attributes
if
max( f ) min( f )
57
Interval-scaled attributes
(cont
Z-score: )
transforms the attribute values so that they
1
s f | x1 f m f | | x2 f m f | ... | xnf m f | ,
n
1
m f x1 f x2 f ... xnf ,
n
Z-score:
z ( xif )
xif m f
sf
58
Ratio-scaled attributes
Do log transform:
log( xif )
59
Nominal attributes
60
Nominal attributes: an
example
Nominal attribute fruit: has three values,
61
Ordinal attributes
62
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
63
Mixed attributes
interval-scaled,
symmetric binary,
asymmetric binary,
ratio-scaled,
ordinal and
nominal
CS583, Bing Liu, UIC
64
65
66
Combining individual
distances
dist (xi , x j )
f f
f 1 ij dij
r
f 1 ij
r
67
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
68
69
70
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
71
User inspection
72
73
Evaluation measures:
Entropy
74
75
An example
76
algorithms.
A real-life data set for clustering has no class labels.
77
78
Indirect evaluation
79
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
80
81
82
83
84
An example
85
86
Characteristics of the
approach
It provides representations of the resulting data and
87
|C |
Pr(c ) log
j
Pr(c j )
j 1
88
An example
89
90
An example
91
92
93
Road map
Basic concepts
K-means algorithm
Representation of clusters
Hierarchical clustering
Distance functions
Data standardization
Handling mixed attributes
Which clustering algorithm to use?
Cluster evaluation
Discovering holes and data regions
Summary
CS583, Bing Liu, UIC
94
Summary
95