9 Fuzzy Clustering
9 Fuzzy Clustering
Content
●
Introduction
●
FCM: Input, output and restriction
●
Fuzzy C-Means Algorithm
●
Demonstration and Example (SKFuzzy)
@LimCK 2 / 32
Introduction
●
Clustering problem: grouping a set of objects in
such a way that objects in the same group are
more similar to each other than to those in other
groups
@LimCK 3 / 32
Introduction
●
Application of clustering
a) Data analysis b) Image segmentation, etc
@LimCK 4 / 32
Introduction
●
K-Means – hard boundaries, each object belongs to 1 cluster.
@LimCK 5 / 32
Introduction
●
K-Means – hard boundaries, each object belongs to 1 cluster.
●
Fuzzy C-Means (FCM) – soft boundaries. Each object belongs
to every cluster with some weight (membership degree)
@LimCK 6 / 32
FCM: Input, output and restriction
●
Input: Unlabeled data set / objects
● X = { x1, x2, …, xj, … , xN}
N is the number of data/objects
xj is a p dimension vector (real number)
●
Also, we need to specify the number of cluster, C
@LimCK 7 / 32
FCM: Input, output and restriction
●
Output :
– A C-partition of X, which is visualize as a matrix U with
dimension (C X N)
U = [ uij ]
– where 1 ≤ i ≤ C gives the number of cluster
– and 1 ≤ j ≤ N gives the number of object / data
uij is the membership degree of object j belong to cluster i
– Some also include vectors V = { v1, v2, …, vC} that represent the
cluster centers
@LimCK 8 / 32
FCM: Input, output and restriction
● Since it is a membership degree, 0 ≤ uij ≤ 1.
●
The total MD of an object in all cluster must add up to 1.
@LimCK 9 / 32
Example of output, U
0.2 0.5 0.7 0.2 0.1 0.6 0.8 0.4 0.2 0.1
0.7 0.2 0.1 0.8 0.8 0.2 0.1 0.6 0.8 0.8
0.1 0.3 0.2 0 0.1 0.2 0.1 0 0 0.1
@LimCK 10 / 32
FCM Algorithm
●
Minimize cost function:
Stopping criteria :
1) center of clusters don’t change
2) change in the cost function is below a specified threshold
3) absolute change in any uij is below a given threshold
@LimCK 12 / 32
Computing Membership Degrees
●
The membership degrees are given by:
●
The distance an object to each cluster center is
computed.
@LimCK 13 / 32
Computing the cluster centers
●
For a cluster i , the corresponding cluster center
ci is defined as:
●
All points are considered and the contribution of
each point to the cluster center is weighted by
its membership degree.
@LimCK 14 / 32
Effect of parameter m
●
If m > 2, then the exponent 1/(m-1) decrease the weight
assigned to clusters that are close to the point.
●
If m→∞ , then the exponent → 0. This implies that the
weights → 1/k.
●
If m→1, the exponent increases the membership degree
of points to which the cluster is close.
As m → 1, membership degree → 1 for the closest cluster
and membership degree → 0 for all the other clusters (this
corresponds to k-means).
@LimCK 15 / 32
Fuzzy C Means - demonstration
●
Simple 1-D case, with 20 data points and 3 clusters.
●
Set m=2. Stop if difference between step <0.3
@LimCK 16 / 32
Fuzzy C Means - demonstration
●
Simple 1-D case, with 20 data points and 3 clusters.
●
Set m=2. Stop if difference between step <0.3
@LimCK 17 / 32
Fuzzy C Means - demonstration
●
Simple 1-D case, with 20 data points and 3 clusters.
●
Set m=2. Stop if difference between step <0.3
@LimCK 18 / 32
FCM with skfuzzy (Clustering)
●
Syntax:
result = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
Or
cn, u, u0, d, jm, p, fpc = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
●
Where inputs:
data – data to be clustered in 2D array with size (S,N). S : features, N : instances
c – integer representing desire number of clusters
m – float. Parameter m that control fuzziness of the clustering, typical value 2
error – Float. Stopping criteria. Algorithm stops if the cluster centers move < error
@LimCK 19 / 32
FCM with skfuzzy (Clustering)
●
Syntax:
result = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
Or
cn, u, u0, d, jm, p, fpc = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
●
Where inputs:
maxiter – Integer. Maximum number of iterations allowed
init – 2D array with size (S,N). Initial fuzzy c-partitioned matrix. If none provided,
algorithm is randomly initialized.
seed – Integer. If provided, sets random seed of init. No effect if init is provided. Mainly
for debug/testing purposes
@LimCK 20 / 32
FCM with skfuzzy (Clustering)
●
Syntax:
result = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
Or
cn, u, u0, d, jm, p, fpc = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
●
Where outputs:
cn – 2D array size (S,c) that represents c cluster centers
u – 2D array size (S,N) that represents final C-partitioned matrix
u0 – 2D array size (S,N) that represents the initial guess of u
d – 2D array size (S,N) that represents final Euclidian distance matrix
@LimCK 21 / 32
FCM with skfuzzy (Clustering)
●
Syntax:
result = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
Or
cn, u, u0, d, jm, p, fpc = fuzz.cmeans(data, c, m, error,
maxiter, init=None, seed=None)
●
Where outputs:
jm – 1D array that records the history of objective function values
p – number of iterations run
fpc - Final fuzzy partition coefficient
result – Everything from cn to fpc
@LimCK 22 / 32
FCM with skfuzzy (Prediction)
●
Syntax:
u, u0, d, jm, p, fpc =
fuzz.cmeans_predict ( testdate,
c_cntr, m, error, maxiter,
init=None, seed=None )
@LimCK 23 / 32
Demo: Iris dataset with 4 features
●
Iris dataset - contains 3 classes iris plant (Setosa, Versicolour,
Verginica)
●
50 instances for each class
●
One class is linearly separable from the other 2; the latter are
NOT linearly separable from each other.
●
4 features are recorded for each class : sepal length, sepal
width, petal length and petal width.
●
For better visualization, only 2 classes (100 instances in total)
are used for demo.
@LimCK 24 / 32
Demo: Iris dataset with 4 features
Average (cm) Sepal length Sepal width Petal length Petal width
setosa 5.006 3.418 1.464 0.244
versicolor 5.936 2.77 4.26 1.326
virginica 6.588 2.974 5.552 2.026
@LimCK 25 / 32
Demo: Iris dataset with 4 features
●
Data : “iris2b.csv” with dimension 100X2
@LimCK 26 / 32
Demo: Iris dataset with 4 features
Command
●
Data : “iris2b.csv” with dimension 100X2 cntr, u, u0, d, jm, p,
fpc = fuzz.cmeans(data,
cluster_number=2, m=2,
stopping_error=0.00001,
maxiter=100, init=None,
seed=None)
cntr
array([[5.97571191,
2.79326814,
4.30582601,
1.33911579],
[5.00456842,
3.40232882,
1.48743359,
@LimCK 0.25304421]]) 27 / 32
Demo: Iris dataset with 4 features
Command
●
Data : “iris2b.csv” with dimension 100X2 cntr, u, u0, d, jm, p,
fpc = fuzz.cmeans(data,
cluster_number=2, m=2,
stopping_error=0.00001,
maxiter=100, init=None,
seed=None)
u.shape
(2,100)
p
11
fpc
0.9236757239275252
@LimCK 28 / 32
Demo: Iris dataset with 4 features
Command
●
Data : “iris2b.csv” with dimension 100X2 cntr, u, u0, d, jm, p,
fpc = fuzz.cmeans(data,
cluster_number=2, m=2,
stopping_error=0.00001,
maxiter=100, init=None,
seed=None)
jm
array([191.57127093,
150.30291193, 139.24608917,
75.65029622,
41.43916205,
40.88894263, 40.88085657,
40.88063667,
40.88063043,
40.88063026, 40.88063025])
@LimCK 29 / 32
Demo: Iris dataset with 4 features
●
Data : “iris2b.csv” with dimension 100X2
We have cluster
centers only.
We perform
prediction with data
@LimCK 30 / 32
Demo: Iris dataset with 4 features
Command (Prediction)
●
Data : “iris2b.csv” with dimension 100X2 new_u, new_u0, new_d,
new_jm,new_p, new_fpc =
fuzz.cluster.cmeans_pre
dict(data, cntr, m=2,
error=0.00001,
maxiter=30)
@LimCK 31 / 32
Thank you
@LimCK 32 / 32