Fuzzy Classification
Fuzzy Classification
Classification Methods
Two popular methods of classification
• Classification By Equivalence
Crisp Relations
Fuzzy Relations
• Fuzzy C-means (FCM)
Fuzzy c-means (FCM) is a method of clustering which allows one piece of
data to belong to two or more clusters.
Classification By Equivalence
Crisp Relation
• Crisp relation is defined on the Cartesian product of two universal sets determined as
• Here “1” implies complete truth degree for the pair to be in relation and “0” implies no relation.
• Define a set, [xi ] = {xj | (xi, xj ) ∈ R}, as the equivalent class of xi on a universe of
data points, X. This class is contained in a special relation, R, known as an equivalence relation .
• This class is a set of all elements related to xi that have the following properties :
1. xi ∈ [xi ] therefore (xi, xi ) ∈ R
2. [xi ] = [xj ] ⇒ [xi ] ∩ [xj ] = Ø
3.x∈X [x] = X.
• crisp equivalence relations can be used to divide the universe X into mutually exclusive classes.
• In fuzzy relations, for all fuzzy equivalence relations, their λ-cuts are equivalent ordinary
relations.
• Hence, to classify data points in the universe using fuzzy relations, we need to find the associated
fuzzy equivalence relation.
Fuzzy C Means
• Fuzzy c-means (FCM) is a data clustering technique in which a data set is grouped
into N clusters with every data point in the dataset belonging to every cluster to a certain
degree.
• For example, a data point that lies close to the center of a cluster will have a high degree of
membership in that cluster, and another data point that lies far away from the center of a
cluster will have a low degree of membership to that cluster.
Cluster Analysis
• Cluster analysis is a statistical classification technique in which a set of objects or points with
similar characteristics are grouped together in clusters.
• The aim of cluster analysis is to organize observed data into meaningful structures in order to
gain further insight from them.
Cluster Validity
• cluster validation is used to design the procedure of evaluating the goodness of clustering
algorithm results.
• This is important to avoid finding patterns in a random data, as well as, in the situation where
to compare two clustering algorithms.
Algorithm
• This algorithm works by assigning membership to each data point corresponding to each
cluster center on the basis of distance between the cluster center and the data point.
• More the data is near to the cluster center more is its membership towards the particular
cluster center.
Disadvantages
1) Apriori specification of the number of clusters.
2) With lower value of β we get the better result but at the expense of more
number of iteration.
3) Euclidean distance measures can unequally weight underlying factors.
THANK YOU