Clustring Data Mining
Clustring Data Mining
Data Mining
Farhad Muhammad Riaz
[email protected]
K-Mean Clustering
• Application
– Image segmentation
– Customer segmentation
– Species clustering
– Anomaly detection
– Clustering languages
K-Mean Clustering
K-Means Clustering Algorithm
• Step 1
– Choose the number of clusters K.
• Step 2
– Randomly select any K data points as cluster centers.
– Select cluster centers in such a way that they are as farther as possible from each other.
• Step 3
– Calculate the distance between each data point and each cluster center.
– The distance may be calculated either by using given distance function or by using
euclidean distance formula.
K-Means Clustering Algorithm
• Step 4
– Assign each data point to some cluster.
– A data point is assigned to that cluster whose center is nearest to that
data point.
• Step 5
– Re-compute the center of newly formed clusters.
– The center of a cluster is computed by taking mean of all the data
points contained in that cluster.
– .
• Step 6
– Calculate the distance between each data point and each cluster center.
– The distance may be calculated either by using given distance function or by using
euclidean distance formula.
K-Means Clustering Algorithm
• Step 6
– Keep repeating the procedure from Step-03 to Step-05 until any of the
following stopping criteria is met-
• Center of newly formed clusters do not change
• Data points remain present in the same cluster
• Maximum number of iterations are reached
Example
• A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6,
4), A7(1, 2), A8(4, 9)
• Initial cluster centers are: C1(2, 10), C2(5, 8) and
C3(1, 2).
• You can use
– Distance formula Ρ(a, b) = |x2 – x1| + |y2 – y1|
– Euclidean Distance Formula
Example
Example
Example
Example
Example
Association Rule Mining