0% found this document useful (0 votes)
14 views5 pages

K-Means Clustering

K-Means is a partition-based clustering algorithm that divides a dataset into K clusters defined by centroids, with data points assigned to the nearest centroid. The process involves initializing cluster centers, assigning data points, updating cluster centers, and repeating until stabilization. It is scalable and easy to understand but can be sensitive to initial placements and assumes spherical clusters.

Uploaded by

Rana Ben Fraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views5 pages

K-Means Clustering

K-Means is a partition-based clustering algorithm that divides a dataset into K clusters defined by centroids, with data points assigned to the nearest centroid. The process involves initializing cluster centers, assigning data points, updating cluster centers, and repeating until stabilization. It is scalable and easy to understand but can be sensitive to initial placements and assumes spherical clusters.

Uploaded by

Rana Ben Fraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

K-Means Clustering

Definition :
K-Means is a partition-based clustering algorithm that splits the dataset into
KKK clusters. Each cluster is defined by its centroid, and data points are
assigned to the nearest centroid based on a distance metric (e.g., Euclidean
distance). It is commonly used for spherical clusters of similar sizes.

Imagine placing KKK magnets on a table of scattered metal balls. The balls will
“stick” to the closest magnet, and the magnets will move to the middle of their
assigned balls.

Example:
If you’re analyzing customer spending, K-Means could group customers into
clusters like:

Cluster 1: High-spenders.

Cluster 2: Average spenders.

Cluster 3: Budget shoppers.(Each group is like a type of shopper based on


their behavior.)

Steps:
1. Initialize KKK: Choose the number of clusters from the database (KKK) and
place KKK initial cluster centers randomly.
(Think of this as choosing where the groups will start forming on a map.)

2. Assignment Step: Assign each data point to the nearest cluster center
using a distance measure like Euclidean distance.

(Imagine giving every house to the closest delivery center.)

3. Update Step: Calculate the new center of each cluster by averaging the
points in it.
(The "center" moves to the middle of its assigned houses.)

4. Repeat: Keep repeating steps 2 and 3 until the centers stop moving
significantly or after a set number of tries.

K-Means Clustering 1
(This keeps adjusting until the groups settle down.)

Example:

K-Means Clustering 2
K-Means Clustering 3
Advantages:
Scalable: Handles large datasets well. (It works quickly even if you have a
lot of data.)

Easy to Understand: Its steps are simple to follow. (You’re just grouping
things and finding averages.)

Limitations:
Sensitive to the initial placement of cluster centers. (Bad starting points can
lead to bad groupings.)

K-Means Clustering 4
Assumes clusters are circular or spherical. (It struggles with weirdly shaped
groups.)

K-Means Clustering 5

You might also like