0% found this document useful (0 votes)
2 views

Practical no_ 4

The document outlines a practical exercise on performing data clustering using the K-Means algorithm in R or Python. It explains the iterative process of K-Means, including steps for initializing clusters, updating centers, and reassigning data points until convergence. The practical uses the iris dataset to demonstrate clustering based on flower measurements, ultimately visualizing the results with plots.

Uploaded by

priya.rajak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Practical no_ 4

The document outlines a practical exercise on performing data clustering using the K-Means algorithm in R or Python. It explains the iterative process of K-Means, including steps for initializing clusters, updating centers, and reassigning data points until convergence. The practical uses the iris dataset to demonstrate clustering based on flower measurements, ultimately visualizing the results with plots.

Uploaded by

priya.rajak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Practical no: 4

Aim:​ Perform the data clustering using clustering algorithm using R or Python

Theory:K-Means clustering intends to partition n objects into k clusters in which each


object belongs to the cluster with the nearest mean. This method produces exactly k
different clusters of greatest possible distinction. The best number of clusters k leading
to the greatest separation(distance) is not known as a priority and must be computed
from the data.

Algorithm:-

K-Means is an iterative process of clustering, which keeps iterating until it reaches the
best solution or clusters in our problem space.Following pseudo example talks about the
basic steps in K-Means clustering which is generally used to cluster our data.

1.​ Start with no. of clusters we want eg. 3 in this case. K-Means algorithm starts the
process with random centers in data, and then tries to attach the nearest points to
these centers.
2.​ Algorithm then moves the randomly allocated centers to the means of created
groups.
3.​ In the next step, data points are again reassigned to these newly created centers.
4.​ Steps 2 & 3 are repeated until no member changes their associated/groups.

In this practical we will be using iris data set that gives the measurements in
centimeters of the variables sepal length and width and petal length and width,
respectively for 50 flowers from each of 3 species of iris. The species are Iris
setosa, versicolor and virginica. Iris is a data frame with 150 cases(rows) and 5
variables(columns) named Sepal.Length, Sepal.Width,Petal.Length.Petal.Width
and Species.

Solution:- ​

> newiris <-iris

> newiris$Species <-NULL

> (kc<-kmeans(newiris,3))
> table(iris$Species,kc$cluster)

> plot(newiris[c("Sepal.Length","Sepal.Width")],col=1:3,pch=8,cex=2)

> dev.off()

null device

​ 1

> plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)

You might also like