Practical no_ 4
Practical no_ 4
Aim: Perform the data clustering using clustering algorithm using R or Python
Algorithm:-
K-Means is an iterative process of clustering, which keeps iterating until it reaches the
best solution or clusters in our problem space.Following pseudo example talks about the
basic steps in K-Means clustering which is generally used to cluster our data.
1. Start with no. of clusters we want eg. 3 in this case. K-Means algorithm starts the
process with random centers in data, and then tries to attach the nearest points to
these centers.
2. Algorithm then moves the randomly allocated centers to the means of created
groups.
3. In the next step, data points are again reassigned to these newly created centers.
4. Steps 2 & 3 are repeated until no member changes their associated/groups.
In this practical we will be using iris data set that gives the measurements in
centimeters of the variables sepal length and width and petal length and width,
respectively for 50 flowers from each of 3 species of iris. The species are Iris
setosa, versicolor and virginica. Iris is a data frame with 150 cases(rows) and 5
variables(columns) named Sepal.Length, Sepal.Width,Petal.Length.Petal.Width
and Species.
Solution:-
> (kc<-kmeans(newiris,3))
> table(iris$Species,kc$cluster)
> plot(newiris[c("Sepal.Length","Sepal.Width")],col=1:3,pch=8,cex=2)
> dev.off()
null device
1
> plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)