0% found this document useful (0 votes)
110 views6 pages

Quiz No 01: Pattern Recognition

This document discusses K-means clustering, an unsupervised machine learning algorithm. It takes input data from the user including the dimension, number of data points, and number of clusters. Random data is generated and initial centroid locations are randomly assigned. The algorithm calculates the Euclidean distance between each data point and centroid and assigns the point to the closest centroid. It then recalculates the centroid locations as the mean of the assigned points. This process repeats for 6 iterations or until convergence, reassigning points and recalculating centroids each time. Plots show the initial random data and centroids, and the progress of clustering across iterations until a final clustered output is achieved.

Uploaded by

sahib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views6 pages

Quiz No 01: Pattern Recognition

This document discusses K-means clustering, an unsupervised machine learning algorithm. It takes input data from the user including the dimension, number of data points, and number of clusters. Random data is generated and initial centroid locations are randomly assigned. The algorithm calculates the Euclidean distance between each data point and centroid and assigns the point to the closest centroid. It then recalculates the centroid locations as the mean of the assigned points. This process repeats for 6 iterations or until convergence, reassigning points and recalculating centroids each time. Plots show the initial random data and centroids, and the progress of clustering across iterations until a final clustered output is achieved.

Uploaded by

sahib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

QUIZ NO 01

Pattern Recognition

OCTOBER 20, 2019


SAHIB ULLAH
CMS 278075
K Means Clustering
Clustering is use to group together similar Data points, it is an unsupervised technique in
which only input will be given to the classifier.
I do that up to two dimensions by using of scatter command and for more dimensions
Matlab may need some other command. And I use total numbers of 6 iterations, if
clustering done successfully before that it will just repeat the plots, and almost it is.

Code
clear all;
close all;
clc

[D,N,K] = Input_Data()
%Taking Input data from users
%D is dimension 2-D or More-D
%N is total number of Data points
% while K is total Number of cluster
%And defining of K doesn't mean that the K-mean Clustering is supervised clustering
can be supervised only in special cases
%Randomly Generating Data
xn = floor(abs(rand(D,N))*255);%floor is use to remove or ignore float data
u = floor(abs(rand(D,K))*255); %u is means of the data
c = zeros(1,N) %c is the centroid location initially we place it randomly

show_data(xn,K,u,c,1)%calling function the first user data showing/plotting

for ite = 1:6 %i m using total number of iteration 6 if clustering done


%and convergence occur before 6 it will just repeat & plot
%Clusters Assignments
for i = 1:N % 1st loop up to total number of data point N
for j = 1:K %2nd Loop up to total number of clusters K
%The commend Norm is by default is euclidean norm
dist(j) = norm((xn(:,i) - u(:,j)));% calculating distance
%b/w each data point xn and means u
end
[val, c(i)] = min(dist);% Taking value which minimize the distance
%it must be an argument value
end
%when once done with assigning of centroid or means then we start the
%calculation of means of the nearest data points and repeat it until
convergence/repeatition
%Finding of New Clusters
for j = 1:K%loop up yo No of cluster which is usr input
c_sum = zeros(D,1); %c_sum initialize having D rows and one column
n = 0;%initialize
for i = 1:N%loop up to total No Of Data point user data
if(c(i)==j)
c_sum = c_sum + xn(:,i); %using formula of calculating means numerator part
n = n+1; % calculating means denumenator part of uj
end
end
new_val = c_sum/n; %uj
u(:,j) = new_val;
end
show_data(xn,K,u,c,ite+1) %caling function to show the whole data up to 6 iterations
end

show_data(xn,K,u,c,ite+2)%caling showing data funtion to show the final data clusterize


data.
Function
function [] = show_data(xn,K,u,c,ite)
[row,N] = size(xn);%size of the data points
color = 'k';%defining colour black
figure(ite) %shows figures of each iterations
for data = 1:N
if (c(data)==1)
color = 'r'; %defining red color
end
if (c(data)==2)
color = 'g'; % defining green color
end
if (c(data)==3)
color = 'b'; % defining Blue color
end
if (c(data)==4)
%defining yellow color
color = 'y';
end
if (c(data)==5)
color = 'm';
end
if (c(data)==6)
color = 'c';
end

scatter(xn(1,data),xn(2,data),'o',color) %using scatter command it is only for 2D Data


hold on
end

for points = 1:K


if (points==1)
color = 'r';
end
if (points==2)
color = 'g';
end
if (points==3)
color = 'b';
end
if (points==4)
color = 'y';
end
if (points==5)
color = 'm';
end
if (points==6)
color = 'c';
end

scatter(u(1,points),u(2,points),'x',color)
hold on
end
hold off

end
Function
function [D,N,K] = Input_Data()
%Taking Inputs
D = input('Enter Dimension of Data: ')
N = input('Enter Number of Points: ')
K = input('Enter Number of Clusters: ')

End
Plots
User Input and Random Data with random centroids

First Euclidean Distance and Assigning Clusters to the nearest centroid/mean


After 4th iteration convergence start

You might also like