0% found this document useful (0 votes)
11 views3 pages

Assignment 4

The document is a Jupyter notebook that implements the K-means clustering algorithm using Python libraries such as pandas and numpy. It reads data from a text file, initializes random centroids, and iteratively assigns data points to clusters based on distance until convergence. Finally, it visualizes the clustered data points using matplotlib.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

Assignment 4

The document is a Jupyter notebook that implements the K-means clustering algorithm using Python libraries such as pandas and numpy. It reads data from a text file, initializes random centroids, and iteratively assigns data points to clusters based on distance until convergence. Finally, it visualizes the clustered data points using matplotlib.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

8/20/22, 7:33 PM 180104072_Assignment4.

ipynb - Colaboratory

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

from google.colab import drive

drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive

main_directory = '/content/gdrive/MyDrive/Pattern Lab/Assignment4/data_k_mean.txt'


train = pd.read_csv(main_directory, sep=' ' , header = None)
train1 = train.to_numpy()
print(train1)

[[-7.87157 -4.86573]
[-4.76661 -6.87944]
[-6.67986 -5.8308 ]
...
[ 6.91832 -0.32132]
[-8.23828 -4.00405]
[-5.75112 -5.99531]]

#plotting all data points


plt.scatter(train[0], train[1], c = 'blue', marker = 's')
plt.show()

#taking k as input
k = int(input("Enter the value of k : "))

Enter the value of k : 2

#random centroids for 1st iteration


np.random.seed(seed=72)
random_numbers = np.random.randint(low=0, high=len(train1), size=(k))
https://colab.research.google.com/drive/1A2rXv6R5N4bVbrStI9bsxdhKsJlzNVYu?authuser=3#scrollTo=QzW8dbY5YkR0&printMode=true 1/3
8/20/22, 7:33 PM 180104072_Assignment4.ipynb - Colaboratory

centroids = [train1[random_numbers[i]] for i in range(k)]


print(centroids)

[array([ 6.80375, -0.13017]), array([6.68468, 0.85224])]

distance = [] #to store the distance from point to classes


index_clusters = [-1 for i in range(len(train1))] #to store class corresponding to index
count = 0 #to count the iteration number
clusters = {} #to store class numbers as keys and data points as values
for x in range(500):
count = x
# flag to keep track whether change occurs or not
flag = 0
for y in range(k):
clusters[y] = []
# iterate through each data points
for i in range(len(train1)):
distance = []
for j in range(k):
dist = np.sqrt(pow(abs((train1[i][0] - centroids[j][0])), 2) + pow(abs((train1
distance.append(dist)
index = distance.index(min(distance))
# check whether the change occurs or not
if index_clusters[i] != index:
flag = 1
index_clusters[i] = index
clusters[index].append(train1[i])
# if change occurs
if flag == 0:
break
# calculating new centroids
centroids = [np.mean(np.asarray(clusters[z]), axis=0) for z in range(k)]

x1 = np.asarray(clusters[0])[:, 0]
y1 = np.asarray(clusters[0])[:, 1]

x2 = np.asarray(clusters[1])[:, 0]
y2 = np.asarray(clusters[1])[:, 1]

# plotting classified data points of two classes with different colored marker
plt.scatter(x1, y1, c = 'red', marker = 's', label = 'Class 1')
plt.scatter(x2, y2, c = 'green', marker = 'H', label = 'Class 2')
plt.legend(loc = 'best')
plt.show()

https://colab.research.google.com/drive/1A2rXv6R5N4bVbrStI9bsxdhKsJlzNVYu?authuser=3#scrollTo=QzW8dbY5YkR0&printMode=true 2/3
8/20/22, 7:33 PM 180104072_Assignment4.ipynb - Colaboratory

check 0s completed at 7:32 PM

https://colab.research.google.com/drive/1A2rXv6R5N4bVbrStI9bsxdhKsJlzNVYu?authuser=3#scrollTo=QzW8dbY5YkR0&printMode=true 3/3

You might also like