0% found this document useful (0 votes)
15 views7 pages

2.3 Aiml Rishit

The document describes an experiment performed by a student to implement and compare K-means clustering and K-nearest neighbor algorithms. Code examples are provided to generate synthetic datasets and apply each algorithm. For K-means, the optimal number of clusters is determined visually. For K-nearest neighbor, accuracy is calculated on test data to evaluate the model. The student learns about clustering, K-means, and K-nearest neighbor through this experiment.

Uploaded by

heex.pros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

2.3 Aiml Rishit

The document describes an experiment performed by a student to implement and compare K-means clustering and K-nearest neighbor algorithms. Code examples are provided to generate synthetic datasets and apply each algorithm. For K-means, the optimal number of clusters is determined visually. For K-nearest neighbor, accuracy is calculated on test data to evaluate the model. The student learns about clustering, K-means, and K-nearest neighbor through this experiment.

Uploaded by

heex.pros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DEPARTMENT OF

COMPUTER SCIENCE & ENGINEERING

Experiment: 2.3

Student Name: Rishit Thakur UID: 21BCS2468


Branch: CSE Section/Group: 611-B
Semester: 5th Date of Performance:17/10/23
Subject Name: AIML Lab Subject Code: 21CSP-316

Aim: To determine the optimal number of clusters using K means algorithm and k nearest
neighbour.

Objective: The objective is to implement k means clustering and k nearest neighbour.


Code:

 K means clustering

import pandas as pd import numpy as np


import matplotlib.pyplot as plt
from sklearn.datasets import
make_blobsfrom sklearn.cluster
import KMeans
# Generate a synthetic dataset with two clusters
X, _ = make_blobs(n_samples=300, centers=2, random_state=42)

# Create a DataFrame for easier data manipulation and


visualizationdata = pd.DataFrame(X, columns=['Feature 1',
'Feature 2'])

# Initialize the K-means clustering algorithm with the number of clusters


n_clusters = 2
kmeans = KMeans(n_clusters=n_clusters, random_state=42)

# Fit the K-means model to the


data data['Cluster'] =
kmeans.fit_predict(X)

# Get cluster centers


cluster_centers = kmeans.cluster_centers_

# Visualize the
clusters
plt.figure(figsize=(
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING
8, 6))
for cluster_label in range(n_clusters):
cluster_data = data[data['Cluster'] == cluster_label]

plt.scatter(cluster_data['Feature 1'], cluster_data['Feature 2'], label=f'Cluster


{cluster_label}')# Plot cluster centers as black 'X'
plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], s=300, c='black',
marker='X',label='Cluster Centers')
plt.title("K-means
Clustering")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")plt.legend()
plt.show()

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import
make_blobsfrom sklearn.cluster
import KMeans

# Create a synthetic dataset with 3


clustersn_samples = 300
n_clusters = 3
X, _ = make_blobs(n_samples=n_samples, centers=n_clusters, random_state=42)

# Initialize the K-means clustering algorithm with 3


clusters kmeans = KMeans(n_clusters=n_clusters,
random_state=42)
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING

# Fit the K-means model to the


datakmeans.fit(X)

# Get cluster labels for each data


pointlabels = kmeans.labels_

# Get cluster centers


cluster_centers = kmeans.cluster_centers_

# Visualize the clusters


plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], s=300, c='red',
marker='X')plt.title("K-means Clustering")
plt.show()

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_irisfrom sklearn.cluster import KMeans

# Load the Iris datasetiris = load_iris()


X = iris.data

# Initialize the K-means clustering algorithm with a specified number of clusters (e.g., 3)
n_clusters = 3
kmeans = KMeans(n_clusters=n_clusters, random_state=42)

# Fit the K-means model to the datakmeans.fit(X)


DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING

# Get cluster labels for each data pointlabels = kmeans.labels_

# Get cluster centers


cluster_centers = kmeans.cluster_centers_

# Visualize the clusters plt.figure(figsize=(8, 6))

# Scatter plot each data point colored by cluster plt.scatter(X[:, 0], X[:, 1], c=labels,
cmap='viridis')

# Plot cluster centers as black 'X'


plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], s=300, c='black', marker='X',
label='Cluster Centers')

plt.title("K-means Clustering")plt.xlabel("Feature 1")


plt.ylabel("Feature 2")plt.legend()
plt.show()
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING

 K Nearest Neighbour

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import
make_classification from
sklearn.model_selection import train_test_split
from sklearn.neighbors import
KNeighborsClassifierfrom sklearn.metrics
import accuracy_score

# Create a synthetic dataset with two classes


X, y = make_classification(n_samples=200, n_features=2, n_informative=2, n_redundant=0,
random_state=42)

# Split the dataset into a training set and a test set


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the KNN classifier with a specified number of neighbors (e.g., 3)


n_neighbors = 3
knn = KNeighborsClassifier(n_neighbors=n_neighbors)

# Fit the classifier to the training


dataknn.fit(X_train, y_train)

# Make predictions on the test


datay_pred =
knn.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test,
y_pred)print(f'Accuracy: {accuracy *
100:.2f}%')

# Visualize the decision


boundary
plt.figure(figsize=(10, 6))
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max,
0.02))Z = knn.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.RdBu)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdBu,
edgecolor='k')plt.title("K-nearest Neighbors Classifier")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")plt.show()
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import
make_classification from
sklearn.model_selection import train_test_split
from sklearn.neighbors import
KNeighborsClassifierfrom sklearn.metrics
import accuracy_score

# Generate a synthetic dataset with two classes


X, y = make_classification(n_samples=300, n_features=2, n_informative=2, n_redundant=0,
random_state=42)

# Split the dataset into a training set and a test set


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)# Initialize the KNN classifier with a specified number of
neighbors (e.g., 3) n_neighbors = 3
knn =
KNeighborsClassifier(n_neighbors=n_neighbors)#
Fit the classifier to the training data
knn.fit(X_train, y_train)

# Make predictions on the test


datay_pred =
knn.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test,
y_pred)print(f'Accuracy: {accuracy *
100:.2f}%')
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING
# Visualize the decision
boundary
plt.figure(figsize=(10, 6))
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy =
np.meshgrid(np.arange(x_min, x_max, 0.02),
np.arange(y_min, y_max,
0.02)) Z = knn.predict(np.c_[xx.ravel(),
yy.ravel()])Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.RdBu)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdBu,
edgecolor='k')plt.title("K-nearest Neighbors Classifier")
plt.xlabel("Feature 1")
plt.ylabel("Featur
e 2")
plt.show()

Learning Outcomes:
 Learn about cluster.
 Learn about K means clustering
 Learn about K neighbor clustering

You might also like