0% found this document useful (0 votes)

16 views8 pages

45B AIML Practical07 Clustering

Uploaded by

Ahmed Shaikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views8 pages

45B AIML Practical07 Clustering

Uploaded by

Ahmed Shaikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Name of Student: Ahmed Mobin Ahmed Shaikh

Roll Number: 45 Lab Practical Number: 07

Title of Lab Assignment: Implementation and analysis of clustering

algorithms like K-Means, K-medoid.

DOP: 06/03/24 DOS: 06/03/24

CO Mapped: PO Mapped: Signature:

CO2 PO3, PO5,
PO6, PO7,
PO11,
PO12
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

keyboard_arrow_down 1. K-Means Clustering:

I. IMPORTING DATASET:

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset

dataset = pd.read_csv('F1Drivers_Dataset.csv')
#dataset
dataset

Driver Nationality Seasons Championships Race_Entries Race_Starts Pole_P

Carlo [1962,
0 Italy 0.0 3.0 0.0
Abate 1963]

George United [1951,

1 0.0 2.0 2.0
Abecassis Kingdom 1952]

Kenny United [1983,

2 0.0 10.0 3.0
Acheson Kingdom 1985]

[1968,
Andrea 1970,
3 de Italy 1971, 0.0 36.0 30.0
Adamich 1972,
1973]

Philippe
4 Belgium [1994] 0.0 2.0 2.0
Adams

... ... ... ... ... ... ...

Emilio
863 Spain [1976] 0.0 1.0 0.0
Zapico

Zhou
864 China [2022] 0.0 23.0 23.0
Guanyu

[1999,
2000,
Ricardo
865 Brazil 2001, 0.0 37.0 36.0
Zonta
2004,
2005]

[1975,
Renzo
866 Italy 1976, 0.0 7.0 7.0
Zorzi
1977]

[1979,
Ricardo
867 Argentina 1980, 0.0 11.0 10.0
Zunino
1981]

868 rows × 22 columns

keyboard_arrow_down EXTRACTING INDEPENDENT VARIABLES:

SOURCE CODE:

x = dataset[['Race_Entries', 'Race_Starts']]
#finding optimal number of clusters using the elbow method
from sklearn.cluster import KMeans
wcss_list= [] #Initializing the list for the values of WCSS
#Using for loop for iterations from 1 to 10.
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++', max_iter=300, n_init=10, random_state=0)
kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11), wcss_list)
mtp.title('The Elbow Method Graph')
mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 1/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

keyboard_arrow_down TRAINING K-MEANS MODEL ON A DATASET:

SOURCE CODE:

# Applying KMeans with 5 clusters

kmeans = KMeans(n_clusters=5, init='k-means++', random_state=42)
y_predict = kmeans.fit_predict(x)

# Visualizing the clusters

mtp.scatter(x[y_predict == 0]['Race_Entries'], x[y_predict == 0]['Race_Starts'], s=100, c='blue', label='Cluster 1')
mtp.scatter(x[y_predict == 1]['Race_Entries'], x[y_predict == 1]['Race_Starts'], s=100, c='green', label='Cluster 2')
mtp.scatter(x[y_predict == 2]['Race_Entries'], x[y_predict == 2]['Race_Starts'], s=100, c='red', label='Cluster 3')
mtp.scatter(x[y_predict == 3]['Race_Entries'], x[y_predict == 3]['Race_Starts'], s=100, c='cyan', label='Cluster 4')
mtp.scatter(x[y_predict == 4]['Race_Entries'], x[y_predict == 4]['Race_Starts'], s=100, c='magenta', label='Cluster 5')

# Plotting centroids
mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='yellow', label='Centroid')

mtp.title('Clusters of Driver data')

mtp.xlabel('Race_Entries')
mtp.ylabel('Race_Starts')
mtp.legend()
mtp.show()

/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarning
warnings.warn(

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 2/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

keyboard_arrow_down 2. K-Medoids:
I. Importing Packages & Loading Dataset:

SOURCE CODE:

!pip install scikit-learn-extra

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn_extra.cluster import KMedoids
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score
from sklearn import datasets, metrics

# Load the Wine dataset

wine = datasets.load_wine()
x = wine.data # Features
y = wine.target # Target labels
wine

Collecting scikit-learn-extra
Downloading scikit_learn_extra-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 8.9 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.10/dist-packages (from scikit-learn-extra) (1.25.2)
Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.10/dist-packages (from scikit-learn-extra) (1.11.4)
Requirement already satisfied: scikit-learn>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn-extra) (1.2.2)
Requirement already satisfied: joblib>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.23.0->scikit-learn-
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.23.0->scikit
Installing collected packages: scikit-learn-extra
Successfully installed scikit-learn-extra-0.3.0
{'data': array([[1.423e+01, 1.710e+00, 2.430e+00, ..., 1.040e+00, 3.920e+00,
1.065e+03],
[1.320e+01, 1.780e+00, 2.140e+00, ..., 1.050e+00, 3.400e+00,
1.050e+03],
[1.316e+01, 2.360e+00, 2.670e+00, ..., 1.030e+00, 3.170e+00,
1.185e+03],
...,
[1.327e+01, 4.280e+00, 2.260e+00, ..., 5.900e-01, 1.560e+00,
8.350e+02],
[1.317e+01, 2.590e+00, 2.370e+00, ..., 6.000e-01, 1.620e+00,
8.400e+02],
[1.413e+01, 4.100e+00, 2.740e+00, ..., 6.100e-01, 1.600e+00,
5.600e+02]]),
'target': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2]),
'frame': None,
'target_names': array(['class_0', 'class_1', 'class_2'], dtype='<U7'),
'DESCR': '.. _wine_dataset:\n\nWine recognition dataset\n------------------------\n\n**Data Set Characteristics:**\n\n
:Number of Instances: 178\n :Number of Attributes: 13 numeric, predictive attributes and the class\n :Attribute
Information:\n \t\t- Alcohol\n \t\t- Malic acid\n \t\t- Ash\n\t\t- Alcalinity of ash \n \t\t- Magnesium\n\t\t- Total phenols\n
\t\t- Flavanoids\n \t\t- Nonflavanoid phenols\n \t\t- Proanthocyanins\n\t\t- Color intensity\n \t\t- Hue\n \t\t- OD280/OD315 of
diluted wines\n \t\t- Proline\n\n - class:\n - class_0\n - class_1\n - class_2\n\t\t\n
:Summary Statistics:\n \n ============================= ==== ===== ======= =====\n Min
Max Mean SD\n ============================= ==== ===== ======= =====\n Alcohol: 11.0 14.8
13.0 0.8\n Malic Acid: 0.74 5.80 2.34 1.12\n Ash: 1.36 3.23 2.36
0.27\n Alcalinity of Ash: 10.6 30.0 19.5 3.3\n Magnesium: 70.0 162.0 99.7 14.3\n
Total Phenols: 0.98 3.88 2.29 0.63\n Flavanoids: 0.34 5.08 2.03 1.00\n
Nonflavanoid Phenols: 0.13 0.66 0.36 0.12\n Proanthocyanins: 0.41 3.58 1.59 0.57\n Colour
Intensity: 1.3 13.0 5.1 2.3\n Hue: 0.48 1.71 0.96 0.23\n OD280/OD315 of
diluted wines: 1.27 4.00 2.61 0.71\n Proline: 278 1680 746 315\n
============================= ==== ===== ======= =====\n\n :Missing Attribute Values: None\n :Class Distribution: class_0
(59), class_1 (71), class_2 (48)\n :Creator: R.A. Fisher\n :Donor: Michael Marshall (MARSHALL%[email protected])\n
:Date: July, 1988\n\nThis is a copy of UCI ML Wine recognition datasets.\nhttps://archive.ics.uci.edu/ml/machine-learning-
databases/wine/wine.data\n\nThe data is the results of a chemical analysis of wines grown in the same\nregion in Italy by three
different cultivators. There are thirteen different\nmeasurements taken for different constituents found in the three types
of\nwine.\n\nOriginal Owners: \n\nForina, M. et al, PARVUS - \nAn Extendible Package for Data Exploration, Classification and
Correlation. \nInstitute of Pharmaceutical and Food Analysis and Technologies,\nVia Brigata Salerno, 16147 Genoa,
Italy.\n\nCitation:\n\nLichman, M. (2013). UCI Machine Learning Repository\n[https://archive.ics.uci.edu/ml]. Irvine, CA:
University of California,\nSchool of Information and Computer Science. \n\n.. topic:: References\n\n (1) S. Aeberhard, D.
Coomans and O. de Vel, \n Comparison of Classifiers in High Dimensional Settings, \n Tech. Rep. no. 92-02, (1992), Dept. of

keyboard_arrow_down II. Scaling and Fitting KMedoids:

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 3/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory
SOURCE CODE:

# Scaling the features

scaler = StandardScaler().fit(x)
x_scaled = scaler.transform(x)

# II. Scaling and Fitting KMedoids

kMedoids = KMedoids(n_clusters=3, random_state=0)
kMedoids.fit(x_scaled)
y_kmed = kMedoids.predict(x_scaled)

keyboard_arrow_down III. Silhouette Method to evaluate cluster:

SOURCE CODE:

silhouette_avg = silhouette_score(x_scaled, y_kmed)

print("Silhouette Score:", silhouette_avg)

Silhouette Score: 0.26597740204536796

keyboard_arrow_down IV. Silhouette Width to find number of cluster:

SOURCE CODE:

sw = []
for i in range(2, 11):
kMedoids = KMedoids(n_clusters=i, random_state=0)
kMedoids.fit(x_scaled)
y_kmed = kMedoids.predict(x_scaled)
silhouette_avg = silhouette_score(x_scaled, y_kmed)
sw.append(silhouette_avg)

plt.plot(range(2, 11), sw)

plt.title('Silhouette Score')
plt.xlabel('Number of clusters')
plt.ylabel('Silhouette Width')
plt.show()

keyboard_arrow_down V. Computing Purity:

SOURCE CODE:

def purity_score(y_true, y_pred):

contingency_matrix = metrics.cluster.contingency_matrix(y_true, y_pred)
return np.sum(np.amax(contingency_matrix, axis=0)) / np.sum(contingency_matrix)

print("Purity Score:", purity_score(y, y_kmed))

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 4/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

Purity Score: 0.898876404494382

keyboard_arrow_down VI. How extreme values effect K-Medoid compared to K-means:

SOURCE CODE:

kmeans = KMeans(n_clusters=3, init='random', max_iter=300, n_init=10, random_state=0)

y_kmeans = kmeans.fit_predict(x_scaled)
print("Purity Score for K-Means:", purity_score(y, y_kmeans))

Purity Score for K-Means: 0.9662921348314607

keyboard_arrow_down VII. Plotting values:

SOURCE CODE:

plt.scatter(x_scaled[y_kmeans == 0, 0], x_scaled[y_kmeans == 0, 1], s=100, c='red', label='Cluster 1')

plt.scatter(x_scaled[y_kmeans == 1, 0], x_scaled[y_kmeans == 1, 1], s=100, c='blue', label='Cluster 2')
plt.scatter(x_scaled[y_kmeans == 2, 0], x_scaled[y_kmeans == 2, 1], s=100, c='green', label='Cluster 3')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='yellow', label='Centroids')
plt.legend()
plt.show()

keyboard_arrow_down VIII. Adding extreme values:

SOURCE CODE:

import numpy as np

# Add extreme values to your dataset

extreme_values = np.array([[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10],
[15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15],
[12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]])
m = np.append(x, extreme_values, axis=0)
y_extreme = np.append(y, [2, 2, 2])
print(y_extreme)
print("we see 3 observations are added over here.-", m.shape)

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]
we see 3 observations are added over here.- (181, 13)

SOURCE CODE:

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 5/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

scaler = StandardScaler().fit(m)
x_scaled_extreme = scaler.transform(m)

# Perform KMeans clustering

kmeans_extreme = KMeans(n_clusters=3, init='random', max_iter=300, n_init=10, random_state=0)
y_kmeans_extreme = kmeans_extreme.fit_predict(x_scaled_extreme)

# Calculate purity score

purity_extreme = purity_score(y_extreme, y_kmeans_extreme)
print(purity_extreme)

0.7016574585635359

keyboard_arrow_down IX. Plot:

SOURCE CODE:

plt.scatter(x_scaled_extreme[y_kmeans_extreme == 0, 0], x_scaled_extreme[y_kmeans_extreme == 0, 1], s=100, c='red', label='C1')

plt.scatter(x_scaled_extreme[y_kmeans_extreme == 1, 0], x_scaled_extreme[y_kmeans_extreme == 1, 1], s=100, c='blue', label='C2')
plt.scatter(x_scaled_extreme[y_kmeans_extreme == 2, 0], x_scaled_extreme[y_kmeans_extreme == 2, 1], s=100, c='green', label='C3')
plt.scatter(kmeans_extreme.cluster_centers_[:, 0], kmeans_extreme.cluster_centers_[:, 1], s=100, c='yellow', label='Centroids')
plt.legend()
plt.show()

output

SOURCE CODE:

data = [['k-Means', 0.81], ['k-Means with Outliers', purity_extreme],

['k-Medoid', 0.84], ['K Medoid with outliers', 0.86]]
df = pd.DataFrame(data, columns=['Method', 'Purity'])
df.plot.bar(x='Method', y='Purity', title='Cluster Quality')
plt.show()

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 6/7
3/8/24, 9:38 PM 45B_AIML_Practical07_Clustering.ipynb - Colaboratory

https://colab.research.google.com/drive/1lMbyWJ57Y8yEjDX02i2XrF_nKrKep-Fk#scrollTo=Z7Y3hDiCyg3e&printMode=true 7/7

LightGBM - An In-Depth Guide Python
No ratings yet
LightGBM - An In-Depth Guide Python
26 pages
Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
Machine Learning (16CIC73) Project Report Template
33% (3)
Machine Learning (16CIC73) Project Report Template
12 pages
Quality Prediction Checkpoint
No ratings yet
Quality Prediction Checkpoint
14 pages
Red Wine Mine
100% (1)
Red Wine Mine
32 pages
Data Set Information WINE QUALITY
100% (1)
Data Set Information WINE QUALITY
4 pages
Data Mining - Wine Classification Assignment
No ratings yet
Data Mining - Wine Classification Assignment
66 pages
Project CST 383
No ratings yet
Project CST 383
1,083 pages
Lab Assignment 10: Web Mining
No ratings yet
Lab Assignment 10: Web Mining
5 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
7th English Guide Term 1
No ratings yet
7th English Guide Term 1
84 pages
University of Mauritius: Assignment On Supervised & Unsupervised Machine Learning Algorithms
No ratings yet
University of Mauritius: Assignment On Supervised & Unsupervised Machine Learning Algorithms
71 pages
Guillermo Garcia Rodriguez - Rivendel S.L
No ratings yet
Guillermo Garcia Rodriguez - Rivendel S.L
85 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
Wine Quality Questions
No ratings yet
Wine Quality Questions
2 pages
Current Trends in Software
No ratings yet
Current Trends in Software
40 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
ML Mini Report
No ratings yet
ML Mini Report
6 pages
Wine
No ratings yet
Wine
15 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
100% (1)
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
10 pages
Import As From Import From Import Import As
No ratings yet
Import As From Import From Import Import As
5 pages
CODE
No ratings yet
CODE
7 pages
Week 11 Assignment 11.2.2
No ratings yet
Week 11 Assignment 11.2.2
3 pages
SUBQUERIES
No ratings yet
SUBQUERIES
8 pages
Sneak Peek BCTCI - First 7 Chapters - What's Broken About Coding Interviews, What Recruiters Won't Tell You, How To Get in The Door, and More
100% (1)
Sneak Peek BCTCI - First 7 Chapters - What's Broken About Coding Interviews, What Recruiters Won't Tell You, How To Get in The Door, and More
70 pages
Information Retrieval 8 Term Weighting A
No ratings yet
Information Retrieval 8 Term Weighting A
11 pages
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
No ratings yet
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
7 pages
Central Tendency and Dispersion Analysis - 12212204
No ratings yet
Central Tendency and Dispersion Analysis - 12212204
14 pages
Mathematics of Codes: Topics (And Subtopics)
No ratings yet
Mathematics of Codes: Topics (And Subtopics)
19 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
6 pages
Wine
No ratings yet
Wine
22 pages
From Import Import As From Import From Import From Import Import Import From Import From Import From Import
No ratings yet
From Import Import As From Import From Import From Import Import Import From Import From Import From Import
3 pages
Lab Assignment 10: Web Mining
No ratings yet
Lab Assignment 10: Web Mining
5 pages
Wine Quality Prediction Using Machine Learning
No ratings yet
Wine Quality Prediction Using Machine Learning
10 pages
Business Analytics 1 Ca 2
No ratings yet
Business Analytics 1 Ca 2
26 pages
Mini Project Report
No ratings yet
Mini Project Report
12 pages
Business Plan-Rssk CNC Automation
100% (1)
Business Plan-Rssk CNC Automation
22 pages
Big Data Projecct
No ratings yet
Big Data Projecct
12 pages
Wine
No ratings yet
Wine
2 pages
Wine Quality Prediction Report
No ratings yet
Wine Quality Prediction Report
2 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
The Classification of White Wine and Red Wine Acco
No ratings yet
The Classification of White Wine and Red Wine Acco
5 pages
Documents From The US Antitrust Investigation Into Apple
No ratings yet
Documents From The US Antitrust Investigation Into Apple
113 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
Exploratory Data Analysis and Case
No ratings yet
Exploratory Data Analysis and Case
29 pages
Scikit Learn1
No ratings yet
Scikit Learn1
4 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
AS Notebook - PCA - Wine Data-4
100% (1)
AS Notebook - PCA - Wine Data-4
1 page
Log
No ratings yet
Log
119 pages
Mechanical Engineering - Lab Manual For Measurement and Instrumentation
No ratings yet
Mechanical Engineering - Lab Manual For Measurement and Instrumentation
18 pages
Exercise#9 Instructions 2021
No ratings yet
Exercise#9 Instructions 2021
5 pages
Wine Quality Prediction Project Report
No ratings yet
Wine Quality Prediction Project Report
4 pages
Mahima 2020
No ratings yet
Mahima 2020
8 pages
Eda Red Wine
No ratings yet
Eda Red Wine
16 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
Data Mining 1 Practical File-1
No ratings yet
Data Mining 1 Practical File-1
24 pages
Decision Trees
No ratings yet
Decision Trees
2 pages
Wine Quality Dataset
No ratings yet
Wine Quality Dataset
2 pages
Wine DS
No ratings yet
Wine DS
14 pages
ML Project Report
No ratings yet
ML Project Report
12 pages
Practical04.ipynb - Colab
No ratings yet
Practical04.ipynb - Colab
2 pages
Devesh
No ratings yet
Devesh
11 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Datamining Exp5 Datanormalisation
No ratings yet
Datamining Exp5 Datanormalisation
14 pages
Cisco ATA Guide - Support Centre For Kiwi VoIP
No ratings yet
Cisco ATA Guide - Support Centre For Kiwi VoIP
10 pages
Confidentiality and Working Agreement: Between
No ratings yet
Confidentiality and Working Agreement: Between
10 pages
Prepare, Sterilize and Dispense Culture Media
No ratings yet
Prepare, Sterilize and Dispense Culture Media
24 pages
Assignment4 VidulGarg
No ratings yet
Assignment4 VidulGarg
14 pages
Parameter List EPA Commander SK (English)
No ratings yet
Parameter List EPA Commander SK (English)
2 pages
Hira For Cement Mill
No ratings yet
Hira For Cement Mill
6 pages
Brio Ir
No ratings yet
Brio Ir
11 pages
Power Grid Substation Report
No ratings yet
Power Grid Substation Report
49 pages
Institute of Space Technology: Submitted by
No ratings yet
Institute of Space Technology: Submitted by
12 pages
A) Collection of Values
No ratings yet
A) Collection of Values
9 pages
MxG2wDO ReleaseNotes
No ratings yet
MxG2wDO ReleaseNotes
4 pages
Python Project 2 Colab
No ratings yet
Python Project 2 Colab
6 pages
Leni Andriani - 1.0.1.2 Class Activity - Top Hacker Shows Us How It Is Done
No ratings yet
Leni Andriani - 1.0.1.2 Class Activity - Top Hacker Shows Us How It Is Done
2 pages
FAQ Professional Assessment Under Mbot: Prepared by Author: Hrdf/Mbot Creation Date: 12 MAC 2019: 1.0
No ratings yet
FAQ Professional Assessment Under Mbot: Prepared by Author: Hrdf/Mbot Creation Date: 12 MAC 2019: 1.0
5 pages
Huawei AR1000V Brochure
No ratings yet
Huawei AR1000V Brochure
4 pages
BB - Cac Phuong Phap Dieu Khien Tien Tien Nham Nang Cao Chat Luong Va TKNL - 11tr
No ratings yet
BB - Cac Phuong Phap Dieu Khien Tien Tien Nham Nang Cao Chat Luong Va TKNL - 11tr
11 pages
IoT Based Street Light Controlling and M
No ratings yet
IoT Based Street Light Controlling and M
8 pages
Cs403 Assignment Solution 1 Fall 2023
No ratings yet
Cs403 Assignment Solution 1 Fall 2023
7 pages
ABAP Proxies
No ratings yet
ABAP Proxies
50 pages
45B AIML Practical 08
No ratings yet
45B AIML Practical 08
10 pages
45b - Ui Prac2
No ratings yet
45b - Ui Prac2
30 pages
ZYAROCK Artec Pot Leaflet (En)
No ratings yet
ZYAROCK Artec Pot Leaflet (En)
2 pages
Service Manual: Ic-F14 Ic-F14s Ic-F15 Ic-F15s
No ratings yet
Service Manual: Ic-F14 Ic-F14s Ic-F15 Ic-F15s
32 pages
45B AIML Practical 11
No ratings yet
45B AIML Practical 11
5 pages
45B AIML Practical 06
No ratings yet
45B AIML Practical 06
5 pages
45B Ahmed Shaikh AIML Prac05
No ratings yet
45B Ahmed Shaikh AIML Prac05
4 pages
Learning Episode 11 Updated
No ratings yet
Learning Episode 11 Updated
7 pages
Sika Antisol - 90
No ratings yet
Sika Antisol - 90
2 pages
CFM @45 Tr. - Coil Selection
No ratings yet
CFM @45 Tr. - Coil Selection
1 page
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
From Everand
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
Rodrigo Copetti
No ratings yet

45B AIML Practical07 Clustering

Uploaded by

45B AIML Practical07 Clustering

Uploaded by

Name of Student: Ahmed Mobin Ahmed Shaikh

Roll Number: 45 Lab Practical Number: 07

Title of Lab Assignment: Implementation and analysis of clustering

DOP: 06/03/24 DOS: 06/03/24

CO Mapped: PO Mapped: Signature:

keyboard_arrow_down 1. K-Means Clustering:

# Importing the dataset

Driver Nationality Seasons Championships Race_Entries Race_Starts Pole_P

George United [1951,

Kenny United [1983,

... ... ... ... ... ... ...

868 rows × 22 columns

keyboard_arrow_down EXTRACTING INDEPENDENT VARIABLES:

keyboard_arrow_down TRAINING K-MEANS MODEL ON A DATASET:

# Applying KMeans with 5 clusters

# Visualizing the clusters

mtp.title('Clusters of Driver data')

!pip install scikit-learn-extra

# Load the Wine dataset

keyboard_arrow_down II. Scaling and Fitting KMedoids:

# Scaling the features

# II. Scaling and Fitting KMedoids

keyboard_arrow_down III. Silhouette Method to evaluate cluster:

silhouette_avg = silhouette_score(x_scaled, y_kmed)

Silhouette Score: 0.26597740204536796

keyboard_arrow_down IV. Silhouette Width to find number of cluster:

plt.plot(range(2, 11), sw)

keyboard_arrow_down V. Computing Purity:

def purity_score(y_true, y_pred):

print("Purity Score:", purity_score(y, y_kmed))

Purity Score: 0.898876404494382

keyboard_arrow_down VI. How extreme values effect K-Medoid compared to K-means:

kmeans = KMeans(n_clusters=3, init='random', max_iter=300, n_init=10, random_state=0)

Purity Score for K-Means: 0.9662921348314607

keyboard_arrow_down VII. Plotting values:

plt.scatter(x_scaled[y_kmeans == 0, 0], x_scaled[y_kmeans == 0, 1], s=100, c='red', label='Cluster 1')

keyboard_arrow_down VIII. Adding extreme values:

# Add extreme values to your dataset

# Perform KMeans clustering

# Calculate purity score

keyboard_arrow_down IX. Plot:

plt.scatter(x_scaled_extreme[y_kmeans_extreme == 0, 0], x_scaled_extreme[y_kmeans_extreme == 0, 1], s=100, c='red', label='C1')

data = [['k-Means', 0.81], ['k-Means with Outliers', purity_extreme],

You might also like