0% found this document useful (0 votes)

40 views5 pages

ML Exp 10

Program to implement K-Medoids in machine learning.

Uploaded by

ananyahc12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views5 pages

ML Exp 10

Program to implement K-Medoids in machine learning.

Uploaded by

ananyahc12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Experiment No.

Objective:
Program to implement K-Medoids in machine learning.

Apparatus required:
Pc and Jupyter or collab .

Theory:

K-Medoids clustering is a data mining technique that groups data points into a predefined
number of clusters (k). It shares similarities with the widely used K-Means algorithm, but
with a key distinction: K-Medoids employs actual data points (medoids) as cluster centers
instead of means (centroids) calculated from data points within a cluster. This approach
makes K-Medoids particularly advantageous in scenarios where datasets exhibit:

 Non-Euclidean Distances: When data points are measured using distance metrics
beyond Euclidean distance (e.g., Manhattan distance, Hamming distance), K-Medoids
can be more suitable, as it doesn't rely on calculations that might be skewed by
outliers or non-spherical clusters.
 Presence of Outliers: Outliers can significantly affect the location of centroids in K-
Means, potentially leading to suboptimal cluster formation. K-Medoids, by using
actual data points, can be less susceptible to such distortions.

Key concepts:

K-Medoids clustering is grounded in the principles of minimizing dissimilarity within

clusters. Here's a breakdown of the core concepts:

 Distance Metric: A distance metric, denoted by d(x, y), quantifies the dissimilarity
between two data points x and y. Common choices include Euclidean distance (||x -
y||), Manhattan distance (Σ |x_i - y_i|), and Hamming distance (number of differing
elements).
 Medoid: A medoid is a data point within a cluster that is centrally located relative to
other points in that cluster. It serves as the representative point for the cluster,
minimizing the sum of distances between the medoid and all other points within the
cluster.
 Cost Function: The cost function, denoted by J(C), measures the total dissimilarity
within all clusters. In K-Medoids, it's typically calculated as the sum of distances
between each data point and its assigned cluster's medoid.

K-Medoids Algorithm

The K-Medoids algorithm follows a step-by-step process to partition data points into k
clusters:

1. Initialization:
o Define the number of clusters (k).
o Select k data points randomly (or using an initialization strategy) as initial
medoids.
2. Assignment Step:
o For each data point x:
 Calculate the distance between x and each medoid using the chosen
distance metric.
 Assign x to the cluster that has the medoid closest to x.
3. Swapping Step (Optimization):
o For each cluster c:
 For each non-medoid data point x in c:
 Temporarily swap x with the current medoid of c.
 Recompute the cost function J(C) after the swap.
 If the swap reduces the cost function, make the swap permanent (update
the medoid of c).
4. Termination:
o Repeat steps 2 and 3 until no swaps result in a lower cost function (convergence is
achieved).

Key Considerations and Advantages

 Choice of Distance Metric: Selecting an appropriate distance metric is crucial for

effective clustering. Consider the nature of your data and the relationships between
data points when making this decision.

 Initialization Strategies: While random initialization is a common starting point,

alternative strategies that select medoids likely to be well-positioned within clusters
can improve the algorithm's efficiency and lead to better clusterings.

 Time Complexity: K-Medoids has a time complexity of O(nkT), where n is the

number of data points, k is the number of clusters, and T is the number of iterations
required for convergence. It can be computationally more expensive than K-Means,
especially for large datasets.

Applications of K-Medoids

K-Medoids clustering finds applications in various domains, including:

 Customer Segmentation: Grouping customers based on purchase history,

demographics, or behavior to personalize marketing campaigns.

 Image Segmentation: Identifying and grouping regions within an image that share
similar characteristics (e.g., color, texture) for object recognition.

 Document Clustering: Grouping documents based on content similarity for

information retrieval or topic modeling.

 Gene Expression Analysis: Identifying patterns in gene expression data to

understand biological processes or disease mechanisms.
Implementation Using Code:

!pip install https://github.com/scikit-learn-contrib/scikit-learn-extra/archive/master.zip

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.cluster import KMeans
from sklearn_extra.cluster import KMedoids

# Load the Iris dataset

iris = load_iris()
data = iris.data
target = iris.target # Actual species labels (for evaluation)

# Define the number of clusters (k)

k=3

# Initialize K-Medoids model

kmedoids = KMedoids(n_clusters=k, metric='euclidean', random_state=0)

# Fit the model to the data

kmedoids.fit(data)

# Get the cluster labels for each data point

predicted_cluster = kmedoids.labels_

# Print some results

print("Predicted cluster labels:", predicted_cluster)

# (Optional) Evaluate clustering performance (e.g., silhouette score)

from sklearn.metrics import silhouette_score

silhouette_coeff = silhouette_score(data, predicted_cluster)

print("Silhouette Coefficient:", silhouette_coeff)

# (Optional) Compare predicted clusters with actual species labels

from sklearn.metrics import confusion_matrix

confusion_matrix = confusion_matrix(target, predicted_cluster)

print("Confusion Matrix:\n", confusion_matrix)

# Visualize the clustered data

plt.figure(figsize=(10, 5))

# Plot the original data

plt.subplot(1, 2, 1)
plt.scatter(data[:, 0], data[:, 1], c=target, cmap='viridis', edgecolor='k')
plt.title('Original Data')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')

# Plot the clustered data

plt.subplot(1, 2, 2)
plt.scatter(data[:, 0], data[:, 1], c=predicted_cluster, cmap='viridis', edgecolor='k')
plt.title('Clustered Data (K-Medoids)')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')

plt.tight_layout()
plt.show()

Result:
Program to implement K-Medoids is implemented.

13 Clustering I
No ratings yet
13 Clustering I
41 pages
Unit 6
No ratings yet
Unit 6
102 pages
AI Unit 5
No ratings yet
AI Unit 5
103 pages
Cluster
No ratings yet
Cluster
69 pages
K-Medoids-Clustering Method
No ratings yet
K-Medoids-Clustering Method
5 pages
MLUnit III
No ratings yet
MLUnit III
42 pages
Unit4 ML
No ratings yet
Unit4 ML
20 pages
Pam Clustering Technique: Bachelor of Technology Computer Science and Engineering
No ratings yet
Pam Clustering Technique: Bachelor of Technology Computer Science and Engineering
12 pages
Data Mining 2
No ratings yet
Data Mining 2
9 pages
ML Extended
No ratings yet
ML Extended
25 pages
K-Medoid Clustering
No ratings yet
K-Medoid Clustering
15 pages
2.10 Partitioning Methods - K-Means and K-Medoids
No ratings yet
2.10 Partitioning Methods - K-Means and K-Medoids
38 pages
Clustering Classification and Intro Neural Network
No ratings yet
Clustering Classification and Intro Neural Network
168 pages
NeurIPS 2020 Banditpam Almost Linear Time K Medoids Clustering Via Multi Armed Bandits Paper
No ratings yet
NeurIPS 2020 Banditpam Almost Linear Time K Medoids Clustering Via Multi Armed Bandits Paper
12 pages
CV Unit 4
No ratings yet
CV Unit 4
60 pages
Lab # 11
No ratings yet
Lab # 11
6 pages
KMeans Variants
No ratings yet
KMeans Variants
27 pages
Cluster
No ratings yet
Cluster
50 pages
Aimlseminar 2024
No ratings yet
Aimlseminar 2024
11 pages
10-Types of Data in Cluster Analysis, Partitioning Methods-21-10-2024
No ratings yet
10-Types of Data in Cluster Analysis, Partitioning Methods-21-10-2024
11 pages
ML Unit-4
No ratings yet
ML Unit-4
23 pages
19.1. Partitioning-Based Clustering Algorithms
No ratings yet
19.1. Partitioning-Based Clustering Algorithms
27 pages
AP1
No ratings yet
AP1
6 pages
K Medoids
No ratings yet
K Medoids
9 pages
Music Recommendation System
No ratings yet
Music Recommendation System
24 pages
Lecture 3. Partitioning-Based Clustering Methods
No ratings yet
Lecture 3. Partitioning-Based Clustering Methods
27 pages
Lesson8 Clustering
100% (1)
Lesson8 Clustering
33 pages
Clustering Notes
No ratings yet
Clustering Notes
29 pages
ML 5
No ratings yet
ML 5
61 pages
K Mean Clustering
No ratings yet
K Mean Clustering
32 pages
7 Cluster Analysis
No ratings yet
7 Cluster Analysis
62 pages
Hard and Fuzzy C-Medoids For Asymmetric Networks: Yousuke Kaizu Sadaaki Miyamoto Yasunori Endo
No ratings yet
Hard and Fuzzy C-Medoids For Asymmetric Networks: Yousuke Kaizu Sadaaki Miyamoto Yasunori Endo
6 pages
Lecture5 - Clustering (K Means and K Medoids)
No ratings yet
Lecture5 - Clustering (K Means and K Medoids)
36 pages
An Improved K Medoids Clustering Approach - 2022 - Journal of Computational Mat
No ratings yet
An Improved K Medoids Clustering Approach - 2022 - Journal of Computational Mat
12 pages
Chapter 3: Cluster Analysis: 3.1 Basic Concepts of Clustering
No ratings yet
Chapter 3: Cluster Analysis: 3.1 Basic Concepts of Clustering
33 pages
Clustering
No ratings yet
Clustering
24 pages
Unsupervised Learning - Clustering
No ratings yet
Unsupervised Learning - Clustering
55 pages
4.3 K-Medoids
No ratings yet
4.3 K-Medoids
31 pages
Lect 10 DM
No ratings yet
Lect 10 DM
36 pages
Clustering Deep Dive
No ratings yet
Clustering Deep Dive
8 pages
K Medroids
No ratings yet
K Medroids
13 pages
Maestro XS Reference Manual Version 2.0 PDF
33% (3)
Maestro XS Reference Manual Version 2.0 PDF
130 pages
Asynchrous Task K6-7 Chaitra
No ratings yet
Asynchrous Task K6-7 Chaitra
2 pages
AL and ML Assessment Week 11
No ratings yet
AL and ML Assessment Week 11
2 pages
K Medoids Alg
No ratings yet
K Medoids Alg
4 pages
An Improved K-Medoid Clustering Algo
No ratings yet
An Improved K-Medoid Clustering Algo
30 pages
K-Medoids Clustering of Data Sequences With Composite Distributions
No ratings yet
K-Medoids Clustering of Data Sequences With Composite Distributions
12 pages
Complete Referenec of Sementics
No ratings yet
Complete Referenec of Sementics
6 pages
Numpy / Scipy Recipes For Data Science: K-Medoids Clustering
No ratings yet
Numpy / Scipy Recipes For Data Science: K-Medoids Clustering
6 pages
CSE3506 - Essentials of Data Analytics: Facilitator: DR Sathiya Narayanan S
No ratings yet
CSE3506 - Essentials of Data Analytics: Facilitator: DR Sathiya Narayanan S
17 pages
(MS-02.00) Condensing Unit & Ahu
No ratings yet
(MS-02.00) Condensing Unit & Ahu
52 pages
Unsupervised Learning Modi
No ratings yet
Unsupervised Learning Modi
16 pages
Pam Clustering Technique
No ratings yet
Pam Clustering Technique
13 pages
The Application of K-Medoids and PAM To The Clustering of Rules
No ratings yet
The Application of K-Medoids and PAM To The Clustering of Rules
6 pages
33 93 LM V1 S1 - Kmedoids
No ratings yet
33 93 LM V1 S1 - Kmedoids
3 pages
Partitioning Around Medoid: K-Medoids
No ratings yet
Partitioning Around Medoid: K-Medoids
5 pages
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
No ratings yet
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
43 pages
Oracle WMS PICK (White Paper)
100% (16)
Oracle WMS PICK (White Paper)
35 pages
Company SNP (Eng) - Color - 1-6-61
No ratings yet
Company SNP (Eng) - Color - 1-6-61
95 pages
KENZA MAX Biochemistry: Compliance
No ratings yet
KENZA MAX Biochemistry: Compliance
43 pages
Bill of Engineering Measurements and Evaluation (BEME)
No ratings yet
Bill of Engineering Measurements and Evaluation (BEME)
18 pages
Tda8580j Datasheet
100% (1)
Tda8580j Datasheet
28 pages
Ijret 110306027
No ratings yet
Ijret 110306027
4 pages
Perio Instruments
100% (3)
Perio Instruments
32 pages
Comparative Investigation of K-Means and K-Medoid Algorithm On Iris Data
No ratings yet
Comparative Investigation of K-Means and K-Medoid Algorithm On Iris Data
4 pages
Amartya Paul With Degrees and Certificates
No ratings yet
Amartya Paul With Degrees and Certificates
30 pages
Subjectivity Objectivity and Frames of R PDF
No ratings yet
Subjectivity Objectivity and Frames of R PDF
49 pages
MA26 Meter & MP-T1 Pulser: Document Ref 903158-001 Rev - 1 10/2001
100% (1)
MA26 Meter & MP-T1 Pulser: Document Ref 903158-001 Rev - 1 10/2001
28 pages
Solar Greenhouse Construction and Operation by Rick Fisher
100% (1)
Solar Greenhouse Construction and Operation by Rick Fisher
166 pages
Esci JPP
0% (1)
Esci JPP
27 pages
Ethylene Oxide: Jump To
100% (1)
Ethylene Oxide: Jump To
31 pages
Class 6 Maths Test (30!06!2025)
No ratings yet
Class 6 Maths Test (30!06!2025)
2 pages
Qpwugerqwjbrchapter 2 Descriptive Statistics: Tabular and Graphical Presentations
No ratings yet
Qpwugerqwjbrchapter 2 Descriptive Statistics: Tabular and Graphical Presentations
37 pages
Brochure ADELE
No ratings yet
Brochure ADELE
12 pages
Science
No ratings yet
Science
4 pages
The Theory of Space Time Warping
No ratings yet
The Theory of Space Time Warping
9 pages
Local Attraction
No ratings yet
Local Attraction
15 pages
Riemann - Biography - Wiki
No ratings yet
Riemann - Biography - Wiki
7 pages
Prefabricated Substations
No ratings yet
Prefabricated Substations
2 pages
BODMAS 1new
No ratings yet
BODMAS 1new
2 pages
Amptec 601ES - Explosive Safety Digital Multimeter (DMM)
No ratings yet
Amptec 601ES - Explosive Safety Digital Multimeter (DMM)
2 pages
Circular Slab Estimation of Steel
No ratings yet
Circular Slab Estimation of Steel
3 pages
Python Tuples PDF
No ratings yet
Python Tuples PDF
3 pages
Design Process en 1993-1!3!2006
No ratings yet
Design Process en 1993-1!3!2006
22 pages
Book of Sweep Picking
100% (3)
Book of Sweep Picking
4 pages
Heat Pipes Write Up With Example
No ratings yet
Heat Pipes Write Up With Example
9 pages
Windows 7 Hyper Terminal
No ratings yet
Windows 7 Hyper Terminal
4 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Computational Geometry: Exploring Geometric Insights for Computer Vision
From Everand
Computational Geometry: Exploring Geometric Insights for Computer Vision
Fouad Sabry
No ratings yet

ML Exp 10

Uploaded by

ML Exp 10

Uploaded by

Experiment No.

K-Medoids clustering is grounded in the principles of minimizing dissimilarity within

Key Considerations and Advantages

 Choice of Distance Metric: Selecting an appropriate distance metric is crucial for

 Initialization Strategies: While random initialization is a common starting point,

 Time Complexity: K-Medoids has a time complexity of O(nkT), where n is the

K-Medoids clustering finds applications in various domains, including:

 Customer Segmentation: Grouping customers based on purchase history,

 Document Clustering: Grouping documents based on content similarity for

 Gene Expression Analysis: Identifying patterns in gene expression data to

!pip install https://github.com/scikit-learn-contrib/scikit-learn-extra/archive/master.zip

# Load the Iris dataset

# Define the number of clusters (k)

# Initialize K-Medoids model

# Fit the model to the data

# Get the cluster labels for each data point

# Print some results

# (Optional) Evaluate clustering performance (e.g., silhouette score)

silhouette_coeff = silhouette_score(data, predicted_cluster)

# (Optional) Compare predicted clusters with actual species labels

confusion_matrix = confusion_matrix(target, predicted_cluster)

# Visualize the clustered data

# Plot the original data

# Plot the clustered data

You might also like