CLARA CLARANS Example

An presentation on clara and clarans

Uploaded by

tripathbikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

236 views3 pages

CLARA CLARANS Example

An presentation on clara and clarans

Uploaded by

tripathbikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

CLARA and CLARANS in Data Mining

Problem Setup:
We have a dataset with 10 points in 2D space, and we need to cluster them into 2 clusters.

Here is the dataset:

| Point | X | Y |
|-------|----|----|
| P1 | 2 | 10 |
| P2 | 2 | 5 |
| P3 | 8 | 4 |
| P4 | 5 | 8 |
| P5 | 7 | 5 |
| P6 | 6 | 4 |
| P7 | 1 | 2 |
| P8 | 4 | 9 |
| P9 | 6 | 2 |
| P10 | 3 | 6 |

CLARA (Clustering Large Applications) Example

1. Step 1: Subset Sampling

CLARA works by drawing multiple random samples (subsets) from the dataset, each of
size s, and then applies PAM (Partitioning Around Medoids) to each subset.

For simplicity, we take a subset of 5 points (small dataset):

- P1 (2, 10)
- P4 (5, 8)
- P6 (6, 4)
- P7 (1, 2)
- P9 (6, 2)

2. Step 2: Apply PAM to Subset

We calculate the distance matrix between the points using the Manhattan distance:

| | P1 | P4 | P6 | P7 | P9 |
|--------|-----|-----|-----|-----|-----|
| **P1** | 0 | 5 | 10 | 9 | 12 |
| **P4** | 5 | 0 | 5 | 10 | 7 |
| **P6** | 10 | 5 | 0 | 9 | 4 |
| **P7** | 9 | 10 | 9 | 0 | 5 |
| **P9** | 12 | 7 | 4 | 5 | 0 |

Using PAM, we identify the medoids. Suppose we pick P4 and P9 as initial medoids. Now,
we assign the remaining points to the closest medoid:
- P1 → P4
- P6 → P9
- P7 → P9

The clusters are:

- Cluster 1: P1, P4
- Cluster 2: P6, P7, P9

3. Step 3: Repeat with Multiple Subsets

CLARA repeats the sampling and clustering multiple times. The final clustering is based on
the medoids that result in the lowest overall cost (sum of distances from points to their
medoid).

CLARANS (Clustering Large Applications based on Randomized Search)

Example

1. Step 1: Initial Medoids

CLARANS starts with two randomly chosen medoids. Suppose we choose:
- Medoid 1: P1 (2, 10)
- Medoid 2: P6 (6, 4)

2. Step 2: Assign Points to Clusters

Assign each point to the closest medoid using Manhattan distance:
- P1 → P1 (Medoid 1)
- P2 → P6 (Medoid 2)
- P3 → P6 (Medoid 2)
- P4 → P1 (Medoid 1)
- P5 → P6 (Medoid 2)
- P7 → P6 (Medoid 2)
- P8 → P1 (Medoid 1)
- P9 → P6 (Medoid 2)
- P10 → P1 (Medoid 1)
Clusters are:
- Cluster 1: P1, P4, P8, P10
- Cluster 2: P2, P3, P5, P6, P7, P9

3. Step 3: Random Search for Better Medoids

CLARANS randomly selects a point that is not a medoid and swaps it with one of the
current medoids, then checks if the overall cost (sum of distances) decreases. If the cost
decreases, the new medoid is retained. If not, another random swap is tried.

4. Step 4: Final Clustering

After several iterations, CLARANS finalizes the clustering when no further improvements
are found. The resulting clusters will be based on the medoids that minimize the clustering
cost.

Conclusion:

- CLARA optimizes by sampling and using PAM, but it can miss the global optimum because
it only evaluates a small subset of data.
- CLARANS uses a randomized search approach, allowing it to explore more medoids and
find a better clustering solution.

Solution Manual For Computer Organization and Desi RISC-V E.. Patterson, K. Hennessy
No ratings yet
Solution Manual For Computer Organization and Desi RISC-V E.. Patterson, K. Hennessy
15 pages
Driver Drowsiness Detection System: A Project Phase-2 Report ON
100% (1)
Driver Drowsiness Detection System: A Project Phase-2 Report ON
51 pages
Shortest Common Superstring1
No ratings yet
Shortest Common Superstring1
14 pages
Exercises 695 Clas
No ratings yet
Exercises 695 Clas
3 pages
Bresenham Line Drawing Algo
No ratings yet
Bresenham Line Drawing Algo
6 pages
DFT Domain Image
No ratings yet
DFT Domain Image
65 pages
Spring End Sem Data Structure Question 2012-13
No ratings yet
Spring End Sem Data Structure Question 2012-13
2 pages
Data Structure4
No ratings yet
Data Structure4
6 pages
DAA Unit-3 PPT 19
No ratings yet
DAA Unit-3 PPT 19
49 pages
Data Structures Using C: Example 4.13
No ratings yet
Data Structures Using C: Example 4.13
5 pages
DAA Unit-2: Fundamental Algorithmic Strategies
No ratings yet
DAA Unit-2: Fundamental Algorithmic Strategies
5 pages
Tree Traversals (Inorder, Preorder and Postorder)
No ratings yet
Tree Traversals (Inorder, Preorder and Postorder)
4 pages
Topological Sorting: Directed Acyclic Graph
No ratings yet
Topological Sorting: Directed Acyclic Graph
22 pages
Flowchart of Sequential Search: Begin
No ratings yet
Flowchart of Sequential Search: Begin
2 pages
Branch and Bound
No ratings yet
Branch and Bound
30 pages
Assignment 1: Data Mining MGSC5126 - 10
No ratings yet
Assignment 1: Data Mining MGSC5126 - 10
10 pages
Aiml Lab Manual 2023
No ratings yet
Aiml Lab Manual 2023
17 pages
5 Queens PROBLEM
No ratings yet
5 Queens PROBLEM
13 pages
Dijkstra's Algorithm Lab Report
No ratings yet
Dijkstra's Algorithm Lab Report
6 pages
Algorithms: Dynamic Programming: 0-1 Knapsack Problem
No ratings yet
Algorithms: Dynamic Programming: 0-1 Knapsack Problem
13 pages
Chapter 3-Problem Solving by Searching Part 1
No ratings yet
Chapter 3-Problem Solving by Searching Part 1
80 pages
Trigonometric FLANN
No ratings yet
Trigonometric FLANN
9 pages
Assignment Ivth Semester
No ratings yet
Assignment Ivth Semester
17 pages
# BFS Algorithm in Python: Import
No ratings yet
# BFS Algorithm in Python: Import
7 pages
Questions 15: Genetic Algorithms: Roman Belavkin Middlesex University
100% (1)
Questions 15: Genetic Algorithms: Roman Belavkin Middlesex University
7 pages
Chomsky Normal Form: Dr.C.Sathiya Kumar, Associate Professor, VIT Univerrsity
No ratings yet
Chomsky Normal Form: Dr.C.Sathiya Kumar, Associate Professor, VIT Univerrsity
4 pages
Title: Implement Depth-First Search: Department of Computer Science and Engineering
No ratings yet
Title: Implement Depth-First Search: Department of Computer Science and Engineering
5 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
N Queen Problem
No ratings yet
N Queen Problem
12 pages
Artificial Intelligence Lab Manual: Python
No ratings yet
Artificial Intelligence Lab Manual: Python
15 pages
Ecc!
No ratings yet
Ecc!
49 pages
Soft Computing UNIT 1
No ratings yet
Soft Computing UNIT 1
10 pages
DAA - Backtracking Branch and Bound
No ratings yet
DAA - Backtracking Branch and Bound
39 pages
Remove Left Factoring
100% (1)
Remove Left Factoring
2 pages
Subsets, Graph Coloring, Hamiltonian Cycles, Knapsack Problem. Traveling Salesperson Problem
No ratings yet
Subsets, Graph Coloring, Hamiltonian Cycles, Knapsack Problem. Traveling Salesperson Problem
22 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Lab Records With Solution
No ratings yet
Lab Records With Solution
31 pages
Ad3351 - Design and Analysis of Algorithm
No ratings yet
Ad3351 - Design and Analysis of Algorithm
41 pages
Unit 2
No ratings yet
Unit 2
3 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
46 pages
Toc Unit-1 Part-2
No ratings yet
Toc Unit-1 Part-2
23 pages
UNIT 1 - Array Based Implementation
No ratings yet
UNIT 1 - Array Based Implementation
19 pages
Set and Frozenset in Python
No ratings yet
Set and Frozenset in Python
5 pages
P and NP Problems
No ratings yet
P and NP Problems
4 pages
Polynomial Representation and Addition
No ratings yet
Polynomial Representation and Addition
4 pages
Cheat-Sheet For Dynamic Programming: Template For Solution Short-Hand
No ratings yet
Cheat-Sheet For Dynamic Programming: Template For Solution Short-Hand
6 pages
Breadth First Search and Iterative Depth First Search: Practical 1
No ratings yet
Breadth First Search and Iterative Depth First Search: Practical 1
21 pages
Unit I J Line Drawing Algorithm - DDA Algorithm, Unit I K Bresenham's Algorithm
No ratings yet
Unit I J Line Drawing Algorithm - DDA Algorithm, Unit I K Bresenham's Algorithm
22 pages
Space and Time Trade Off
No ratings yet
Space and Time Trade Off
8 pages
KMP Algorithm
No ratings yet
KMP Algorithm
26 pages
Batch-11 Daa
100% (1)
Batch-11 Daa
11 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
ST Microelectronics Interview Questions
No ratings yet
ST Microelectronics Interview Questions
4 pages
Unit 3 - Basic Search and Traversal Techniques
100% (2)
Unit 3 - Basic Search and Traversal Techniques
113 pages
Recursion C++ PDF
No ratings yet
Recursion C++ PDF
24 pages
Data Mining and Business Intelligence Lab Manual
No ratings yet
Data Mining and Business Intelligence Lab Manual
52 pages
32 Lecture CSC462
No ratings yet
32 Lecture CSC462
34 pages
Instructions For Physics Practical Exam
No ratings yet
Instructions For Physics Practical Exam
2 pages
K Medroids
No ratings yet
K Medroids
13 pages
CLARANS
No ratings yet
CLARANS
19 pages
3.k-Metoids and Hierarchical Updated
No ratings yet
3.k-Metoids and Hierarchical Updated
50 pages
CS583 Association Sequential Patterns
No ratings yet
CS583 Association Sequential Patterns
64 pages
Clustering - Jupyter Notebook
100% (1)
Clustering - Jupyter Notebook
11 pages
Lab 7
No ratings yet
Lab 7
4 pages
ISM-24 (BI and Knowledge Management)
No ratings yet
ISM-24 (BI and Knowledge Management)
37 pages
BI Unit I
No ratings yet
BI Unit I
9 pages
Unit-IV Classification Part 1
No ratings yet
Unit-IV Classification Part 1
38 pages
Data Mining Ii Sol
No ratings yet
Data Mining Ii Sol
106 pages
Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms
No ratings yet
Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms
12 pages
Business Intelligence - The Ultimate Guide To BI, Artificial Intelligence, Machine Learning, Big Data, Cybersecurity, Data Science, and Predictive Analytics
No ratings yet
Business Intelligence - The Ultimate Guide To BI, Artificial Intelligence, Machine Learning, Big Data, Cybersecurity, Data Science, and Predictive Analytics
153 pages
A Study On K-Means Clustering in Text Mining Using Python
No ratings yet
A Study On K-Means Clustering in Text Mining Using Python
5 pages
DBSCAN Clustering Algorithm Based On Density
No ratings yet
DBSCAN Clustering Algorithm Based On Density
5 pages
Unit 3 Ba
No ratings yet
Unit 3 Ba
29 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Saravanan Thesis
100% (1)
Saravanan Thesis
207 pages
Clustering
No ratings yet
Clustering
11 pages
Financial Crime Investigations
100% (1)
Financial Crime Investigations
10 pages
CRISP-DM - Towards A Standard Process Model For Data
No ratings yet
CRISP-DM - Towards A Standard Process Model For Data
11 pages
Concepts (PPT) - Data Preprocessing
No ratings yet
Concepts (PPT) - Data Preprocessing
19 pages
DWDM MCQ
No ratings yet
DWDM MCQ
218 pages
BTech CSE Subjects Scheme
No ratings yet
BTech CSE Subjects Scheme
8 pages
MTech CO
No ratings yet
MTech CO
21 pages
Curated List of AI and Machine Learning Resources From Around The Web - by Robbie Allen - Machine Learning in Practice - Medium
No ratings yet
Curated List of AI and Machine Learning Resources From Around The Web - by Robbie Allen - Machine Learning in Practice - Medium
9 pages
Spatial Clustering Algorithm Using R-Tree
100% (2)
Spatial Clustering Algorithm Using R-Tree
6 pages
Data Science Training Institute in Hyderabad
No ratings yet
Data Science Training Institute in Hyderabad
14 pages
Data Tugas2 Data Mining Kmeans Clustering
No ratings yet
Data Tugas2 Data Mining Kmeans Clustering
4 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
K-Means Clustering Numerical Example
No ratings yet
K-Means Clustering Numerical Example
5 pages
Data Analytics Lifecycle
No ratings yet
Data Analytics Lifecycle
50 pages
附件1：南方科技大学研究生高水平国际会议名录
No ratings yet
附件1：南方科技大学研究生高水平国际会议名录
84 pages