100% found this document useful (1 vote)

343 views3 pages

Partitioning Methods

Partitioning methods divide data into k partitions or clusters where each object belongs to one cluster. Typical partitioning methods include k-means and k-medoids which group objects based on distance to cluster centers or medoids. When data is large, CLARA and CLARANS can be used which apply partitioning to samples of the data rather than the entire dataset.

Uploaded by

Diyar T Alzuhairi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

343 views3 pages

Partitioning Methods

Uploaded by

Diyar T Alzuhairi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Partitioning methods

Partitioning methods: Given a database of n objects or data tuples, a partitioning

method constructs k partitions of the data, where each partition represents a cluster
and k<= n.
 Requirement
 Each group must contain at least one object
 Each object must belong to exactly one group
 Typical methods: k-means, k-medoids
1. Hierarchical approach :
 Create a hierarchical decomposition of the set of data (or objects) using
some criterion( agglomerative or divisive )
 Typical methods: Diana, Agnes, BIRCH, CHAMELEON.
 A clustering is a set of clusters
Important distinction between hierarchical and partitional sets of clusters
Partitioning Clustering
 A division data objects into non-overlapping subsets (clusters) such
that each data object is in exactly one subset
Hierarchical clustering
 A set of nested clusters organized as a hierarchical tree
2. Denesity – based methods :
 Developed based on the notion of density
 The general idea is to continue growing the given cluster as long as the density
( number of objects or data points )
 3. grid-based methods :
 Quantize the object space into afinite number of cells that form a grid structure.
Advantage of this approach is its fast processing , e.g , sting

Center-based
 A cluster is a set of objects such that an object in a cluster is closer (more
similar) to the “center” of a cluster, than to the center of any other cluster .
 The center of a cluster is often a centroid, the average of all the points in the
cluster, or a medoid, the most “representative” point of a cluster .
1 .partitioned methods:
1 .k-means: The k-means algorithm for partitioning where each cluster’s center is
represented by the mean value of the objects in the cluster
K-Means Properties: Advantages

 K-means is relatively scalable and efficient in processing large data sets.

 Disadvantage
 Can be applied only when the mean of a cluster is defined

 Users need to specify k

 K-means is not suitable for discovering clusters with non-convex shapes or

clusters of very different size

 It is sensitive to noise and outlier data points (can influence the mean value)

2 . K-Medoids Method: Minimize the sensitivity of k-means to outliers

• Pick actual objects to represent clusters instead of mean values

• Each remaining object is clustered with the representative object (Medoid) to

which is the most similar

• The algorithm minimizes the sum of the dissimilarities between each object and
its corresponding reference point

• E: the sum of absolute error for all objects in the data set

• P: the data point in the space representing an object

• Oi: is the representative object of cluster Ci

K-Medoids Method: The Idea

• Initial representatives are chosen randomly

• The iterative process of replacing representative Objects by no representative

objects continues as long as the quality of the clustering is improved. For each
representative object O

• For each non-representative object R, swap O and R

• Choose the configuration with the lowest cost

• Cost function is the difference in absolute error-value if a current

• representative object is replaced by a non representative object

2 . k-medoids :K-Melodies Properties (k-melodies vs.-means)

Advantages

 K-melodies method is more robust than k-Means in the presence of noise and
outliers.
Disadvantages

• K-melodies is more costly that the k-Means method

• Like k-means, k-melodies requires the user to specify k.

• It does not scale well for large data sets

• For large values of n and k, such computation becomes very costly

Partitioning Methods in Large data base

3 . CLARA

k-medoids partitioning algorithm work effectively for small data sets but dose not
scale well for large data sets, to deal with large data sets can be used ( clustering
large application) ,clara

CLARA (Kaufmann and Rousseau in 1990)

Draw multiple samples of the data set, apply PAM on each sample, give the
best clustering
Perform better than PAM in larger data sets
Efficiency depends on the sample size
Strength: deals with larger data sets than PAM
4. Clarans (clustering large applications based upon randomized search )

 Which combines the sampling technique with pam

 Clarns does not confine itself to any sample at any given time, clara has fixed
sample at each stage of the search .
 Clarns draws a sample with some randomness in each step
of the search .
 The clustering can be viewed as a search through a graph.
 Each node assigned a cost.
 Pam examines all of the neighbors of the current node in its search for a minimum
cost solution .
 Clarns dynamically drowse a random sample of neighbors in each step of search .
 If a better neighbor is fond , clarns is move to the neighbors node and the process
starts again.

Advantages

• Experiments show that CLARANS is more effective than both PAM and CLARA
• Handles outliers.
Disadvantages : The clustering quality depends on the sampling method.

Numerical Methods Using MATLAB 4ed Solution Manual
43% (23)
Numerical Methods Using MATLAB 4ed Solution Manual
171 pages
Clustering For Big Data Analytics
No ratings yet
Clustering For Big Data Analytics
28 pages
Lecture06 - Interpolation by Spline Functions
No ratings yet
Lecture06 - Interpolation by Spline Functions
39 pages
Pam Clustering Technique: Bachelor of Technology Computer Science and Engineering
No ratings yet
Pam Clustering Technique: Bachelor of Technology Computer Science and Engineering
11 pages
Lesson8 Clustering
100% (1)
Lesson8 Clustering
33 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
Introduction To Cluster Analysis.
No ratings yet
Introduction To Cluster Analysis.
53 pages
Clustering
No ratings yet
Clustering
104 pages
10 Clus Basic
No ratings yet
10 Clus Basic
95 pages
05 Clustering
No ratings yet
05 Clustering
96 pages
Cluster-Analysis
No ratings yet
Cluster-Analysis
89 pages
Unit IV
No ratings yet
Unit IV
96 pages
Session 7 Clustering
No ratings yet
Session 7 Clustering
93 pages
Unit - 5 Cluster Analysis
No ratings yet
Unit - 5 Cluster Analysis
83 pages
Clustering
No ratings yet
Clustering
89 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
101 pages
Cluster Analysis
No ratings yet
Cluster Analysis
76 pages
DWMModule 4
No ratings yet
DWMModule 4
31 pages
8 - Clustering
No ratings yet
8 - Clustering
85 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
93 pages
Data Mining Clustering
No ratings yet
Data Mining Clustering
76 pages
2002 Spring CS525 Lecture 2
No ratings yet
2002 Spring CS525 Lecture 2
37 pages
Partitioning Methods
No ratings yet
Partitioning Methods
26 pages
Clustering
No ratings yet
Clustering
34 pages
DMW Unit-V
No ratings yet
DMW Unit-V
47 pages
10 Clus Basic
No ratings yet
10 Clus Basic
66 pages
Unit 5 DM
No ratings yet
Unit 5 DM
47 pages
Linked List: Intermediate Level Questions
No ratings yet
Linked List: Intermediate Level Questions
3 pages
Unit VII
No ratings yet
Unit VII
30 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
50 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
77 pages
DM Clustering UNIT4
No ratings yet
DM Clustering UNIT4
36 pages
Clustering
No ratings yet
Clustering
25 pages
Lect 10 DM
No ratings yet
Lect 10 DM
36 pages
Chapter 3: Cluster Analysis: 3.1 Basic Concepts of Clustering
No ratings yet
Chapter 3: Cluster Analysis: 3.1 Basic Concepts of Clustering
33 pages
Cluster Analysis: Basic Concepts Partitioning Methods Hierarchical Methods Density-Based Methods Grid-Based Methods Evaluation of Clustering
No ratings yet
Cluster Analysis: Basic Concepts Partitioning Methods Hierarchical Methods Density-Based Methods Grid-Based Methods Evaluation of Clustering
38 pages
Cluster Analysis: Minh Tran, PHD
No ratings yet
Cluster Analysis: Minh Tran, PHD
37 pages
Clustering
No ratings yet
Clustering
37 pages
Slide-08-Chapter10-Cluster Analysis Basic Concept I
No ratings yet
Slide-08-Chapter10-Cluster Analysis Basic Concept I
40 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
101 pages
Clustering K Means Agnes
No ratings yet
Clustering K Means Agnes
36 pages
4.1 Clustering
No ratings yet
4.1 Clustering
69 pages
Clustering
No ratings yet
Clustering
32 pages
Clustering
No ratings yet
Clustering
24 pages
Clustering Data Mining
No ratings yet
Clustering Data Mining
27 pages
Session 3: Clustering Techniques - Partitioning & Hierarchical Methods
No ratings yet
Session 3: Clustering Techniques - Partitioning & Hierarchical Methods
27 pages
Cluster
No ratings yet
Cluster
20 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
24 pages
Data Mining-Partitioning Methods
100% (1)
Data Mining-Partitioning Methods
7 pages
Cluster Analysis - Approach 1
No ratings yet
Cluster Analysis - Approach 1
28 pages
10 Clus Basic
No ratings yet
10 Clus Basic
31 pages
What Motivated Data Mining? Why Is It Important?
No ratings yet
What Motivated Data Mining? Why Is It Important?
12 pages
Clustering Partitioning Methods
No ratings yet
Clustering Partitioning Methods
20 pages
Unit-5 DM
No ratings yet
Unit-5 DM
11 pages
Cluster Analysis Clustering
No ratings yet
Cluster Analysis Clustering
6 pages
LP Mush
No ratings yet
LP Mush
23 pages
Unit IV Cluster Analysis
No ratings yet
Unit IV Cluster Analysis
7 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
22 pages
(12.5) The Quadratic Formula
No ratings yet
(12.5) The Quadratic Formula
11 pages
Clustering
No ratings yet
Clustering
7 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
42 pages
Clustering Methods
No ratings yet
Clustering Methods
14 pages
Advanced Numerical Analysis: Data Interpolation and Smoothing
No ratings yet
Advanced Numerical Analysis: Data Interpolation and Smoothing
26 pages
Amity School of Engineering and Technology Amity University, Uttar Pradesh
No ratings yet
Amity School of Engineering and Technology Amity University, Uttar Pradesh
5 pages
Partitioning Methods & Hierachical Methods
No ratings yet
Partitioning Methods & Hierachical Methods
22 pages
Lecture 8 - Clustering
No ratings yet
Lecture 8 - Clustering
23 pages
Fundamental Theorem of Algebra
No ratings yet
Fundamental Theorem of Algebra
11 pages
Allied Bca
No ratings yet
Allied Bca
6 pages
Lecture 1P4-Further Matrices (Class 3) - B&W
No ratings yet
Lecture 1P4-Further Matrices (Class 3) - B&W
55 pages
Hyperplane: The Case When The Data Are Linearly Separable
No ratings yet
Hyperplane: The Case When The Data Are Linearly Separable
2 pages
Alaa Ali ch8 Mining
No ratings yet
Alaa Ali ch8 Mining
13 pages
Unit-3: Divide and Conquer: Algorithms
No ratings yet
Unit-3: Divide and Conquer: Algorithms
78 pages
Master Theorom
No ratings yet
Master Theorom
22 pages
Cs 402 Analysis
No ratings yet
Cs 402 Analysis
4 pages
Introduction To Management Science
No ratings yet
Introduction To Management Science
46 pages
X X X X X X: Multiply The Following
No ratings yet
X X X X X X: Multiply The Following
2 pages
Hierarchical Clustering: DSCI 5240 Data Mining and Machine Learning For Business
No ratings yet
Hierarchical Clustering: DSCI 5240 Data Mining and Machine Learning For Business
45 pages
Optimization of Chemical Processes (Che1011)
No ratings yet
Optimization of Chemical Processes (Che1011)
22 pages
Midterm Exam - FALL 2020 Artificial Intelligence
No ratings yet
Midterm Exam - FALL 2020 Artificial Intelligence
3 pages
Presentation: Problem No:1.8 B.Charitha Reddy EE19BTECH11001 Electrical Engineering IIT Hyderabad
No ratings yet
Presentation: Problem No:1.8 B.Charitha Reddy EE19BTECH11001 Electrical Engineering IIT Hyderabad
8 pages
New Iterative Improvement of A Solution For An Ill-Conditioned System of Linear Equations
No ratings yet
New Iterative Improvement of A Solution For An Ill-Conditioned System of Linear Equations
8 pages
FEM Mathematical Modeling
No ratings yet
FEM Mathematical Modeling
19 pages
Numerical Analysis PDF
No ratings yet
Numerical Analysis PDF
6 pages
Automated Essay Grading: Alex Adamson, Andrew Lamb, Ralph Ma December 13, 2014
No ratings yet
Automated Essay Grading: Alex Adamson, Andrew Lamb, Ralph Ma December 13, 2014
5 pages
Alpha-Beta Pruning
No ratings yet
Alpha-Beta Pruning
8 pages
To Study The Difference Between Digital Differential Analyser (DDA) and Bresenham Line Drawing Algorithm
No ratings yet
To Study The Difference Between Digital Differential Analyser (DDA) and Bresenham Line Drawing Algorithm
5 pages
CS 189 - 289A - Introduction To Machine Learning
No ratings yet
CS 189 - 289A - Introduction To Machine Learning
6 pages
WORKSHEET BCA5004 NUMERICAL METHOD Unit 2
No ratings yet
WORKSHEET BCA5004 NUMERICAL METHOD Unit 2
3 pages
Unit 1 Formula
No ratings yet
Unit 1 Formula
3 pages
Multi-Layer Perceptrons (MLP) in R - GeeksforGeeks
No ratings yet
Multi-Layer Perceptrons (MLP) in R - GeeksforGeeks
8 pages
What Is Linear Data Structure
No ratings yet
What Is Linear Data Structure
2 pages
Quiz 1 - Spring 2024
No ratings yet
Quiz 1 - Spring 2024
3 pages
Final 2023 - PartA - Solution
No ratings yet
Final 2023 - PartA - Solution
2 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet