0% found this document useful (0 votes)

135 views14 pages

Clustering

This document summarizes and compares different clustering methods for uncertain data using R. It begins by defining clustering and uncertain data. It then discusses clustering algorithms like partitioning, hierarchical, density-based, and grid-based methods. The document focuses on hierarchical clustering in R, providing code to generate dendrograms and comparing single, complete, average, and centroid linkage methods using sample European country data. It aims to evaluate these techniques for clustering uncertain data.

Uploaded by

Nakib Aman Turzo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

135 views14 pages

Clustering

Uploaded by

Nakib Aman Turzo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Relative Comparison of

Different Clustering Methods

for Uncertain Data Using R

Presented By
Nakib Aman
Roll no : 110131
Computer Science and Engineering
Department ,PUST.

Clustering
Clustering is the process of grouping a
set of data objects into multiple groups
or clusters so that objects within a cluster
have high similarity , but are very
dissimilar to objects in other clusters.
Much of the history of cluster analysis is
concerned with developing algorithms
that were not too computer intensive ,
since early computers were not nearly as
powerful as they are today.

Uncertain Data
The notion of data that contains
specific uncertainty is called
uncertain data. Uncertainty in data
naturally arises from a variety of real
world phenomena , such as implicit
randomness in a process of data
generation / acquisition .

Uncertain Database
A uncertain database DB is
defined by a set of uncertain
objects DB = {O1,, O|
DB|} spanning a (potentially
infinite) set of possible worlds W
and a constructive generation
rule G to draw possible worlds
from W in an unbiased way .

R Programming Language
R is highly extensible through the use
of user-submitted packages for specific
functions or specific areas of study.
Due to its S heritage, R has stronger
object-oriented-programming facilities
than most statistical computing
languages.
* R version 3.1.3 (Smooth Sidewalk) has been
released on 2015-03-09.

R Studio
R Studio is an integrated development
environment (IDE) for R. It includes a
console, syntax-highlighting editor that
supports direct code execution, as well
as tools for plotting, history, debugging
and workspace management.
Version0.98.1102 will be used for
demonstartion.

R Cluster library
The R Cluster library provides a
modern alternative to k-means
clustering , known as PAM , which is
an acronym for Partitioning around
Medoids . Cluster package was
last updated to version 2.0.1 on
February 19,2015 .

Overview of Clustering Methods

Method

General
Characteristics

Partitioning
methods

-Find mutually exclusive clusters

of spherical shape.
-Distance based.
-Effective for small-to-medium
size data sets.

Hierarchical
methods

-Clustering is a hierarchical
decomposition.
-May incorporate other
techniques like microclustering .

Density based
methods

Grid based

-Can find arbitrarily shaped

clusters .
-May filter out outliers.
-Clusters are dense regions of
objects in space that are
separated by low-density regions.
-Use a multiresolution grid data

Hierarchical Methods of Data

Clustering
A hierarchical method creates a
hierarchical decomposition of the
given set of data objects .
Hierarchical methods can be distance
based or density or continuity
based . Various extensions of
hierarchical methods consider
clustering in subspaces as well .

Cluster Dendrogram

R code for generating Cluster

Dendrogram
europe = read.csv("G:/Thesis/R codes/europe.csv")
europe
euroclust<-hclust(dist(europe[-1]))
plot(euroclust, labels=europe$Country)
rect.hclust(euroclust, 5)

Sample Data used for Clustering

Europe.csv
R file used for clustering

The data is taken from the CIA World Factbook and

gives some information about 28 european
countries.

Comparison of Different
Hierarchical Techniques
Single

Complete

hclust(dist(europe),meth
od="single")

hclust(dist(europe),meth
od="complete")

Average

Centroid

hclust(dist(europe),meth
od="average")

hclust(dist(europe),meth
od="centroid")

Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Clustering
No ratings yet
Clustering
55 pages
03 Clustering
No ratings yet
03 Clustering
63 pages
Unit 6 - Machine Learning in R
No ratings yet
Unit 6 - Machine Learning in R
45 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
64 pages
Clustering
No ratings yet
Clustering
53 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
Unsupervised Learning-01
No ratings yet
Unsupervised Learning-01
42 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Module 5
No ratings yet
Module 5
43 pages
Clustering
No ratings yet
Clustering
29 pages
2002 Spring CS525 Lecture 2
No ratings yet
2002 Spring CS525 Lecture 2
37 pages
Cluster
No ratings yet
Cluster
20 pages
Clustering
No ratings yet
Clustering
34 pages
Clustering
No ratings yet
Clustering
20 pages
Introduction To Cluster Analysis.
No ratings yet
Introduction To Cluster Analysis.
53 pages
Unit Iv
No ratings yet
Unit Iv
19 pages
4 Clustring
No ratings yet
4 Clustring
48 pages
Unit 2 - Introduction To Cluster Analysis
No ratings yet
Unit 2 - Introduction To Cluster Analysis
53 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Kmeans Algorithm
No ratings yet
Kmeans Algorithm
9 pages
Unit5 CSM ML
No ratings yet
Unit5 CSM ML
32 pages
Lect 10 DM
No ratings yet
Lect 10 DM
36 pages
Unit 4
No ratings yet
Unit 4
16 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
77 pages
Clad Cluster Analysisi Slides-Clusteranalysis
No ratings yet
Clad Cluster Analysisi Slides-Clusteranalysis
7 pages
Cluster Analysis I: Presidency University
No ratings yet
Cluster Analysis I: Presidency University
98 pages
Clustering
No ratings yet
Clustering
38 pages
DW & DM Unit 4 Notes
No ratings yet
DW & DM Unit 4 Notes
40 pages
Chap7 Basic Cluster Analysis
No ratings yet
Chap7 Basic Cluster Analysis
82 pages
Data Mining Unit-Iv
No ratings yet
Data Mining Unit-Iv
34 pages
Cluster Lecture-1
No ratings yet
Cluster Lecture-1
20 pages
Clustering in R
No ratings yet
Clustering in R
12 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Chapter 5. Clustering Algorithms-Stud
No ratings yet
Chapter 5. Clustering Algorithms-Stud
44 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Clustering Data Mining
No ratings yet
Clustering Data Mining
27 pages
Cluster Analysis Set 01: Types of Clustering
No ratings yet
Cluster Analysis Set 01: Types of Clustering
18 pages
Lecture 6
No ratings yet
Lecture 6
14 pages
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
93 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
18 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
Lecture 7 - Integrated Analysis With R
No ratings yet
Lecture 7 - Integrated Analysis With R
79 pages
Clustering and Visualisation of Data - 2020
No ratings yet
Clustering and Visualisation of Data - 2020
5 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
22 pages
Clustering - The Data Ensemble
No ratings yet
Clustering - The Data Ensemble
4 pages
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
No ratings yet
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
16 pages
Unit 5
No ratings yet
Unit 5
5 pages
Clustering New
No ratings yet
Clustering New
6 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
42 pages
Lecture 8 - Clustering
No ratings yet
Lecture 8 - Clustering
23 pages
Clustering: K-Means, Agglomerative, DBSCAN: Tan, Steinbach, Kumar
No ratings yet
Clustering: K-Means, Agglomerative, DBSCAN: Tan, Steinbach, Kumar
45 pages
Agnes
No ratings yet
Agnes
25 pages
UNIT5
No ratings yet
UNIT5
60 pages
Internship Report
No ratings yet
Internship Report
8 pages
Cluster Analysis Usingr PDF
No ratings yet
Cluster Analysis Usingr PDF
0 pages
Cluster
100% (1)
Cluster
72 pages
Unit4 Datascience
No ratings yet
Unit4 Datascience
43 pages
S 8401 PDF
No ratings yet
S 8401 PDF
110 pages
BSNL Landline Broadband Closure Letter
0% (1)
BSNL Landline Broadband Closure Letter
2 pages
PWP Chapter 5
No ratings yet
PWP Chapter 5
25 pages
Solutions Assignment-1 Data Members and Functions
No ratings yet
Solutions Assignment-1 Data Members and Functions
24 pages
Nat ADABAS4 ND
100% (1)
Nat ADABAS4 ND
54 pages
Fortra Data Classification Suite For Windows Deployment Guide
No ratings yet
Fortra Data Classification Suite For Windows Deployment Guide
69 pages
SJ XJ Pump Manual
100% (1)
SJ XJ Pump Manual
18 pages
Using Multivariate Statistics 7th Edition Barbara G. Tabachnickdownload
100% (2)
Using Multivariate Statistics 7th Edition Barbara G. Tabachnickdownload
51 pages
U1
No ratings yet
U1
2 pages
Lecture 02 Running EnergyPlus
No ratings yet
Lecture 02 Running EnergyPlus
29 pages
Examples On Sampling and Aliasing Phenomena: Example 1
No ratings yet
Examples On Sampling and Aliasing Phenomena: Example 1
5 pages
03 01 PatMax Logic
No ratings yet
03 01 PatMax Logic
15 pages
Vail CMMS
No ratings yet
Vail CMMS
24 pages
Knight's Tour
No ratings yet
Knight's Tour
8 pages
Geovision Hybrid Software Datasheet
No ratings yet
Geovision Hybrid Software Datasheet
6 pages
Os Installation
No ratings yet
Os Installation
16 pages
Adobe Scan 14 Sept 2024
No ratings yet
Adobe Scan 14 Sept 2024
9 pages
Simplifying Radicals
No ratings yet
Simplifying Radicals
8 pages
Fall 2023 - CS607 - 1
No ratings yet
Fall 2023 - CS607 - 1
3 pages
Review Paper: Virtual Autopsy: A New Trend in Forensic Investigation
No ratings yet
Review Paper: Virtual Autopsy: A New Trend in Forensic Investigation
7 pages
Main Ldap Training Day2
No ratings yet
Main Ldap Training Day2
39 pages
Constructor CPP Unit8
No ratings yet
Constructor CPP Unit8
28 pages
Digital System Design Q1 Q2
No ratings yet
Digital System Design Q1 Q2
3 pages
Links For Learning German
No ratings yet
Links For Learning German
2 pages
Inte 423 Exam Draft
No ratings yet
Inte 423 Exam Draft
3 pages
LEDGENTS For Building
No ratings yet
LEDGENTS For Building
1 page
Modelo de Negocio Secubike
No ratings yet
Modelo de Negocio Secubike
1 page
Lecture 7: Least-Squares Problem: Convex Optimization
No ratings yet
Lecture 7: Least-Squares Problem: Convex Optimization
7 pages
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
No ratings yet
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
2 pages
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
Comprehensive Guide to Glue for Scientific Data Exploration: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Glue for Scientific Data Exploration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Clustering

Uploaded by

Clustering

Uploaded by

Relative Comparison of

Different Clustering Methods

Overview of Clustering Methods

-Find mutually exclusive clusters

-Can find arbitrarily shaped

Hierarchical Methods of Data

R code for generating Cluster

Sample Data used for Clustering

The data is taken from the CIA World Factbook and

You might also like