0% found this document useful (0 votes)

40 views4 pages

An Efficient Method For Active Semi-Supervised Density Based Clustering

Semi-supervised clustering algorithms relies on side information, either labeled data (seeds) or pairwise constraints (must-link or cannot link) between data objects, to improve the quality of clustering. This paper proposes to extend an existing seed-based clustering algorithm with an active learning mechanism to collect pairwise constraints. My new semi-supervised algorithm can deal with both seeds and constraints. Experiment results on real data sets show the efficient of my algorithm when compared to the initial seed-based clustering algorithm.

Uploaded by

WARSE Journals

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views4 pages

An Efficient Method For Active Semi-Supervised Density Based Clustering

Uploaded by

WARSE Journals

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

ISSN 2320 2602

Volume 4 No.4, April 2015

Viet-Vu Vu, International Journal of Advances in Computer Science and Technology, 4(4), April 2015, 59 - 62

International Journal of Advances in Computer Science and Technology

Available Online at http://www.warse.org/ijacst/static/pdf/file/ijacst04442015.pdf

An efficient method for active semi-supervised density based clustering

Viet-Vu Vu
Electronics Faculty, Thai Nguyen University of Technology, Thai Nguyen city, Viet Nam, [email protected]

Figure 1 and figure 2 illustrate different types of prior

knowledge that can be included in the process of classifying
data: dots correspond to points without any labels; points with
labels are denoted by circles, asterisks and crosses. In figure 2
(left), the must-link and cannot-link constraints are denoted
by
solid and dashed lines [1].

ABSTRACT
Semi-supervised clustering algorithms relies on side
information, either labeled data (seeds) or pairwise
constraints (must-link or cannot link) between data objects, to
improve the quality of clustering. This paper proposes to
extend an existing seed-based clustering algorithm with an
active learning mechanism to collect pairwise constraints. My
new semi-supervised algorithm can deal with both seeds and
constraints. Experiment results on real data sets show the
efficient of my algorithm when compared to the initial
seed-based clustering algorithm.
Key words: semi-supervised clustering, active learning,
seed, constraint.
1. INTRODUCTION

Figure 1: Spectrum of supervised (left) and partially labeled

(right) Learning

Clustering is an important task in the process of knowledge

discovery in data mining. In the past ten years, the problem of
clustering with side information (known as semi-supervised
clustering) has become an active research direction to
improve the quality of the results by integrating knowledge to
the unsupervised algorithms [2].
The works on semi-supervised clustering can be divided
into two main families depending on the type of side
information provided to the algorithm. On the one hand, seed
based clustering [3, 4, 6, 12] relies on a small set of labeled
data, while on the other hand, constraint based clustering
relies on a small set of pairwise constraints (must-link - ML or
cannot link CL) between data objects [2].
Each of these methods has advantages and drawbacks:
seeds are useful for initialization of clusters but can be more
difficult to set, while constraints are more adapted to delimit
the frontier between clusters but needs clusters to already exist
to be efficient. In both case, the difficulty of the
semi-supervised methods, as in supervised learning, is to
initiate the algorithms with labeled data or pairwise
constraints that are likely to be beneficial for the clustering
algorithm. This problem has been tackled in [5, 8, 9, 10]
where the authors propose an active learning algorithm to: (1)
select the best constraints/seed based on a nearest-neighbors
density criterion and, (2) propagates the constraints selected
by the expert to infer new constraints automatically and thus
minimizing the number of expert solicitations.

Figure 2: Spectrum of constrained (left) and unsupervised

(right) learning
In this paper, I extend the Seed based DBSCAN algorithm
(SSDBSCAN) [4]
and propose the ActSSDBSCAN
algorithm that integrates an active learning strategy to collect
ML and CL constraints. Thus, the proposed algorithm is
probably, to the best of my knowledge, the first method that
includes at the same time seeds and constraints. Preliminary
experiments conducted on some real datasets show that, using
my new active algorithm, the performance of SSDBSCAN
can be improved after only few expert solicitations.
This paper is organized as follows: Section 2 presents the
main principles of the seed-based DBSCAN on which relies
my new Active SSDBSCAN algorithm described in Section 3.
59

Viet-Vu Vu, International Journal of Advances in Computer Science and Technology, 4(4), April 2015, 59 - 62
Then, section 4 presents the experimental protocol and the
preliminary results. Finally, Section 5 concludes and devises
some perspectives of this research.

2.2. Seed-based K-Means

The seed based K-Means algorithm has been proposed by
Basu et al. [4]. This method uses a small set of labeled data,
the seeds, to help the clustering of the unlabeled data. Two
variants of semi-supervised K-Means clustering are
introduced: Seed K-Means and Constraint K-Means. In both
methods, the seeds are supposed to be representative of all the
clusters. In Seed K-Means, the labeled data are used to
compute an initial center for each cluster. Then a traditional
K-Means is applied on the dataset without any further use of
the labeled data, while in Constraint K-Means the
information is used as constraints so that the labeled data
cannot be removed from the cluster they have been affected to
by the user. The seed based K-Means is presented in the
algorithms \ref{alg1}.

2. SEED-BASED CLUSTERING ALGORITHMS

This section introduces some existed words with sees-based
clustering like Seed-based DBSCAN and Seed-based
K-means.
2.1. Seed-based DBSCAN
The seed-based DBSCAN extends the original DBSCAN
algorithm [14] with a small set of labeled data to enable the
discovery of clusters with distinct densities.
Indeed, following [15], in the algorithm DBSCAN, the notion
of density is formalized according to two parameters: MinPts
specifies a minimum number of objects, and is the radius of
a hypers-phere in the space of the objects. However, as these
parameters are set once for all clusters, DBSCAN can only
detect clusters with the same density.

Algorith 1: Seed-based KMeans

Input: Set of data points X = {x1,x2,,xn}; xi Rd, number of
clusters K, {S1, S2,,SK} of initial seeds
Output: Disjoint K partitioning {X1, X2,,XK} of X such
that K-Means objective function is optimized.
Method:
Step 1. gh = xShx /|Sh|, for h = 1,,K; t = 0
Step 2: Repeat until convergence
- Assign_cluster: Assign each data point x to the cluster
h*, for h* = argmin||x-gh(t) ||2
- Estimate_mean: gh(t+1) = xShx/|Xh(t+1)|
- t = t+1

The objective of SSDBSCAN is to overcome this limit by

using seeds to compute an adapted radius for each cluster.
Thus, SSDBSCAN has only one parameter MinPts, the
parameter being deduced from the set of provided seeds.
Another difference is that, contrary to DBSCAN, in
SSDBSCAN, the clustering is seen more like a graph
partitioning problem.
To this aim, the data set is represented as a weighted
undirected graph where each vertex corresponds to an unique
data objects and each edge between objects p and q has a
weight determined by the rDist() measure described hereafter.

3. ACTIVE LEARNING SEED-BASED DBSCAN

In the context of density based clustering, distant points are
not necessarily in different clusters and the choice of the
largest edge in the set of all density-connection paths to
decide of the separation between two clusters may not be the
best solution.
My proposal is to use the expert knowledge to define the
appropriate separation distance between clusters. To this aim,
my algorithm integrates an active learning mechanism to
gather constraints and can be summarized as follows:

The rDist(p,q) measure indicates the smallest radius value for

which p and q are core points and directly density connected
with respect to MinPts. Thus, rDist() can be formalized as
follows:
p, q X, rDist(p,q) = max(cDist(p),cDist(q), d(p,q))
(1)
where D denotes the data set, d() is the metric used in the
clustering and o D, cDist(o) is the minimal radius such
that o is a core-point and has MinPts nearest-neighbors.

Algorithm 2: ActSSDBSCAN
Input: Set of data points X = {x1,x2,,xn}; xi Rd, {S1,
S2,,SK} is the set of seeds
Output: Disjoint K partitioning {X1, X2,,XK}
Repeat
Step 1: Build cluster as in SSDBSCAN, if the stop condition
is true, go to step 3
Step 2: For each sorted edge, ask the expert if the relation
between the vertices is a must-link (ML) or cannot-link (CL)
constraint;
Step 3: While the expert answer is ML go to step 2;
Step 4: If the expert answer is CL then choose the edge rDist
value as a separation distance and obtain a cluster.
Until the set of seeds is empty

Then, given a set of seeds DL, the SSDBSCAN algorithm

proceeds as follows. Using the previous distance rDist(), it is
possible to construct a density-based cluster C that contains
the first seed point p, by first adding p to C and then iteratively
adding the next closest point in term of rDist() distance to C.
The process continues until there is a point q that has a
different label from $p$. At that time, the algorithm
backtracks to the point o with the largest rDist() before adding
q. The current expansion stops and includes all points up to
but excluding o, having a cluster C containing p.
Conceptually, this is the same as constructing a minimum
spanning tree (MST) for a complete graph where the set of
vertices equals D and the edge weights are given by rDist().
60

Viet-Vu Vu, International Journal of Advances in Computer Science and Technology, 4(4), April 2015, 59 - 62
should help minimizing expert solicitations during the active
learning step.

4. EXPERIMENT RESULTS
I use 5 real datasets from the Machine Learning Repository
[15] named: Protein, Iris, Glass, Thyroid, and LetterIJL to
evaluate my algorithm. The detail of datasets is shown in
Table 1.

Table 1. Data set for testing

Name
N
M

Protein

115

Iris

150

Glass

214

Thyroid

215

LetterIJL

227

I use the Rand Index (RI) measure [13], as it is widely used

in evaluation of clustering results.
The RI measure computes the agreement between the
theoretical partition of each dataset and the output partition of
evaluated algorithms.
This measure is based on n(n-1)/2 pairwise comparisons
between the n points of a data set X. For each pair of point xi
and xj in X, a partition assigns them either to the same cluster
or to different clusters.
Let us consider two partitions P1 and P2, and let a be the
number of decisions where the point xi is in the same cluster
as xj in P1 and P2. Let b be the number of decisions where the
two points are placed in different clusters in both partitions. A
total agreement can then be calculated as shown in equation
(1).

RI ( P1 , P2 )

2(a b)
n(n 1)

Protein

(1)

RI takes values between 0 and 1; RI = 1 when the result is the

same as the ground-truth. The larger the RI, the better the
result.
It can be seen from figure 3 that ActSSDBSCAN outperforms
SSDBSCAN for each of the benchmark data sets. This
experiments show the benefit of using both seeds and
constraints to build the clusters and validate my hypothesis
that the longest edge may not be the best criterion in the case
of density based clustering algorithms.

Iris

5. CONCLUSION
This paper presents a new active learning density based
clustering algorithm named ActSSDBSCAN. To the best of
my knowledge, this is the first semi-supervised algorithm to
use both seeds and constraints as side information.
Preliminary results on real data sets show the benefit of my
approach when compared to SSDBSCAN. Future research

Thyroid
61

Viet-Vu Vu, International Journal of Advances in Computer Science and Technology, 4(4), April 2015, 59 - 62
5.

LetterIJL

10.

11.

12.

Soybean
Figure 3: Experiment results

13.
14.

REFERENCES
1.

T. Lange, M.H. Law, A.K. Jain and J.B. Buhmann.

Learning with constrained and unlabeled data, in
proc. IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR), volume 1, pp.
730-735, 2005.
S. Basu, I. Davidson, and K. L. Wagstaff, Constrained
Clustering: Advances in Algorithms, Theory, and
Applications, Chapman and Hall/CRC Data Mining and
Knowledge Discovery Series, 1st ed., 2008.
Levi
Lelis,
Jrg
Sander.
Semi-supervised
Density-Based Clustering. In Proc. IEEE International
Conference on Data mining, 2009, pp. 842-847
Sugato Basu, Arindam Banerjee, Raymond J. Mooney:
Semi-supervised Clustering by Seeding. In Proc. ICML
2002: 27-34

15.

Viet-Vu Vu, Nicolas Labroche, and Bernadette

Bouchon-Meunier, Improving Constrained Clustering
with Active Query Selection, Pattern Recognition
45(4): 1749-1758 (2012), ISSN: 0031-3203.
Carlos Ruiz, Myra Spiliopoulou, Ernestina Menasalvas
Ruiz: Density-based semi-supervised clustering. Data
Min. Knowl. Discov. 21(3): 345-370 (2010)
Anil K. Jain.: Data clustering: 50 years beyond
K-means. Pattern Recognition Letters (PRL)
31(8):651-666 (2010).
Viet-Vu Vu, Nicolas Labroche, and Bernadette
Bouchon-Meunier.
Active
Learning
for
Semi-Supervised K-Means Clustering. In Proceedings
of the 22nd IEEE International Conference on Tools with
Artificial Intelligence (ICTAI-2010), Arras, France,
October, 2010.
Viet-Vu Vu, Nicolas Labroche, and Bernadette
Bouchon-Meunier. Boosting Clustering by Active
Constraint Selection. In Proceedings of the 19th
European Conference on Artificial Intelligence
(ECAI-2010), Lisbon, Portugal, August, 2010.
Viet-Vu Vu, Nicolas Labroche, and Bernadette
Bouchon-Meunier. An Efficient Active Constraint
Selection Algorithm for Clustering. In Proceedings of
the 20th IEEE International Conference on Pattern
Recognition (ICPR-2010), Istanbul, Turkey, August,
2010.
Kiri Wagstaff, Claire Cardie, Seth Rogers, Stefan
Schrdl: Constrained K-means Clustering with
Background Knowledge. In Proc. ICML 2001: 577-584
Amine Bensaid, Lawrence O. Hall, James C. Bezdek,
Laurence P. Clarke. Partially supervised clustering for
image segmentation. Pattern Recognition 29(5):
859-871 (1996)
http://archive.ics.uci.edu/ml/
Martin Ester, Hans-Peter Kriegel, Jrg Sander, Xiaowei
Xu. A Density-Based Algorithm for Discovering
Clusters in Large Spatial Databases with Noise. In
Proc. KDD 1996: 226-231.
Christian Bhm and Claudia Plant: HISSCLU: a
hierarchical
density-based
method
for
semi-supervised clustering. In Proc. EDBT, 2008:
440-451.

Cybersecurity: Powerpoint Template
100% (3)
Cybersecurity: Powerpoint Template
69 pages
Dynamic Approach To K-Means Clustering Algorithm-2
No ratings yet
Dynamic Approach To K-Means Clustering Algorithm-2
16 pages
How To Write Compiler
No ratings yet
How To Write Compiler
148 pages
Data Clustering Using Kernel Based
No ratings yet
Data Clustering Using Kernel Based
6 pages
Density-Based Semi-Supervised Clustering: Carlos Ruiz Ernestina Menasalvas
No ratings yet
Density-Based Semi-Supervised Clustering: Carlos Ruiz Ernestina Menasalvas
26 pages
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
No ratings yet
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
8 pages
I Jsa It 04132012
No ratings yet
I Jsa It 04132012
4 pages
A Fully Autonomous Data Density Based Clustering Technique: R.hyde1@lancaster - Ac.uk P.angelov@lancaster - Ac.uk
No ratings yet
A Fully Autonomous Data Density Based Clustering Technique: R.hyde1@lancaster - Ac.uk P.angelov@lancaster - Ac.uk
8 pages
Clustering Through Decision Tree Construction
No ratings yet
Clustering Through Decision Tree Construction
22 pages
s10994-020-05896-2
No ratings yet
s10994-020-05896-2
50 pages
An Efficient Enhanced K-Means Clustering Algorithm
No ratings yet
An Efficient Enhanced K-Means Clustering Algorithm
8 pages
5 - Clustering
No ratings yet
5 - Clustering
13 pages
Effective Semi-Supervised Document Clustering Via Active Learning With Instance-Level Constraints
No ratings yet
Effective Semi-Supervised Document Clustering Via Active Learning With Instance-Level Constraints
19 pages
Fast Self-Supervised Clustering With Anchor Graph
No ratings yet
Fast Self-Supervised Clustering With Anchor Graph
14 pages
Aced
No ratings yet
Aced
17 pages
Applying SR-Tree Technique in DBSCAN Clustering Algorithm
No ratings yet
Applying SR-Tree Technique in DBSCAN Clustering Algorithm
4 pages
Chapter 2 (19-06-2019 v2)
No ratings yet
Chapter 2 (19-06-2019 v2)
10 pages
Recent Advances in Clustering A Brief Survey
No ratings yet
Recent Advances in Clustering A Brief Survey
9 pages
K-Means Clustering Algorithm and Its Improvement R
No ratings yet
K-Means Clustering Algorithm and Its Improvement R
6 pages
澳大利亚悉尼科技大学利用质量与距离峰值快速自主聚类，开发出Torque Clustering算法，实现无参数化高效聚类
No ratings yet
澳大利亚悉尼科技大学利用质量与距离峰值快速自主聚类，开发出Torque Clustering算法，实现无参数化高效聚类
14 pages
Evolutionary Clustering With DBSCAN
No ratings yet
Evolutionary Clustering With DBSCAN
6 pages
AWS Disaster Recovery
No ratings yet
AWS Disaster Recovery
22 pages
0764 Kali Linux
0% (2)
0764 Kali Linux
322 pages
Knowledge Mining Using Classification Through Clustering
No ratings yet
Knowledge Mining Using Classification Through Clustering
6 pages
Clustering Analysis
No ratings yet
Clustering Analysis
30 pages
Electronics 11 02735 v2
No ratings yet
Electronics 11 02735 v2
19 pages
Lecture 13 - Unsupervised Learning, PCA ICA
No ratings yet
Lecture 13 - Unsupervised Learning, PCA ICA
50 pages
Autoepsdbscan: Dbscan With Eps Automatic For Large Dataset: Manisha Naik Gaonkar & Kedar Sawant
No ratings yet
Autoepsdbscan: Dbscan With Eps Automatic For Large Dataset: Manisha Naik Gaonkar & Kedar Sawant
6 pages
Comparison of Graph Clustering Algorithms
No ratings yet
Comparison of Graph Clustering Algorithms
6 pages
Unit 3 Updated Notes
No ratings yet
Unit 3 Updated Notes
29 pages
Enhancing DBSCAN Algorithm For Data Mining
No ratings yet
Enhancing DBSCAN Algorithm For Data Mining
5 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
Clustering With Shallow Trees
No ratings yet
Clustering With Shallow Trees
17 pages
Liu 2000
No ratings yet
Liu 2000
10 pages
Automatic Clustering Algorithms
No ratings yet
Automatic Clustering Algorithms
3 pages
Clustering Analysis (Unsupervised)
No ratings yet
Clustering Analysis (Unsupervised)
6 pages
Mfa-Merit Unit 1 Matlab Fundamentals-4532
No ratings yet
Mfa-Merit Unit 1 Matlab Fundamentals-4532
25 pages
Unit 2
No ratings yet
Unit 2
33 pages
Author's Accepted Manuscript: Pattern Recognition
No ratings yet
Author's Accepted Manuscript: Pattern Recognition
41 pages
USL3
No ratings yet
USL3
19 pages
A Novel Graph-based Clustering Method Using Noise Cutting
No ratings yet
A Novel Graph-based Clustering Method Using Noise Cutting
14 pages
Research On K-Value Selection Method of K-Means Clustering Algorithm
No ratings yet
Research On K-Value Selection Method of K-Means Clustering Algorithm
10 pages
Word Handouts Theory of Automata
100% (1)
Word Handouts Theory of Automata
211 pages
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
No ratings yet
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
54 pages
ML - 8
No ratings yet
ML - 8
70 pages
Cluster Analysis
No ratings yet
Cluster Analysis
22 pages
Exp5 - Unsupervised Learning
No ratings yet
Exp5 - Unsupervised Learning
13 pages
SJNanda - Spider and CollidingBodies
No ratings yet
SJNanda - Spider and CollidingBodies
50 pages
Lecture 6
No ratings yet
Lecture 6
55 pages
Unsuper
No ratings yet
Unsuper
15 pages
V5I5201699a5
No ratings yet
V5I5201699a5
7 pages
Examples in C++
No ratings yet
Examples in C++
39 pages
DBSCAN Clustering in ML _ Density Based Clustering
No ratings yet
DBSCAN Clustering in ML _ Density Based Clustering
5 pages
Clustering
No ratings yet
Clustering
28 pages
Density Based Clustering [ Unit 5 ]
No ratings yet
Density Based Clustering [ Unit 5 ]
5 pages
MB Manual Ga-H81m-H v1.1 e
No ratings yet
MB Manual Ga-H81m-H v1.1 e
31 pages
A Distribution-Based Clustering Algorithm For Mining in Large Spatial Databases
No ratings yet
A Distribution-Based Clustering Algorithm For Mining in Large Spatial Databases
8 pages
Clustering Algorithm (Dbscan) : Vishal Bharti Computer Science Dept. GC, Cuny
No ratings yet
Clustering Algorithm (Dbscan) : Vishal Bharti Computer Science Dept. GC, Cuny
27 pages
Ijritcc 1378
No ratings yet
Ijritcc 1378
15 pages
4.1 Program To Convert Celsius To Fahrenheit: Ex - No:4 C Programming Using Simple Statements and Expression
No ratings yet
4.1 Program To Convert Celsius To Fahrenheit: Ex - No:4 C Programming Using Simple Statements and Expression
5 pages
DBSCAN Past, present and future
No ratings yet
DBSCAN Past, present and future
7 pages
SSRN Id3768295
No ratings yet
SSRN Id3768295
7 pages
An Empirical Evaluation of Density-Based Clustering Techniques
No ratings yet
An Empirical Evaluation of Density-Based Clustering Techniques
8 pages
Web-Based Health Monitoring System and Textual Mining
100% (1)
Web-Based Health Monitoring System and Textual Mining
13 pages
R 23
No ratings yet
R 23
27 pages
DBSCAN.docx
No ratings yet
DBSCAN.docx
7 pages
NSX 64 Troubleshooting
No ratings yet
NSX 64 Troubleshooting
249 pages
Automatic Change Detection On Satellite Images Using Principal Component Analysis, ISODATA and Fuzzy C-Means Methods
No ratings yet
Automatic Change Detection On Satellite Images Using Principal Component Analysis, ISODATA and Fuzzy C-Means Methods
8 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
21 pages
Balanced K-Means Revisited-1
No ratings yet
Balanced K-Means Revisited-1
3 pages
DBSCAN_An_Assessment_of_Density_Based_Cl
No ratings yet
DBSCAN_An_Assessment_of_Density_Based_Cl
5 pages
Simulation Based Analysis of Hierarchical Timed Colored Petri Nets Model of The Restaurant Food Serving Process
No ratings yet
Simulation Based Analysis of Hierarchical Timed Colored Petri Nets Model of The Restaurant Food Serving Process
11 pages
Automatic Change Detection On Satellite Images Using Principal Component Analysis, ISODATA and Fuzzy C-Means Methods
No ratings yet
Automatic Change Detection On Satellite Images Using Principal Component Analysis, ISODATA and Fuzzy C-Means Methods
8 pages
Eclipse and Java: Introducing Persistence Companion Tutorial Document
No ratings yet
Eclipse and Java: Introducing Persistence Companion Tutorial Document
39 pages
Spoken Language Identification Using CNN With Log Mel Spectrogram Features in Indian Context
No ratings yet
Spoken Language Identification Using CNN With Log Mel Spectrogram Features in Indian Context
7 pages
Multi Density DBScan
No ratings yet
Multi Density DBScan
8 pages
I Jeter 0110122022
No ratings yet
I Jeter 0110122022
6 pages
I Jeter 039112021
No ratings yet
I Jeter 039112021
8 pages
DotSpatial 1 Working With Controls
100% (2)
DotSpatial 1 Working With Controls
7 pages
GIS-technology of Water Drainage System (WDS) Modernization in Ukrainian City With Rugged Terrain
No ratings yet
GIS-technology of Water Drainage System (WDS) Modernization in Ukrainian City With Rugged Terrain
5 pages
VDBSCAN
No ratings yet
VDBSCAN
4 pages
Computer Memory: Primary Memory Secondary Memory
No ratings yet
Computer Memory: Primary Memory Secondary Memory
23 pages
Optimal Placement and Sizing of Dgs in Distribution Networks Using Dandelion Optimization Algorithm: Case Study of An Algerian Distribution Network
No ratings yet
Optimal Placement and Sizing of Dgs in Distribution Networks Using Dandelion Optimization Algorithm: Case Study of An Algerian Distribution Network
9 pages
Optimal Placement and Sizing of Dgs in Distribution Networks Using Dandelion Optimization Algorithm: Case Study of An Algerian Distribution Network
No ratings yet
Optimal Placement and Sizing of Dgs in Distribution Networks Using Dandelion Optimization Algorithm: Case Study of An Algerian Distribution Network
9 pages
I Jeter 109112021
No ratings yet
I Jeter 109112021
8 pages
GIS-technology of Water Drainage System (WDS) Modernization in Ukrainian City With Rugged Terrain
No ratings yet
GIS-technology of Water Drainage System (WDS) Modernization in Ukrainian City With Rugged Terrain
5 pages
01 - Introduction To Java
No ratings yet
01 - Introduction To Java
34 pages
Spoken Language Identification Using CNN With Log Mel Spectrogram Features in Indian Context
No ratings yet
Spoken Language Identification Using CNN With Log Mel Spectrogram Features in Indian Context
7 pages
Tourism Srs
100% (1)
Tourism Srs
22 pages
A Efficient Method To Detect DDos Attack in Cloud Computing
No ratings yet
A Efficient Method To Detect DDos Attack in Cloud Computing
9 pages
Microsoft Dynamics CRM Installing Guide
No ratings yet
Microsoft Dynamics CRM Installing Guide
112 pages
OPTIMIZATION
100% (1)
OPTIMIZATION
79 pages
I Jeter 049112021
No ratings yet
I Jeter 049112021
8 pages
Adding Custom Tab To The Transaction VF01 VF02 VF03 Header Item Detail Screen
No ratings yet
Adding Custom Tab To The Transaction VF01 VF02 VF03 Header Item Detail Screen
25 pages
I Jeter 019112021
No ratings yet
I Jeter 019112021
6 pages
Capacity Planning
No ratings yet
Capacity Planning
9 pages
Resort Management System
No ratings yet
Resort Management System
4 pages
Lab Manual DSP Lab 5 Sem
No ratings yet
Lab Manual DSP Lab 5 Sem
46 pages
Migration of Scripts To Smartforms
No ratings yet
Migration of Scripts To Smartforms
12 pages
Gamification As An Effective Learning Tool To Increase Learner Motivation and Engagement
No ratings yet
Gamification As An Effective Learning Tool To Increase Learner Motivation and Engagement
5 pages
NW 7.3 Aaex
No ratings yet
NW 7.3 Aaex
5 pages
Lenovo Datasheet Desktop M720q
No ratings yet
Lenovo Datasheet Desktop M720q
3 pages
Kevan Olhausen Network Engineer
No ratings yet
Kevan Olhausen Network Engineer
9 pages
SAP Work Manager 6.4: Integrating Mobile, Cloud, Analytics, and More
No ratings yet
SAP Work Manager 6.4: Integrating Mobile, Cloud, Analytics, and More
8 pages
Innovus addSpareInstance
No ratings yet
Innovus addSpareInstance
2 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

An Efficient Method For Active Semi-Supervised Density Based Clustering

Uploaded by

An Efficient Method For Active Semi-Supervised Density Based Clustering

Uploaded by

ISSN 2320 2602

Volume 4 No.4, April 2015

International Journal of Advances in Computer Science and Technology

An efficient method for active semi-supervised density based clustering

Figure 1 and figure 2 illustrate different types of prior

Figure 1: Spectrum of supervised (left) and partially labeled

Clustering is an important task in the process of knowledge

Figure 2: Spectrum of constrained (left) and unsupervised

2.2. Seed-based K-Means

2. SEED-BASED CLUSTERING ALGORITHMS

Algorith 1: Seed-based KMeans

The objective of SSDBSCAN is to overcome this limit by

3. ACTIVE LEARNING SEED-BASED DBSCAN

The rDist(p,q) measure indicates the smallest radius value for

Then, given a set of seeds DL, the SSDBSCAN algorithm

Table 1. Data set for testing

I use the Rand Index (RI) measure [13], as it is widely used

RI takes values between 0 and 1; RI = 1 when the result is the

T. Lange, M.H. Law, A.K. Jain and J.B. Buhmann.

Viet-Vu Vu, Nicolas Labroche, and Bernadette

You might also like