Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Xie, Wen-Bo; Lee, Yan-Li; Wang, Cong; Chen, Duan-Bing; Zhou, Tao

doi:10.1016/j.ins.2020.04.016

Computer Science > Information Retrieval

arXiv:1907.04915 (cs)

[Submitted on 9 Jul 2019]

Title:Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Authors:Wen-Bo Xie, Yan-Li Lee, Cong Wang, Duan-Bing Chen, Tao Zhou

View PDF

Abstract:Clustering is a fundamental analysis tool aiming at classifying data points into groups based on their similarity or distance. It has found successful applications in all natural and social sciences, including biology, physics, economics, chemistry, astronomy, psychology, and so on. Among numerous existent algorithms, hierarchical clustering algorithms are of a particular advantage as they can provide results under different resolutions without any predetermined number of clusters and unfold the organization of resulted clusters. At the same time, they suffer a variety of drawbacks and thus are either time-consuming or inaccurate. We propose a novel hierarchical clustering approach on the basis of a simple hypothesis that two reciprocal nearest data points should be grouped in one cluster. Extensive tests on data sets across multiple domains show that our method is much faster and more accurate than the state-of-the-art benchmarks. We further extend our method to deal with the community detection problem in real networks, achieving remarkably better results in comparison with the well-known Girvan-Newman algorithm.

Comments:	13 pages, 5 figures, 5 supplementary figures, 2 tables
Subjects:	Information Retrieval (cs.IR); Social and Information Networks (cs.SI); Data Analysis, Statistics and Probability (physics.data-an)
Cite as:	arXiv:1907.04915 [cs.IR]
	(or arXiv:1907.04915v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1907.04915
Journal reference:	Information Sciences 527 (2020) 279-292
Related DOI:	https://doi.org/10.1016/j.ins.2020.04.016

Submission history

From: Tao Zhou [view email]
[v1] Tue, 9 Jul 2019 04:34:28 UTC (1,079 KB)

Computer Science > Information Retrieval

Title:Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators