Google Scholar

Data clustering based on maximization of outlier factor

V Saltenis - Journal of Global Optimization, 2006 - Springer

V Saltenis

Journal of Global Optimization, 2006•Springer

Abstract

There exist many data clustering algorithms, but they can not adequately handle the number of clusters or cluster shapes. Their performance mainly depends on a choice of algorithm parameters. Our approach to data clustering and algorithm does not require the parameter choice; it can be treated as a natural adaptation to the existing structure of distances between data points. The outlier factor introduced by the author specifies a degree of being an outlier for each data point. The outlier factor notion is based on the difference between the frequency distribution of interpoint distances in a given dataset and the corresponding distribution of uniformly distributed points. Then data clusters can be determined by maximizing the outlier factor function. The data points in dataset are divided into clusters according to the attractor regions of local optima. An experimental evaluation of the proposed algorithm shows that the proposed method can identify complex cluster shapes. Key advantages of the approach are: good clustering properties for datasets with comparatively large amount of noise (an additional data points), and an absence of important parameters which adequate choice determines the quality of results.

Springer

Show moreShow less

Save Cite Cited by 2 Related articles All 6 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Data clustering based on maximization of outlier factor