Data clustering based on maximization of outlier factor

V Saltenis - Journal of Global Optimization, 2006 - Springer
V Saltenis
Journal of Global Optimization, 2006Springer
There exist many data clustering algorithms, but they can not adequately handle the number
of clusters or cluster shapes. Their performance mainly depends on a choice of algorithm
parameters. Our approach to data clustering and algorithm does not require the parameter
choice; it can be treated as a natural adaptation to the existing structure of distances
between data points. The outlier factor introduced by the author specifies a degree of being
an outlier for each data point. The outlier factor notion is based on the difference between …
Abstract
There exist many data clustering algorithms, but they can not adequately handle the number of clusters or cluster shapes. Their performance mainly depends on a choice of algorithm parameters. Our approach to data clustering and algorithm does not require the parameter choice; it can be treated as a natural adaptation to the existing structure of distances between data points. The outlier factor introduced by the author specifies a degree of being an outlier for each data point. The outlier factor notion is based on the difference between the frequency distribution of interpoint distances in a given dataset and the corresponding distribution of uniformly distributed points. Then data clusters can be determined by maximizing the outlier factor function. The data points in dataset are divided into clusters according to the attractor regions of local optima. An experimental evaluation of the proposed algorithm shows that the proposed method can identify complex cluster shapes. Key advantages of the approach are: good clustering properties for datasets with comparatively large amount of noise (an additional data points), and an absence of important parameters which adequate choice determines the quality of results.
Springer
Showing the best result for this search. See all results