Reducing over-clustering via the powered Chinese restaurant process

Lu, Jun; Li, Meng; Dunson, David

Computer Science > Machine Learning

arXiv:1802.05392 (cs)

[Submitted on 15 Feb 2018]

Title:Reducing over-clustering via the powered Chinese restaurant process

Authors:Jun Lu, Meng Li, David Dunson

View PDF

Abstract:Dirichlet process mixture (DPM) models tend to produce many small clusters regardless of whether they are needed to accurately characterize the data - this is particularly true for large data sets. However, interpretability, parsimony, data storage and communication costs all are hampered by having overly many clusters. We propose a powered Chinese restaurant process to limit this kind of problem and penalize over clustering. The method is illustrated using some simulation examples and data with large and small sample size including MNIST and the Old Faithful Geyser data.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1802.05392 [cs.LG]
	(or arXiv:1802.05392v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.05392

Submission history

From: Jun Lu [view email]
[v1] Thu, 15 Feb 2018 02:53:30 UTC (6,691 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-02

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jun Lu
Meng Li
David B. Dunson
David Dunson

export BibTeX citation

Computer Science > Machine Learning

Title:Reducing over-clustering via the powered Chinese restaurant process

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reducing over-clustering via the powered Chinese restaurant process

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators