0% found this document useful (0 votes)
24 views

Constrained Clustering

Constrained clustering is a semi-supervised learning technique that incorporates must-link and cannot-link constraints to guide standard clustering algorithms. Must-link constraints specify data points that should be in the same cluster, while cannot-link constraints specify points that should not be in the same cluster. Constrained clustering algorithms aim to find clusters, or "chunklets", that satisfy all the constraints. Examples of constrained clustering algorithms include COP K-means, PCKmeans, and CMWK-Means.

Uploaded by

john949
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Constrained Clustering

Constrained clustering is a semi-supervised learning technique that incorporates must-link and cannot-link constraints to guide standard clustering algorithms. Must-link constraints specify data points that should be in the same cluster, while cannot-link constraints specify points that should not be in the same cluster. Constrained clustering algorithms aim to find clusters, or "chunklets", that satisfy all the constraints. Examples of constrained clustering algorithms include COP K-means, PCKmeans, and CMWK-Means.

Uploaded by

john949
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Constrained clustering

In computer science, constrained clustering is a class of semi-supervised learning algorithms. Typically,


constrained clustering incorporates either a set of must-link constraints, cannot-link constraints, or both,
with a data clustering algorithm. A cluster in which the members conform to all must-link and cannot-link
constraints is called a chunklet.

Types of constraints
Both a must-link and a cannot-link constraint define a relationship between two data instances. Together,
the sets of these constraints act as a guide for which a constrained clustering algorithm will attempt to find
chunklets (clusters in the dataset which satisfy the specified constraints).

A must-link constraint is used to specify that the two instances in the must-link relation
should be associated with the same cluster.
A cannot-link constraint is used to specify that the two instances in the cannot-link relation
should not be associated with the same cluster.

Some constrained clustering algorithms will abort if no such clustering exists which satisfies the specified
constraints. Others will try to minimize the amount of constraint violation should it be impossible to find a
clustering which satisfies the constraints. Constraints could also be used to guide the selection of a
clustering model among several possible solutions.[1]

Examples
Examples of constrained clustering algorithms include:

COP K-means [2]


PCKmeans (Pairwise Constrained K-means) [3]
CMWK-Means (Constrained Minkowski Weighted K-Means) [4]

References
1. Pourrajabi, M.; Moulavi, D.; Campello, R. J. G. B.; Zimek, A.; Sander, J.; Goebel, R. (2014).
"Model Selection for Semi-Supervised Clustering". Proceedings of the 17th International
Conference on Extending Database Technology (EDBT). pp. 331–342.
doi:10.5441/002/edbt.2014.31 (https://doi.org/10.5441%2F002%2Fedbt.2014.31).
2. Wagstaff, K.; Cardie, C.; Rogers, S.; Schrödl, S. (2001). "Constrained K-means Clustering
with Background Knowledge". Proceedings of the Eighteenth International Conference on
Machine Learning. pp. 577–584.
3. http://www.cs.utexas.edu/~ml/papers/semi-sdm-04.pdf
4. de Amorim, R. C. (2012). "Constrained Clustering with Minkowski Weighted K-Means".
Proceedings of the 13th IEEE International Symposium on Computational Intelligence and
Informatics. pp. 13–17. doi:10.1109/CINTI.2012.6496753 (https://doi.org/10.1109%2FCINTI.
2012.6496753).
Retrieved from "https://en.wikipedia.org/w/index.php?title=Constrained_clustering&oldid=1092151058"

You might also like