Conceptual Clustering
Conceptual Clustering
Conceptual clustering is a machine learning paradigm for unsupervised classification that has been
defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the
1980s. It is distinguished from ordinary data clustering by generating a concept description for each
generated class. Most conceptual clustering methods are capable of generating hierarchical category
structures; see Categorization for more information on hierarchy. Conceptual clustering is closely related to
formal concept analysis, decision tree learning, and mixture model learning.
More general discussions and reviews of conceptual clustering can be found in the following publications:
Michalski (1980)
Gennari, Langley, & Fisher (1989)
Fisher & Pazzani (1991)
Fisher & Langley (1986)
Stepp & Michalski (1986)
Knowledge representation
The COBWEB data structure is a hierarchy (tree) wherein each node represents a given concept. Each
concept represents a set (actually, a multiset or bag) of objects, each object being represented as a binary-
valued property list. The data associated with each tree node (i.e., concept) are the integer property counts
for the objects in that concept. For example, (see figure), let a concept contain the following four
objects (repeated objects being permitted).
1. [1 0 1]
2. [0 1 1]
3. [0 1 0]
4. [0 1 1]
The figure to the right shows a concept tree with five concepts. is the root concept, which contains all
ten objects in the data set. Concepts and are the children of , the former containing four objects,
and the later containing six objects. Concept is also the parent of concepts , , and , which
contain three, two, and one object, respectively. Note that each parent node (relative superordinate concept)
contains all the objects contained by its child nodes (relative subordinate concepts). In Fisher's (1987)
description of COBWEB, he indicates that only the total attribute counts (not conditional probabilities, and
not object lists) are stored at the nodes. Any probabilities are computed from the attribute counts as needed.
The description language of COBWEB is a "language" only in a loose sense, because being fully
probabilistic it is capable of describing any concept. However, if constraints are placed on the probability
ranges which concepts may represent, then a stronger language is obtained. For example, we might permit
only concepts wherein at least one probability differs from 0.5 by more than . Under this constraint, with
, a concept such as [.6 .5 .7] could not be constructed by the learner; however a concept
such as [.6 .5 .9] would be accessible because at least one probability differs from 0.5 by more than
. Thus, under constraints such as these, we obtain something like a traditional concept language. In the
limiting case where for every feature, and thus every probability in a concept must be 0 or 1, the
result is a feature language base on conjunction; that is, every concept that can be represented can then be
described as a conjunction of features (and their negations), and concepts that cannot be described in this
way cannot be represented.
Evaluation criterion
In Fisher's (1987) description of COBWEB, the measure he uses to evaluate the quality of the hierarchy is
Gluck and Corter's (1985) category utility (CU) measure, which he re-derives in his paper. The motivation
for the measure is highly similar to the "information gain" measure introduced by Quinlan for decision tree
learning. It has previously been shown that the CU for feature-based classification is the same as the mutual
information between the feature variables and the class variable (Gluck & Corter, 1985; Corter & Gluck,
1992), and since this measure is much better known, we proceed here with mutual information as the
measure of category "goodness".
What we wish to evaluate is the overall utility of grouping the objects into a particular hierarchical
categorization structure. Given a set of possible classification structures, we need to determine whether one
is better than another.
References
Biswas, G.; Weinberg, J. B.; Fisher, Fisher, Douglas H. (1987). "Knowledge
Douglas H. (1998). "Iterate: A conceptual acquisition via incremental conceptual
clustering algorithm for data mining". IEEE clustering" (https://link.springer.com/conten
Transactions on Systems, Man, and t/pdf/10.1007/BF00114265.pdf) (PDF).
Cybernetics - Part C: Applications and Machine Learning. 2 (2): 139–172.
Reviews. 28 (2): 100–111. doi:10.1007/BF00114265 (https://doi.org/1
doi:10.1109/5326.669556 (https://doi.org/1 0.1007%2FBF00114265).
0.1109%2F5326.669556).
Carpineto, C.; Romano, G. (2014) [1993].
"Galois: An order-theoretic approach to
conceptual clustering" (https://books.googl
e.com/books?id=TrqjBQAAQBAJ&pg=PA3
3). Proceedings of 10th International
Conference on Machine Learning,
Amherst. pp. 33–40. ISBN 978-1-4832-
9862-7.
Fisher, Douglas H. (1996). "Iterative Jonyer, I.; Cook, D. J.; Holder, L. B. (2001).
optimization and simplification of "Graph-based hierarchical conceptual
hierarchical clusterings". Journal of clustering". Journal of Machine Learning
Artificial Intelligence Research. 4: 147– Research. 2: 19–43.
178. arXiv:cs/9604103 (https://arxiv.org/ab doi:10.1162/153244302760185234 (http
s/cs/9604103). Bibcode:1996cs........4103F s://doi.org/10.1162%2F153244302760185
(https://ui.adsabs.harvard.edu/abs/1996c 234).
s........4103F). doi:10.1613/jair.276 (https://d Lebowitz, M. (1987). "Experiments with
oi.org/10.1613%2Fjair.276). incremental concept formation" (https://doi.
S2CID 9841360 (https://api.semanticschol org/10.1007%2FBF00114264). Machine
ar.org/CorpusID:9841360). Learning. 2 (2): 103–138.
Fisher, Douglas H.; Langley, Patrick W. doi:10.1007/BF00114264 (https://doi.org/1
(1986). "Conceptual clustering and its 0.1007%2FBF00114264).
relation to numerical taxonomy" (https://ini Michalski, R. S. (1980). "Knowledge
s.iaea.org/search/search.aspx?orig_q=RN: acquisition through conceptual clustering:
18080906). In Gale, W. A. (ed.). Artificial A theoretical framework and an algorithm
Intelligence and Statistics. Reading, MA: for partitioning data into conjunctive
Addison-Wesley. pp. 77–116. ISBN 978-0- concepts" (http://www.mli.gmu.edu/papers/
201-11569-7. OCLC 12973461 (https://ww 79-80/80-12.pdf) (PDF). International
w.worldcat.org/oclc/12973461). Journal of Policy Analysis and Information
Fisher, Douglas H.; Pazzani, Michael J. Systems. 4: 219–244.
(2014) [1991]. "Computational models of Michalski, R. S.; Stepp, R. E. (1983).
concept learning" (https://www.sciencedire "Learning from observation: Conceptual
ct.com/science/article/pii/B978148320773 clustering" (http://ebot.gmu.edu/bitstream/h
5500079). In Fisher, D. H.; Pazzani, M. J.; andle/1920/1568/83-01.pdf?sequence=1&i
Langley, P. (eds.). Concept Formation: sAllowed=y) (PDF). In Michalski, R. S.;
Knowledge and Experience in Carbonell, J. G.; Mitchell, T. M. (eds.).
Unsupervised Learning. San Mateo, CA: Machine Learning: An Artificial Intelligence
Morgan Kaufmann. pp. 3–43. Approach. Palo Alto, CA: Tioga. pp. 331–
doi:10.1016/B978-1-4832-0773-5.50007-9 363. ISBN 978-0-935382-05-1.
(https://doi.org/10.1016%2FB978-1-4832-0 OCLC 455234543 (https://www.worldcat.or
773-5.50007-9). ISBN 978-1-4832-2116-8. g/oclc/455234543).
Gennari, John H.; Langley, Patrick W.; Stepp, R. E.; Michalski, R. S. (1986).
Fisher, Douglas H. (1989). "Models of "Conceptual clustering: Inventing goal-
incremental concept formation" (https://esc oriented classifications of structured
holarship.org/uc/item/5r51t42n). Artificial objects" (http://jbox.gmu.edu/bitstream/han
Intelligence. 40 (1–3): 11–61. dle/1920/1613/86-30.pdf?sequence=1&isA
doi:10.1016/0004-3702(89)90046-5 (http llowed=y) (PDF). In Michalski, R. S.;
s://doi.org/10.1016%2F0004-3702%288 Carbonell, J. G.; Mitchell, T. M. (eds.).
9%2990046-5). Machine Learning: An Artificial Intelligence
Hanson, S. J.; Bauer, M. (1989). Approach. Los Altos, CA: Morgan
"Conceptual clustering, categorization, and Kaufmann. pp. 471–498. ISBN 0-934613-
polymorphy" (https://doi.org/10.1007%2FB 00-1.
F00116838). Machine Learning. 3 (4): Talavera, L.; Béjar, J. (2001). "Generality-
343–372. doi:10.1007/BF00116838 (http based conceptual clustering with
s://doi.org/10.1007%2FBF00116838). probabilistic concepts". IEEE Transactions
on Pattern Analysis and Machine
Intelligence. 23 (2): 196–206.
doi:10.1109/34.908969 (https://doi.org/10.1
109%2F34.908969).
External links
Bibliography of conceptual clustering (https://web.archive.org/web/20110409095215/http://w
ww.lsi.upc.es/~talavera/conceptual-clustering.html)
Working python implementation of COBWEB (https://github.com/cmaclell/concept_formatio
n)