Cla 10
Cla 10
Barış Sertkaya
1 Introduction
Formal Concept Analysis (FCA) [28] is a field of applied mathematics that aims
to formalize the notions of a concept and a conceptual hierarchy by means of
mathematical tools. On the other hand Description Logics (DLs) [3] are a class of
logic-based knowledge representation formalisms that are used to represent the
conceptual knowledge of an application domain in a structured way. Although
the notion of a concept as a collection of objects sharing certain properties, and
the notion of a conceptual hierarchy are fundamental to both FCA and DLs,
the ways concepts are described and obtained differ significantly between these
two research areas. In DLs, the relevant concepts of the application domain are
formalized by so-called concept descriptions, which are expressions built from
unary predicates (that are called atomic concepts), and binary predicates (that
are called atomic roles) with the help of the concept constructors provided by
the DL language. Then in a second step, these concept descriptions are used
to describe properties of individuals occurring in the domain, and the roles are
used to describe relations between these individuals. On the other hand, in FCA,
one starts with a so-called formal context, which in its simplest form is a way of
specifying which attributes are satisfied by which objects. A formal concept of
such a context is a pair consisting of a set of objects called extent, and a set of
attributes called intent such that the intent consists of exactly those attributes
that the objects in the extent have in common, and the extent consists of exactly
those objects that share all attributes in the intent.
There are several differences between these approaches. First, in FCA one
starts with a purely extensional description of the application domain, and then
derives the formal concepts of this specific domain, which provide a useful struc-
turing. In a way, in FCA the intensional knowledge is obtained from the exten-
sional part of the knowledge. On the other hand, in DLs the intensional definition
of a concept is given independently of a specific domain (interpretation), and the
description of the individuals is only partial. Second, in FCA the properties are
atomic, and the intensional description of a formal concept (by its intent) is
just a conjunction of such properties. DLs usually provide a richer language for
the intensional definition of concepts, which can be seen as an expressive, yet
decidable sublanguage of first-order predicate logic.
Despite these differences, there have been several attempts to bridge the gap
between these two formalisms, and attempts to apply methods from one field to
the other. For example, there have been efforts to enrich FCA with more complex
properties similar to concept constructors in DLs [60, 45, 44, 23, 46]. On the other
hand, DL research has benefited from FCA methods to solve some problems
encountered in knowledge representation using DLs [1, 55, 10, 12, 48, 14, 49, 52,
16, 7, 53, 50, 4, 5, 13]. The present work aims to give an overview on these works
done for bridging the gap between the two formalisms. In Section 2 we give a
short introduction to DLs without going into technical details. We assume that
the reader is familiar with FCA. We do not introduce FCA, we refer the reader
to [28] for details. In Section 3 we summarize the existing work done by other
researchers in the field. In Section 4 we summarize our own contributions to the
field, and conclude with Section 5.
2 Description Logics
The following ABox states the facts about the individuals P ortugal, Austria,
and Atlantic Ocean.
A := {LandlockedCountry(Austria), Country(P ortugal), Ocean(Atlantic Ocean),
hasBorderTo(P ortugal, Atlantic Ocean)}
The existing work done by other researchers towards bridging the gap between
FCA und DLs, and attempts to apply methods from one field to the other can
roughly be collected under two categories:
Theory-driven logical scaling In [45], Prediger and Stumme have used DLs in
Conceptual Information Systems, which are data analysis tools based on FCA.
They can be used to extract data from a relational database and to store it
in a formal context by using so-called conceptual scales. Prediger and Stumme
have combined DLs with attribute exploration in order to define a new kind of
conceptual scale. In this approach, DLs provide a rich language to specify which
FCA attributes cannot occur together, and a DL reasoner is used during the
attribute exploration process as an expert to answer the implication questions,
and to provide a counterexample whenever the implication does not hold.
Terminological attribute logic In [44], Prediger has worked on introducing
logical constructors into FCA. She has enriched FCA with relations, existen-
tial and universal quantifiers, and negation, obtaining a language like the DL
ALC, which she has called terminologische Merkmalslogik (terminological at-
tribute logic 2 ). In the same work she has also presented applications of her ap-
proach in enriching formal contexts with new knowledge, applications in many
valued formal contexts, and applications for so-called scales, which are formal
contexts that are used to obtain a standard formal context from a many valued
formal context.
Relational exploration In his Ph.D thesis [49], Rudolph has combined DLs
and FCA for acquiring complete relational knowledge about an application do-
main. In his approach, which he calls relational exploration, he uses DLs for
defining FCA attributes, and FCA for refining DL knowledge bases. More pre-
cisely, DLs makes use of the interactive knowledge acquisition method of FCA,
and FCA benefits from DLs in terms of expressing relational knowledge.
In [48, 49], Rudolph uses the DL FLE for this purpose, which is the DL
that allows for the constructors conjunction, existential restriction, and value
restriction. In his previous work [47], he uses the DL EL, which allows for the
constructors conjunction and existential restriction. In both cases, he defines
the semantics by means of a special pair of formal contexts called binary power
context family, which are used for expressing relations in FCA. Binary power
context families have also been used for giving semantics to conceptual graphs.
In order to collect information about the formulae expressible in FLE, in [48,
49] he defines a formal context called FLE-context. The attributes of this formal
context are FLE-concept descriptions, and the objects are the elements of the
domain over which these concept descriptions are interpreted. In this context,
an object g is in relation with an attribute m if and only if g is in the interpre-
tation of m. Thus, an implication holds in this formal context if and only if in
the given model the concept description resulting from the conjunction of the
attributes in the premise of the implication is subsumed by the concept descrip-
tion formed from the conclusion. This is how implications in FLE-contexts give
rise to subsumption relationships between FLE concept descriptions.
In order to obtain complete knowledge about the subsumption relationships
in the given model between arbitrary FLE concepts, Rudolph gives a multi-step
exploration algorithm. In the first step of the algorithm, he starts with an FLE-
context whose attributes are the atomic concepts occurring in a knowledge base.
In exploration step i + 1, he defines the set of attributes as the union of the set
of attributes from the first step and the set of concept descriptions formed by
universally quantifying all attributes of the context at step i w.r.t. all atomic
roles, and the set of concept descriptions formed by existentially quantifying all
concept intents of the context at step i w.r.t all atomic roles. Rudolph points out
that, at an exploration step, there can be some concept descriptions in the at-
tribute set that are equivalent, i.e., attributes that can be reduced. To this aim,
he introduces a method that he calls empiric attribute reduction. In principle,
it is possible to carry out infinitely many exploration steps, which means that
the algorithm will not terminate. In order to guarantee termination, Rudolph re-
stricts the number of exploration steps. After carrying out i steps of exploration,
it is then possible to decide subsumption (w.r.t. the given model) between any
FLE concept descriptions up to role depth i just by using the implication bases
obtained as a result of the exploration steps. In addition, he also characterizes
the cases where finitely many steps are sufficient to acquire complete information
for deciding subsumption between FLE concept descriptions with arbitrary role
depth. Rudolph argues that his method can be used to support the knowledge
engineers in designing, building and refining DL ontologies. This method has
been implemented in the tool Relexo.3
Exploring Finite Models in the DL ELgf p In [4] Baader and Distel have
extended classical FCA in order to provide support for analyzing relational
structures by using efficient FCA algorithms. In this approach the atomic at-
tributes are replaced by complex formulae in some logical language, and data
is represented using relational structures rather than just formal contexts. This
extension is later instantiated with atrributes defined in the DL EL, and with
relational structures defined over a signature of unary and binary predicates, i.e.,
models for EL. In this setting an implication corresponds to a GCI in EL. This
approach at the first sight seems to be very close to the approach introduced
in [48, 49]. One of the main differences between these approaches is that in [4]
the authors use one context with infinitely many complex attributes, whereas
in [49] Rudolph uses an infinite family of contexts, each having finitely many
attributes that are obtained by restricting the role depth of concepts. In [4] the
authors additionally show that for the DLs EL and ELgf p , which extends EL
with cyclic concept definitions interpreted with greatest fixpoint semantics, the
set of GCIs holding in a finite model always has a finite basis. That is, there is
always a finite subset of the infinitely many GCIs from which the rest follows.
Later in [5] the authors have shown how to compute this basis efficiently by
using methods from FCA. In a follow-up paper [22], Distel has described how
this method can be modified to allow ABox individuals as counterexamples to
GCIs.
Unfortunately, the proof of this result does not yield a practical algorithm. Due
to this, in [14, 16, 53] we have developed a more practical approach. Assume
that L1 is a DL for which least common subsumers (without background TBox)
always exist. Given L1 (T )-concept descriptions C1 , . . . , Cn , one can compute a
common subsumer w.r.t. T by just ignoring T , i.e., by treating the defined names
in C1 , . . . , Cn as primitive and computing the lcs of C1 , . . . , Cn in L1 . However,
the common subsumer obtained this way will usually be too general. In [14, 16,
53], work we presented a method for computing “good” common subsumers w.r.t.
background TBoxes, which may not be the least common subsumers, but which
are better than the common subsumers computed by ignoring the TBox. In the
present work we do not give the gcs algorithm in detail. We only demonstrate it
on an example. The algorithm is described in detail in [16].
NoSon ≡ ∀has-child.Female,
NoDaughter ≡ ∀has-child.¬Female,
SonRichDoctor ≡ ∀has-child.(Female t (Doctor u Rich)),
DaughterHappyDoctor ≡ ∀has-child.(¬Female t (Doctor u Happy)),
ChildrenDoctor ≡ ∀has-child.Doctor,
C := ∃has-child.(NoSon u DaughterHappyDoctor),
D := ∃has-child.(NoDaughter u SonRichDoctor).
Example 2. Let our TBox Tcountries contain the following concept definitions:
Moreover, let our ABox Acountries contain the individuals Syria, Turkey, France,
Germany, Switzerland, USA and assume we are interested in the subsumption
relationships between the concept names AsianCountry, EUmember, European-
Country, G8member and MediterreneanCountry. Table 1 shows the partial context
induced by Acountries , and Table 2 shows the questions asked by the completion
algorithm and the answers given to these questions. In order to save space, the
names of the concepts are shortened in both tables. The questions with positive
answers result in extension of the TBox with the following GCIs:
0
knowledge base (Tcountries , A0countries ) is complete w.r.t. the initially selected
concept names.
5 Conclusion
We have summarized the work done in combining DLs and FCA. The research
done in this field mainly falls under two categories: 1) efforts to enrich the lan-
guage of FCA by borrowing constructors from DL languages, and 2) efforts to
employ FCA methods in the solution of problems encountered in knowledge rep-
resentation with DLs. For each of these categories we have given pointers and
shortly described the relevant work in the literature. We have also described our
own contributions, which are mainly under the second category.
Recent developments in information technologies like social networks, Web
2.0 applications and semantic web applications are bringing up new challenges
for representing vast amounts of knowledge and analyzing huge amounts of data
rapidly generated by these applications. The two research areas we have dis-
cussed here, namely DLs and FCA, are lying at the core of representing knowl-
edge, and analyzing data, respectively. We are confident that these new chal-
lenges will enable new fruitful cooperations between these two research fields.
References