Introduction To Semantic Web Ontology Languages: 1 Organisation of This Chapter
Introduction To Semantic Web Ontology Languages: 1 Organisation of This Chapter
Introduction To Semantic Web Ontology Languages: 1 Organisation of This Chapter
Ontology Languages
In section 2 we discuss general issues and requirements for Web ontology lan-
guages, including the semantics issues. We then describe briefly the most impor-
tant ontology languages in the design of the Semantic Web, namely RDF Schema
in section 3 and OWL in section 4. Section 5 contains a brief comparison with
other ontology languages. A brief introduction to description logics and their
relation to the OWL family of web ontology languages is included. The chapter
is concluded by a discussion on the importance of having correct and complete
inference engines for web ontology languages.
Even though ontologies have a long history in Artificial Intelligence (AI), the
meaning of this concept still generates a lot of controversy in discussions, both
within and outside of AI. We follow the classical AI definition: an ontology is
a formal specification of a conceptualisation, that is, an abstract and simplified
view of the world that we wish to represent, described in a language that is
equipped with a formal semantics. In knowledge representation, an ontology is a
description of the concepts and relationships in an application domain. Depend-
ing on the users of this ontology, such a description must be understandable by
humans and/or by software agents. In many other field – such as in informa-
tion systems and databases, and in software engineering – an ontology would
be called a conceptual schema. An ontology is formal, since its understanding
should be non ambiguous, both from the syntactic and the semantic point of
views.
Researchers in AI were the first to develop ontologies with the purpose of fa-
cilitating automated knowledge sharing. Since the beginning of the 90’s, ontolo-
gies have become a popular research topic, and several AI research communities,
including knowledge engineering, knowledge acquisition, natural language pro-
cessing, and knowledge representation, have investigated them. More recently,
the notion of an ontology is becoming widespread in fields such as intelligent
information integration, cooperative information systems, information retrieval,
digital libraries, e-commerce, and knowledge management. Ontologies are widely
regarded as one of the foundational technologies for the Semantic Web: when
annotating web documents with machine-interpretable information concerning
their content, the meaning of the terms used in such an annotation should be
fixed in a (shared) ontology. Research in the Semantic Web has led to the stan-
dardisation of specific web ontology languages.
An ontology language is a mean to specify at an abstract level – that is,
at a conceptual level – what is necessarily true in the domain of interest. More
precisely, we can say that an ontology language should be able to express con-
straints, which declare what should necessarily hold in any possible concrete
instantiation of the domain. In the following, we will introduce various ways
to impose constraints over domains, by means of statements expressed is some
suitable ontology language.
Class hierarchies Once we have classes we would also like to establish rela-
tionships between them. For example, suppose that we have classes for
– staff members
– academic staff members
– professors
– associate professors
– assistant professors
– administrative staff members
– technical support staff members.
These classes are not unrelated to each other. For example, every professor is
an academic staff member. We say that professor is a subclass of academic staff
member, or equivalently, that academic staff member is a superclass of professor.
The subclass relationship is also called subsumption.
The subclass relationship defines a hierarchy of classes. In general, A is a
subclass of B if every instance of A is also an instance of B.
A hierarchical organisation of classes has a very important practical signifi-
cance, which we outline now. Consider the range restriction
1. a well-defined syntax
2. a well-defined semantics
3. efficient reasoning support
4. sufficient expressive power
5. convenience of expression.
The importance of a well-defined syntax is clear, and known from the area
of programming languages; it is a necessary condition for machine-processing of
information. Web ontology languages have a syntax based on XML, though they
may also have other kinds of syntaxes.
Of course it is questionable whether the XML-based syntax is very user-
friendly, there are alternatives better suitable for humans. However this drawback
is not very significant, because ultimately users will be developing their ontologies
using authoring tools, or more generally ontology development tools, instead of
writing them directly in the Web ontology language.
Formal semantics describes precisely the meaning of knowledge. “Precisely”
here means that the semantics does not refer to subjective intuitions, nor is
it open to different interpretations by different persons (or machines). The im-
portance of formal semantics is well-established in the domain of mathematical
logic. In the context of ontology languages, the semantics enforces the meaning
of the expressed knowledge as a set of constraints over the domain. Any pos-
sible instantiation of the domain should necessarily conform to the constraints
expressed by the ontology.
Given a statement in an ontology, the role of the semantics is to devise pre-
cisely which are the models of the statement, i.e., all the possible instantiations
of the domain that are compatible with the statement. We say that a statement
is true in an instantiation of the domain if this instantiation is compatible with
the statement; the instantiation of the domain in which a statement is true is of
course a model of the statement, and viceversa. So, an ontology will itself devise
a set of models, which is the intersection among all the models of each statement
in the ontology. The models of an ontology represent the only possible realisable
situations.
For example, if an ontology states that professor is a subclass of academic
staff member (i.e., in any possible situation, each professor is also an academic
staff member), and if it is known that Michael Maher is a professor (i.e., Michael
Maher is an instance of the professor class), then in any possible situation it
is necessarily true that Michael Maher is an academic staff member, since the
situation in which he would not be an academic staff member is incompatible
with the constraints expressed in the ontology.
If we understand that an ontology language talks basically about classes,
properties and objects of a domain, then a model (i.e., a specific instantiation
of the domain) is nothing else than the precise characterisation for each objects
of the classes it is instance of, and of the properties it participates to. So, in
the above example, in any model of the ontology Michael Maher should be an
instance of the academic staff member class.
2.3 Reasoning
The fact that the formal semantics associates to an ontology a set of models,
allows us to define the notion of deduction. Given an ontology, we say that an
additional statement can be deduced from the ontology if it is true in all the
models of the ontology. This definition of deduction comes from logic and it is
very general but also very strict: if a statement is not true in all the models of an
ontology, then it is not a valid deduction from it. The process of deriving valid
deductions from an ontology is called reasoning.
If we consider the typical statements of web ontology languages, the following
deductions (“inferences”) can be introduced:
We now turn to a discussion of specific ontology languages that are based on the
abstract view from the previous version: RDF Schema and OWL. Quite a few
other sources already exist that give general introductions to these languages.
Some parts of the RDF and OWL specifications are intended as such introduc-
tions (in particular [13], [9] and [10]), and also didactic material such as [12] and
[11].
Our presentation is structured along the so-called layering of OWL: OWL
Lite, OWL DL and OWL Full. This layering is motivated by different require-
ments that different users have for a Web ontology language:
Before discussing the language primitives of OWL Lite, we first discuss language
elements from RDF and RDF Schema (RDF(S) for short). With the only purpose
to simplify the presentation in this tutorial by obtaining a strict layering between
RDF(S) and OWL Lite, we will restrict our discussion of RDF(S) to the case
where the vocabulary is strictly partitioned, the meta-modelling and reification
facilities are forbidden, as described in [12], also called “type separation” in [9]:
“Any resource is allowed to be only a class, a data type, a data type
property, an object property, an individual, a data value, or part of the
built-in vocabulary, and not more than one of these. This means that,
for example, a class cannot at the same time be an individual, [...]”
Under this restriction, we have the following strict language inclusion relation-
ship:
The most elementary building block of RDF(S) is a class, which defines a group
of individuals that belong together because they share some properties. The
following states that an instance e belongs to a class c:
4
Note that the semantics of the same constructs in RDF(S) and OWL can differ.
Individual(e type(c)) (“e is of type c”).
The second elementary statement of RDF(S) is the subsumption relation be-
tween classes: subClassOf:
subClassOf(ci cj )
In RDF, instances are related to other instances through properties:
Individual(ei value(p ej ))
Properties are characterised by their domain and range:
SubPropertyOf(o1 : pi o2 : pj )
RDF and RDFS allow the representation of some ontological knowledge. The main
modelling primitives of RDF/RDFS concern the organisation of vocabularies in typed
hierarchies: subclass and subproperty relationships, domain and range restrictions, and
instances of classes. However a number of other features are missing. Here we list a
few:
SameIndividual(ei ej )
Besides equality between instances, OWL Lite also introduces constructions to state
equality between classes and between properties. Although such equalities could already
be expressed in an indirect way in RDF(S) (e.g., through a pair of mutual Subclassof
or SubPropertyOf statements), this can be done directly in OWL Lite:
EquivalentClasses(c1 cj )
EquivalentProperties(p1 pj )
DifferentIndividuals(ei ej )
Because inequality between individuals is an often occurring and important statement
(in many ontologies, all differently named individuals are assumed to be different, i.e.
they embrace the unique name assumption), OWL Lite provides an abbreviated form:
DifferentIndividuals(e1 ... e4 )
abbreviates the six DifferentIndividuals statements that would have been required
for this.
Whereas the above constructions are aimed at instances and classes, OWL Lite
also has constructs specifically aimed at properties. An often occurring phenomenon is
that a property can be modelled in two directions. Examples are ownerOf vs. ownedBy,
contains vs. isContainedIn, childOf vs. parentOf and countless others. The relationship
between such pairs of properties is established by stating
ObjectProperty(pi inverseOf(pj ))
Other vocabulary in OWL Lite (TransitiveProperty and SymmetricProperty are
modifying a single property, rather then establishing a relation between two properties:
ObjectProperty(o1 : pi Transitive)
ObjectProperty(o1 : pi Symmetric)
The main limitation of RDF(S) to represent knowledge in terms of concepts and their
properties, is its inability to use properties in the local context of a class. As we have
5
but motivated by a deliberate design decision concerning the computational and
conceptual complexity of the language
already noted, a property has a unique definition for its domain and for its range,
and moreover the participation constraints of the instances of the domain and range
classes to the property are not specifiable in RDF(S). So, in RDF(S) it is impossible to
state whether a property is optional or required for the instances of the class (in other
words: should it have at least one value or not), and whether it is single- or multi-valued
(in other words: is it allowed to have more than one value or not). Technically, these
restrictions constitute 0/1-cardinality constraints on the property. The case where a
property is allowed to have at most one value for a given instance (i.e. a max-cardinality
of 1) has a special name: FunctionalProperty. The case where the value of a property
uniquely identifies the instance of which it is a value (i.e. the inverse property has a
max-cardinality of 1) is called InverseFunctionalProperty. These two constructions
allow for some interesting derivations under the OWL semantics: If an ontology models
that any object can only have a single “age”:
DifferentIndividuals(ei ej )
(if two objects ei and ej have a different age, they must be different objects). Similarly,
if an ontology states that social security numbers uniquely identify individuals, i.e.
ObjectProperty(hasSSN InverseFunctional)
then the two facts
SameIndividuals(ei ej )
Although RDF(S) already allows to state domain and range restrictions, these are very
limited. OWL Lite allows more refined version of these, local to the definition of a class:
ObjectProperty(p range(cj ))
which says that all pi -values must be members of cj , irrespective of whether they are
members of ci or not. This allows us to use the same property-name pi with different
range restrictions cj depending on the class ci to which pi is applied. For example, take
for pi the property Parent. Then Parents of cats are cats, while Parents of dogs are
dogs. An RDF(S) range restriction would not be able to capture this.
Similarly, although in RDF(S) we can define the range of a property, we cannot
enforce that properties actually do have a value: we can state the authors write books:
4.2 OWL DL
With the step from OWL Lite to OWL DL, we obtain a number of additional language
constructs, which simplify the writing of an ontology, even if most of them could be
written anyway in OWL Lite as macros. It is often useful to say that two classes are
disjoint (which is much stronger than saying they are merely not equal):
DisjointClasses(ci cj )
OWL DL allows arbitrary Boolean algebraic expressions on either side of an equality
of subsumption relation. For example
SubClassOf(ci unionOf(cj ck ))
In other words: ci is not subsumed by either cj or ck , but is subsumed by their union.
Similarly
EquivalentClasses(ci intersectionOf(cj ck ))
in other words: although ci is subsumed by cj and ck (a statement already expressible
in RDF(S)), stating that ci is equivalent to their intersection is much stronger. An
obvious example to think of here is “old men”: “old men” are not just both old and
men, but they are exactly the intersection of these two properties.
Of course, the unionOf and intersectionOf may be taken over more than two
classes, and may occur in arbitrary Boolean combinations.
Besides disjunction (unionOf) and conjunction (intersectionOf), OWL DL com-
pletes the Boolean algebra by providing a construct for negation: complementOf:
complementOf(ci cj )
In fact, arbitrary class expressions can be used on either side of subsumption or equiv-
alence axioms.
Note that all the additional OWL DL constructs introduced so far, are also in-
directly expressible already in OWL Lite. For example, the disjointness between two
classes ci and cj can be expressed by means of the following two statements in OWL
Lite, for some fresh new property p:
Table 1. Comparison of web ontology languages with respect to concepts and tax-
onomies (taken from [16])
past. A comparison of these older languages is reported in [16]. We will now briefly
review the results of this comparison and discuss implications for our work.
Besides RDF Schema and OWL6 , which have been introduced above, the com-
parison reported in [16] includes the following languages that have been selected on
the basis of their aim of supporting knowledge representation on the Web and their
compatibility to the Web standards XML or RDF.
– XOL (XML-based ontology language). XOL [4] has been proposed as a language
for exchanging formal knowledge models in the domain of bio-informatics. The
development of XOL has been guided by the representational needs of the domain
and by existing frame-based knowledge representation languages.
– SHOE (simple HTML ontology extension). SHOE[6] was created as an extension
of HTML for the purpose of defining machine-readable semantic knowledge. The
aim of SHOE is to enable intelligent Web agents to retrieve and gather knowledge
more precisely than it is possible in the presence of plain HTML documents.
– OML: (ontology markup language). OML [5] is an ontology language that has ini-
tially been developed as an XML serialisation of SHOE. Meanwhile, the language
consists of different layers with increasing expressiveness. The semantics especially
of the higher levels is largely based on the notion of conceptual graphs. In the
comparison, however, only a less expressive subset of OML (simple OML) is con-
sidered.
– OIL (ontology inference layer). OIL [3] is an attempt to develop an ontology lan-
guage for the Web that has a well defined semantics and sophisticated reasoning
6
Actually, [16] discuss DAML+OIL instead of OWL. DAML+OIL [8] is the direct
precursor of OWL, and all of the conclusions from [16] about DAML+OIL are also
valid for OWL
C, D → A | A (primitive conc.)
>| top (top)
⊥| bottom (bottom)
¬C | (not C) (complement)
C uD | (and C D . . .) (conjunction)
C tD | (or C D . . .) (disjunction)
∀R. C | (all R C) (univ. quantifier)
∃R. C | (some R C) (exist. quantifier)
f ↑| (undefined f ) (undefinedness)
f :C| (in f C) (selection)
≥n R. C | (atleast n R C) (min cardinality)
≤n R. C (atmost n R C) (max cardinality)
R→ P | P (primitive role)
f| f (primitive feature)
R−1 (inverse R) (inverse role)
support for ontology development and use. The language is constructed in a lay-
ered way starting with core-OIL, providing a formal semantics for RDF Schema,
standard-OIL, which is equivalent to an expressive description logic with reasoning
support, and Instance OIL that adds the possibility of defining instances.
We have to mention that there is a strong relationship between the OIL language and
RDF Schema as well as DAML+OIL. OIL extends RDF Schema and has been the
main influence in the development if DAML+OIL. The main difference between OIL
and DAML+OIL is an extended expressiveness of DAML+OIL in terms of complex
definitions of individuals and data types. DAML+OIL in turn has been the basis for
the development of OWL, which carries the stamp of an official W3C recommendation.
All observations on DAML+OIL in this comparison also apply to OWL.
6 Description Logics
We briefly now introduce description logics, which is the logic-based formalism which
is behind the OWL family of web ontology languages. From this brief Section the
parallel with the OWL family of web ontology languages will appear clear. An extensive
treatment of description logics, from friendly introductory chapters, to the theoretical
results, up to the description of applications and systems, can be found in the Handbook
of Description Logics [1]. Consistently with the informal notion of semantics introduced
above for the web ontology languages, description logics are considered as a structured
fragment of predicate logic. ALC is the minimal description language including full
negation and disjunction—i.e., propositional calculus.
>I = ∆I
⊥I = ∅
(¬C)I = ∆I \ C I
(C u D)I = C I ∩ DI
(C t D)I = C I ∪ DI
(∀R. C)I = {i ∈ ∆I | ∀j. RI (i, j) ⇒ C I (j)}
(∃R. C)I = {i ∈ ∆I | ∃j. RI (i, j) ∧ C I (j)}
(f ↑)I = ∆I \ dom f I
(f : C)I = {i ∈ dom f I | C I (f I (i))}
(≥n R. C)I = {i ∈ ∆I | ]{j ∈ ∆I | RI (i, j) ∧ C I (j)} ≥ n}
(≤n R. C)I = {i ∈ ∆I | ]{j ∈ ∆I | RI (i, j) ∧ C I (j)} ≤ n}
(R−1 )I = {(i, j) ∈ ∆I × ∆I | RI (j, i)}
The basic types of a DL language are concepts, roles, and features. A concept is a
description gathering the common properties among a collection of individuals; from a
logical point of view it is a unary predicate ranging over the domain of individuals. A
concept corresponds to a class in the web ontology languages. Inter-relationships be-
tween these individuals are represented either by means of roles (which are interpreted
as binary relations over the domain of individuals) or by means of features (which are
interpreted as partial functions over the domain of individuals). Roles correspond to
properties of RDF and OWL, while features correspond to functional properties. In this
Section, we will consider the Description Logic ALCQI, extending ALC with qualified
cardinality restrictions and inverse roles.
According to the syntax rules of Figure 1, ALCQI concepts (denoted by the letters
C and D) are built out of primitive concepts (denoted by the letter A), roles (denoted
by the letter R), and primitive features (denoted by the letter f ); roles are built out
of primitive roles (denoted by the letter P ) and primitive features. The top part of
Figure 1 defines the ALC sublanguage. Please also note that features are introduced
as shortcuts; in fact, they can be expressed by means of axioms using cardinality
restrictions, as we already noticed for OWL DL.
Let us now consider the formal semantics of ALCQI. We define the meaning of
concepts as sets of individuals—as for unary predicates—and the meaning of roles as
sets of pairs of individuals—as for binary predicates. This is the formalised notion of
instantiation of the domain we introduced at the beginning of this chapter. Formally, an
interpretation is a pair I = (∆I , ·I ) consisting of a set ∆I of individuals (the domain
of I) and a function ·I (the interpretation function of I) mapping every concept to a
subset of ∆I , every role to a subset of ∆I × ∆I , and every feature to a partial function
from ∆I to ∆I , such that the equations in Figure 2 are satisfied. The semantics of the
language can also be given by stating equivalences among expressions of the language
and First Order Logic formulae. An atomic concept A, an atomic role P , and an atomic
feature f , are mapped respectively to the open formulæA(γ), P (α, β), and f (α, β) –
with f a functional relation, also written f (α) = β. Figure 3 gives the transformational
semantics of ALCQI expressions in terms of equivalent FOL well-formed formulæ. A
>I ∼ true
⊥I ∼ false
(¬C)I ∼ ¬FC (γ)
(C u D)I ∼ FC (γ) ∧ FD (γ)
(C t D)I ∼ FC (γ) ∨ FD (γ)
(∃R. C)I ∼ ∃x. FR (γ, x) ∧ FC (x)
(∀R. C)I ∼ ∀x. FR (γ, x) ⇒ FC (x)
(f ↑)I ∼ ¬∃x. f (γ, x)
(f : C)I ∼ ∃x. f (γ, x) ∧ FC (x)
(≥n R. C)I ∼ ∃≥n x. FR (γ, x) ∧ FC (x)
(≤n R. C)I ∼ ∃≤n x. FR (γ, x) ∧ FC (x)
(R−1 )I ∼ FR (β, α)
concept C and a role R correspond to the FOL open formulae FC (γ) and FR (α, β)
respectively. It is worth noting that, using the standard model-theoretic semantics, the
extensional semantics of Figure 2 can be derived from the transformational semantics
of Figure 3.
For example, we can consider the concept of happy fathers, defined using the
primitive concepts Man, Doctor, Rich, Famous and the roles CHILD, FRIEND. The
concept happy fathers can be expressed in ALCQI as
Man u (∃CHILD. >)u
∀CHILD. (Doctor u ∃FRIEND. (Rich t Famous)),
i.e., those men having some child and all of whose children are doctors having some
friend who is rich or famous.
An ontology is called in DL a knowledge base, and formally it is a finite set Σ
of terminological axioms – these are the ontology statements; it can also be called a
terminology or TBox. For a concept name A, and (possibly complex) concepts C, D,
.
terminological axioms are of the form A = C (concept definition), A v C (primitive
concept definition), C v D (general inclusion statement). An interpretation I satisfies
C v D if and only if the interpretation of C is included in the interpretation of D, i.e.,
C I ⊆ DI . It is clear that the last kind of axiom is a generalisation of the first two:
.
concept definitions of the type A = C – where A is an atomic concept – can be reduced
to the pair of axioms (A v C) and (C v A). Another class of terminological axioms –
pertaining to roles R, S – are of the form R v S. Again, an interpretation I satisfies
R v S if and only if the interpretation of R – which is now a set of pairs of individuals
– is included in the interpretation of S, i.e., RI ⊆ S I . An interpretation I is a model
of a knowledge base Σ iff every terminological axiom of Σ is satisfied by I. If Σ has
a model, then it is satisfiable; thus, checking for KB satisfiability is deciding whether
there is at least one model for the knowledge base. Σ logically implies an axiom α
(written Σ |= α) if α is satisfied by every model of Σ. We say that a concept C is
subsumed by a concept D in a knowledge base Σ (written Σ |= C v D) if C I ⊆ DI
for every model I of Σ. For example, the concept
Person u (∃CHILD. Person)
denoting the class of parents—i.e., the persons having at least a child which is a
person—subsumes the concept
Man u (∃CHILD. >)u
∀CHILD. (Doctor u ∃FRIEND. (Rich t Famous))
denoting the class of happy fathers – with respect to the following knowledge base
Σ:
.
Doctor = Person u ∃DEGREE. Phd,
.
Man = Person u sex : Male,
i.e., every happy father is also a person having at least one child, given the background
knowledge that men are male persons, and that doctors are persons.
A concept C is satisfiable, given a knowledge base Σ, if there is at least one model
I of Σ such that C I 6= ∅, i.e. Σ 6|= C ≡ ⊥. For example, the concept
(∃CHILD. Man) u (∀CHILD. (sex : ¬Male))
is unsatisfiable with respect to the above knowledge base Σ. In fact, an individual
whose children are not male cannot have a child being a man.
References