Verifying CIM Models of Apache Web-Server Configurations
Verifying CIM Models of Apache Web-Server Configurations
1. Introduction
For todays complex software systems, development
does not end at the manufacturers side but extends to the
clients side, because standard software products must frequently be congured after delivery to meet particular needs
or policies. Hence, software verication and validation are
no longer conned to the traditional software development
process but must extend to the conguration phase at the
client site. For the particular example treated in this paper,
a formal verication of a Web-server can not be considered
complete if it does not extend to particular installations, i.e.
congurations, of the product. A awed Web-server conguration may cause malfunctions just as if the software itself
were awed, and may constitute a serious security breach,
even if the product as such were perfect.
Thus, the problem of verifying and validating particular
system congurations is at least as important as that of traditional software verication. It is even more important in the
sense that, in general, product (re-)conguration by the customer occurs much more frequently than product completion by the manufacturer, and because traditional software
quality assurance methods hardly extend to the conguration phase at the clients site. For the Apache Web-server
the Netcraft survey (http://www.netcraft.com/survey) has
determined over 13,000,000 installations as of July 2003,
all of which have been congured individually to meet each
sites demands.
Viktor Mihajlovski
Linux Technology Center (LTC)
IBM Lab. Boblingen, Germany
[email protected]
In this paper we attempt a rst step towards an ideal scenario where a congurable software product is delivered together with an expert system containing conguration constraints, such that, based on formal methods, the client can
check a concrete conguration against the manufacturers set of universal formal conguration constraints;
set up site specic local conguration constraints that
can be checked automatically;
suggest possible actions of repair to bring a awed installation back into compliance with the established
constraints;
contain a constraint editor which helps both manufacturer and client in setting up non-contradictory constraints which are actually satisable.
wise the theorems which are proved hold only of the model
and not of the real application.
In our previous work, we could successfully bridge the
formalization gap in two applications. One is an industrial
information system used for the conguration of motor cars
[10]. There, propositional logic formulae are used in rules
that control order modication and checking. The other was
the verication of an expert system that is part of a larger
system management application [15, 16]. This expert system contains situation-action rules which we modeled using
PDL (Propositional Dynamic Logic, [8]).
For our present work, an important part of our success
is due to the use of a semi-formal intermediate model of
the Apache Web-server formulated in terms of the Common Information Model (CIM) standard [3]. Starting from
the CIM model of Apache, it was feasible to build a faithful formal model of constraints and to feed real system values via CIM-based software into the variables of our constraints formulae. From a practical point of view, the existence of CIM and its usefulness outside of formal verication is extremely important, because industry hardly ever
builds abstract models for formal verication purposes only.
In the case of the CIM standard, it has been developed and
is being used to provide abstract system management interfaces to complex systems. A CIM interface presents an
object-oriented view of the underlying system and provides
abstract interfaces to retrieve and manipulate conguration
data. Hence, the system manufacturer maintains a faithful
CIM model of the implemented system independent of any
formal verication, because it greatly aids in investigating
and manipulating the conguration of a system without resorting to implementation details.
It is the core of our approach to hook into the CIM model
and start the formal modeling from there. Using CIM software, we can feed real system values into our abstract constraints sets and we can even attempt to repair a conguration under the control of our expert system. Moreover, since
our methodology is built on CIM, our work is not particular
to the Apache Web-server but can be applied in principle to
other systems for which a CIM model exists.
system, regardless of manufacturer, architecture, or operating system. As an information model, CIM focuses on
standardizing data-semantics and uniform interfaces, and is
independent of any encoding and protocol considerations.
The Web Based Enterprise Management (WBEM) Initiative
is dening additional standards for CIM implementation interoperability (like operational semantics and communication protocols).
Any hardware or software system component is called a
managed element and is represented as an instance of a CIM
class. Instances contain properties (name/value pairs) describing units of data. All properties are accessible through
uniform getter/setter-operations. Some of the properties
may be declared as key properties, with the intended meaning that each CIM object is uniquely identiable among its
other class members by these properties.
A collection of class denitions that describe managed
elements (in a particular environment) is called a Schema.
Schemas have a framework character and are designed to be
extensible. CIM Schemas are represented by UML (Unied Modeling Language) Diagrams, or MOF les. MOF
(Managed Object Format) is a declarative language similar
to CORBAs IDL (Interface Denition Language). All CIM
Schemas have to satisfy special restrictions given by the
CIM Meta-Schema. The most signicant restriction is the
use of association classes to model relationships between
objects. Thereby, composition or any kind of reference
within a class is strictly avoided, thus preventing anomalies
caused by complex class relations. Aggregations are just
special association classes. Most of the association classes
of the CIM core are abstract and thus have to be rened.
CIM Schemas belong to one of the levels Core Model,
Common Model, and Extension, depending on their level
of specicity: the Core Model contains only a small number of very general classes as an abstract description of
components and relationships that are found in most environments. These classes are inherited by the more specic classes in the Common Model that includes a series of
domain-specic, but platform-independent classes like System and Network. The most specic classes are gathered in
Extension Schemas derived from the Common Model. Sophisticated design patterns like the Composite Pattern [7]
are used in all layers. A main aspect in designing CIM
Schemas is their real world usability, i.e. the completeness
with respect to the use case scenarios.
The concrete information of the application is gathered
by a set of provider applications and delivered to a CIM
Object Manager (CIMOM), where it can then be accessed
and modied by client applications communicating with the
CIMOM using standardized XML-mappings [4], and utilizing HTTP as transport protocol.
The CIM classes relevant for our work are the Conguration and the Setting classes together with their associations,
as dened by the CIM Core Schema. While there are already some small-scale examples in the literature of device
conguration management using CIM, there has not been
any real software example yet. The reason for this might
be that real software congurations possess a considerably
larger number of options and parameters, and therefore give
rise to a much more complex CIM modeling effort (see the
next section for an example).
The work presented in this paper results from a collaboration of our research group at the University of
Tubingen with the IBM Linux Technology Center (LTC)
Systems Management Group located at the IBM laboratory at Boblingen, Germany, which has developed the CIM
model for (part of) the Apache Web-server conguration.
is maintaining In the Open Source Project SBLIM the IBM
LTC provides CIM models and instrumentation for Linux
Systems Management [9].
;
class Apache HttpServerProperties :
Apache HttpServerSetting
[Key] String ConfigName;
String BindAddress;
String CoreDumpDirectory;
uint16 MaxClients;
uint32 MaxRequestsPerChild;
int16 MaxSpareServers;
int16 MinSpareServers;
...
ManagedSystem
Element
Configuration
CIM-Core
ManagedElement
Setting
*
Server
Configuration
Service
Apache
Logical
Element
*
Host
Configuration
Directory
Configuration
Apache
HTTPService
HTTPServer
Setting
HTTPListen
Setting
HTTPServer
Properties
HTTPServer
Module
HTTPHost
Setting
HTTPHost
Properties
Override
Policy
HTTPDirectory
Setting
HTTPDirectory
CoreSetting
;
S1
S2
HostConf.
H1
HostConf.
H2
S3a
P
DirectoryConf.
D1
S5a
Q
S6a
DirectoryConf.
D2
S5b
Q
HostConf.
H3
S3c
P
S4
S3b
P
DirectoryConf.
D3
DirectoryConf.
D4
S5c
Q
S5d
Q
DirectoryConf.
D5
S5e
Q
DirectoryConf.
D6
S5f
Q
S6b
is supposed to be relevant for all descendant nodes of element E, too. Thus, in our example, the directives of setting
S3a are also valid for directory conguration D1, whereas
they are irrelevant for host congurations H2 and H3, or
directory congurations D2 through D6. The semantical
structure can be reproduced by traversing the part/wholerelations (CongurationComponent and SettingContext association classes in CIM) between these nodes. Each conguration node can be considered generating a causally
closed semantical context for constraint evaluation. In specifying and verifying constraints describing interdependencies between different settings, it is crucial to consider the
relevant settings only.
Some constraints also require some kind of horizontal
navigation in this tree, allowing selection of all present instances of a particular class. This can be seen as a form
of quantication over conguration instances. Using the
WBEM API, quantication over objects can be accomplished using the class name, whereas addressing a special
object requires additional knowledge of its key values.
5.1. Syntax
We will now present our CIM constraint language,
, which is partly inuenced by Description Logic
[BMNP03] and partly resembles variable-free predicate
logic. The language of consists of three kinds of expressions: v-expressions, a-expressions and f-expressions.
V-expressions represent arbitrary nite sets of property values (numbers, strings, ), a-expressions are the atomic
propositions of our language, and f-expressions constitute
formulae. These expressions are recursively dened as follows:
v-expressions (denoted by s,t, ):
where is a class name and a
property name.
where is a class name and
are property names.
where is an arbitrary property
value constant (string, number, etc).
where is a -ary (interpreted)
function and are vexpressions.
a-expressions:
5.2. Semantics
All expressions of our language are interpreted with respect to a set of instances of CIM classes, where each
instance has properties according to its class denition, and
each property has a value matching the propertys type (always including the value NULL, denoting an undened
value; see also [3], Sec. 4.11.6).
By we (intuitively) want to denote the set of values occurring under property of any instance of class
; selects all tuples
that occur under properties of an instance of class ;
: delivers the set of parent nodes of instance
(i.e. a singleton set, if is not the root node, and the
if is atomic
..
.
f.a. . The modication of
the set of considered instances in the denition of the context operator , , is dened over the aggregation tree
structure using auxiliary functions and :
if
otherwise
and
and thus
Therefore,
is a set containing (among others) all in-
3. Each virtual host has its own unique server name. Here
we used an additional unary function, , computing
the cardinality of its argument set, and the key property
Name of class HostConguration.
4. The error log should not be stored in directory DocumentRoot or a subdirectory thereof. Here we used a
binary predicate on sets of strings (isPrexOf) returning true, if all elements of the rst set are prexes of
all elements of the second set. The context operator
assures that these sets are singletons.
5. The address/port pair of each virtual host must be an
address/port the Web-server is listening to (see [1]).
6. A conguration name and PID le must be specied
for the Web-server.
stances of setting classes relevant for node H1, which justies our denition of the context operator via function .
Now, as usual, a formula is said to be satised by a set of
instances , if .
Note that all non-logical functions and predicates exclusively take set-valued arguments. This enables a wide variety of functions being uniformly and naturally denable. A
comparison operator , ranging over sets of natural numbers, may, e.g., be dened as
Set-containment, as another example, can be dened directly. But there are further possibilities, e.g.
in dening vary-adic operators. That way, sum- or
minimum/maximum-operators can be dened, and statements such as
can be expressed. This
allows the formulation of complex, but common dependencies.
5.3. Examples
Turning back to Apache conguration, we now want to
give some examples on how to use the constraint specication language . In Figure 3 we give formal variants of part of the specication stated in natural language
in Section 4 above, as well as some examples taken from
the Apache documentation [1]. These are to be understood
as follows:
1. Property ServerRoot is dened exactly once.
2. For each server conguration, the MinSpareServer and
MaxSpareServer properties are set as mentioned in
Section 4. Here we also used the comparison operator and its converse as dened above.
where
denotes the total number
object-oriented semi-formal CIM model of the conguration data and a specialized constraint specication language. Our extensible specication language reects typical constructs found in CIM and currently allows formulation of constraints containing predicates and functions over
numbers, sets and strings. We also implemented a prototypical constraint evaluator based on Java Reection and the
WBEM infrastructure. This implementation facilitates error recovery by computing weights for probably faulty conguration settings.
Representative for other work on formal verication of
(semi-formal) UML-diagrams we want to mention DupuyChessa and du Bousquets validation of UML models [6]
and Meyer and Souqui`eres formalization based on the
specication language B [12]. In contrast to their work,
we do not use a powerful specication language using full
predicate logic, but restrict our specication language to
a variable-free logic that resembles Description Logic [2],
which potentially offers advantages for automated theorem
proving tasks. Dong et al. present an approach to specify
Semantic Web Services using Z in order to nd errors in the
ontology [5]. Work on validation and integrity checking of
XML data can also be found in the literature [13].
Future work may include the application of automatic
theorem proving methods to CIM verication using .
This would offer additional possibilities, e.g. in checking the consistency constraints by themselves, in automatic
completion of partially specied congurations, and in automatic error correction. Moreover, a complexity theoretic
analysis of our specication language and a comparison with current description logics could be of interest.
References
[1] The Apache Software Foundation. Apache HTTP Server
Version 1.3 Documentation, 2002. http://httpd.
apache.org/docs.
[2] F. Baader, D. McGuinness, P. Nardi, and P. Patel-Schneider,
editors. The Description Logic Handbook. Cambridge University Press, 2003.
[3] Distributed Management Task Force, Inc. Common Information Model Specication, 1999. http://www.dmtf.
org/standards/documents/CIM/DSP0004.pdf.
[4] Distributed Management Task Force, Inc. CIM Operations over HTTP, 2002. http://www.dmtf.org/
standards/documents/WBEM/DSP200.html.
[5] J. Dong, J. Sun, and H. Wang. Z Approach to Semantic Web.
In International Conference on Formal Engineering Methods (ICFEM02), pages 156167. Springer-Verlag, 2002.
[6] S. Dupuy-Chessa and L. du Bousquet. Validation of UML
models thanks to Z and Lustre. In Proc. of the Intl. Symp.
on Formal Methods Europe (FME 2001), pages 254258,
Berlin, Germany, 2001. Springer-Verlag.
[7] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design
Patterns. Addison Wesley, 1995.
[8] D. Harel, D. Kozen, and J. Tiuryn. Dynamic Logic. MIT
Press, 2000.
[9] IBM Linux Technology Center. Standards Based Linux Instrumentation for Manageability, 2000. http://oss.
software.ibm.com/sblim.
[10] W. Kuchlin and C. Sinz. Proving consistency assertions for
automotive product data management. J. Automated Reasoning, 24(12):145163, Feb. 2000.
[11] B. Laurie and P. Laurie. Apache: The Denitive Guide (3rd
Edition). OReilly & Associates, 2002.
[12] E. Meyer and J. Souqui`eres. A systematic approach to transform OMT diagrams to a B specication. In Proc. of the
World Congress on Formal Methods in the Development
of Computing Systems (FM99), pages 875896, Toulouse,
France, 1999. Springer-Verlag.
[13] C. Nentwich, W. Emmerich, and A. Finkelstein. Static consistency checking for distributed specications. In Proc. of
the 16th IEEE Intl. Conf. on Automated Software Engineering (ASE01), pages 115125, Coronado Bay, CA, 2001.
IEEE Computer Society.
[14] Web Ontology Language (OWL) Reference Version 1.0,
2002. W3C Working Draft 12 November 2002. Latest version available at http://www.w3.org/TR/owl-ref.
[15] C. Sinz, T. Lumpp, and W. Kuchlin. Towards a verication
of the rule-based expert system of the IBM SA for OS/390
automation manager. In Proceedings of the 2nd Asia-Pacic
Conference on Quality Software (APAQS 2001), pages 367
374, Hong Kong, Dec. 2001. IEEE Computer Society.
[16] C. Sinz, T. Lumpp, J. Schneider, and W. Kuchlin. Detection
of dynamic execution errors in IBM System Automations
rule-based expert system. Information and Software Technology, 44(14):857873, Nov. 2002.
[17] R. Waldinger and M. Stickel. Proving properties of rulebased systems. Intl. J. Software Engineering and Knowledge
Engineering, 2(1):121144, 1992.