Scientific Engineering For Distributed Java Applications (199 Fault Tolerance in Phone Systems)
Scientific Engineering For Distributed Java Applications (199 Fault Tolerance in Phone Systems)
Scientific Engineering
for Distributed
Java Applications
International Workshop, FIDJI 2002
Luxembourg-Kirchberg, Luxembourg, November 28-29, 2002
Revised Papers
13
Series Editors
Volume Editors
Nicolas Guelfi
Institut Supérieur de Technologie
Départment d’Informatique Appliquée
6, rue Richard Coudenhove-Kalergi, 1359 Luxembourg, Luxembourg
E-mail: nicolas.guelfi@ist.lu
Egidio Astesiano
Gianna Reggio
Universitá di Genova-Italy
Dipartimento di Informatica e Scienze dell’Informazione (DISI)
Via Dodecaneso 35, 16146 Genova, Italy
E-mail: {astes/reggio}@disi.unige.it
A catalog record for this book is available from the Library of Congress.
ISSN 0302-9743
ISBN 3-540-00679-6 Springer-Verlag Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
liable for prosecution under the German Copyright Law.
Springer-Verlag Berlin Heidelberg New York
a member of BertelsmannSpringer Science+Business Media GmbH
http://www.springer.de
© Springer-Verlag Berlin Heidelberg 2003
Printed in Germany
Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Stefan Sossna e. K.
Printed on acid-free paper SPIN: 10872768 06/3142 543210
Preface
FIDJI 2002 was an international forum for researchers and practitioners inte-
rested in the advances in, and applications of, software engineering for distribu-
ted application development. Concerning the technologies, the workshop focused
on “Java-related” technologies. It was an opportunity to present and observe the
latest research, results, and ideas in these areas.
All papers submitted to this workshop were reviewed by at least two members
of the International Program Committee. Acceptance was based primarily on the
originality and contribution. We selected for these postworkshop proceedings 16
papers amongst 33 submitted, two tutorials, and two keynotes.
FIDJI 2002 was aimed at promoting a scientific approach to software enginee-
ring. The scope of the workshop included the following topics:
Program Chairs
Guelfi, Nicolas IST, Luxembourg
Astesiano, Egidio DISI Genoa, Italy
Reggio, Gianna DISI Genoa, Italy
Organizing Committee
Amza, Catalin IST/DISI, Genoa, Italy
De Colnet, Olivier SITEC, Luxembourg
Dahaoui, Mustapha IST, Luxembourg
Guelfi, Nicolas IST, Luxembourg
Kies, Mireille IST, Luxembourg
Lambeau, Bernard SITEC, Luxembourg
Perrouin, Gilles IST, Luxembourg
Reggio, Gianna DISI, Genoa, Italy
Ries, Benoı̂t IST, Luxembourg
Sterges, Paul IST, Luxembourg
VIII Organization
Additional Referees
Brimont, Pierre Perrouin, Gilles
Eshuis, Rik Periorellis, Panayiotis
Foukia, Noria Razafimahefa, Chrislain
Gautheron, Laurent Ries, Benoı̂t
Khadraoui, Djamel Sterges, Paul
Oriol, Manuel
Sponsoring Institutions
Keynote Talks
Designing Fault-Tolerant Mobile Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Giovanna Di Marzo Serugendo, Alexander Romanovsky
Tutorials
Requirements Elicitation with Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Shane Sendall
Java Threads Can Be Very Useful Building Blocks . . . . . . . . . . . . . . . . . . . . 204
Claude Petitpierre
1 Introduction
Nowadays nobody doubts the Internet has become the most important global
network infrastructure. Many companies are enclosing as software services their
traditional computing tasks or introducing new tasks to connect them to Internet
at a rapid pace searching new promising opportunities. However, this growth
of services over the network has been faster than the formal efforts to agree
on service-oriented architectures [6] and to identify the necessary support for
enabling these distributed services to work together harmoniously [16].
First formal steps have progressed around the interoperability among sys-
tems (any Web service can interact with any other Web service [18]). This
way, SOAP (http://www.w3.org/TR/SOAP/) has become a de facto standard
for Web-service messaging and invocation, and for solving the problems of con-
verting data between traditional distributed platforms such as CORBA, DCOM
or EJB [19]. Additionally, many standardization initiatives have arisen in specific
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 1–14, 2003.
c Springer-Verlag Berlin Heidelberg 2003
2 P. Álvarez et al.
research areas for defining open, ubiquitous and interoperable service interfaces,
such as the area of the Location-based Services (LBS).
LBS extend the spatial processing capabilities of the Geographic Informa-
tion Services (GIS) integrating wireless communications, location data and In-
ternet technology [25,12]. In this context, two well-positioned organizations
have emerged as the drivers of the LBS interoperability: LIF (Location-
Interoperability Forum, http://www.locationforum.org/) and OGC (Open
GIS Consortium, http://www.opengis.org/, and its Open Location Service
Initiative (OpenLS), http://www.openls.org/). Both are promoting and defin-
ing standard interfaces for a collection of wireless, Location and GIS services for
providing the required LBS functionality [2,26,29]. These public interfaces make
easier the integration through Internet of these distributed services into end ap-
plications as individual computing entities, but in an isolated way, they have
a very limited functional value. Therefore, LBS context may be considered as
a prototypical technological-context where it may be evaluated the impact of
the integration of industrial web-centric standards over the development of dis-
tributed applications.
Once services and their interfaces have been described, it arises the neces-
sity of establishing an organization for supporting their use, interactions and
automated discovery. This organization must define an architecture that pro-
vides a framework for making easier the collaborative work among the services
and the access to them [14]. Past architecture experiences such as [24,31] could
help us to define these future ones: problem in the use of the object technology
in large-scale applications when it must be combined and recombined [23], and
in building of component-based frameworks [21]. Besides, in order to integrate
services implemented with different computational models, services must coop-
erate and to be ensemble over this architectural vision in an orthogonal way to
their computing tasks [8], allowing to exploit the true value of services beyond
independent computing entities.
This work presents a coordination Web-service for distributed architectures
over Internet which has been used inside the LBS context as a prototypical
domain. This support service has been implemented in Java so it could be used
independently of the hardware platform and the operating system on which it is
being executed. Internet is a distributed environment where many hardware and
software configurations can be found (http://leb.net/hzo/ioscount/data/
r.9904.txt). Therefore, Java as programming language of Web services, is the
key for achieving the required platform portability and independence. Besides, it
provides a number of built-in networking capabilities that make it easy to develop
Internet-based and Web-based applications [10]. A more detailed description of
our technological evolution towards this approach may be found in [4].
The paper is structured as follows. Section 2 presents a description of ser-
vices provided by standards that constitute the LBS framework. It is shown in
a succinct way the underlying conceptual model, the hierarchical levels of func-
tionality, and the found problems in order to orchestrate these services. Section
3 justifies the adopted coordination-approach based on JavaSpaces technology
A Java Coordination Tool for Web-Service Architectures 3
and standards to develop web-centric solutions from XML and HTTP. Section
4 shows design and implementation details. Section 5 reviews the benefits of the
proposal. Finally, future work and conclusions are presented.
The presented architectural model has been the conceptual base for the devel-
opment of a LBS framework whose functionality may be integrated into end-
applications through Internet (see fig. 1), such as ERP or CRM systems [4].
Required services are organized according to the proposed functional levels and
built according to the Web-service philosophy: their operations are provided
through a standard, published interface to ensure interoperability, and are ac-
cessible via ubiquitous Internet protocols and data formats, such as HTTP and
XML.
The Data Management level is the base of the proposed architecture. Its
services must be able of providing the necessary support for the storage and
recovery of geodata. LBS frameworks require a wide variety of geodata: georef-
erenced maps; location descriptors such as street addresses, roads, place names
or telephone numbers; sensor data such as immediate locations of mobile de-
vices; or more specific data of the LBS context, such as traffic conditions or
road repairs. A collection of services has been implemented for fulfilling these
requirements. For example, basic GIS services: Web Map Server (WMS), for
4 P. Álvarez et al.
visualizing digital maps on the Internet as rendered raster data [3,13]; Web Fea-
ture Server (WFS), for storing, spatial and non spatial querying and discovering
geographical features, such as the previously presented location descriptors [1];
and Web Traffic Server (WTS) [26], for providing traffic conditions of a specific
region of interest. Interfaces of these services have been developed following the
specifications proposed by the Open GIS Consortium (OGC).
Besides these GIS services, Location Services are required for communicating
with remote sensors, an example could be services for acquiring location data
from mobile devices throw wireless media to send and receive location requests
and responses. These services define their interface according to the LIF (Loca-
tion Inter-operability Forum) specification, which describes a Mobile Location
Protocol (MLP) that can be used by an Internet application to request location
information from a Location Server. As a part of this Location Services, it is
possible to integrate the Mobile Positioning Servers provided by a telecommu-
nication operator through its Location Service Platform.
Geodata provided by these Data Management services are not usually used
in an isolated way, instead they are used by the Data Processing services for
generating more complex and elaborate data. It is interesting to have services
for combining different kinds of geodata, such as maps and location descriptors;
for linking many location descriptors; or for calculating structured geodata, such
as ideal routes from a set of mobile-device locations.
To achieve this functionality, geospatial services have been implemented for
geodata presentation (such as, Style Layer Descriptor Server), utility (Gazetteers
and Geocoders), and determination (Route Server and Tracking Server). Details
A Java Coordination Tool for Web-Service Architectures 5
about their specifications can be found in [26]. A Style Layer Descriptor Server
(SLD) visualizes the result of a WFS query over a digital map returned by the
WMS, applying a visualization style specified by the service client to the dis-
played geoinformation. On the other hand, the implementation of the WFS has
been utilized as the baseline for developing another geoservices [11,4]: Gazetteers,
used to link text-based location descriptions to geographic locations; Geocoders,
used to transform a textual term or code, such as an address, place name, or
telephone number, into a geographic location; Route Server, used to calculate
ideal routes along a set of locations; and Tracking Server, used to store, query,
retrieve the latest known geographic location of mobile device. The Tracking
Server is not proposed in the OpenGIS specification. However we propose it as
a natural way to provide mobile data.
Finally, at the higher level of the architecture, the Data Analysis level is
composed by specific application services, such as the Mobile-Resource Tracking
Service (MRTS), built on the lower-level services for integrating their functional-
ity into end-applications through Internet. This service allows to make tracking
tasks of mobile resources with an installed location-device (such as, vehicles or
employees with a mobile phone with an integrated GPS-receptor), to visualize
their real-time positions, to plan a route and tracking it, or to generate opera-
tional reports. To provide these operations through its interface, it is necessary
that data and geoprocessing services of lower levels collaborate among them in
an adequate way as an only global system.
1. Location Services have not persistence for storing received locations from
mobile devices. If a service requests the geographic location of a device for
processing it, a Location Service must communicate with the remote mobile-
device for acquiring and providing it to the requester service.
2. Many location-data consumers are not simultaneously supported. Owing to
the fact that a Location Service has not persistence and its interface’s opera-
tions are invoked using the HTTP Internet protocol, the invoker service can
only receive the requested location. If many services require the location-data
of a same device, then each one must make an independent request.
3. Asynchronous responses for operations of the interface are more adequate.
A service can request an immediate location of a device or a periodic col-
lection of them, but a Location Service does not know in advance how long
it is required for acquiring and receiving the requested location. This fact
is owing to the introduced delay by communication networks. Therefore an
asynchronous model for receiving location responses is more adequate. How-
ever, HTTP provides a synchronous request/response model.
4. A Location Service provides operations whose initiative comes from remote
mobile-devices and not from the service client. For example, generation of
emergency alarms or location events when a device comes into/out from a
specific geographical region. Service clients must be able to subscribe to be
notified when those alarms or events happen, instead of being continuously
requesting to the Location Service to check their occurrence through its
HTTP interface.
These restrictions show the need of a more complex communication mecha-
nism among Web services than the one provided by the HTTP protocol:1) able
to store exchanged data and to support many consumers and 2) an asynchronous
and reactive communication model. Besides, these communication requirements
involve matters of service synchronization too: services can be waiting for re-
ceiving new location data or being notified by alarms or events for doing their
task, such as to update the vehicle location or to show a new alarm on a digital
map, to track a predefined route or to recalculate a tracking report. Therefore,
the particular target is to provide a high-level tool for coordinating (commu-
nicating and synchronizing) Location Services with GIS services. However, this
target will be deal with a broader perspective, trying to provide a flexible tool
to coordinate any Web service over Internet.
HTTP and XML; its open interface must hide the implementation details of the
service, such as the hardware or software platform on which it is implemented or
the programming language in which is written; and it must encourage a flexible
and loosely-coupled interaction among Web services.
Event model is incorporated into JavaSpace for firing events when entries that
match templates are written into a space. It allows to react to the arrival of
entries as they are placed in a space.
The built coordination Web-service encapsulates one or more spaces imple-
mented by JavaSpaces, and provides through its interface the proposed opera-
tions by the extended Generative Communication model. In order to coordinate
Web services through Internet, these operations are accessible through HTTP
protocol.
Distributed services cooperate among them inserting and retrieving messages
into/from the encapsulated space using the HTTP operations provided by the
coordination service interface. Exchanged messages are encoded in XML format,
and producing and consuming services must understand their content. Standards
which define how to express the exchanged data (such as the MLP proposed by
LIF that defines XML-Schemas for location, alarm or location-event data) are
used for achieving this syntactic interoperability.
According with the ideas proposed before, a coordination web-service has been
implemented and tested in the LBS context. Its kernel consists of three software
components (see fig. 2):
to the matching rules for supporting more complex XML queries. XQL is a
path expression based query language proposed to the W3C query workshop
(http://www.w3.org/TandS/QL/QL98/pp/xql.html).
sages with another proxies and as remote listeners that can be called by spaces
when a matching occurs.
Proxies must be able to inform about its internal state and to push data to
respective external processes. This connection between both must be established
when the proxy is created and remained until it is destroyed. The technique used
to connect each other depends on the executing environment where is being
used the Java Coordination Component. For example, if a collection of Java
processes are cooperating through the developed coordination component and
both are running inside a same Java Virtual Machine (JVM), processes and
their respective proxies can be connected through message passing. However, in
a distributed environment, such as Internet, the HTTP streaming technique can
be used to connect them. It consists of remaining open an HTTP connection
to push fresh data from the proxy to the remote process (see fig. 3). These
processes may even be simple HTML-pages able to receive JavaScript events
(for more details, http://www.pushlets.com/). In this case, exchanged data
are the XML-messages that are the result of its coordination task.
through its HTTP interface the same collection of operations. This interface
allows web-applications to cooperate independently of the hardware and soft-
ware platform where they are running and independently of the programming
language in which they are written.
The core of this component has been implemented as a Java Servlet, a Java
program that resides and executes on a Web-Server, in this case on Apache
Server (http://www.apache.org/) using Tomcat as a servlet container
(http://jakarta.apache.org/tomcat/).
This section presents how the proposed coordination Web-service is able to solve
the identified communication problems among Location Services and other dis-
tributed GIS services:
1. Spaces encapsulated into the coordination service are persistent and mes-
sages may be indefinitely stored into it. Therefore, Location Services can use
it as a persistent repository of location data writing received locations from
the mobile devices into it.
2. Many distributed services are simultaneously able to access to the coordi-
nation service for reading a stored message, such as a location data. So, a
location that has been requested and stored into the coordination service
may be shared by many consuming services.
3. The publishing and subscribing operations provided by the coordination ser-
vice’s interface support an asynchronous interaction model. It allows time-
uncoupled interactions between producing and consuming processes.
4. Distributed services can be subscribed for retrieving messages instead of be-
ing making constant readings over the coordination service, being the server
who has the notification-based initiative.
In this work it has been presented an architectural model for organizing Web ser-
vices and an implementation of it based on standards in the context of the LBS.
Despite the lack of problems from a conceptual point of view, real restrictions
arise when distributed services must work together harmoniously. For resolving
them, it is proposed a coordination Web-service implemented using Java and
Internet technologies. The coordination functionality provided by the service is
orthogonal to the computing functionality offered by the coordinated entities.
This fact keeps the independence between the computing and coordination mod-
els.
Open research issues are trying (1) to discover the real potential of the XML
language to express synchronization restrictions and work flows among Web ser-
vices, and (2) to add a new component that integrates thesaurus and ontologies
12 P. Álvarez et al.
for supporting the semantic interoperability among Web services. The under-
lying idea is to extend the concept of matching rules in the way that different
values and XML representations will match if correspond to the same concept
[22,30].
Finally, it is important to have a formal instrument for reasoning upon the
behaviour of coordinated distributed-services and the coordination service. From
a formal point of view, Petri nets are the most attractive formalism for mod-
elling concurrent system that allows formal analysis, graphic representation and
the execution/simulation of the system models. In this sense, advanced software
development approaches for modelling, implementing and reasoning upon open
parallel and distributed systems are based on principles presented in this paper,
that is, concurrent object-orientation, generative communication, and Petri nets.
In [20] it is presented a Petri net formalism to provide semantics for the Objec-
tive Linda language, and modelling the internal behaviour of concurrent objects;
and in [5] it is presented transition merging as the main mechanism to represent
the interaction between concurrent objects, providing a symmetric form of com-
munication very close to generative communication that allows the cooperation
of an arbitrary number of entities, and no direction of communication.
Acknowledgment. The basic technology of this work has been partially sup-
ported by the Spanish Ministry of Science and Technology through projects
TIC2000-1568-C03-01, TIC2000-0048-P4-02 and FIT-0700002000B270827 from
the National Plan for Scientific Research, Development and Technology Innova-
tion, co-supported by FEDER resources.
References
27. D.S. Rosenblum and A. Wolf, A design framework for Internet-scale event obser-
vation and notification, Proceedings of the sixth European Software Engineering
Conference (Zurich, Switzerland) (M. Jazayeri and H. Schauer, eds.), Springer-
Verlag, September 1997, pp. 344–360.
28. J.C. Thill, Geographic information systems for transportation in perspective, Trans-
portation Research Part C: Emerging Technologies 8 (2000), no. Issues 1-6,
February-December, 3–12.
29. J. VanderMeer, Ubiquitous wireless location interoperability, Available in
http://www.directionsmag.com/, July 2002.
30. U. Visser and H. Stuckenschmidt, Interoperability in GIS. Enabling technologies,
Proceedings of the 5th AGILE Conference on Geographic Information Science
(Palma de Mallorca, Spain), April 2002, pp. 291–297.
31. F.J. Zarazaga, P. Álvarez, J.A. Bañares, J. Nogueras, J. Valiño, and P.R. Muro-
Medrano, Examples of vehicle location systems using CORBA-based distributed
real-time GPS data and services, Computers, Environment and Urban Systems 25
(2001), 293–305.
32. A.K. Ziliaskopoulos and S. Travis Waller, An Internet-based geographic informa-
tion system that integrates data,models and users for transportation application,
Transportation Research Part C: Emerging Technologies 8 (2000), no. Issues 1-6,
February-December, 427–444.
Critical Evaluation of the EJB
Transaction Model
Raul Silaghi and Alfred Strohmeier
Software Engineering Laboratory
Swiss Federal Institute of Technology in Lausanne
CH-1015 Lausanne EPFL, Switzerland
{Raul.Silaghi,Alfred.Strohmeier}@epfl.ch
1 Introduction
For three decades, transaction processing has been a cornerstone of modern information
technology: it is an indispensable asset in banking, stock trading, airline reservation sys-
tems, travel agencies, and so on. With the new millennium’s proliferation of e-Com-
merce applications, business-to-business workflows, and broad forms of Web-based e-
Services, transactional information systems are becoming even more important.
Transactions are a classic software structure for managing concurrent accesses to
global data and for maintaining data consistency in the presence of failures. The notion
of transaction was first introduced in database systems in order to correctly handle con-
current updates of data and to provide fault tolerance with respect to hardware failures
[1]. A transaction groups an arbitrary number of operations on data objects (also re-
ferred to as transactional objects), making the operations as a whole appear indivisible
to the application and with respect to other concurrent transactions. The classic transac-
tion scheme relies on three standard operations: begin, commit, and abort, which mark
the boundaries of a transaction. The properties of transactions are referred to as the
ACID properties: Atomicity, Consistency, Isolation, and Durability [1].
Support for transactions is an essential component of the Enterprise JavaBeans ar-
chitecture. The Enterprise JavaBeans architecture supports only flat transactions, de-
spite the fact that the classic transaction model has been extended a long time ago to
support nested transactions [2], and thus provides a more flexible support for concur-
rency and recovery.
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 15−28, 2003.
© Springer-Verlag Berlin Heidelberg 2003
16 R. Silaghi and A. Strohmeier
While the second column in Table 1 indicates whether or not the bean method is in-
voked from within a client transaction context, the third column indicates the exact
transaction context in which the bean method will be executed, e.g., the client transac-
tion context, a new transaction context, or no transaction context.
Another transaction attribute that is not presented in Table 1 is
TX_BEAN_MANAGED; it notifies the Container that the bean class directly handles
transaction demarcation by using the javax.transaction.UserTransaction
interface. This attribute can only be set for session and message-driven beans, and not
for entity beans, because entity beans must always be designed with container-managed
transaction demarcation.
2.1.2 Setting the Transaction Isolation Level Attribute
The transaction isolation level determines how strongly one transaction is isolated from
another. Within a transactional context, the isolation level associated with the first
method invocation becomes the required isolation level for all other methods invoked
within that transaction. If a method is invoked with a different isolation level from that
of the first method, an exception is thrown. This constraint is mainly imposed by the
underlying databases because most resource managers interpret a change in the isola-
tion level in the middle of a transaction as an implicit sync point, committing the chang-
es done so far (even if the transaction has not committed yet).
The possible values that can be set for the isolation level attribute (from strongest
to weakest) are: TRANSACTION_SERIALIZABLE, TRANSACTION_REPEATABLE_READ,
TRANSACTION_READ_COMMITTED, and TRANSACTION_READ_UNCOMMITTED.
None of these values permits two transactions to update the same data concurrently; one
transaction must end before another one can update the same data. The values determine
only how locks are managed for reading data. However, risks to consistency can arise
from read operations when a transaction does further work based on the values read. For
example, if one transaction is updating a piece of data and a second transaction is per-
mitted to read that data after it has been changed but before the updating transaction
ends, the reading transaction can make a decision based on a change that is eventually
rolled back. Thus, the second transaction risks making a decision on transient data.
2.2 Issues in Using the EJB Transaction Model
Sequential Access within the same Transaction Context. An entity bean object may
be accessed by multiple clients in the same transaction. A program A may start a trans-
action, and then call program B and program C in the same transaction context. If the
programs B and C access the same entity bean object, the topology of the transaction
creates a diamond. In this scenario, the programs B and C will access the entity object
sequentially. Concurrent access to an entity object in the same transaction context
would be considered an application programming error, and it would be handled in a
Container-specific way.
The EJB specification requires that the Container provides support for local dia-
monds. In a local diamond, all components (here A, B, C, and the entity bean) are de-
ployed in the same EJB Container. Distributed diamonds are not required to be support-
ed by an EJB Container. However, if the EJB Container Provider chooses to support
distributed diamonds, then the specification requires that it provides a consistent view
Critical Evaluation of the EJB Transaction Model 19
of the entity bean’s state within a transaction. Two ways of how this can be achieved
are proposed in the specification.
Concurrent Access from Multiple Transactions. For concurrent access from multi-
ple transactions, the EJB specification mentions two different strategies that the Con-
tainer typically uses to achieve proper synchronization. In the first one, the Container
acquires exclusive access to the entity object’s state in the database. It activates a single
instance of the entity bean and serializes the access from multiple transactions to this
instance, as shown in Fig. 1.
Container
enterprise bean instance
Account 100
Client 1 TX1
in TX1
Entity object Account 100
Account 100
Client 2 TX2 The Container blocks Client 2
until Client 1 finishes
Fig. 1. Multiple clients can access the same entity object using a single instance
In the second one, the Container activates multiple instances of the entity bean, one for
each transaction in which the entity object is being accessed, and relies on the underly-
ing database to perform the transaction synchronization during the accessor method
calls performed by the business methods, and by the ejbLoad, ejbCreate, ejb-
Store, and ejbRemove methods.
While the second strategy just passes the problem to the next in line, the first one
might lead to deadlock, as presented in section 4.1.
Transaction Isolation Level Attribute Issues. The choice of the transaction isolation
level attribute depends on several factors, which include: the acceptable level of risk to
data consistency, the acceptable levels of concurrency and performance, the isolation
levels supported by the underlying database. The first two factors are related. Decreas-
ing the risk to consistency requires to decrease concurrency because reducing the risk
to consistency requires holding locks for longer periods. The longer a lock is held on a
piece of data, the longer concurrently running transactions must wait to access that data.
The TRANSACTION_SERIALIZABLE value protects data by eliminating concurrent
access to it. Conversely, the TRANSACTION_READ_UNCOMMITTED value allows the
highest degree of concurrency but entails the greatest risk to consistency. These two
factors need to be balanced appropriately depending on the application.
The third factor means that although the EJB specification allows one to request one
of the four levels of transaction isolation, it is possible that the database being used in
the application does not support all of the levels. Also, vendors of database products im-
plement isolation levels differently, so the precise behavior of an application can vary
from database to database.
Transaction Attribute Issues. Attention must be paid to the possible values of a trans-
action attribute. In particular, the TX_REQUIRES_NEW value, as shown in Table 1, di-
20 R. Silaghi and A. Strohmeier
rects the container to always invoke a bean method within a new transaction context,
regardless of whether the client invokes the method within or outside of a transaction
context. Please notice that by “new transaction” it is meant that a new, top-level trans-
action is started and no nesting, or overlapping is implied.
The scenario presented in Fig. 2 illustrates how misusing the TX_REQUIRES_NEW
value for the transaction attribute might lead to the violation of the all-or-nothing prop-
erty of transactions. We considered a bean method m1 with the transaction attribute set
to TX_REQUIRED, and another bean method m2 with the transaction attribute set to
TX_REQUIRES_NEW. For rendering the example more realistic, we supposed that the
two methods belong to two different entity beans, deployed in different Containers on
different Application Servers.
Container C1 Container C2
Client Entity Bean 1 Entity Bean 2
TX_REQUIRED TX_REQUIRES_NEW
m1
T1 ... m2
T1
T2
...
T1
Fig. 2. Changing Transaction Contexts according to the Transaction Attribute
Imagine now that a client calls the bean method m1 from within a transaction context
T1. Receiving the call to m1 together with the transaction context T1 that comes with
the call, the container C1 follows the exact instructions found in the deployment de-
scriptor for the invoked method, i.e., it will execute method m1 within the client trans-
action context T1 (see Table 1). When method m2 is called from within m1, the con-
tainer C1 passes the transaction context with the invocation. Receiving the call to m2
together with the transaction context T1, it is the turn of the container C2 to follow the
instructions found in the deployment descriptor for the invoked method, i.e., it will ex-
ecute method m2 within a new transaction context. First of all, however, container C2
suspends the association of the transaction context T1 with the current thread, and only
then, it will start a new top-level transaction T2 and it will invoke the business method
m2. The container will resume the suspended transaction association after the business
method m2 and the new transaction T2 have been completed.
In our example, performing certain operations in the new top-level transaction T2
will not guarantee the all-or-nothing property for T1. Why? Because once transaction
T2 commits, there is no way to roll it back if later on transaction T1 aborts. The changes
made on behalf of T2 will persist even if T1 rolls back. As a conclusion, all the opera-
tions that must be performed in a transaction context should not cross the boundaries of
other transactions, not even if those transactions were created from within the main
transaction. However, other operations that do not interfere with the main transaction
Critical Evaluation of the EJB Transaction Model 21
In most of the cases, deposit and withdraw operations can very well be executed
in two different transactions. A member would like to deposit some money in his or
her account and the system will perform this operation within a transaction. Later on, s/
he would like to withdraw some money from his or her account and the system will
perform this new operation within a new transaction. In this case, a new transaction is
needed because the two operations are not at all related and there is no reason of undo-
ing the deposit operation if the withdraw operation fails.
However, when several operations are somehow inter-connected for achieving a
certain goal, they can no longer be executed in separate transactions. They have to be
executed in the same transaction for the sake of preserving the ACID properties. This
is the case of the placeBid method which encapsulates small operations that have all
to be executed in the same transaction context for preserving a consistent state of the
auctions. When a participant places a bid in an auction we have to withdraw from his or
her account the corresponding amount of money, protecting in this way the system from
participants that would overdraw their accounts by placing bids in several auctions
without actually having all that money. However, when s/he gets overbidden in the
same auction, we are required to give the money back and make a new withdraw from
the new bidder’s account corresponding to the amount of the new bid. One possible so-
lution to achieve this behavior is sketched in Fig. 4.
void AuctionBean::placeBid(bidder:String, bid:float) {
...
//Withdraw the bidded amount of money
getAccount(bidder).withdraw(bid);
text), and it will be passed around to all the other method invocations that are made
within placeBid. If one of the invoked methods calls at its turn other methods of oth-
er enterprise beans, then those methods should also be configured with the transaction
attribute set to TX_REQUIRED in the deployment descriptors of those beans.
Based on these considerations, the English Auction, as implemented on top of EJBs,
is graphically presented in Fig. 5. We identified three main operations that must be ex-
ecuted in a transactional way: the creation of a new auction, placing a bid in an auction,
and ending an auction.
By simply filling an item form, Member 1 will create a new AuctionBean object
within a transaction T1, and, automatically, a new row will be added in the table of all
auctions. In a few seconds the displays of all the logged members will be refreshed, and
thus, they will see the new proposed auction. Member 2 decides to participate, and plac-
es his bid. Once the method placeBid has been invoked on the AuctionBean ob-
ject, four operations will be executed within the same transaction context (T2) on dif-
ferent beans. First we will withdraw the new bid from the account of member 2. Then
some money will be returned to the previous highest bidder (this is not the case here
since member 2 is the first bidder). Finally, the information concerning the current high-
est bidder and current highest bid will be updated in the AuctionBean object. Later
on, Member 3 decides to overbid member 2 in the same auction, thus it will invoke the
placeBid method on the same AuctionBean object. Within the same transaction
context (T3) we will: withdraw the new bid from the member 3’s account, give back
the money that member 2 has previously paid, and update the information in the Auc-
tionBean object. In our example we considered that no other member overbids mem-
ber 3. Once the auction closes, the endAuction method is invoked on the Auction-
Bean object. Here we considered that the member that created the auction closes it by
invoking endAuction. If the auction terminates due to time limit, then it will be a
separate auction system thread that will call the endAuction method. At least two op-
T1 T4
...
Member 1
T2
Member 2
T3
Member 3
erations must be executed within the same transaction context (here T4) when closing
an auction: deposit a certain percentage of the amount of the final bid on the system
account as a commission, and deposit the rest of the amount of the final bid on the
seller’s account (here member 1’s account). Another operation that might also be per-
formed when an auction finishes is to mark it as closed, so no other bids can be made.
4.1 A Possible Deadlock in the EJB Solution
Due to all these withdraw-deposit operations that have to be done on several ac-
counts, a deadlock situation might appear in the EJB Solution.
Consider for instance two auctions and two participants in both auctions. Suppose
now that participant A is the current highest bidder in Auction 2, and that participant B
is the current highest bidder in Auction 1, and that both overbid each other, i.e., partic-
ipant A overbids participant B in Auction 1, and participant B overbids participant A in
Auction 2. As already presented in the previous section, the placeBid method, to-
gether with all the four operations that are chained inside it, will be executed within the
same transaction context. Fig. 6 presents the scenario where the placeBid method in-
voked by participant A is executed in the transaction context T1, and the placeBid
method invoked by participant B is executed in the transaction context T2. We repre-
sented the last two operations inside the placeBid method, i.e., setHighestBid-
der and setHighestBid, under the name of update.
Due to the isolation between transactions (see section 2.2), when the withdraw oper-
ation will be performed on account A (Fig. 6 ) from within transaction T1, access to
this account will be locked until the transaction T1 finishes. The same happens with the
withdraw operation on account B (Fig. 6 ), which locks access to account B until
the transaction T2 finishes. This situation ends in a deadlock since the two deposit
operations (Fig. 6 , ) will wait for their target accounts to be unlocked, which will
never happen since neither one of the two transactions can finish.
4.2 Proposed Work-Arounds for Avoiding the Deadlock Situation
One solution to avoid the deadlock situation would be to have a certain random timeout
after which we abort a transaction. In our example, once transaction T1 aborts, transac-
tion T2 can continue and commit. In this case, the participant A will have to re-issue a
call to the placeBid method and hope that this time it would work.
Another solution is to have ordered access to the involved accounts. The order is
dictated by the account numbers that are involved in the same placeBid method. We
will introduce a new operation, called dummy, that will be the first operation executed
inside a placeBid method. The dummy operation will target the account with the
26 R. Silaghi and A. Strohmeier
smallest number with the only purpose of getting its lock. If, for example, we have to
withdraw from Account 2 and deposit in Account 1, then a dummy operation will
be performed first on the account that has the smallest number (see Fig. 7).
In this way, if two placeBid methods are dealing with the same two accounts, like in
the deadlock situation, they will both try to perform first the dummy operation on the
same account, i.e., the one with the smallest number, trying to get its lock until the end
of the enclosing transaction. One of the transactions will get the lock first. The other one
will have to wait, thus not performing other operations on other accounts and not block-
ing other transactional objects. The battle is done once, at the beginning, and after that
everything should go on smoothly without other blockings.
The order in which the operations inside the two placeBid methods will be exe-
cuted changes due to the anticipated blocking behaviour introduced by the dummy op-
eration. In Fig. 7 we used encircled numbers to show the exact order in which those op-
erations are going to be executed.
5 Discussion
In this section, we will discuss the drawbacks of the EJB Solution for the auction sys-
tem, highlighting those limitations in the EJB Transaction Model that led to these draw-
backs. We will also point out certain features that would enhance the current EJB Trans-
action Model.
First of all, the proposed EJB Solution for the auction system does not provide the
desired all-or-nothing semantics of transactions. In an ideal case, we can imagine hav-
ing one long-living transaction for each auction, which can be rolled back at any time
while the auction is still open, returning the system in the previous consistent state. Such
a solution is proposed by [7], where the same auction system is modeled on top of open
multithreaded transactions (OMTTs) [8] in a very natural way. In the EJB Solution, we
can roll back only small increments. For example, once a new AuctionBean object
is created, there is no possibility to undo it, only by explicitly removing it from the table
in the database. The same thing happens with the placeBid method. We can roll it
back if something happens while inside, but once it finishes, there is no way to come
back to the previous state. With this approach, the system is always in a consistent in-
termediate state and it will even persist to system crashes, which is not the case for the
OMTT Solution, where everything is lost and has to be restarted from the beginning. In
the EJB Solution, after a system crash, all the information about the created auctions,
about the current highest bidders and bids in all auctions, about the balances of all ac-
counts, will simply be restored from the corresponding tables in the database(s). In this
way, all bids made by a member are remembered and s/he can continue exactly from
Critical Evaluation of the EJB Transaction Model 27
the same state where the system crashed. So, we could say that fault tolerance is pro-
vided by the persistency of the underlying database, while in the OMTT Solution, fault
tolerance is provided automatically by the underlying transaction support.
Due to the isolation between transactions, bean objects are locked until the transac-
tion that has locked them commits or aborts. From this perspective, we can understand
why the EJB Transaction Model does not encourage the usage of long-living transac-
tions. A lot of bean objects can be involved in a long-living transaction, which can re-
duce system efficiency and throughput, as there is no support for partial rollbacks, ear-
ly-release locks, savepoints, or compensating actions, like in Sagas [9]. It should be pos-
sible to release bean objects during a long-lasting transaction execution. Or, a method-
commutativity table should be created for each bean, marking some methods as non-
conflicting. This would increase the Container’s knowledge about the bean, and, con-
sequently, increase the potential for sharing a particular bean object with other transac-
tions.
Another feature that is offered by OMTTs are nested transactions, which give the
developer the possibility to make partial undos by rolling back a subtransaction and all
its children, without causing abortion of the whole open multithreaded transaction. In
the auction system, such partial undos are related to returning the money to a bidder
once s/he gets overbidden. In the EJB Solution, we handle this by having compensation
operations in the upcoming transaction. In the OMTT Solution, it is achieved using
nested transactions. When a user places a bid, the money is withdrawn from his or her
account inside a nested transaction. Later on, if someone places a higher bid, the money
is returned to the account by simply aborting the nested transaction.
One step forward towards providing nested transactions in the EJB Transaction
Model, would be to support abort-dependent and commit-dependent transactions. In
this way, transactions would be able to change their behavior based on the state of an-
other transaction. Once a transaction aborts, the corresponding abort-dependent trans-
actions will also abort. A transaction commits if all its corresponding commit-depend-
ent transactions have already committed. If applications are mostly based on transac-
tions, it is desirable to express bindings and dependencies between them [10], [11].
The EJB specification is not clear regarding multithreaded transactions, contrary to
the full support of multithreading in OMTTs. In the Java Transaction API (JTA) [12],
however, it is mentioned that each thread has associated a transaction context, which is
either null or refers to a specific global transaction. The transaction-to-thread associa-
tion is managed transparently by the Transaction Manager. Multiple threads may con-
currently be associated with the same global transaction. This can be achieved by cre-
ating or spawning threads from within an already existing transaction context. Howev-
er, it is not clear how a newly created thread can be associated with a previously started
transaction, i.e., how threads can join already existing transactions.
Some other limitations of the EJB Transaction Model that we will not enter into de-
tails in this paper are: transactions cannot manage their locks according to the applica-
tion’s requirements, the set of values for the transaction attribute is very limiting, bean
methods cannot be associated with several transaction attributes, bean methods cannot
be dynamically associated with a particular transaction attribute, no support for asyn-
chronous operations, and no constraint for distributed diamond support.
28 R. Silaghi and A. Strohmeier
6 Conclusions
Even if the EJB specification does not require the Bean Provider to have any program-
ming knowledge for concurrency, transactions, and other services, s/he must first ac-
complish a detailed analysis of all the enterprise beans’ methods before starting the con-
figuration of all deployment descriptors. Any misuse of the values of the transaction and
isolation level attributes might lead to incorrect applications. Changing the values de-
fined by the Bean Provider for these two attributes is highly error-prone. Only the im-
plementor of the bean knows exactly the semantics of the methods, and is qualified to
select the appropriate policies.
By implementing the auction system on top of EJBs, a certain rigidity of the EJB
Transaction Model was sensed. We discovered several limitations of the EJB Transac-
tion Model and we proposed work-arounds when possible. A deadlock situation was
identified in the EJB implementation, and some solutions to avoid it were proposed. We
also presented certain features that are missing in the EJB Transaction Model and that,
we believe, would enhance the current model and would make it more open and flexi-
ble.
References
[1] Gray, J.; Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan
Kaufmann Publishers, San Mateo, California, 1993.
[2] Moss, J. E. B.: Nested Transactions, An Approach to Reliable Computing. Ph.D. Thesis,
MIT, Cambridge, April 1981.
[3] Sun Microsystems: Enterprise JavaBeansTM Specification, v2.0, August 2001.
[4] Sun Microsystems: JavaTM 2 Platform, Enterprise Edition Specification, v1.4, Proposed Fi-
nal Draft, August 2002.
[5] IBM: WebSphere® Application Server. http://www.ibm.com/websphere/
[6] Vachon, J.: COALA: A Design Language for Reliable Distributed Systems. Ph.D. Thesis
#2302, Swiss Federal Institute of Technology, Lausanne, Switzerland, December 2000.
[7] Kienzle, J.; Romanovsky, A.; Strohmeier, A.: Auction System Design Using Open Multi-
threaded Transactions. Proceedings of the 7th International Workshop on Object-Oriented
Real-Time Dependable Systems, San Diego, California, USA, January 2002. IEEE Com-
puter Society Press, Los Alamitos, CA, 2002, pp. 95–104.
[8] Kienzle, J.: Open Multithreaded Transactions: A Transaction Model for Concurrent Ob-
ject-Oriented Programming. Ph.D. Thesis #2393, Swiss Federal Institute of Technology,
Lausanne, Switzerland, April 2001.
[9] Garcia-Molina, H.; Salem, K.: Sagas. Proceedings of the SIGMod Annual Conference, San
Francisco, California, USA, May 1987. ACM Press, pp. 249–259.
[10] Elmagarmid, A. K.: Database Transaction Models for Advanced Applications. Morgan
Kaufmann Publishers, 1992.
[11] Jajodia, S.; Kerschberg, L.:Advanced Transaction Models and Architectures. Kluwer Aca-
demic Publishers, 1997.
[12] Sun Microsystems: JavaTM Transaction API (JTA) Specification, v1.0.1, April 1999.
[13] Sun Microsystems: JavaTM Transaction Service (JTS) Specification, v1.0, December 1999.
[14] Software Engineering Laboratory: Open Multithreaded Transactions - The Auction System
Case Study. http://lglwww.epfl.ch/research/omtt/auction.html
[15] Weikum, G.; Vossen, G.: Transactional Information Systems: Theory, Algorithms, and the
Practice of Concurrency Control and Recovery. Morgan Kaufmann Publishers, 2002.
Automated Prototyping of CORBA-Based
Distributed Object-Oriented Systems
Stephan Philippi
1 Introduction
Due the advent of the internet and affordable network technologies, the develop-
ment of distributed systems is an important area of software engineering today.
Distributed systems (potentially) have many advantages in comparison to local
ones, like the sharing of resources over networks, fault tolerance, increased per-
formance, scaling of applications, and others. Unfortunately, these advantages do
not come for free, since the development of distributed systems is very demand-
ing. Problems in this context arise from the heterogeneity of hardware platforms,
operating systems, and programming languages, as well as the use of unreliable
networks for reliable communication between components, real concurrency, se-
curity issues with respect to the (mis-)use of services, and the need for location
transparency. In order to support the development of distributed systems in the
light of these problems, the CORBA (Common Object Request Broker Archi-
tecture) international standard is developed by the OMG (Object Management
Group) [OMG02]. The idea behind this standard is to specify a middleware
which decouples distribution related aspects from application development. In
order to support arbitrary combinations of operating systems, programming lan-
guages and underlying hardware, the CORBA standard is specified in a plat-
form independent manner. Today, CORBA is the most widely used middleware
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 29–38, 2003.
c Springer-Verlag Berlin Heidelberg 2003
30 S. Philippi
with implementations for almost every environment. Even if the wide range of
availability of CORBA compliant implementations clearly indicates that the ser-
vices provided by this middleware are in principle useful for the development of
distributed systems, there are also some problems. In detail, the still growing
complexity of the standard leads to a situation in which a straight forward un-
derstanding of the underlying ideas and their details is difficult, especially for
software developers with little or no experience in distributed systems. In addi-
tion, todays modeling tools and integrated development environments only offer
basic CORBA support. Out of this, the development of distributed systems is
still a demanding task which needs the experience of specialists to handle the
additional complexity introduced by the CORBA standard.
IDL IDL
writes
IDL Definitions
Naming
Service
compiler of the ORB used by the client is called to generate so called stubs
according to a specified target programming language mapping. At the run-
time of a distributed system, stubs serve as proxies for calls to remote services.
Therefore, the client side stubs are the counterparts of the server side skeletons,
i.e. the stub code is responsible for the transparent call of the remote service in
question. In addition to the application logic, the client also needs to implement
administrative functionality for creating and initializing of an ORB object and
finding of CORBA objects and services. The last step of the client development is
that all classes have to be compiled and linked together with the libraries needed.
If a naming service is available to client and server which provides initial access
to remote object references, both are ready then to communicate in a distributed
environment.
Even if the use of a CORBA compliant middleware as outlined above is un-
doubtedly a great help for the development of distributed systems in general,
there are still drawbacks. The most severe problem is that the modeler/pro-
grammer can not concentrate on his main interest which is the development
of application logic. In addition, further non-trivial tasks have to be fulfilled
in order to integrate a CORBA compliant distribution middleware into a sys-
tem. Especially for designers and programmers with little or no background in
CORBA technology, or distributed systems in general, this is a difficult task due
to the complexity of the standard and the implementing products. In order to
make the basic CORBA functionality more accessible even for non experts in this
area, the next section introduces concepts for the automation of CORBA-based
distributed systems development.
Automated Prototyping of CORBA-Based Systems 33
3 Automation Concepts
For the purposes of an easier access to the basic CORBA functionality for the
prototyping of distributed systems, we propose to introduce automation tech-
niques into the above described development process. Ideally, such techniques
fully encapsulate distribution related aspects and allow developers to focus on
the application logic. As a prerequisite, we assume that the application develop-
ment is model driven (see [OMG01a]) and UML class diagrams [OMG01b] are
used for the specification of distributed system components. Starting from this
assumption, the following three areas of CORBA-based systems development
can be identified to benefit from the use of automation concepts:
IDL support: The first step of the server development is the specification
of its remotely accessible interfaces by means of IDL descriptions. With a model
driven development process for the architectural specification of a server, inter-
faces are described with constructs of the underlying modeling language. Obvi-
ously, publicly available CORBA interfaces and internally used ones have to be
distinguished, e.g. by making use of ’CORBA Interface’ and ’CORBA Struct’
stereotypes in UML class diagrams. With the help of a predefined IDL mapping
of specific elements of UML class diagrams, this approach enables automatic
derivation of IDL descriptions from architectural models. In addition, the com-
piler which translates the IDL descriptions of the server into target programming
language skeletons is ideally invoked transparently.
During the development of a client component the server specified remote in-
terfaces have to be known, i.e. the automatically generated IDL descriptions of
the server have to be imported into the architectural model of the client. Ideally,
this import is fully automated after the selection of the IDL descriptions needed,
i.e. the interfaces should be visualized automatically within the UML class dia-
gram of the client. Analogue to the server development, the IDL compiler of the
client for the translation of the interface descriptions into target programming
language stubs should be invoked transparently.
Implementation and run-time support: As described with the CORBA
development process in the last section, the integration of a CORBA compliant
middleware into a distributed component demands for the manual development
of non-trivial server and client side administrative parts. Automation support
in this area has potentially the greatest impact to help the development of dis-
tributed systems. The main idea here is to provide an own API which is au-
tomatically extended with application specific code in order to encapsulate the
CORBA interface from the application.
Server objects are usually created and destroyed by means of so called homes
which implement the factory method design pattern [Gamma95] and which are
CORBA objects themselves. These homes are initially created in the server and
their references are published within a naming service for client access. In ad-
dition, so called portable object adapters (POAs) are needed by the ORB for
run-time administration of servants. Ideally, the factory classes and the POAs
are generated automatically together with their implementations. Moreover, a
server framework should be provided in order to transparently assemble the auto-
34 S. Philippi
matically generated parts of the server with the manually developed application
logic. At run-time this framework is responsible e.g. for ORB initializations and
publishing of home references within a naming service.
A client needs to know the references of server factory objects published in the
naming service in order to be able to create remote objects which are then used
to serve client requests. The communication of the client with the naming service
to retrieve these references is ideally encapsulated by automatically generated
code which upon request returns the reference of the factory object needed.
Analogue to the server, the described means to support the client development
are ideally completed with a client framework which transparently handles ORB
initializations and then passes control to the client application logic.
Build support: As the compilation and linking of components in a dis-
tributed system is not always straight forward due to complexity of configuration
options, this task should be automated, too. Especially the differences in appli-
cation building which have their origin in ORB implementations from different
vendors with differing configuration parameters should be handled transparently
by the supporting development environment.
CORBA models
interfaces
Naming
Service
With the above described automation concepts, the original development pro-
cess given in figure 2 is considerably simplified and results in the process shown
in figure 3. The introduced approach for the automation of CORBA-based sys-
tems development almost completely frees the developer from CORBA-specific
tasks, only the modeling of CORBA-interfaces remains. Consequently, develop-
ers can concentrate on the application logic as their main interest. Furthermore,
also non-CORBA experts are able to develop distributed systems with the sim-
plified development process, given that the described concepts are properly sup-
ported by an integrated development environment. The next section therefore
introduces a prototypical implementation of the introduced automation concepts
targeted towards Java as an example programming language.
Automated Prototyping of CORBA-Based Systems 35
Fig. 4. Screenshot with the Architectural Model of the Address Database Server.
of the CASE tool. The IDL descriptions are then automatically transformed
into representatives of the client UML class diagram. Figure 5 gives the result
of this process for the example. In detail, the CORBA structure ’Address’ as
well as the CORBA interfaces ’Address DB’ and ’Address DB Home’ are auto-
matically generated from the IDL import. Moreover, the additional proxy class
’Address DB proxy’ is generated automatically which is used to encapsulate the
communication with the naming service needed in order to get the initial ref-
erence to the servant home. After the application logic of the client has been
provided by the developer in class ’Address DB Client’, the client component
may be generated. While the build support is the same as for the server, the
generation transparently provides implementations for the proxy class and a
client framework which performs ORB initializations and then passes the con-
trol to the application logic. Assuming that a naming service is running on a
default port, server and client can be started with automatically provided scripts
and then communicate in a distributed environment.
A more detailed description of the implementation of the automation con-
cepts into the experimental CASE tool platform NEPTUN [Phil02] is given in
[Brands02].
as only the modeling of CORBA interfaces and the application logic develop-
ment remain as manual tasks to fulfill. However, this high degree of automation
also has some limitations. Due to the fact that almost all decisions on how to
statically and dynamically bind an ORB to the application logic are hidden
from the developer, he also loses the flexibility of a non-automated development
process. This is not necessarily a drawback, as it depends on the area of use
if highly efficient development or high flexibility are more important. From our
point of view, especially the rapid prototyping of distributed systems benefits
from the introduced approach, as in most cases a prototype is developed as a
’proof of concept’ and hence efficiency of development is more important than
bells and whistles. In addition, the described approach offers benefits for devel-
opers which are not familiar with distributed systems development in general
and the CORBA standard specifically. In such a case the developer loses some
flexibility but also the accompanying complexity of choice.
While there is a plethora of work on distributed systems modeling, automa-
tion concepts to be used in this area are only rarely discussed in the literature.
A proposal which is similar in focus to the approach presented in this article is
introduced in [AleKor02]. However, the idea described by the authors fundamen-
tally differs from our concept, as they propose to enrich IDL specifications during
their manual development with additional semantic information as a prerequisite
to automatic Java code generation. Another area of related work are the automa-
tion concepts offered by existing tools for (Java) systems modeling, roundtrip
engineering, and more traditional integrated development environments. As will
be pointed out in the following, all these tools only offer limited support for the
automation of distributed systems development. Considering for example ’Ratio-
nal Rose’ [Rati02] and ’Together’ [Toge02] as popular tools for the model driven
software development, we find that they only offer automatic generation of IDL
descriptions from UML class diagrams and some application building support.
Integrated development environments like e.g. ’JBuilder’ [Borl02], ’Visual Age
for Java’ [IBM02], and ’NetBeans’ [Net02] are programming language centered
and thus do not make use of modeling languages for systems development, i.e.
interface descriptions have to be written manually. Often, such environments
have an ORB integrated out of the box which (potentially) simplifies applica-
tion building. Interestingly, the open source project ’NetBeans’ offers the most
sophisticated support for the development of CORBA-based distributed systems
today. A wizard driven approach to distributed application development allows
for the automatic generation of CORBA-specific client and server side code frag-
ments, which have to be completed manually by the developer. In summary, the
existing tools either offer automatic IDL generation from class diagrams or they
are programming language centered and thus mainly concentrate on application
building. Therefore, none of the tools available today offers a seamless automa-
tion support ranging from high-level systems modeling to low-level application
building as proposed in this article.
Future work on the presented approach for the automated prototyping of
CORBA-based distributed systems will enable the developer to individually
38 S. Philippi
choose the appropriate level of tool support, i.e. if either flexibility or efficiency
of development is more important in a particular stage of a project. Also, the
introduced automation concepts will be further refined to be able to provide sup-
port for more complex application scenarios. Furthermore, the transparent use
of CORBA or RMI for distributed Java applications is also a topic of interest.
Acknowledgements. The author would like to thank the NEPTUN team and
especially Holger Brands for their contributions to the project, as well as Jacob
Köhler and Dion Whitehead for commenting an earlier version if this paper.
References
[AleKor02] M. Aleksy and A. Korthaus. ‘Automatic Java Code Generation Based on
CORBA IDL plus Semantic Comments’. ‘Proceedings of the International
Conference on Computer Science, Information Technology, e-Business,
and Applications (CSITeA ’02)’. Foz do Iguazu, Brasil, 2002.
[Borl02] Borland. ‘JBuilder Documention’. ‘www.borland.com’, 2002.
[Brands02] H. Brands. ‘Modellierung and Generierung von CORBA-Anwendungen’.
Diploma Thesis, Koblenz University, 2002.
[Gamma95] Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides. Design
Patterns. Addison Wesley, Reading, MA, 1995.
[HenVin99] M. Henning and S. Vinoski. ‘Advanced CORBA Programming with C++’.
Addison-Wesley, Reading, 1999.
[IBM02] IBM. ‘Visual Age for Java Documentation’. ‘www.ibm.com’, 2002.
[Net02] NetBeans. ‘NetBeans Documentation’. ‘www.netbeans.org’, 2002.
[OMG99] OMG. ‘CORBA Components Vol. I, Joint Revised Submission’.
‘www.omg.org/corba’, 1999.
[OMG01a] OMG. ‘Model Driven Architecture (MDA)’. ‘www.omg.org/mda’, 2001.
[OMG01b] OMG. ‘UML 1.4’. ‘www.omg.org/uml’, 2001.
[OMG02] OMG. ‘CORBA Specification 3.0’. ‘www.omg.org/corba’, 2002.
[Phil02] S. Philippi. ‘A CASE-Tool for the Development of Concurrent Object-
Oriented Systems based on Petri-Nets’. Petri-Net Newsletter 62, 2002.
[Rati02] Rational. ‘Rational Rose Documenation’. ‘www.rational.com’, 2002.
[Toge02] Togethersoft. ‘Together Documentation’. ‘www.togethersoft.com’, 2002.
Jawa: A Java Tool-Kit for Mobile Objects Applications
1 Introduction
Users always demand the fastest and the most accurate results from software
applications. A faster computation may be achieved by dividing the applications into
several small units and then distributing the units to several computers. However, a
distribution introduces complexity for the application itself as well as the development
cycle; and thus threatens the correctness of the application [19].
The Standard Java SDK provides distributed services based on Remote Methods
Invocation (RMI). Despite being powerful, RMI is only suitable for static distributed
client-server architectures. The RMI development cycle is also somewhat tedious.
Therefore, a tool-kit called Jawa is designed with the following features:
Simple development: The application’s classes are compiled and deployed as a
single Java archive (jar) for easy and statically correct development. However, it
contains several mobile objects, representing distributable computational units.
Auto dynamic distribution: The application is initiated on a local computer, and
then it automatically discovers the remote computers and distributes the mobile
objects.
Clean termination: The application terminates completely from all computers,
therefore it safe for repetitive executions and iterative development.
Robust sharing: Several Jawa based applications may run simultaneously in the
network, along side with other applications in well separated processes.
This paper begins by reviewing the background theories and the standard Java
distributed services. The next section presents the Jawa architecture, followed by the
descriptions of Jawa packages. Finally, the paper discusses some comparison and
related works, then concludes with the main characteristics and benefits of Jawa.
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 39–48, 2003.
© Springer-Verlag Berlin Heidelberg 2003
40 E.M. Budi, G. Roy, and G. Cole
2 Background Theories
The main inspirations of Jawa are parallel computing [7] and scheduling theories [4].
In parallel computing, several processors work cooperatively to execute application
software. The fundamental concepts of parallel computing are:
Scalability: the management of all possible computers as a set of machines.
Modularity: the decomposition of the application into a set of small jobs.
Locality: the grouping of related jobs to minimize communications.
Concurrency: the simultaneous executions of many jobs in the machines.
To be efficient, parallelism requires a good scheduling; to assign the right processor
for the right job. The scheduling consists of three main concepts:
Measurement: the determination of machines resources capabilities and the jobs
resources requirements.
Optimization: the assignment of a set of machines to complete a set of jobs under
particular objectives and constraints.
Dispatching: the loading and unloading of the jobs onto the machines (depending
on the assignment).
The distributed object technology recognizes several basic architectures such as code-
on-demand, remote evaluation, client-server, mobile objects/agents [11], and
replicated objects [3]. So far, the best architecture supported by standard Java from
Sun Microsystems is client-server architecture with an automatic discovery. It uses
Jini + RMI activation framework [8][17] that works as follows (see Figure 1):
a. The environment consists of one local and several remote computers in a wide area
network (Internet). Every remote computer runs an RMI activation daemon
(RMID), and there must be at least one Jini lookup service (Reggie).
b. The application is developed as separated servers and client archives (collection of
classes) that share a common interface definition language (IDL). The
programmers must make sure that the clients match the servers.
c. The server archives have to be installed and started on several remote computers as
necessary. To enable access by any clients, the servers register themselves to the
RMI daemon and the Jini lookup service. After that, the client archive can be
started on the local computer.
d. On operation, the client objects discover the server objects from the Jini lookup
service. Client objects will receive the server stubs, and can use it to access the
server objects.
e. On termination, the clients close the connections and shut themselves down.
However, the server objects are kept running on the remote computers.
Additional tool-kits are required to support mobile objects and replicated objects, for
example Aglets [12], Voyager [21], JavaThread [7], Brake [18], JGroup [2], and
FilterFresh [3].
Jawa: A Java Tool-Kit for Mobile Objects Applications 41
Application Server
Source IDL Source
RMID
Archive Archive
Reggie
Client Server
(c) Starting
Reggie
Client Server
(d) Operation
Server RMID
Reggie
Server
(e) Termination
Fig. 1. Java Jini + RMI activation framework.
4 Jawa Architecture
d. On operation, the root controls the distribution of the migrants to other remote
computer by the following procedures:
It automatically discovers all available Islands on remote computers.
It selects the suitable Islands, and then the books Colonies on those Islands.
Each Colony uses a separated JVM to accommodate the incoming Migrants.
It dynamically measures the Colony’s capabilities and Migrant’s resources
requirements, and then assigns and dispatches the Migrants into the Colonies.
e. On termination, the root closes the Colonies and then shut itself down. Therefore,
the remote computers are left clean.
Application
Source
Island
Archive
Application
Classes
Application Island
Root
Resident1 Migrant1
(c) Starting
Resident1 Migrant1
User Local Computer Remote Computer
(d) Operation
Island
(e) Termination
Fig. 2. Jawa architecture.
Jawa: A Java Tool-Kit for Mobile Objects Applications 43
5 Jawa Packages
Jawa is built on top of the standard Java using the layering as shown in Figure 3. The
tool-kit consists of seven packages (Island, Application, Scheduling, Discovery,
Measurement, Mobile and Local) and one executable (Island daemon). The Island
daemon is installed on each remote computer. The programmer builds custom
applications by importing the packages. The Jawa services are comparable to grid
computing [10].
Jawa based Applications Island
(To be developed) daemon
Application Island
Scheduling
Standard Java
The application package supports the framework to build a parallel application that
can be distributed into the Colonies efficiently. For job partitioning, it uses the idea of
hierarchical mobile agents [15]. The characteristics of the package are as follows:
The mobile objects are organized in a tree structure. Each mobile object can nest
other mobile objects within itself and can only be nested by one other mobile
object.
The resource requirement of a parent object includes all of its child objects.
Each mobile object can migrate individually, or as a whole including all its child
objects.
However, the application also requires several non-mobile objects and ordinary
objects. To support these different types of objects, Jawa defines three basic classes:
1. Jawa.migrant is the base class that must be extended by the custom mobile objects.
This base class implements basic services for mobility and resource requirement
measurement. The Migrant is used for the job that may move between Colonies. A
Migrant may contain other Migrants or serializable objects.
2. Jawa.resident is the base class for non-mobile objects. The base class implements
the services for remote access and resource requirement measurement. The resident
is used to wrap the job that must stay in a computer, but accessible from other
computers, for example: databases or input/output device. A Resident may contains
some Migrants or serializable objects, as well as Residents or non-serializable
objects.
3. Jawa.root is the base class for the main object of the application. The programmer
must extend this class and add the necessary code to initialize the application, such
as the first level Migrants and Residents.
Under that framework, the programmer must carefully design the job partitioning of
the application by exploiting parallelism. The programmer should arrange a dividable
job as a Migrant, and group highly related Migrants under a parent Migrant. For a
simple example, Figure 4 shows a possible job partitioning of an application to
calculate the following formula:
n m
A = ∑ X + ∑Y
1 1
APPLICATION
Root
RESIDENT
User MIGRANT
Add A
Interface
Sum X Sum Y
The scheduling package defines the interface and implements some implements some
base class for scheduling. Scheduling itself is a broad topic, thus this package is a
flexible framework that provides the skeletons for:
The selection mechanism of the machines and the jobs. Jawa has a defined machine
capabilities matrix that includes platform information and benchmarking result (see
section 5.4). Programmers may define a custom job resource requirements matrix,
and the selection policy, such as random, least loaded, etc.
The dispatching mechanism. Jawa supports initial placement (by dynamic class
loading) and preemption migration (by weak mobility, strong mobility is still under
development).
The scheduling algorithm. The scheduling algorithms may be static or dynamic
with any search mechanism (enumeration or heuristic). Jawa provides generic
static-table scheduling and dynamic-list scheduling.
The discovery package provides the services to discover remote computers (Islands) in
the network automatically. A standard package for this function is Jini [8]. However,
Jini does not support unequal comparison for search criteria. The applications can
search, for example, all computers that have memory equal to 128 MB. However, it
cannot ask for all computers that have memory greater than 128 MB. Therefore, the
applications have to search all possible combinations or seek all remote computers
then test it one by one.
As the alternative options, Jawa use the light RMI-discovery protocol [5]. The
application sends the discovery request directly to the Islands using a multicast
communication. The Island evaluates the request; and on acceptance, answers it
directly to the application using unicast communication. This technique greatly
simplifies and improves the speed of discovery process.
46 E.M. Budi, G. Roy, and G. Cole
The mobile package provides the service for the movement of computation units
between computers. As noted by Fuggeta [11], a complete mobility should carry data,
code, execution state, and reference. Standard Java, however, does not provide
execution state and reference mobility. The Jawa Mobile package add the necessary
mobility services on top of the standard Java as follows:
The data and code mobility uses the standard Java RMI. Each application’s Root
includes a custom class server. At the other end, the Colonies are configured to use
codebase from that class server. The more efficient data and code migration was
discussed in [14].
The execution state mobility uses several approaches. For the basic, Jawa provides
weak mobility with a state machine like Aglets [12] or Voyager [21]. The support
for strong mobility is not stable yet. The first option is using portable thread
migration (Brakes) [18]; but the application must be recompiled for some byte code
modification. The other option is using thread serialization tool-kit (Java Thread)
[7]; however, it uses non-standard Java virtual machines.
The reference mobility using a smart proxy that wraps the standard RMI stub, and
add the functionality to reconnect itself when the referenced Migrant moves. The
basic idea of smart proxy is taken from [13][22]. Another example of mobile
reference is MobileRMI [1].
The local package provides the necessary services to access the local resources on the
remote computers, such as files or input/output devices. Usually, they are platform
dependent and may involve some native code. The local package classes are installed
on the remote machines and managed by the Islands. If the Migrants need to access
any local resources, it may get the necessary classes from the Island and instantiates it
within the same JVM (the Colony). Therefore, the Migrants can access the resources
directly without the overhead of inter-process communication.
Jawa has several advantages over the standard Java RMI + Jini as shown in Table 1. In
short, the Java RMI + Jini supports the basic client-server distributed architecture and
the automatic server objects management. As a more advance tool-kit, Jawa supports
the mobile objects architecture, completed with the automatic remote machines
management, hierarchical jobs structuring, and autonomic scheduling.
Jawa is similar with many projects. The international project for Java high
performance computing is Java Grande (www.javagrande.org). From another block,
the new paradigm about the automatic environment is emerging as grid computing
(www.gridcomputing.org).
Jawa: A Java Tool-Kit for Mobile Objects Applications 47
7 Conclusion
Jawa provides a toolkit that simplifies the distributed application development. The
programmers can compile and deploy the application in a single archive. The
application then started on a local computer, but will automatically distributed into the
available remote computers in the network. After completing the execution, the
programmer may modify the program and repeat the execution without class
versioning problems. Several Jawa based applications may run on the same network
with minimum interference on each others, except for the shared resources.
The Jawa based applications that are carefully designed with parallelism may gain
performance improvements. Jawa also provides load balancing for overall system.
However, the cost of communication, migration, and scheduling are relatively high.
Therefore, Jawa is more appropriate for big applications.
For future works, Jawa should consider other aspects such as distributed scheduling
and fault tolerance. The distributed scheduling is necessary to ensure that the overall
system is convergent when several Jawa based applications run simultaneously. The
fault tolerance feature is an assurance for possible partial failures in the networks or
remote computers.
References
1. Avvenuti et al., MobileRMI: a Toolkit to Enhance Java RMI with Mobility, 6th ECOOP
Workshop on Mobile Object Systems: Operating System Support, Security and
Programming Languages, June 13, Sophia Antipolis, France, 2000.
2. Ban, B., JavaGroups-Group Communication Patterns in Java,
http://www.javagroups.com/javagroupsnew/docs/papers/Patterns.ps.gz, November 2002
48 E.M. Budi, G. Roy, and G. Cole
1 Introduction
There is an emerging demand for Java-oriented reliable software for distributed applica-
tions due to Java platform independence, the dynamic plug-in of Java components (e.g.
downloading Java applets) and the simplicity of application programming just to name
some of the Java features. One of the well-known way to add reliability to a distributed
application is through software replication. However there are only a few systems that
implement replication in Java.
Software replication is a well known technique allowing to increase the availability
of a service exploiting specialized software running on COTS (Commercial-Off-The-
Shelf), cheap hardware. The basic idea underlying software replication is to replicate
the server of a given service on different hosts connected by a communication network
so that the service’s clients can connect to different server replicas to increase the prob-
ability of getting replies to their requests. When dealing with a replicated service, it
arises the problem of guaranteeing consistency among the local states of the replicas
despite crashes. Active [1] and passive [2] replication are well-known approaches to
increase the availability of a stateful service. These approaches employ group communi-
cation primitives and services such as total order multicast, view synchronous multicast,
group membership, etc. Implementations of such primitives are provided by group com-
munication toolkits e.g., ISIS [3], TOTEM [4], Maestro/Ensemble [5]. Previous group
This work has been partially supported by a grant from EU IST Projects “EU-PUBLI.COM”
(#IST-2001-35217) and “MIDAS” (#IST-2001-37610), and by a grant from MIUR on the
contexts of projects “DAQUINCIS” and “MAIS”
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 49–60, 2003.
c Springer-Verlag Berlin Heidelberg 2003
50 R. Baldoni et al.
toolkits have been implemented in C and/or C++. Only recently Java group toolkits have
emerged e.g. JavaGroups [6], Spread [7] and Appia [8]. However, to the best of our
knowledge, there does not exist a performance comparison among these toolkits, and
the evaluation of Java implementations is a critical point as distinct Java designs of the
same application can lead to huge differences in performance due to the interaction be-
tween the virtual machine and the bare operating system and the network. Furthermore,
a performance comparison among such group toolkits cannot prescind from a specific
application, otherwise the comparison risks to be useless.
In this paper we present a performance evaluation of Java group toolkits in the context
of a specific architecture for software replication, namely the three-tier (3T) software
replication, based on group toolkit. More specifically, in a three-tier architecture for
software replication, clients (the client-tier) interact with a middle tier (the middle-tier)
that forwards client requests to replicas (the end-tier) maintaining consistency. To achieve
this, the middle-tier embeds two basic components, namely the sequencer and the active
replication handler. The first component assigns, in a persistent way, consecutive and
unique sequence numbers to client requests. This is based on the classical "one-shot" total
order multicast provided by group toolkits1 , while the second masters the client/server
interaction enforcing atomicity on the end-tier.
Therefore to evaluate Java group toolkits we realized an implementation of the se-
quencer component, one for each group toolkit, and measured their performance in terms
of client latency and sequencer component latency. We show how the performance of a
given Java group toolkit is heavily influenced by the protocol implementing the total-
order primitive. To evaluate the gap between a C++ group toolkit and Java ones, we
also show the performance results of the same 3T architecture whose sequencer is based
on Maestro/Ensemble. This performance comparison points out that, once selected the
"one-shot" total order protocol well tailored for the underlying application, the gap be-
tween a Java group toolkit and Maestro/Ensemble is acceptable.
The rest of the paper is structured as follows: Section 2 introduces main features of
Java group communication toolkits. Section 3 presents the application scenario, namely
software replication based on a three-tier architecture, where group toolkits will be
evaluated and finally Section 4 shows the performance results.
Group Communication Toolkits (GCTs) have proven to be an effective tool for building
reliable software in partially synchronous distributed systems2 . They provide a rich set
of services and primitives (e.g., total order multicast, group membership etc) which help
1
“One-shot" means that there is no relation between any two consecutive runs of a total order
protocol.
2
A partially synchronous distributed system alternates stable and unstable intervals. During a
stable interval there is a bound (know or unknown) on (i) communication transfer delays and
(ii) on the time a process takes to execute a step. During unstable intervals the system becomes
purely asynchronous. If the duration of stable intervals tends to zero, the group toolkit does not
longer guarantee its services. Conditions verifying partial synchrony can be easily ensured on
a LAN.
Performance Analysis of Java Group Toolkits 51
two blocks, we should note that if they reside in different processes or in different hosts,
there are two additional hops in the communication between any two processes of a
group, thus yielding an additional overhead that results in a performance degradation.
52 R. Baldoni et al.
Another abstraction level regards the core system itself; in particular, we consider the
protocols implemented within the core and the way in which they interact. From this point
of view, GCTs can be classified into two distinct types: one is characterized by a fixed
set of protocols, which interact always in the same way; the other is based on the concept
of a composable protocol stack (see Figure 1(b)), where each specific functionality, e.g.
total ordering of messages, is confined in a micro-protocol, which can be placed above
another to form a stack. The developer is also free to add new semantics, by encapsulating
them in a new micro-protocol, which can be used to compose another protocol stack.
Moreover, this composition can be done dynamically, giving the possibility to build
adaptive systems. As an example, an application sending totally ordered messages using
the stack depicted in Figure 1(b), can decide, at some point in time, to add an encryption
layer, defined by the developer, to obtain confidentiality. This behavior is not possible
if the core system is built as a fixed predefined set of protocols, because it allows the
developer to choose only among a static set of GCT services, whereas every additional
functionality must be added at the application level, which is not easy and sometimes
even impossible. In contrast, from a performance viewpoint, composable protocol stack
can yield penalties due, for example, to arbitrary delays introduced by micro-protocols
and to the growth of the message size due to layering.
We now give a brief description of the GCTs we used in the experiments, focusing
on aspects that allow to classify them on the basis of the previous discussion.
Maestro/Ensemble [5]. Ensemble is a flexible and efficient toolkit written in the Ob-
jective CAML language. It is based on the concept of composable protocol stack, but
also implements some optimizations, trying to overcome the performance degradation
that results from a layered architecture. Maestro is an interface to Ensemble written in
C++, and thus can be considered the API of the toolkit with respect to our framework
defined above. Maestro starts Ensemble in a separate thread, therefore the API and the
core reside in the same process, sharing the same address space.
Spread [7]. The Spread toolkit is intended to be used in wide area networks, but it has
also excellent performances if used in LANs. It is based on a client-daemon architecture,
where the daemon represents the core system. The client connects to the daemon by
means of a predefined API, and sends messages to other members of a group exploiting
the daemon itself. Therefore the API and the core reside in different processes, which can
be collocated on the same host, or in distinct ones. The daemon employs a fixed number
of protocols, therefore, even if it gives a certain amount of flexibility in the services it
offers, the developer is forced to add every additional functionality at the application
level. The daemon is written in ANSI C, whereas the API is available in C++, Java and
other languages.
Appia [8]. As Ensemble, the Appia toolkit is based on a protocol stack, but it is more
flexible than Ensemble as it provides the possibility to extend not only the protocols
that compose the stack, but also the events used for intra-stack communication. In Appia
there isn’t a clear separation between the API and the core system, because an application
will make use of the toolkit by providing it’s own micro-protocol, and collocating it on
Performance Analysis of Java Group Toolkits 53
the top of the stack. It follows that the application reside in the same process of the core
system. The entire toolkit is written in Java.
JavaGroups [6]. JavaGroups is also entirely written in Java, with the aim to exploit
as much as possible all the benefits of this language. It can be considered as an API at
all, because it can be easily extended to be integrated with any group communication
toolkit. As an example, it provides interfaces to Ensemble as well as its native core
system. It is thus based on a protocol stack. In this configuration, the application and
the core system run in the same process. An important remark on JavaGroups is that it
is heavily based on patterns.
Toolkit API language Core language Collocation of the JVM Core design
Maestro/Ensemble C++ OCaml - Composable
Spread Java ANSI C Under the API Fixed
Appia Java Java Under the core Composable
JavaGroups Java Java Under the core Composable
The idea behind three-tier (3T) software replication is to free clients and replicas from
participating to protocols that guarantee linearizability. This is achieved by embedding
the replication logic (handling atomicity and ordering) within a software middle-tier
physically detached from both clients and replicas . In other words, the middle-tier
encapsulates all the synchrony necessary to get linearizable executions of a stateful
replicated service. In this architecture, a client sends it’s request to the middle-tier,
which forwards it to replicas according to the replication logic implemented by the
middle-tier. Then some replica processes the request and returns a result to the middle-
tier that finally forwards the result to the client. The middle-tier has to be fault tolerant to
ensure the termination of a client/server interaction in presence of failures. In particular,
if a middle-tier entity that was carrying out a client/server interaction crashes, another
middle-tier entity has to conclude the job in order to enforce end-tier consistency despite
failures. Interested readers can find additional details on 3T replication in [10,11,12].
Figure 2 shows the components of the three-tier architecture for active replication (i.e.,
each replica executes the same set of client requests in the same order). In the remainder
of this section we introduce a brief functional description of each component.
Retransmission/Redirection (RR). To cope with ARH replica failures and with the
asynchrony of communication channels, each client process c1 , . . . , cl embeds a
RR message handler. Clients invoke operations through RR that issues uniquely
identified request messages to ARH. After the elapsing of a timeout set upon the
request sending, RR retransmits the request, until a result is eventually received.
The Active Replication Handler (ARH). ARH component is the core of the replica-
tion logic: by exploiting the sequencer service, it orders all incoming client requests
and ensures that at least one copy of each ordered client request is eventually deliv-
ered at every available end-tier replica. Requests are sent to end-tier replicas along
with the sequence numbers provided by the sequencer. Replicas execute requests
according to these sequence numbers. Once replicas return results, ARH returns the
latter to clients.
Performance Analysis of Java Group Toolkits 55
Sequencer Service. The sequencer service is available to each ARH replica. In partic-
ular, each ARH replica has access to the distributed sequencer service (DSS) class
which is a distributed and fault-tolerant implementation of the persistent sequencer
service. This service returns an unique and consecutive sequence number for each
distinct client request through the invocation of the GetSeq() method. As shown
in Figure 2, each DSS component uses a GCT module as communication bus: in
particular our implementation is based on the usage of a "one-shot" total order mul-
ticast primitive. DSS uses this primitive to assign a sequence number to a client
request. In other words the DSS component manages a persistent state composed
of pairs client request, sequence numbers that allows, for example, to retrieve,
upon the failure of some component of the 3T architecture, a client request given a
sequence number or viceversa [13].
Filtering and Ordering (FO). FO is a message handler placed in front of each end-tier
replica (i) to ensure ordered execution of client requests according to the number
assigned by DSS to each request, and (ii) to avoid repeated execution of the same
client request (possibly sent by ARH).
Figure 3 illustrates a simple run of the protocol in which no process crashes. In this
scenario, client c1 invokes the method op1 . Upon receiving req1 , h1 invokes the DSS
component (GetSeq() method) to assign it a unique sequence number (1 in the example).
This method embeds an invocation to the total order primitive of the group toolkit. Then
h1 sends a message containing the pair 1, op1 to all end-tier replicas and starts waiting
for the first result from an end-tier replica. When this reply arrives, h1 sends, in its turn, a
reply message back to c1 . h1 discards further replies to operation op1 produced by end-
tier replicas (for simplicity these messages are not shown in Figure 3). Concurrently, h2
serves req2 sent by c2 . A prototype of the 3T architecture for software replication, namely
the Interoperable Replication Logic (IRL) [14], based on a CORBA infrastructure, has
been developed in our department. This prototype has been used as a testbed for the
performance comparison among the group toolkits presented in the next section.
A B
c1
c2
op1
op2
D
h1 GetSeq(op1)=1
C
h2 GetSeq(op2)=2
GCT
h3
GCT <2,res2>
<1,op1> <2,op2>
<1,res1>
r1 compute(op1) compute(op2)
r2 compute(op1) compute(op2)
r3 compute(op1) compute(op2)
4 Performance Results
In this section we describe the performance analysis we carried out on the current IRL
prototype. In particular, we first introduce the testbed platform and then explain the set
of experiments done.
Testbed Platform. The testbed platform is composed by eight Intel Pentium II 600Mhz
workstations that run Windows 2000 Professional as operative system. Each workstation
is equipped with Java 2 Standard Edition version 1.3.0 01 ([15]) and IONA’s ORBacus
4.1 for Java ([16]); also each PC is equipped with Appia version 1.5 ([8]), Spread version
3.16.2 ([7]) and JavaGroups 2.0.2 ([6]); the workstations are interconnected by a 100Mbit
Switched Ethernet LAN. The replicated CORBA object used for the experiments is a
simple hello-server that accepts requests and immediately replies. We measured the client
latency and the DSS latency varying the number of replicas (#R, varying in {2,4,6,8}),
the number of clients invoking the replicated server (#C, varying in {2,4,6,8}) and the
number of the ARH components (#ARH). As depicted in Figure 3, the client latency is
the time elapsed between points A and B, whereas the DSS latency is the time elapsed
between points C and D.
Table 2 shows the group toolkit configuration used during the experiments i.e., the
relative deployment of core and API, the total order protocol used and its type3 .
particular GCT has an important role in the client latency. In fact the latency introduced
by the DSS falls between 30% (Spread and Maestro/Ensemble) and 95% (JavaGroups
(TB)) of that experienced by the client. From Figure 4 it can be devised that different
Spread Appia (AB) Appia (SB) Spread Appia (AB) Appia (SB)
JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble
1000 1000
Client Latency (msec)
10 10
1 1
2 4 6 8 #R 2 4 6 8 #R
implementations of the same protocol (namely the total order primitive) can yield huge
differences in the performance of the toolkit. More specifically, JavaGroups (TB) gets
total ordering through the circulation of a token (as in TOTEM [4]) among the members
of the group. This of course introduces an additional latency, not present in JavaGroups
(SB) which is based on the notion of a sequencer i.e., each invocation of a total order
primitive is redirected to a coordinator of the group that orders them. Also Appia shows
different performance when using Appia (AB) or Appia (SB) to get total order5 . Finally,
the results also confirms that toolkits with a non-Java core has better performances. In
particular, Spread gives better performances than the two other Java toolkits, despite of
the collocation of it’s API and core in distinct processes.
Spread Appia (AB) Appia (SB) Spread Appia (AB) Appia (SB)
JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble
1000 1000
Client Latency (msec)
10 10
1 1
2 4 6 8 #ARH 2 4 6 8 #ARH
Experiment 3. In this experiment we evaluated the client and DSS latency as a function
of the number of concurrent clients. Therefore we set #R=2 (minimum fault tolerance),
#ARH=4, and varied #C in {1, 2, 4, 8}.
Figure 6 points out the experimental results. The client latency grows almost as the
DSS latency does until the number of clients reaches 4; then it roughly doubles, while the
DSS latency continues to increase smoothly. This is due to the additional synchronization
required within each ARH replica to serve multiple concurrent clients. With 8 clients,
indeed, each ARH replica receives requests from two clients, and the access to the DSS
component is serialized within each ARH. As a consequence, the GCT works in the
same conditions as there were only 4 clients. The little growth of the DSS latency is
due to the small growth of network traffic due to the additional replicas, and it is almost
independent from the toolkit, as confirmed in Figure 6(b). In contrast, synchronization
Spread Appia (AB) Appia (SB) Spread Appia (AB) Appia (SB)
JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble JavaGroups (TB) JavaGroups (SB) Maestro/Ensemble
1000 1000
Client Latency (msec)
100 100
10 10
1 1
1 2 4 8 #C 1 2 4 8 #C
within each ARH causes the doubling of the client latency with respect to that observed
with 4 clients. Therefore, overall performances are mainly influenced by the GCT when
Performance Analysis of Java Group Toolkits 59
the number of clients is less than or equal to the number of ARH replicas. Figure 6(b)
also confirms the gap between JavaGroups (SB) and other toolkits with a non-Java
core. Let us finally remark that in all the experiments Maestro/Ensemble outperforms
Java implementations, which lay on a virtual machine. However, pure Java GCTs like
Appia (SB) and JavaGroups (SB) can be configured to perform close to Spread, which
is comparable with Maestro/Ensemble in terms of efficiency.
5 Conclusions
In this paper we have first identified some architectural properties that influence the
performance of a GCT, and then we have compared the performances of three Java GCT
and a C++/OCAML toolkit, in the context of a three-tier software replication. Results
are promising, as the expected performance degradation due to Java can be considered
acceptable with respect to all the advantages due to the use of this language. However,
a gap with C++ still exists, and optimizations are required in order to reduce it.
References
1. Schneider, F.B.: Replication Management Using the State Machine Approach. In Mullender,
S., ed.: Distributed Systems. ACM Press - Addison Wesley (1993)
2. Budhiraja, N., Schneider, F., Toueg, S., Marzullo, K.: The Primary-Backup Approach. In
Mullender, S., ed.: Distributed Systems. ACM Press - Addison Wesley (1993) 199–216
3. Birman, K., van Renesse, R.: Reliable Distributed Computing With The ISIS Toolkit. IEEE
Computer Society Press, Los Alamitos (1993)
4. Moser, L.E., Melliar-Smith, P.M., Agarwal, D.A., Budhia, R.K., Lingley-Papadopoulos, C.A.,
Archambault, T.P.: The Totem System. In: Proc. of the 25th Annual International Symposium
on Fault-Tolerant Computing, Pasadena, CA (1995) 61–66
5. Vaysburd, A., Birman, K.P.: The Maestro Approach to Building Reliable Interoperable Dis-
tributed Applications with Multiple Execution Styles. Theory and Practice of Object Systems
4 (1998) 73–80
6. Ban, B.: Design and implementation of a reliable group communication toolkit for java.
Cornell University (1998)
7. Amir,Y., Stanton, J.: The Spread Wide Area Group Communication System. Technical Report
CNDS-98-4, Center for Networking and Distributed Systems, Computer Science Department,
Johns Hopkins University (1998)
8. Miranda, H., Pinto, A., Rodrigues, L.: Appia, a flexible protocol kernel supporting multiple
coordinated channels. In: Proceedings of The 21st International Conference on Distributed
Computing Systems (ICDCS-21), Phoenix, USA, IEEE Computer Society (2001) 707–710
9. Herlihy, M., Wing, J.: Linearizability: a Correcteness Condition for Concurrent Objects.
ACM Transactions on Programming Languages and Systems 12 (1990) 463–492
10. Marchetti, C.: A Three-tier Architecture for Active Software Replication. PhD thesis, Dipar-
timento di Informatica e Sistemistica, Università degli Studi di Roma “La Sapienza” (2002)
11. Baldoni, R., Marchetti, C., Tucci-Piergiovanni, S.: Asynchronous Active Replication in Three-
tier Distribuuted Systems. In: Proc. of the IEEE Pacific Rim Symposium on Dependable
Computing (PRDC02), Tsukuba, Japan (2002) to appear, also available as Technical Report
at http://www.dis.uniroma1.ir/∼irl.
60 R. Baldoni et al.
12. Baldoni, R., Marchetti, C., Termini, A.: Active Software Replication through a Three-tier
Approach. In: Proc. of the 22th IEEE International Symposium on Reliable Distributed
Systems (SRDS02), Osaka, Japan (2002) ,pp. 109–118
13. Baldoni, R., Marchetti, C., Tucci-Piergiovanni, S.: Fault-tolerant Sequencer: Specification and
an Implementation. In Ezhilchelvan, P., Romanovsky, A., eds.: Concurrency in Dependable
Computing. Kluwer Academic Press (2002)
14. IRL Project Web Site: (http://www.dis.uniroma1.it/∼irl)
15. Java Sun website: (http://java.sun.com)
16. IONA Web Site: (http://www.iona.com)
17. Birman, K., Joseph, T.: Reliable Communication in the Presence of Failures. ACM Transac-
tions on Computer Systems 5 (1987) 47–76
18. Dèfago, X.: Agreement-Related Problems: From Semi Passive Replication to Totally Ordered
Broadcast. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland (2000)
A Java-Based, Distributed Process Management
System for Collaborative Design and Manufacturing
1 2 3 4
Moon Jung Chung , Sangchul Kim , Hyun Kim , and Ho Sang Ham
Yongin, Korea
[email protected]
3 Concurrent Engineering Team, ETRI, 161 Kajong-Dong, Taejon, Korea
[email protected]
4 Mobile Distributed Processing Team, ETRI, 161 Kajong-Dong, Taejon,
Korea
[email protected]
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 61–72, 2003.
© Springer-Verlag Berlin Heidelberg 2003
62 M.J. Chung et al.
has been little follow-up effort in the commercial market, but our system paper
basically follows the architecture suggested by WfMC.
Many distributed process (or workflow) management system have been proposed
which integrate the works of different organizations [5, 11]. Most of the systems [9
17] use CORBA to provide the interoperability between remote applications. The
CORBA-based architecture enforces ORB to be installed at every site running
CORBA-based applications. This enforcement is a great obstacle to the development
of a portable client in the client-server style system, where the client is usually dedi-
cated to the user interface while the server (or servers) provides application logic. This
is the reason that MIDAS uses RMI (Remote Method Interface).
The core of the MIDAS is a process grammar, which provides the theoretical founda-
tion to represent, manipulate and execute design and manufacturing processes [1]. In
the MIDAS system, Design and Manufacturing processes are modeled as a collection
of tasks. Each task can be decomposed into a set of subtasks. Our process flow graph
depicts tasks, data and the relationships among them, describing the sequence of tasks
for a larger D&M activity. Four basic symbols are used to represent the process flow:
Logical Tasks (represented using oval), Atomic Tasks (two-concentric oval nodes),
Specifications (rectangular nodes) and Selectors (diamonds nodes). A logical task can
be decomposed into subtasks, while an atomic task cannot. A selector selects a speci-
fication or a parameter for a task. These elements can be combined into a process flow
graph using directed arcs to indicate the specifications used and produced by each
task.
Using these graphical elements, a process flow graph can be created in a top-down
fashion. Fig. 1 shows a simple outsourcing arrangement, where company A produces a
telephone by manufacturing all items, except the frame, and assembling all items, and
64 M.J. Chung et al.
company B manufactures the frame only. Process flow graphs can describe processes
in varying degrees of detail. The overall process can be seen because the details are
hidden within the tasks. A graph containing many logical nodes describes what should
be done without describing how it should be done (for example, specifying which
tools to use). Conversely, a graph in which all task nodes are either atomic or selec-
tors fully describes a complete methodology for a design.
The process grammar provides a mechanism for transforming a high-level process
flow graph into progressively more detailed process flow graphs. The grammar con-
sists of a set of production rules. A production is a substitution that permits the re-
placement of a logical task node with a flow graph that represents a way of performing
the task. If there are several production rules for a logical task, it implies that there are
alternatives for the task. This capability is critical to maintaining the usability and
effectiveness of the overall framework. Fig. 2 shows one production for “Item Pro-
duction” task. The process flow graph of Fig. 1 has been obtained from a process
flow graph by replacing “Item Production” task with the production of Fig. 2.
Fig. 3 illustrates the architecture of MIDAS. The system consists of various compo-
nents, and links external tools to design and manufacturing processes. The system is
partitioned into the client side and the server side. The database, the connection
server, the user information server, the site proxy server are in the sever side. The
cockpit is in the client side. The cockpit is a Java applet, increasing the portability,
and all other components are Java applications. All components communicate with
each other using RMI connections.
Cockpit: The cockpit controls the interactions between the user and the other com-
ponents. It allows multiple users to collaboratively edit and execute a process flow by
sharing the execution status and visualization data. The collaborative process execu-
tion is carried out in a way that each of collaborative cockpits executes some part of
the whole process. Also, the task manager is a part of the cockpit that schedules a
sequence of tasks and invokes each task. Pre-evaluation functions estimate the possi-
bility of success for each alternative. It calls post-evaluation functions to determine if
task’s outputs meet the specified requirements.
66 M.J. Chung et al.
XML is becoming the standard format for structured documents and data. MIDAS
utilizes XML to represent various data such as knowledge about process and task, the
run-time status of a process in execution, and the graphical information of the flow
graph. The first type of data is stored in the process database, but the second and third
types of data are stored in the workspace.
The process database keeps three types of XMLs (called PROCESS XML, TASK
XML, and PRODUCTION XML, respectively) that encode the knowledge about
processes and tasks. The PROCESS XML is to encode the top-level description of a
manufacturing or design process. The PRODUCTION XML is to encode a production
of a logical task. The TASK XML is a repository of the information about a task,
including constraints, a set of production names for a task if the task is a logical one,
and an external tool for accomplishing a task if the task is an atomic one. Using XML
enables collaborative companies to share this knowledge more easily since XML
allows common and simple items. XML makes it easy to integrate our system with
other XML-enabled manufacturing tools, where the number of such tools is expected
to grow rapidly. Moreover, the simple syntax of XML increases system scalability.
Enterprise systems are constantly evolving: new processes and products are added to
the system, and new technology makes old constraints obsolete. In a truly scalable
enterprise system, processes, tools, machines should be added in a modular way. New
information is simply added by introducing a new XML tag or a new attribute of an
existing XML tag.
A Java-Based, Distributed Process Management System 67
We use a three-step approach to generating XML schemas of the three XMLs from
process and task specifications. The first step is to capture and represent those specifi-
cations in form of conceptual UML diagrams. Those diagrams correspond to ER dia-
grams in the designing of the database. The second step is to convert the conceptual
diagrams to logical UML diagrams, and the last step is to convert the logical diagrams
to XML schemas. The motivation for the three-step approach is that instead of devis-
ing XML schemas directly from the specifications, the conceptual modeling is an easy
way to capture the complete knowledge about the process and to verify the correctness
of the captured knowledge.
The second and the last step can be carried out algorithmically. The conversion at
the second step can be done by an extended version of a method suggested by Rout-
ledge, et. al [14]. The extension is needed since their method does not support the
generalization relationship that is necessary for process grammar-based specifications.
The extension is simple in that a super-class is not mapped to a complex-type object,
but its attributes are reflected in the complex-type objects mapped to its sub-classes.
The conversion at the third step can be done by a method suggest by Booch, et. al [2].
For more detail of our UML/XML modeling, refer to [15].
The EX XML keeps track of the run-time status of a process from the start of execu-
tion. The information kept by the EX XML includes a list of productions ever used for
the elaboration of logical tasks, the readiness of inputs and outputs, whether a task is
being executed or not, etc. When the process flow graph gets elaborated with a pro-
duction, the elaboration is recorded in the EX XML. Thus, the EX XML keeps the
history of process execution, which is needed to find an unexplored alternative way
when a task fails in some way of execution. The GRAPH XML keeps only the current
status of a process in execution and the geometry of the process flow graph. The proc-
ess flow graph is visualized using a XSL style sheet. Each cockpit is allowed to apply
its own XSL style sheet for customized view.
The reasons for using XML to keep the execution status and visualization data of a
process are many-folded:
The simplicity of XML facilitates the development of external applications for
additional services, such as the analysis or monitoring of the result of process
execution.
The run-time status of a process is inherently represented hierarchically since
the elaboration of a process flow graph is hierarchically performed. That is,
after a task of a process flow graph is elaborated with a production, the tasks
of the production are elaborated afterwards. The hierarchical structure of
XML is suitable for representing the hierarchical information.
XML facilitates integrating MIDAS with other process management systems.
It will be easy to develop tools that convert the execution status of a process
into the formats understandable by those systems.
In MIDAS, there is a separate EX XML for each individual task. The EX XML for
the top-level description of a process has almost the same content as the one for a
68 M.J. Chung et al.
logical task. In their EX XMLs, an attribute, called status, for a constituent task exists
to tell the execution status of the task. For a logical task that has already been elabo-
rated, the productions that have been explored are also specified. If a logical task (i.e.
a production being selected) is failed, the failure and a new production, if any, are also
specified.
There is one GRAPH XML for each cockpit in collaboration. The GRAPH XML
serves as a mechanism for sharing the current status of a process in execution. After an
EX XML is updated by an operation (Apply or Rollback, to be mentioned later) at a
cockpit, the GRAPH XML of the cockpit is also updated accordingly and then sent,
through “communication servers”, to other collaborating cockpits.
The GRAPH XML needs to be constructed so as to protect the proprietary infor-
mation of companies from unauthorized use by hiding the detail information. MIDAS
allows two types of permission with respect to tasks: read and execute. The user with
“execute” permission with respect to a task can execute the task. A user with “read”
permission with respect to a task is able to view all the information on the task, in-
cluding how the task, if it is a logical one, has been elaborated and to access the out-
puts of the task. Before the communication server forwards a GRAPH XML to a
cockpit, it modifies the GRAPH XML in a way that the information on a task is
masked out if the user of the cockpit has no “read” permission with respect to the task.
For example, for such a logical task, the elaboration of the task is masked out so that
the task appears as one node in the process flow graph. That is, the user cannot see
how the task has been performed.
The cockpit finishes the execution of a process when an initial process flow graph gets
fully elaborated. Applications bound to atomic tasks and selector tasks are executed
during execution. The task scheduling of the thread for a logical task or the top-level
representation of a process is performed basically in a data flow fashion. That is, as
soon as all inputs of a constituent task are available, the task is scheduled to be ready
for execution. When the execution of a task is done successfully, all outputs of the
task become available to other tasks. The data flow style allows the concurrent execu-
tion of multiple tasks.
tion selection. That is, in manual mode, the user can determine which task to be exe-
cuted next, and select a production. However, in auto mode, these decisions are made
automatically by MIDAS.
Two styles of collaboration among cockpits are supported: screen sharing style and
client-server style. These two styles reflect the two way of collaboration among com-
panies. The screen sharing style is for CSCW (Computer Supported Collaborative
Work) style collaboration. Each user can view how other users perform their own
work. In the client-server style, a client cockpit sends a server cockpit a request for the
execution of a task. Then the sever cockpit performs the task, and notify the client
cockpit of the termination of the task. The client cockpit cannot see how the task has
been done and the server also cannot see the execution of all other tasks except for its
own task.
Since there can be more than one ready task (a task whose all inputs are available),
multi-threaded process execution is conceptually natural in that a thread takes care of
one ready task. Sometimes backtracking, called “rollback”, occurs to a previous point
of execution. In case of the failure of a task, rollback occurs to either its predecessor
or parent task. The thread of a task to which rollback occurs is asked to find the other
alternative to explore different way of execution. The failure of a task occurs when all
the alternative ways of executing the task have been explored but failed. For example,
when all the productions of a logical task have been explored but none have lead to a
fully elaborated flow graph, the task is said to fail. When a task fails, the thread for the
task dies without heavy interaction with other threads. When the task is considered for
execution again, a new thread is created for the task.
A thread does not die even though it has succeeded, because a rollback may
happen to the thread afterwards. In case of the failure of a thread, it is difficult to
determine to which rollback needs to occur if the thread has multiple predecessor
threads. Rollbacks cannot occur to those predecessor threads concurrently since those
threads do not share the internal detail of their work. For example, when a task with
two predecessor tasks fails, we need to ask the threads of two predecessors to find an
alternative way one after the other. Otherwise, it is very hard to enumerate all the
combinations of the alternatives that can be done by the two threads.
Fig. 4 illustrates the classes that implement the Cockpit. The key classes are
Apply, Rollback: An Apply operation is to execute a task. Apply operation to a
logical task performs the following operation: First evaluate the pre-evaluation
function of the logical task to select the best production of the task. Once a pro-
duction is chosen, the task is elaborated with the flow graph of the production. If
this operation is applied to an atomic task, the designated work, such as external
tool invocation, is performed on the task. A Rollback operation is the reverse of
70 M.J. Chung et al.
the Apply operation. When this operation is applied to a task, the task is forced to
fail. There are two choices of rollback points: predecessors and parents. In man-
ual mode, the user is able to select one of the two choices. In auto mode, the sec-
ond choice is made only if the task has no predecessor. Apply and Rollback op-
erations modify the EX XML and the GRAPH XML.
Shrink, Expand: There are two operations provided by the cockpit for enabling
the user to change the presentation of the process flow graph. They are Shrink and
Expand. Shrink is only applied to a logical task in the flow graph. The operation
updates the flow graph in a way that the detail process flow of a logical task is
hidden, and instead one task node is shown. Expand is the reverse of Shrink. It
replaces a shrunken node by its underlying flow graph. These two operations are
recorded in the GRAPH XML.
VisualCafe and IBM XML parser was used to implement MIDAS. MIDAS was ap-
plied to a gearbox manufacturing process and a pulley (a mechanical equipment)
manufacturing process. The process grammar was shown to naturally capture the pro-
cess knowledge since it facilitates the succinct representation of alternatives and ab-
straction. Support for manual mode together with auto mode of execution made engi-
neers explore the D&M space more efficiently. The cockpit, a Java applet, enabled
engineers to work continuously independently of their physical locations while they
visit other companies for collaboration and work there, resulting in the speed up of the
manufacturing process. MIDAS showed reasonable performance even in case of
many clients due to the distributed architecture.
7 Conclusions
Recently, system support for collaboration has been interesting research field in busi-
ness, design, and manufacturing. As more manufacturing devices become computers-
A Java-Based, Distributed Process Management System 71
controlled and the network technologies have advanced, the demand for a system that
facilitates collaborative design and manufacturing is increasing rapidly.
In this paper, we presented a process management system for collaborative design
and manufacturing, called MIDAS. The core of MIDAS is the process grammar,
which provides the theoretical foundation to represent, manipulate and execute D&M
processes. The contribution of this paper is to propose a methodology for developing
the process management system that efficiently supports the process grammar. Core
technologies employed by the methodology are Java, the distributed architecture and
XML.
The advantages of our system over previous ones are summarized as follows;
1. New process knowledge, external applications, and process engines (servers)
can be easily incorporated into the system transparently of their location. Since
the cockpit, the client-side component, is implemented as a Java applet, the
cockpit can be started from any location without the pre-installation of any
communication module.
2. MIDAS, with process grammar in its engine, supports the iterative nature of
engineering process and provides a layered approach to information modeling.
3. MIDAS supports a flexible work process. Tasks can be performed automati-
cally or manually, and alternative processes can be generated in mid-stream.
By supporting search for alternative process, rather than simply searching for
alternative design parameters, the system encourages truly a flexible approach
to design process management.
4. The use of XML facilitates the sharing of process knowledge by collaborative
organizations, system scalability, and the protection of proprietary process in-
formation from unauthorized access.
5. Our distributed task scheduling makes the architecture of our system more
modular. Little previous effort has been published on the distributed schedul-
ing in which rollback is considered for exploring the search space.
Our system can be viewed as an extension of agent-based approach. Each task be-
haves as an agent that carries out a specified task by selecting an appropriate tool
based on constraints, input data and assigned resources. If the result of the task does
not meet the requirements, the agent must select other alternative or input data.
References
4. Dornfeld D., Wright P. K., Wang F-C., Sheng P., Stori J., Sundararajan V., Krishnan N.,
and Chu C-H, “Multi-Agent Process Planning for a Networked Machining Service,” By
North American Manufacturing Research Conference, 1999.
5. Georgakopoulos, Diimitrios and Hornick, Mark, “An Overview of Workflow Manage-
ment: From Process Modeling to Workflow Automation Infrastructure”, Distributed and
Parallel Databases, 3, 119–153, 1995.
6. Ripeanu, M., Peer-to-Peer Architecture Case Study: Gnutella Network,
http://www.cs.uchicago.edu/files/tr_authentic/TR-2001-26.pdf
7. Shiau, J., Y., Ratchev, S., and Valtchanov, G., “Distributed Collaborative Design and
Manufacturability Assessment for Extended Enterprise in XML-based System, School of
Mechanical, Material, Manufacturing Engineering and Management”, Univ. of Notting-
ham, UK. Technical Report, 2002.
8. Lavana, H., Khetawat, A., Brglez, F., and Kozminski, K., "Executable Workflows: A
Paradigm or Collaborative Design on the Internet", Proceedings of the 34th ACM/IEEE
Design Automation Conference, June 1997.
9. Miller, J.A., Sheth, A.P., Kochut, K.J. and Wang, X., CORBA-Based Run-Time Archi-
tectures for Workflow Management Systems, Technical Report, Large Scale Distributed
Information Systems Lab, Dept of Computer Science, Univ. of Georgia.
10. Sadeh, N., Hildum, D., Laliberty, T., McANulty, J., Kjenstad, J., and Tseng, A., “A
Blackboard Architecture for Integrating Process Planning and Production Scheduling”,
Concurrent Engineering: Research and Applications, Vol 6, No 2, June 1998.
11. Papazolgou, M. P., Jeusfeld, M. A., Weigand, H., and Jarke, M., “Distributed, Interoper-
able Workflow Support for Electronic Commerce,” http//infolabwww.kub.nl/infolab, 2002,
12. Riemer, K., ebXML Business Process,
http://www.gca.org/papers/xmleurope2001/papers/s18-1.html
13. RosettaNet, http://www.rosettanet.org
14. Routledge, N., Bird, L., and Goodchild, A., “UML and XML Schema”, Thirteenth Aus-
tralasian Database Conference (ADC2002), Melbourne, Australia.
15. Quin, Y., Process Modeling and Presentation for Collaborative Design and Manufacturing,
master thesis (draft), Department of Computer Science and Engineering, Michigan State
University, 2002.
16. Workflow Management Coalition, http://www.wmfc.org.
17. Whiteside, R. A., Pancerella, C. M., Klevgard, P. A., “A CORBA-Based Manufacturing
Environment”, Hawaii International Conf on System Sciences, Maui Hawaii, January 7-10,
1997.
Structured Handling of Online Interface Upgrades in
Integrating Dependable Systems of Systems
1 Introduction
A “System of Systems” (SoS) is built by interfacing to systems which might be under
the control of organisations totally separate from that commissioning the overall SoS.
(We will refer to the existing (separate) systems as “components” although this must
not confuse the question of their separate ownership). In this situation, it is unrealistic
to assume that all changes to the interfaces of such components will be notified. In
fact, in many interesting cases, the organisation responsible for the components may
not be aware of (all of) the systems using its component. One of the most challenging
problems faced by researchers and developers constructing dependable systems of
systems (DSoSs) is, therefore, dealing with on-line (or unanticipated) upgrades of
component systems in a way which does not interrupt the availability of the overall
SoS.
It is useful to contrast evolutionary (unanticipated) upgrades with the case where
changes are programmed (anticipated). In the spirit of other work on dependable
systems, the approach taken here is to catch as many changes as possible with
exception handling mechanisms.
Dependable systems of systems are made up of loosely coupled, autonomous
component systems whose owners may not be aware of the fact that their system is
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 73–86, 2003.
© Springer-Verlag Berlin Heidelberg 2003
74 C. Jones et al.
involved in a bigger system. The components can change without giving any warning
(in some application areas, e.g. web services, this is a normal situation). The drivers
for on-line software upgrading are well known: correcting bugs, improving (non-)
functionality (e.g. improving performance, replacing an algorithm with a faster one),
adding new features, and reacting to changes in the environment.
This paper focuses on evolutionary changes that are typical in complex web
applications which are built out of existing web services; we aim to propose a
generally applicable approach. As a concrete example, we consider an Internet Travel
Agency (TA) [Periorellis & Dobson 2001] case study (see Figure 1). The goal of the
case study is to build a travel service that allows a client to book whole journeys
without having to use multiple web services each of which only allows the client to
book some component of a trip (e.g. a hotel room, a car, a flight). To achieve this we
are developing fault tolerance techniques that can be used to build such emergent
services that provide a service which none of its component systems are capable of
delivering individually. Of course, the multiplicity of airlines, hotel chains etc.
provides redundancy which makes it possible for a well-designed error-recovery
mechanism to survive temporary or permanent interruptions of connection but the
interest here is on surviving unanticipated interface changes. As not all the systems in
our system of systems are owned by the same organisation, it is inevitable that they
will change during the lifetime of the system and there is no guarantee that existing
clients of those systems will be notified of the change.
When a component is upgraded without correct reconfiguration or upgrading of the
enclosing system, problems similar to ones caused by faults occur, for example: loss
of money, TA service failures, deterioration of the quality of TA service, misuse of
component systems. Changes to components can occur at both the structural and
semantic level. For example changes of a component system can result in a revision of
the units in which parameters are measured (e.g. from Francs to Euro), in the number
of parameters expected by an operation (e.g. when an airline introduces a new type of
service), in the sequence of information to be exchanged between the TA and a
component system (e.g. after upgrading a hotel booking server requires that a credit
card number is introduced before the booking starts). In the extreme, components
might cease to exist and new components must be accommodated.
Although some on-line upgrading schemes assume that interfaces of components
always stay unchanged (e.g. [Tai et al. 2002]), we believe that in many application
areas it is very likely that component interfaces will change and that this will happen
without information being sent to all the users/clients. This is the nature of the Internet
as well as the nature of many complex systems of systems in which components have
different owners and belong to different organizations as shown in Figure 1. In some
cases of course, there might be an internal notification of system changes but the
semantics of the notification system might not be externally understood.
Although there are several existing partial approaches to these problems, they are not
generally applicable in our context. For example, some solutions deal only with
programmed change where all possible ways of upgrading are hard-wired into the
design and information about upgrading is always passed between components. This
does not work in our context in which we deal with pre-existing component systems
but still want to be able to deal with interface upgrading in a safe and reasonable
fashion. Other approaches that attempt to deal with unanticipated or evolutionary
Structured Handling of Online Interface Upgrades 75
Webservice
Interface
Fig. 1. UML Component diagram showing the component systems that make up the Internet
Travel Agency (TA). The grey areas indicate the fact that the component systems are under the
control of different organisations. A user is shown interacting with the Travel Agency Client
Side component that validates client side inputs and passes requests to the Travel Agency
Server Side component. The Travel Agent Server Side component handles each request by
invoking multiple Intermediate Interfacing Subsystems (IIS). Each IIS provides an abstract
service interface for a particular service type, for example the Flight Systems IIS provides an
abstract service interface for booking flights with systems such as AirNZ and KLM even
though each of these systems has different webservice interfaces.
Our overall aim is to propose structured multi-level mechanisms that assist developers
in protecting the integrated DSoSs from interface changes and, if possible, in letting
these DSoSs continue providing the required services.
2 System Model
Integrators compose a DSoS from existing components (systems) that are connected
by interfaces, glue code and additional (newly-developed) components where
necessary. An interface is a set of named operations that can be invoked by clients
[Szyperski 1997]. We assume that the integrators know the component interfaces.
Knowledge of the interfaces can be derived from several sources: interfaces can be
either published or discovered (there is a number of new techniques emerging in this
area), programmer’s guides, interfaces are first-class entities in a number of
environments such as interpreters, component technologies (CORBA, EJB), languages
(Java).
Besides integrators there are other roles played by humans involved in the composed
system at runtime, for example: clients of the composed system, other clients of the
components, etc.
We assume that component upgrade is out of our control: components are upgraded
somehow (e.g. off-line) and if necessary their states are consistently transferred from
the old version to the new version.
3 The Framework
during DSoS design or integration, the developer identifies errors that can be detected
at each level and develops handlers for them; if handling is not possible at this level,
an exception is propagated to the higher level and responsibility for recovery is passed
to this level. In addition to this general scheme, study of some examples suggests
classifications of changes which can be used as check lists.
In nearly all cases, there is a need for meta-information to detect interface changes.
Such meta-information is a non-functional description of the interfaces (and possibly
of their upgrades), which may capture both structural and semantic information. Some
languages and most middleware maintain structural meta-information, for example
Java allows structural introspection and CORBA supports interface discovery via
specialised repositories. However, at present there is little work on handling changes
to semantic meta-information.
Meta-information for a component includes descriptions of:
call points (interfaces), including input parameters (types, allowable defaults),
output parameters (types, allowable defaults), pre- and post-conditions,
exceptions to be propagated
protocols: the sequences of calls to be executed to perform specific activities (e.g.
cancel a flight, rent a car). A high-level scripting language can be used for this.
Interface changes can be detected either by comparing meta-description of old and
new interfaces or if a component supports some mechanism to notify clients of
changes. Another, less general, and as such less reliable, way of detecting such
changes is by using general error detection features (some reasonable run-time type
checking; pre- and post-conditions, or assertions of other types of checking
parameters in the call points; protective component wrappers, etc.).
The intention should be to associate a rich set of exceptions with structural and
semantic interface changes (changing the type of a parameter, new parameters,
additional call points, changing call points, changing protocols, etc.); this would allow
the system developers to handle them effectively.
changes of the protocols, re-ordering, splitting, joining, adding, renaming and the
removal of protocol events
change of the meta-description language itself (if components provide us with
such a meta-description of its interface)
raising of new exceptions, if the protocol changes then new exceptions may also
be raised during its execution.
To provide some motivational examples, consider the Travel Agent case study.
A very simple interface change is where the currency in which prices are quoted
changes. In this case, simple type information could show, for example, that the
TA system requires a price in Pounds Sterling and the Car rental is being quoted
in Norwegian Crowns. An exception handler can ask for a price in Euros which
might be countered with an offer to quote in Dollars. Note that this process is not
the same as reducing everything to a common unit (dollars?), finding agreement
earlier can result in real savings in conversions.
A previously functioning communication from the TA system to a hotel
reservation system might raise an exception if a previously un-experienced query
comes back as to whether the client wants a non-smoking room. Either of two
general strategies might help here: the query could come marked with a default
which will be applied if no response is given (an exception handler could accept
this option) or the coded value might (on request from the exception handler)
translate into an ASCII string which can be passed to the client for interpretation.
Some of the most interesting changes and incompatibilities are likely to be
protocol changes. An airline system might suddenly start putting its special offers
before any information dialogue can be performed; the order in which
information is exchanged between the TA and its suppliers of cars, flights etc.
might change. Given enough meta-information, it is in principle, possible to
resolve such changes but this is far more complex than laying out the order of
fields in a record: it is the actual order of query and response which can evolve.
In the extreme, the chosen meta-language might change. Even here, a higher-
level exception handler might be able to recover if the meta-language is from a
know repertoire.
When an airline ceases to respond (exist?) the TA system must cope with the
exception by offering a reduced service from the remaining airlines.
Communication with new systems might be established if there is some
agreement on meta-languages which can be handled.
In all of the above cases, the attempt is to use exception handling to keep the TA
system running. Of course, notification of such changes might well be sent to
developers; but the continuing function of the TA should not await their availability.
3.3.2 Multilevel Handling
Exceptions are propagated to a higher level if an exception is not explicitly handled or
an attempt to handle the exception fails. This leads to a recursive system structuring
with handlers being associated with different levels of a system. Possible handling
strategies are:
Structured Handling of Online Interface Upgrades 79
4 Representing Meaning
a request for a particular method (i.e. booking a flight) and further more infer what
parameters accompany this method and what is their meaning.
This process requires a well defined globally shared description of the domain in
which the SoS operates. Such a definition is usually called ontological definition and
the process is referred to as ontological modeling. Current developments in web
architectures and distributed systems are working towards communicating meta-data
information across components systems. XML for example allows us to define our
own tags in order to structure web pages and it is also widely used for structuring
SOAP [W3C-SOAP] messages sent to components systems (web services). XML
effectively allows the user to define its own tags the process of which is shared via a
common document type definition which in turn enables both client and servers to
interpret them. In order however to comprehend the semantics behind it then we need
human intervention. To put it simply; a user for a share price component that accepts
a string and returns the price of the share represented by the string may have a tag
type stock price. So within the tags <stockprice> and </stockprice> the price of the
share would be returned. Via XML we could communication between a provider and
a consumer that a certain tag is of type String or Integer but that does not encapsulate
its semantic information. This is where the resource definition framework (RDF)
[W3C-RDF 2000] may help as it provides a technique for describing resources on the
web. It is a framework for specifying metadata using XML syntax.
In conjunction with RDF interface descriptions could also bear and communicate their
semantic information. All these technologies and disciplines however have not yet
being put together for any meaningful application. They exist separately as hypertext
and TCP/IP for example existed before the Internet. Using the current API’s our
choices are limited. The next section presents a general framework for dealing with
interface upgrading and discusses the range of possible implementation technologies.
5 Interface Upgrading
In this section we provide a framework for implementing interface upgrading. The
problem addressed by this framework is how to detect if the interface offered by a
server has changed and how to dynamically load a new interface and continue
interacting with the server.
We assume that in our distributed system there is a client, server and registry. The
registry maintains a reference to the server, and a description of its interface. Note that
the description of the interface may contain both structural and semantic information
and we do not specify how this is represented.
When a client is implemented we assume that a copy of the server’s interface is stored
with it. This seems reasonable, as the client must make use of the interface in order to
implement its program. We also assume that someone ensures that the reference to the
server, and the description of the interface held at the registry is up to date. This may
be the responsibility of the server provider, the client themselves, or a third-party. We
also assume that the registry may be a separate entity or even collocated with the
server, what is important is its function rather than its location.
Structured Handling of Online Interface Upgrades 81
When a client interacts with a server we assume that it goes through the following
steps. The client contacts the registry and uses a symbolic name to look up both a
reference to the server and a description of its interface. The client compares the
description it retrieved from the registry to the cached description, if there is a
difference then an exception is raised locally and handled locally. Assuming that there
is no difference then the client invokes the operation on the server using the interface
and the server’s location. We assume that there may be a series of interactions and
this series of interactions may represent a long lived task. The long lived nature of the
task raises the possibility that the server may change its interface while the task is
active. In this case, we assume that not-understood exception is raised at the server
due to it no longer being able to understand the client’s invocation or a not-understood
exception is raised at the client due to the client no longer being able to understand the
server’s reply. This exception is handled locally at the client.
When the client or the server raises a not-understood exception then the following
steps take place. The client contacts the registry again and looks up the server’s
interface. The client then compares the (hopefully updated) interface with its cached
interface and raises an interface change exception.
Associated with each interface change exception there is an appropriate handler as
described in Section 2.3.3.2. We assume that some handlers will wish to still keep
interacting with the server but using the new interface, and therefore we require some
means to take the new interface and dynamically invoke operations using it. The
degree to which this is possible is directly related to the way that client-server
interaction is implemented in the target middleware.
There majority of approaches to implementing client-server interactions can be
classified using two dimensions: whether the client-server protocol sent across the
wire is fixed or not, and whether the middleware tries to make distribution transparent
or not. The following are examples classified using these two dimensions:
Fixed protocol for communication between clients and servers, the middleware
provides an interface for dynamically composing invocations using the protocol (for
example, CORBA’s Dynamic Interface Invocation facility which allows the client to
dynamically create invocations using the fixed protocol).
Potentially server-specific protocols for communication between clients and servers
but a fixed protocol description language for describing the server-specific protocols,
the middleware provides an interface for discovering the server-specific protocol and
dynamically composing invocations using the protocol (for example, SOAP [W3C-
SOAP] which allows the client to dynamically create invocations using server-specific
protocols that are expressed using a fixed protocol description language).
Fixed protocol for communication between clients and servers, the middleware hides
the protocol and allows communication with servers as if they were local objects (for
example, Java RMI [JAVA RMI] which provides a local proxy that implements the
fixed client-server protocol).
Potentially server-specific protocols for communication between clients and servers,
the middleware hides the protocol and allows communications with the servers as if
they were local objects (for example, Jini [Waldo 2001] with its “code-by-wire”
82 C. Jones et al.
approach where proxy objects that encapsulates the server-specific protocol are
downloaded dynamically from a type of registry).
Note that some middleware uses a combination of approaches. For example, the Java
webservices toolkit [JAVA-WEBSERVICES] allows the complexity of SOAP to be
hidden using proxies but also allows dynamic invocation if required. There also
extensions which complicate the picture. For example, Java RMI now provides the
ability to download proxies as required that can be invoked dynamically using the
standard Java reflection features since they are just local objects. Although this makes
Java RMI appear like Jini it still makes use of a fixed protocol for communication
between clients and servers.
Implementing our framework using middleware that provides transparent distribution
can only be done if the middleware exposes aspects of the how the transparent
distribution is achieved. Usually, transparent distribution is implemented through the
use of a local stub object that the client uses to communicate with a remote server as
if it was a local object. Should the server’s interface change then there must be a
facility of updating the local stub object that is used by the client. This requires that
the local stub object can be reloaded dynamically. In addition since the stub object has
changed then there must be some way to dynamically invoke its operations as there
may be new operations or the signature of the operations may have changed from
what existed when the client program was compiled. If any of these features is not
provided by the middleware then our framework cannot be fully implemented.
Our framework is most easily implemented using middleware that supports dynamic
invocation of server operations. It is harder to use because each invocation has to be
explicitly constructed but there are none of the problems of updating and reloading
proxies or requiring a feature allowing dynamic invocation of local objects.
In the next section we discuss our experiences with the Java RMI implementation,
which highlight some of the problems faced when using a middleware that only
supports transparent distribution.
This section discusses how some part of the general framework presented above is
being applied within our ongoing experimental work on implementing Internet TA
[Periorellis & Dobson 2001]. Current API's allow us to carry out some work towards
dealing with online dynamic upgrades, although there is significant work to be done
not just in programming terms at the application level but in terms of providing an
adequate API that would allow us to overcome certain technical difficulties.
Java RMI does not offer a full API for dynamic interfaces. However, it does support
dynamic invocation when used in conjunction with the standard Java reflective API.
The client does not need to maintain a local copy of a stub for a remote service, and
neither does it need to maintain a local copy of the interface for the remote service.
This is because Java RMI supports the automatic downloading of RMI stubs on
demand, and once the stub has been downloaded then the standard Java reflection API
can be used to discover and invoke the methods supported by the stub and therefore
Structured Handling of Online Interface Upgrades 83
the remote service. The limitation of this approach is that if the stub changes during
the lifetime of the client then a replacement stub cannot be downloaded. This is due to
caching at the client side, as the replacement stub has the same name as the original
stub then the cached copy is used instead of downloading the stub again.
The TA prototype is using Java RMI and the standard Java reflective API to
dynamically compose the emerging service out of participating components. As the
stubs can be downloaded and the interface of the stubs discovered at runtime this
allows the SoS to determine the composition of the emerging service at runtime. In
order to implement such a structure we need four machines: one to act as an RMI
server that accepts requests for component systems (e.g. playing a role of a KLM
server), a client (IIS in our case, see Figure 1) and a stub repository that makes the
stubs available via the network (this could be a web server or an anonymous FTP
server), and a machine that hosts the RMI registry. In our implementation we
maintain the stubs at the web server while the RMI server holds the actual
implementations of the component systems, supporting classes and the interface
description.
Each IIS only holds the names of the SoS component systems that it wraps. Each
name is a human readable, implementation-independent reference that is registered
with the RMI registry. This allows the location of SoS components to change without
forcing changes to the implementation of the IIS. When the IIS invokes a SoS
component service it queries the RMI registry for the stub that represents the SoS
component service. The stub is transparently downloaded from the stub repository by
the RMI infrastructure as the stub does exist locally. The IIS then uses the Java
reflection API to discover and invoke methods on the stub and via RMI the SoS
component system.
As each IIS provides a fixed abstract interface to the TA SoS then any changes to SoS
component systems are localised to the IIS. The TA SoS and, via the TA SoS, any
clients may be informed of unexpected changes to the component systems if extra
information that is not captured by the abstract interface is required in order to
complete a request. We foresee this being handled via our distributed exception
handling scheme.
We already have an initial prototype that does not deal with server upgrading which
can be accessed at http://ouston.ncl.ac.uk/main.htm. There are several avenues we are
exploring right now that would allow some handling of online dynamic upgrades to
SoS component systems. Although, changes to SoS component system interfaces that
take place during the lifetime of a IIS are not visible via changes to the stub we can
detect that some change has occurred by catching marshalling/unmarshalling or
connection refused exceptions that will be caused by an upgrade. At present the best
course of action that we can suggest is to restart the IIS and thereby force the local
copy of the stub for the SoS component system to be refreshed. Once it has been
refreshed then we can compare the interface of the new stub with a cached description
of the old stub, this would allow the exact nature of the change to be detected and the
appropriate handlers to be invoked. In this approach the actual stubs represent the
meta-information used for handling interface upgrades. Assuming that we can find
some technical solution to the caching problem then it would be possible to avoid
84 C. Jones et al.
restarting the IIS and therefore handling the effect of the upgrade would be more
transparent.
Under some assumptions (e.g. the registry is updated before the server has been
replaced with a new one) several scenarios are possible with respect to handling
interface changes. For example:
if a marshalling/unmarshalling exception is raised while accessing a KLM server
we force the refresh of the local stub for the KLM server and compare its
interface with a cached description of the KLM server in order to discover what
has changed.
if a connection refused exception is raised we can find out if we are trying to
access the server in the middle of upgrading by going to the registry. This case
clearly needs additional features because there is no guarantee that KLM updates
the registry and the server atomically.
Our experience shows that Java and the RMI architecture in particular are not the
most appropriate technologies for evaluating and implementing dynamic interface
updates even though additional features such as the Java reflection API can be used to
implement a limited form of dynamic interface discovery and remote. In particular,
they do not allow us to call an updated service as a means for handling because of
local caching of the stubs. By catching some RMI service exceptions we can infer that
a service upgrade has occurred and this can drive manual clearing of the cache via a
restart of the RMI client. Alternatively another way of handling such situation is to
exclude the upgraded service from the following execution until the client logs off.
There are two directions in which we can progress from here. The first one is to see if
we can modify the Java/RMI infrastructure to force local refresh of the stub cache.
The second one is to use modern Web technologies which offer much more flexible
features for on-line dealing with interface descriptions and provide dynamic discovery
and invocation as first-class features.
7 Related Work
The distributed computing community has considered the problems of maintaining
meta-information for service discovery within the context of loosely coupled
distributed systems such as DSoSs. Most middleware systems implement some form
of object trading service, for example CORBA has an Object Trader Service, Jini has
a Lookup Service, and .NET uses services provided by the Universal Discovery,
Description and Integration (UDDI) project. Furthermore recent developments
supported by the World Wide Web Consortium (W3C)2 include a number of XML-
base languages complementing UDDI and allowing Web service interfaces [W3C-
WSDL 2001] and business-level conversations supported by such services (e.g.
[W3C-WSCL 2002]) to be described. Object traders enable providers to advertise
2 http://www.w3.org/
Structured Handling of Online Interface Upgrades 85
8 Concluding Remarks
This paper has not proposed a totally general or efficient solution; our interest is in
providing a pragmatic approach that explicitly uses a fault tolerance framework. Our
work is motivated by real problems encountered when considering a case study where
mismatches due to evolution must be dealt with at runtime. Although there are some
86 C. Jones et al.
existing approaches to this problem we do not try to hide evolution from the
application developer but provide a framework for dealing with it dynamically.
References
[Amann et al. 2000] B. Amann, I. Fundulaki, M.Scholl. Integrating ontologies and thesauri for
RDF schema creation and metadata querying. International Journal of Digital Libraries, 3, 3,
pp. 221–236, 2000.
[Cristian 1995] F. Cristian. Exception Handling and Tolerance of Software Faults. In Lyu,
M.R. (ed.): Software Fault Tolerance. Wiley, pp. 81–107, 1995.
[Hruska & Hashimoto 2000] T. Hruska and H. Hashimoto (eds), Knowledge Based Software
Engineering, Ios Press June 2000.
[JAVA RMI] Java™ Remote Method Invocation (RMI), technical documentation, Sun
Corporation, http://java.sun.com/j2se/1.4/docs/guide/rmi/, last accessed October 2003.
[JAVA WEBSERVICES] Sun Microsystems Limited, “Web Services Made Easier”, technical
white paper, June 2002 revision 3, http://java.sun.com/xml/webservices.pdf
[Laprie 1995] J.-C. Laprie. Dependable Computing: Concepts, Limits, Challenges. Proc. of the
25th Int. Symposium On Fault-Tolerant Computing. IEEE CS. Pasadena, CA. pp. 42–54.
1995.
[Periorellis & Dobson 2001] P. Periorellis, J.E. Dobson. Case Study Problem Analysis. The
Travel Agency Problem. Technical Deliverable. Dependable Systems of Systems Project
(IST-1999-11585). University of Newcastle upon Tyne. UK. 37 p. 2001.
[Romanovsky & Smith 2002] A. Romanovsky, I. Smith. Dependable On-line Upgrading of
Distributed Systems. In Proc. of COMPSAC 2002. 26-29 August 2002, Oxford, UK. IEEE
CS Press. pp. 975–976. 2002.
[Szyperski 1997] C. Szyperski. Component Software. ACM Press. 1997.
[Tai et al. 2002] A.T. Tai, K.S. Tso, L. Alkalai, S.N. Chau, W.H. Sanders. Low-Cost Error
Containment and Recovery for Onboard Guarded Software Upgrading and Beyond. IEEE
TC-51, 2, pp. 121–137. 2002.
[Vanderperren 2002] W. Vanderperren. A Pattern Based Approach to Separate Tangled
Concerns in Component Based Development. Proc. of the 1st AOSD Workshop on Aspects,
Components, and Patterns for Infrastructure Software, held in conjunction with the 1st
International Conference on Aspect-Oriented Software Development. pp. 71–75. 2002.
[Waldo 2003] J. Waldo. The End of Protocols. October 2003.
http://developer.java.sun.com/developer/technicalArticles/jini/protocols.html
[Welch 2002] I. Welch. A Reflective Security Architecture for Applications. PhD Thesis.
Department of Computing, University of Newcastle upon Tyne (in preparation).
[W3C-RDF 2000] W3C. Resource Description Framework (RDF). RDF Specification
Development. 2000. http://www.w3.org/RDF/
[W3C-SOAP] Simple Object Access Protocol (SOAP) 1.1. W3C Note May 2000.
http://www.w3.org/TR/SOAP/.
[W3C-WSCL 2002] W3C. Web services conversation language (WSCL), version 1.0. The
World Wide Web Consortium, 2002. http://www.w3.org/TR/wscl10/
[W3C-WSDL 2001] W3C. Web services description language (WSDL), version 1.1. The
World Wide Web Consortium, 2001. http://www.w3.org/TR/wsdl
An Experience in Architectural Extensions:
Active Objects in J2EE
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 87–98, 2003.
c Springer-Verlag Berlin Heidelberg 2003
88 P. Inverardi et al.
The whole formalization and validation was carried out by using the Charmy
[4] environment. In Charmy the architectural formalization makes use of dia-
grammatic notations, Scenarios and State Diagrams, and SPIN [14] is the target
model checking engine. As a result of this work we realized that the approach
we followed could actually be generalized and used to support evolutions (ex-
tensions/refinements) in software architecture. We have also applied the whole
approach to EJB2.0 by proposing a new extension which uses its new features.
This represented a way to validate the approach and a way to compare the two
extensions. Summarizing, the contributions of the paper are the proposal of a
J2EE architectural extension for Active Objects and an approach (together with
a framework) to rigorously support architectural extensions. Our aim is to show
that practical architectural extension problems can be rigorously treated with
an acceptable degree of formality and state of the art tools. We agree that formal
models may be successfully used for analysis and model checking. However, our
approach tries to take a more liberal view in order to extend the support to
rigorous reasoning beyond the classical bounds of strict formal reasoning on a
model or on pairs of models. First, we do neither require to use a formal architec-
tural language nor to completely specify the architectural descriptions. We try to
exploit as much as possible the architectural artifacts, both static and dynamic,
that exist in a standard development process. Second, we want to address evo-
lution, that is we want to support extensions, and possibly refinements and thus
consistency among different model abstractions. Third we want to show that
rigorous reasoning can be supported on suitable portions of the specifications.
We show how it is possible, in practice, to focus only on the properties and the
specification portions relevant to the problem under consideration.
The paper is structured as follows. The next section outlines the approach
to architectural extensions we followed. In Section 3 the relevant features of the
Charmy environment are briefly summarized. Section 4 introduces J2EE, EJB1.1
and AO, and their formalization. Section 5 presents the architectural extensions
to obtain AO in EJB1.1 and discusses their validation. In Section 6 EJB2.0 is
considered and a different architectural extension is described and validated.
Conclusions, related and future work are reported in Section 7.
The problem we are going to manage may be outlined as follows (Figure 1). Given
a system, we describe its relevant properties using natural language, static dia-
grams (for static properties) and scenarios (for dynamic properties, like in other
approaches [12,9]). Only relevant properties are considered. This corresponds
to level 1 in Figure 1. Starting from this high level specification we can ac-
tually build a possible SA model encompassing both static (components and
connectors) and dynamic (state-based models) information. This is represented
in Figure 1 at the second level. The arrows linking backward the second level
to the first represent a conformance checking between the SA specification and
its reference architecture. We will use our framework to validate the SA con-
An Experience in Architectural Extensions: Active Objects in J2EE 89
EJB AO
S Specification Specification
p
e
level1
c Properties
Properties
formalization formalization
Conformance
drives drives Conformance
Checking
Checking
S level2
A
exception
avail
avail
return
return
a) b)
EJB Container
?invokeBC
?notavail ?invoke ?invoke !exception
!returnBC
Bean
EJB
EJB
Class
?avail
!exception
!checkres
Instance ?return
?invoke !invokeBC
RO HO
EJB !invoke !availBC
?exception ?availBC
RMI !return
il
va ?returnBC
?r
Client !avail
?a
et
ur
n
of control”. This definition is too abstract and vague for our purposes. Thus
we refined it in order to identify a well defined class of objects characterized
by the abstract property of having a thread of control. Our analysis resulted in
the following informal description of AO. It is the starting point to identify and
formalize the high-level specification corresponding to the first level in Figure
1. In particular, an AO must be able to concurrently run with respect to other
entities running in the system and, above all, with the client which requested
its services; it must have a well defined interface that is exposed to other com-
ponents and shows functionalities or services implementing the AO behavior; it
may have a per-client state that is held during the computations.
From these properties we can argue that in order to be an AO, a component
must support the asynchronous method invocation. Moreover, since Active Ob-
jects are a special class of components, we should consider that exported methods
may get parameters as input values and may return some results. The second
aspect is especially relevant since it introduces the issue of results handling.
Typically a client asynchronously invokes a method, receives back the control
and then runs in parallel with the called object. When the results have been
computed by the called object, the client must be able to retrieve these values.
This means that an AO model must provide a way to manage this situation.
AO1. An AO must be able to concurrently run with the client which requested
its services. This property is needed to achieve the asynchronous method
invocation. The scenario pictured Figure 4.a) shows the parallel computation
between a Client and an Active Object.
Active Active
Client Client
Active Object Object
Client invoke invoke
Object
invoke
return return
checkres checkres
notavail avail
a) b)
S3
?checkres
!notavail
?invoke
!return S0 S1
!return
?invoke !return ?invoke
S0 S1 S2 S4 S6
?checkres
!notavail
!avail
?invoke S3 S2
!avail
S5
a) b)
?invoke !invokeEJB
!get
!return !enqueue
AO !setres ?availEJB
Serializer ?returnEJB
Proxy
Queue a) Proxy
c) Scheduler
Future Scheduler ?enqueue ?enqueue
!notavail
?checkres !avail
!notavail ?checkres
PO
?setres ?get ?get
d) Future b) Queue
?setres
them into the “Future”. This component can be queried for results availability.
This model has been checked proving its conformance to the identified high-level
properties. It makes us confident that this architecture is a valid implementation
of the AO higher-level architecture defined in 4.2.
We start from the AO architecture (Figure 6). In this architecture there is a Pas-
sive Object component. Since EJB1.1 components expose the same properties
of a Passive Object [11], we replace the PO component in the AO architecture,
with the EJB1.1 component. Figure 7 shows the integrated architecture topol-
ogy, that will be denoted as EJB1.1+AO. From the dynamic point of view, its
components behave similarly to components in the EJB1.1 (Figure 3) and AO
(Figure 6) architectures. The new model only differs because, the Client sends
the invocation to the Proxy (instead of the Remote Object) and the EJB com-
ponent is directly used by the Scheduler, instead of the Client. Moreover, new
behaviors may appear: the Scheduler in this new architecture has to manage also
additional behaviors due to the EJB integration.
AO
EJB
Scheduler Queue
Future Proxy
Extension
Client
Running SPIN, the four scenarios formalized in Section 4 have been proved
to be correctly implemented by our EJB1.1+AO model. For implementation
reasons, we preferred to check the negation of some scenarios. In particular, we
asked to SPIN to check if scenario AO1 is never verified. The output is “claim
violated” that means that scenario AO1 is possible. In the next section, we
reconsider the extension problem in the EJB2.0 context.
The current EJB specification (EJB2.0) [1] has introduced many new features,
including a way to perform asynchronous method invocation. To accomplish this
task a new component type has been added: the Message Driven Bean (MDB).
MDBs act as message queue listeners implemented through the Java Messaging
Service (JMS). MDBs do not have neither a Remote Object nor a Home Object.
The only way for a client to interact with a MDB is to send a message to the JMS
(Figure 8). MDBs behaves like Stateless Session Beans. The integration between
the container and the JMS enables the notification to the MDB of the arrival of a
message on the queue the MDB is listening on. A special method ”onMessage” is
called and the message sent by the client, is passed as a parameter. This solution
enables the asynchronous invocation of a service implemented using an MDB.
Once a client has sent a message, it can continue its activity independently from
the notification process and from the execution of the service implemented with
the ”onMessage” method. Intuitively, we could argue that MDBs may provide
a feasible solution to the asynchronous method invocation problem. However,
MDB components present some limitations which violates the Active Object
properties: statelessness (components do not have a per-client state), lack of
trasparency (MDB services are invoked in a different way with respect to EJB
components), communication problems (i.e. there is not a clear way to handle
parameter passing and results recovery). Due to the enhanced EJB2.0 features,
we will try to directly map the AO architecture components to the EJB2.0 archi-
tecture components: the JMS behaves as a Queue; the Scheduler functionalities
are shared between the container-JMS integration (message retrieval) and the
MDB component (actual service invocation); the Passive Object will be an EJB
component; the Proxy is implemented by stateless EJB components which will
expose the same interface of the Passive Object. It takes care of encoding the
invocation to a message which is delivered to the JMS; the Future is replaced
by entity bean components which will store the results for a later retrieval. The
EJB2.0+AO architecture is shown in Figure 8. This extended architecture meets
the Active Object properties but, checking its correctness with respect to the
EJB properties, SPIN found an error trail: when a client makes two different
asynchronous invocations (to the same component) and the second one is run
before the ending of the first one, an exception is raised. The SPIN error trail
puts in evidence that this unexpected behavior is due to Container which runs
MDBs in parallel leading to the concurrent component invocations violating the
An Experience in Architectural Extensions: Active Objects in J2EE 97
EJB Container
Client
Client
property EJB1 in Section 4.1 (Figure 2.a). This result suggests that even in the
enhanced EJB2.0 context the first solution we proposed still remains preferable.
7 Conclusions
We have presented the result of an experiment on architectural extension. The
achieved results are two. The description of an extension which is a correct real-
ization of its specification and the definition of an approach, and of a supporting
framework, to carry out architectural extensions in a correct way. The approach
rigorously defines what we mean for architectural specifications, at different lev-
els of abstractions, and what we mean for correct realization. Moreover it shows
that it can be supported by tools. We have experimented our ideas in the context
of the Charmy environment but the approach we propose is obviously indepen-
dent from the specification formalisms used and from the checking techniques.
Future work will go in two directions. We are trying to experiment with different
kind of high level properties or requirements; typically we would like to address
non functional requirements like security. From the framework side we want to
make the whole framework more integrable with standard development envi-
ronments, notably UML based. Since the paper addresses several topics, many
work may be related to it: architectural refinements, architecture evolutions on
the SA side; model checking and view consistency on the conformance checking
side. Other papers have been proposed to formally specify and analyze the EJB
architecture. Since we do not have space here to discuss these related work, we
direct interested readers to [3].
References
1. L. G. DeMichiel, L. Ümit Yalçinalp, and S. Krishna. Enterprise JavaBeansTM
Specification, Version 2.0. On-line at
<http://java.sun.com/products/ejb/ docs.html>, year2001.
2. G. J. Holzmann. Design and Validation of Computer Protocols. Prentice Hall,
1991.
3. P. Inverardi, F. Mancinelli, H. Muccini, and P. Pelliccione. An Experience in
Architectural Extensions: Active Objects in J2EE. In Technical Report, University
of L’Aquila, year 2002. On-line at
<http://www.henrymuccini.com/publications.htm>.
4. P. Inverardi, H. Muccini, and P. Pelliccione. Automated Check of Architectural
Models Consistency using SPIN. In the Automated Software Engineering Confer-
ence Proceedings (ASE 2001). San Diego, California, November 2001.
5. P. Inverardi, H. Muccini, and P. Pelliccione. Checking Consistency Between Ar-
chitectural Models Using SPIN. In TR 02/01, University of L’Aquila. On-line at
<http://www.henrymuccini.com/publications.htm>, year 2001.
6. P. Inverardi, H. Muccini, and P. Pelliccione. Checking Consistency Between Ar-
chitectural Models Using SPIN. In Proc. the First Int. Workshop From Software
Requirements to Architectures (STRAW’01), year 2001.
7. T. Jenkinson. The Active Object Design Paradigm. On-line at
<http://www.cs.ncl.ac.uk/people/t.j.jenkinson/home.formal/activeobjects.html>.
8. N. Kassem and the Enterprise Team. Designing Enterprise Applications with
the JavaTM 2 Platform, Enterprise Edition. On-line at <http://java.sun.com/
j2ee/download.html>, year 2000.
9. R. Kazman, L. Bass, G. Abowd, and M. Web. Saam: A method for analyzing the
properties of software architectures. Proceedings of ICSE 16, Sorrento, Italy:81–90,
May 1994.
10. R. G. Lavender and D. C. Schmidt. Active Object - An Object Behavioral Pat-
tern for Concurrent Programming. Proceeding fo the Second Pattern Languages of
Programming conference, Monticello, Illinois, 1995.
11. V. Matena and M. Hapner. Enterprise JavaBeansTM Specification, v 1.1. On-line
at <http://java.sun.com/products/ejb/docs.html>, year1999.
12. J. Ryser and M. Glinz. A Practical Approach to Validating and Testing Software
Systems Using Scenarios. QWE’99: Third International Software Quality Week
Europe, Brussels, Nov 1999.
13. B. Shannon. Java 2 Platform Enterprise Edition Specification, v1.3. On-line at
<http://java.sun.com/ j2ee/>, year 2001.
14. SPIN. Home page on line at:
<http://cm.bell-labs.com/cm/cs/what/spin/ index.html>.
Generating Pattern-Based Web Tutorials for Java
Frameworks
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 99–110, 2003.
© Springer-Verlag Berlin Heidelberg 2003
100 M. Hakala et al.
2 Patterns as Interfaces
well understood, but we argue that the conventional service-oriented interface concept
can support this only in a limited way.
UML has introduced the notion of a collaboration as a "society of roles and other
elements that work together to provide some co-operative behavior" [2]. We make use
of a similar collaborative interface concept that we call a pattern interface, or simply a
pattern. A pattern is a collection of roles for various language structures (classes,
methods, attributes etc.), and a set of constraints on the roles. Each role is a slot that
can be bound to an actual instance of a program element in a system: a class role is
bound to a particular class, a method role is bound to a particular method, an attribute
role is bound to a particular attribute etc. The constraints can specify the cardinality of
the source structures bound to a role, or the relationships between the source structures
bound to different roles. Similar notions of a pattern have been used and formalized
elsewhere, too (e.g., see [15]). The structural relationships of classical design patterns
[5] can be presented using such a pattern concept, but note that our pattern concept
does not imply any particular run-time behavior for the source structures.
As an example of a pattern, consider a framework containing a base class that is as-
sumed to be subclassed by the application. In the subclass, a particular method of the
base class is assumed to be overridden. Further, the subclass is assumed to be instanti-
ated in the "main" method of the root class of the application, taking care of the ini-
tialization of the application. These are closely related tasks that have to be reflected
in the specification of the specialization interface of the framework. We can specify
this part of the interface as a pattern consisting of the following roles: base class,
method (in base class), derived class, method (in derived class), root class, method (in
root class), and creation code (in method of root class). These roles have several con-
straints concerning the actual source elements bound to the roles. For example, certain
source elements are required to be located within other elements, the derived class is
required to inherit the base class, the method in the derived class is required to over-
ride the method in the base class etc.
In the context of frameworks, the strength of the pattern concept is that it allows
various levels of binding. At the most abstract level, none of the roles in the pattern is
bound to a concrete program element. This kind of pattern represents an abstract ar-
chitectural solution without references to actual systems. For example, an abstract
pattern might specify the idea of extending a framework by overwriting a method in
an application-specific subclass, without referring to any particular framework. At the
next level, a certain subset of the roles can be bound to concrete program elements.
Such a pattern (instance) represents an interface a software system offers for other
systems to hook up with it. In the case of a framework, a partially bound pattern speci-
fies part of the specialization interface of the framework, fixing the roles that the ele-
ments of the framework play in the specialization but leaving the application-specific
roles open. Finally, in some pattern (instance) all the roles of the pattern may be bound
to actual program elements. In this case the pattern specifies either a fully imple-
mented specialization, or an architectural solution occurring completely within the
framework. For a more detailed description of a pattern, see [7].
We have investigated several real frameworks when extracting their specialization
interface as patterns [6, 9, 17]. Even in middle-sized frameworks consisting of several
Generating Pattern-Based Web Tutorials for Java Frameworks 103
hundreds of classes the number of patterns appears to be relatively small, typically 10-
20. This is due to the fact that usually a pattern represents a major feature required for
the application, and typically a framework is prepared for a fairly limited number of
such features: it is the variance within the features that gives the frameworks their
expressive power rather than the number of features. On the other hand, those patterns
are fairly sizable, consisting of tens of roles.
3 Fred Environment
Fred (FRamework EDitor for Java) [6, 7] is a tool for generating an architecture-
driven software development environment for a given framework, based on a specifi-
cation of the framework’s specialization interface as patterns. Fred supports the
framework specialization process by guiding the application developer through a task
list based on these pattern definitions. Fred keeps track of the progress of the tasks,
verifying that the requirements of the framework architecture are followed as required
in the patterns. Roughly speaking, Fred generates a task for any role binding that can
be created at that point, given the bindings made so far.
Fred is unique in that the accompanying textual documentation is automatically
adapted to the application context (e.g. application-specific names and terms), and that
the integrated Java editor is "architecture-sensitive": it immediately notifies the user of
an architectural conflict. In a sense, Fred can be compared to language-sensitive edi-
tors, but in Fred the rules to be checked are those of the architecture, not those of the
underlying programming language.
In Fred, the pattern concept is called a specialization pattern. This is essentially a
concrete realization and extension of the pattern concept discussed in the previous
section. As a pattern, a specialization pattern is given in terms of roles, to be played by
(or bound to) structural elements of a program, such as classes or methods. The same
role can be played by a varying number of program elements. This is indicated by the
multiplicity of the role; it defines the minimum and maximum number of bindings that
may be created for the role. A single program element can participate in multiple
patterns.
A role is always played by a particular kind of a program element. Consequently,
we can speak of class roles, method roles, field roles etc. For each kind of a role, there
is a set of properties that can be associated with the role. For instance, for a class role
there is a property "inheritance" specifying the required inheritance relationship of
each class associated with that role. Properties like this, specifying requirements for
the concrete program elements playing the role are called constraints. For example, a
simple inheritance pattern might consist of roles Base and Derived, with a constraint
stating that the class bound to Derived must inherit the class bound to Base. Another
constraint might state that the program element bound to a particular role must contain
the element bound to another role; we call this a containment constraint. It is the duty
of the tool to keep track of broken constraints and instruct the user to correct the situa-
tion. Other properties affect code generation or user instructions; for instance, most
104 M. Hakala et al.
role kinds support a property "default name" for specifying the (default) name of the
program element used when the tool generates a default implementation for the ele-
ment.
The central part of the user interface of the Fred environment shows the current
bindings of the roles for a selected pattern, structured according to the containment
relationship of the roles. Since this relationship corresponds to the containment rela-
tionship of the program elements playing the roles, the given view looks very much
like a conventional structural tree-view of a program. In this view, a red spot marks
undone mandatory tasks, optional tasks are marked with a white spot. The actual to-do
tasks are shown with respect to this view: for each bound role selected from the view,
a separate task pane shows the tasks for binding the child roles, according to the con-
tainment relationship of the roles. The user interface of Fred is shown in Figure 1.
Integrated
Java editor
Pattern
instances
Task list
Role Instructions
bindings for task
The application is built following the tasks generated by the tool. The tasks can be
carried out by indicating the existing program element that plays the role, by asking
the system to generate a default form for the bound element as specified by the pat-
tern, or simply by typing the element using the Java editor and then binding it to the
role. The system checks the bound element against the constraints of the pattern and
generates remedial tasks if necessary. The task list evolves dynamically as new tasks
become possible after completing others. The application programmer immediately
sees the effect of the actions in the source code. Each individual task can be cancelled
Generating Pattern-Based Web Tutorials for Java Frameworks 105
and redone. Hence the development process becomes highly interactive and incre-
mental, giving the application programmer full control over the process.
An important feature of Fred is its support for adaptive user guidance: the task
prompts and instructions are dynamically customized for the particular application at
hand. This is achieved by giving generic task title and instruction templates in the
pattern specifications, with parameters that are bound at pattern instantiation time. The
actual parameters can be taken, for instance, from the names of the concrete program
elements bound so far to the roles of the pattern.
This form of framework documentation resembles the way the tutorials are con-
ventionally written. New things are introduced gently one thing at a time in a logical
order. This includes any exemplar code, which is typically provided in fragments
instead of a revealing it all at once. Hence, using Fred to store the documentation
process could be characterized as automatic generation of a tutorial for the framework.
Using Fred for several specializations would result in tutorials for different kinds of
example applications.
To support adoption of open-source frameworks, the documentation needs to be ac-
cessible on the web even prior to downloading the framework itself. This can be im-
plemented using standard Java technologies. E.g., the documentation can be accessed
by a Java applet contacting a Java servlet. The applet provides thus the user interface
for the documentation player application, whereas the servlet would have access to the
specialization data exported from Fred. We follow this scheme in the current imple-
mentation.
The user interface of the player application is presented in Figure 2. It imitates the
essential parts of the original Fred user interface, but allows only play and playback of
the development process. Forward-button brings the user to the next specialization
step. The selection is changed in the binding view to the binding that has been created
in the step, or to the manual change that took place. Similarly, the description pane on
the right is updated to show the documentation for the step, and the source code view
highlights the piece of code that was generated or modified. Hence, it is possible to
observe the specialization of an example application step by step. With each button
Generating Pattern-Based Web Tutorials for Java Frameworks 107
click, the documentation grows incrementally, allowing the user to inspect all the
previous steps. Similarly, the user is able to browse all the source code that has been
produced at that point of specialization.
The steps are displayed as a tree that promotes the understanding of the recurring
form that takes place in specialization of the particular framework. Our intention is
that the user understands the framework specialization as an instantiation of a well-
defined pattern, instead of an unrelated set of service method invocations. The tool
does not explicitly teach the abstract form that takes place between the framework-
application-boundary, but builds the big picture by using examples. This makes use of
the developer’s inherent ability to abstract – a way that has proven to be the easiest
way to adopt complex structures.
Framelets are small frameworks used as customisable building blocks for applications
rather than application skeletons [14]. In contrast to application frameworks, framelets
have very focussed functionality. The Red framework is a simple framelet used to
demonstrate Fred [7]. Here we will use this framework to illustrate our ideas. Al-
though this framework is so simple that it hardly needs extensive tutorials, it serves the
purpose of making the discussion more concrete.
Red provides user interface facilities to maintain a list of Record-objects and to edit
their fields. Typically, the Red framelet is used by deriving a new Record subclass
with some application-specific fields. Once the application developer has created this
new record type, the framelet provides facilities to automatically generate dialogs to
set the values of the instantiated Record-objects.
The central concept of the Red framework is a record. A record is an arbitrary class
with some member variables exposed to the framework as editable fields. In the ex-
ample application, new Host records (we assume the application developer wants to
create a list-box for ftp hosts) are created by the HostManager class that implements
the RecordFactory interface. Each record object must implement the fields() method
to create an adapter object for each field that it wants to expose to the framework.
Although not assumed by the framework, the adapters are most conveniently imple-
mented as anonymous inner classes, declared directly in the fields() method.
From the application developer's standpoint, one of the specialization problems is
how to create a new record type that complies with the framework conventions. The
framework expert, in turn, has identified this request as a specialization goal pursued
by the application developer. We assume that the framework expert has used the Fred
environment to create a specialization pattern called RecordTypes to guide the appli-
cation developer through the specialization process for this goal.
Assume further that the ftp application has been chosen as the basis of the tutorial.
The framework expert can then produce the ftp application using Fred in the recording
mode. As a result, Fred generates a full trace of the specialization process. This trace
can then be made available for future users of the Red framelet through the player
application.
108 M. Hakala et al.
In Figure 2, a screenshot was presented for a situation where the application devel-
oper has reached a point where an address field has been added for a Host. The next
step is to create an adapter object for the address field. When the developer clicks the
forward button, the screen in Figure 3 is displayed. On the left, a small arrow indicates
the associated Java element for each step while the role-specific documentation and
source code are shown on the right. The documentation can contain links to more
detailed Java documentation. It uses application-specific terms, like “address”, which
were obtained when the pattern was used and recorded. Also, the associated source
code was given or generated during the original pattern usage.
6 Related Work
framework as well, this kind of facility can significantly support the distribution and
acceptance of open source systems.
Thomas Vestdam suggests a system using elucidative programming, a variation of
literate programming to provide program tutorials [16]. He describes a set of tools for
creating and maintaining these tutorials. Created program tutorials can be viewed in a
web browser and they provide hyperlink navigation between documentation, source
code extracts and the actual program code. A special-purpose language called Source
Code Extraction Language is used in describing source code fragments that should be
included in the documentation. Whereas our approach provides dynamic visualization
of actions recorded during the framework specialization process, Vestdam’s system
presents a technique to produce and maintain hyperlinked, static documentation.
7 Discussion
References
[1] Bosch J.: Design and Use of Software Architectures - Adopting and evolving a product-
line approach. Addsion-Wesley 2000.
[2] Booch G., Rumbaugh J., Jacobsen I.: The Unified Modeling Language User Guide.
Addison-Wesley, 1999.
[3] Clements P., Northrop L.: Software Product Lines - Practices and Patterns. Addison-
Wesley 2002.
[4] Fayad M.E., Schmidt D.C., Johnson R.E.: Building Application Frameworks — Object-
Oriented Foundations of Framework Design. John Wiley & Sons, 2000.
110 M. Hakala et al.
[5] Gamma E., Helm R., Johnson R., Vlissides J.: Design Patterns — Elements of Reusable
Object-Oriented Software. Addison-Wesley 1994.
[6] Hakala M., Hautamäki J., Koskimies K., Paakki J., Viljamaa A., Viljamaa J.: Annotating
Reusable Software Architectures with Specialization Patterns. In: Proceedings of the
Working IEEE/IFIP Conference on Software Architecture (WICSA'01), Amsterdam,
August 2001, 171–180.
[7] Hakala M., Hautamäki J., Koskimies K., Paakki J., Viljamaa A., Viljamaa J.: Generating
application development environments for Java frameworks. In: Proceedings of the 3rd
International Conference on Generative and Component-Based Software Engineering
(GCSE'01), Erfurt, Germany, September 2001, Springer, LNCS2186, 163–176.
[8] Hammouda I., Koskimies K.: Generating a Pattern-Based Application Development Envi-
ronment for Enterprise JavaBeans. In: Proc. COMPSAC 2002, Oxford, England, August
2002.
[9] Hautamäki J.: Task-Driven Framework Specialization - Goal-Oriented Approach. Licenti-
ate thesis, Report A-2002-9, Department of Computer and Information Sciences, Univer-
sity of Tampere, 2002.
[10] Johnson R.: Documenting Frameworks Using Patterns. In: Proc. of OOPSLA'92, Vancou-
ver, Canada, October 1992, 63–76.
[11] Meusel M., Czarnecki K., Köpf W.: A Model for Structuring User Documentation of
Object-Oriented Frameworks Using Patterns and Hypertext. In: Proc. of ECOOP ‘97,
LNCS 1241, 496–510.
[12] Meyer B.: Object-Oriented Software Construction. Prentice-Hall 1988.
[13] Østerbye K., Madsen O.L., Sandvad. E., Bjerring C., Kanmeyer O., Skov S.H., Hansen
F.O.: Hansen F., Documentation of Object-Oriented Systems and Frameworks, COT/2-42-
V2.4, Centre for Object Technology, Danmark, 2000.
[14] Pree W., Koskimies K.: Framelets - Small is Beautiful. In: Fayad M., Schmidt D., Johnson
R. (eds.): Building Application Frameworks - Object-Oriented Foundations of Framework
Design. Wiley 1999, 411–414.
[15] Riehle R.: Framework Design — A Role Modeling Approach. Ph.D. thesis, ETH Zürich,
Institute of Computer Systems, February 2000.
[16] Vestdam T.: Generating Consistent Program Tutorials, NWPER ’02, Copenhagen, August
2002.
[17] Viljamaa A.: Pattern-Based Framework Annotation and Adaptation - A Systematic Ap-
proach. Licentiate thesis, Report C-2001-52, Department of Computer Science, University
of Helsinki, 2001.
Massively Distributed Virtual Worlds:
A Framework Approach
MaDViWorld: A Java Software Framework for Massively
Distributed Virtual Worlds
University of Fribourg
Department of Informatics
Rue P.-A. de Faucigny 2
CH-1700 Fribourg
Switzerland
[email protected]
http://diuf.unifr.ch/˜fuhrer/
Abstract. The aim of this paper is to briefly present the general con-
cept of virtual worlds and then to focus on distributed and decentralized
ones. MaDViWorld is a Java framework for massively distributed vir-
tual worlds. We first present its software architecture and then discuss
some of its specialized features, namely: the object structure and the
distributed event model. These are the main aspects that evolved since
the first version of the framework. To conclude, some example objects
and further potentialities of the framework are discussed.
1 Introduction
1.1 The Virtual World Paradigm
The document paradigm is well-known in today’s Internet technology: docu-
ments are made available on one or several servers and client applications (e.g.
Web browsers) are used in order to interact with them. The underlying metaphor
is the one of a huge cross-referenced book where each user browses through the
pages totally unaware of other users performing the same task at the same mo-
ment. All actions are asynchronous and, thus, there is no need for a central server
to coordinate user interactions with the pages of the book or to take care of an
event redistribution mechanism.
Within the virtual world paradigm, multiple users and active objects interact
in the same space. Therefore they have a direct impact on each other. Within
such systems, if a user interacts with an object, the other connected users can
see her and start a dialog with her. Moreover, it is possible for a user to modify
some properties of the world and all the other users present in the same subspace
(e.g. the same room) must immediately be made aware of it. Examples of the
virtual world paradigm range from simple graphical chat to sophisticated 3D
virtual worlds used for military simulations.
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 111–121, 2003.
c Springer-Verlag Berlin Heidelberg 2003
112 P. Fuhrer and J. Pasquier-Rocha
For a good comprehension of the present paper, the following four terms need
to be briefly explained (for more details the interested reader is refered to [6]):
Sylvia Hans
James
TicTacToe
R2 R1
corridor corridor
First of all, it is important to notice that it is out of the scope of this paper to
present the framework in all its details. This paper explains the concept of using
a framework approach at a rather high level of abstraction and then concentrates
on some critical points, that have been solved.
More detailed information is available in [6] and in [7], which describe the
first version of the framework. Furthermore, an Object Programmer’s Guide is
reachable from the official MaDViWorld web site [3]. The guide includes a tuto-
rial for developing new objects. The working framework can also be downloaded,
with an installation guide and the Javadoc of the framework.
Avatar
Sylvia
Room Server
"134.21.9.252:1099/corridor"
"134.21.9.155:1099/Hans"
"134.21.9.255:1099/R2"
R1
"134.21.9.252:1099/R1" Avatar
corridor
Hans
James
Room Server
"134.21.9.255/corridor"
TicTacToe
An application
"134.21.9.252:1099/R1"
objects contained in
"134.21.9.255:1099/R2" list of rooms sharing ARoom
corridor
doors with ARoom
ARoom list of avatars connected
to ARoom
A physical machine
(IP adress)
James' machine
(134.21.9.255)
connection to the LAN/WAN
Legend
Fig. 2. One possible physical model for the conceptual model of Figure 1
There are a lot of available definitions for what a framework is. A complete and
precise one is given in [12]. But let us take the following more concise one that
can be found in [2]:
Keeping these definitions in mind, this section describes the MaDViWorld appli-
cation framework. The different places where adaptations for specific functional-
ity should be made will be identified and explained. These flexible points of the
framework are commonly called hot spots and are opposed to the frozen spots
(see [15]). An overview of the whole framework is shown in Figure 3. The root
package for all the other MaDViWorld classes is ch.unifr.diuf.madviworld,
and in the rest of the paper this will be omitted for evident convenience reasons.
First, let us consider an horizontal decomposition of Figure 3 building blocks:
Massively Distributed Virtual Worlds: A Framework Approach 115
Second, let us decompose the blocks of Figure 3 vertically. From left to right
one finds respectively all the packages and classes relative to the client appli-
cations (i.e. avatars), then those relative to the server applications (i.e. rooms)
and those relative to the objects populating the rooms. Ultimately there are
two utility packages, one containing packages and classes used by the framework
and rightmost the event package. Obviously, the implementation or extension
of the Avatar, Room and Object packages are the hot spots of the MaDViWorld
framework. There are two types of activities for a framework user:
– The first one consists just in enriching the world with new types of active
objects by providing the appropriate implementations of the wobject pack-
age. For this activity, users do not need to modify the standard avatar and
room server application and they can resort to a little wizard application
in order to install their new objects dynamically into a running (potentially
remote) room.
– The second one consists in providing richer or better avatar and/or room
server applications either by extending the default ones, or by both extending
and implementing the appropriate framework extendable classes or by fully
implementing the appropriate framework interfaces from scratch.
This section focuses on the objects and particularly on the distributed event
model of the framework, which is one of the aspects that was not yet present in
the first version of MaDViWorld as presented in [6] and which plays a central
role for objects.
The purpose of this section is to better understand the process of objects creation
and use. Figure 4 illustrates how the wobject package has to be implemented
116 P. Fuhrer and J. Pasquier-Rocha
core
avatar room
wobject
util event
Default Default
avatar room
in order to develop new objects. For further comments about this package the
reader is invited to consult the Object Programmer’s Guide on [3].
In order to add a new object, the framework user has to create the correspond-
ing wobjects.newobj package, which must contain two subpackages, one for the
object’s implementation part and one for its graphical user interface (GUI) part.
This clean separation between the user interface and the object logic does not
provide a two-way communication channel between these two parts. The client
relationship between the MyObjGUIImpl class and the MyObj interface provides
a one-way communication channel (from GUI to the implementation), but the
implementation part cannot send information back to the GUI. The distributed
event model designed to address this issue is presented below.
«interface»
java.rmi.Remote
ch.unifr.diuf.madviworld.core
«interface» «interface»
ch.unifr.diuf.madviworld.event WObjectGUI WObject
RemoteEvent
framework classes
ch.unifr.diuf.madviworld.wobject
«interface»
RemoteEventListener WObjectImpl
WObjectGUIImpl
ch.unifr.diuf.madviworld.wobjects.myobj
«interface»
MyObj
ch.unifr.diuf.madviworld.wobjects.myobj.gui
1 ch.unifr.diuf.madviworld.wobjects.myobj.impl
MyObjGUIImpl
1
+initComponents() : void
implementation classes
MyObjRemoteEventListener
1 1
MyObjRemoteEventListener MyObjImpl
+notify(in theEvent : RemoteEvent) : void
MyObjPanel
+notify(in theEvent : RemoteEvent) : void
and is implemented in the core.DUID class1 . Each room, object or avatar has an
associated DUID that is generated by the framework, so that it can be identified
without ambiguity. The use of such a DUID was inspired by [8].
It is now time to take a closer look at the content of the event package (see
Figure 5 ) and how it solves the mentioned problems:
– The RemoteEvent class defines remote events passed from an event generator
to the event notifiers, which forward them to the interested remote event
listeners. A remote event contains information about the kind of event that
occurred, a reference to the object which fired the event and arbitrarily many
attributes.
– The RemoteEventProducerImpl class implements the RemoteEventProducer
interface.
– The RemoteEventNotifier helper class notifies in its own execution thread
a given event listener on behalf of an RemoteEventProducerImpl.
Figure 5 shows the design pattern used through the whole framework for
the collaboration between the three different parts of MaDViWorld (i.e. avatars,
rooms and objects) and the utility event package. Note that the three of them
are both implementing the RemoteEventProducer interface and are client of
its implementation, RemoteEventProducerImpl. The operations defined by the
interface are just forwarded to the utility class. With this pattern we have the
suited inheritance relation (a WObject ’is a’ RemoteEventProducer) without
duplicating the common code. A lot of similarities with the Proxy Pattern defined
in [9] can be found.
To sum up the whole event mechanism, the UML sequence diagram of
Figure 6 dwells on all the operations, from the registration phase to the fir-
ing and notification of an event. First (a), the event consumer registers a
RemoteEventListener to a room, avatar or object whose events it is interested
in. Second (b), due to a state change an event is fired and all interested listeners
are notified, each by a RemoteEventNotifier. The informed listener can then
do the appropriate work with regard to the type of the event. On Figure 6,
one can also see the different methods invoked remotely across the LAN. This
pattern present some similarities with the Jini distributed event programming
model, which is specified in [1] and thoroughly explored in [14].
The MaDViWorld framework has been intensively used and tested in various
students projects. So, there is already a little collection of interesting objects
available. Here is a non exhaustive list of them:
«interface» «interface»
java.util.EventListener java.lang.Runnable
Java classes
java.rmi.Remote
ch.unifr.diuf.madviworld.event
+anOperation()
RemoteEventProducerImpl RemoteEventNotifier
UnknownEventException
+anOperation()
ch.unifr.diuf.madviworld.wobjects.core
framework classes
«interface»
Avatar / Room / WObject
implementation classes
ch.unifr.diuf.madviworld.wobjects.avatar/room/wobject
Fig. 5. Pattern used for integrating the event model in the framework
notifyAllListeners(anEvent)
(b)
notify(anEvent)
anOperation()
4 Conclusion
4.1 Achievements
The actual version of the MaDViWorld framework has reached the following
achievements:
1. Security. On the one hand a security model inside the world (access to rooms,
permissions,...) should be clearly defined and developed. On the other hand,
security issues at the lower level of the framework should be resolved (down-
loading of classes, codebase verification, trusting of lookup services, etc.).
An overview of existing solutions can be found in [13].
2. Scripting. The end-user should be able to change the behavior of objects at
runtime.
Massively Distributed Virtual Worlds: A Framework Approach 121
3. Static description. Rooms and avatars with all their attributes should be
described statically in a form of structured data such as XML and created
”on the fly” by a parser tool. Persistence and state recovery mechanisms
could then take advantage of this feature.
4. Multimedia and 3D. More complex space aspects of virtual worlds (rooms,
avatars and objects) should be integrated.
5. Real world example. Developing concrete practical applications of the frame-
work like educational worlds as virtual campus could be a possibility. Such
a world is sketched in [6].
References
1. Arnold, K. et al.: The Jini Specification, Addison-Wesley:Reading, MA, 1999.
2. Buschmann, F. et al.:Pattern-Oriented Software Architecture - A System of Pat-
terns, John Wiley and Sons:Chichester, 1996.
3. Department of Informatics, University of Fribourg (CH), Software Engineering
Group. MaDViWorld: a Software Framework for Massively Distributed Virtual
Worlds http://diuf.unifr.ch/softeng/projects/madviworld/ [10 September 2002].
4. Fabre, Y. et al.: A framework to dynamically manage distributed virtual environ-
ments. Virtual Worlds, Proceedings of the Second International Conference, VW
2000, Paris, France, 5–7 July. Springer: Berlin, 2000; 54–64.
5. Frécon E., Stenius M.: DIVE: A scaleable network architecture for distributed vir-
tual environments (special issue on Distributed Virtual Environments). Distributed
Systems Engineering Journal 1998; 5(3):91–100.
6. Fuhrer, P. et al.: MaDViWorld: a Software Framework for Massively Distributed
Virtual Worlds. Software Practice and Experience, 2002, 32:645–668.
7. Fuhrer, P. et al.: The MaDViWorld Software Framework for Massively Distributed
Virtual Worlds: Concepts, Examples and Implementation Solutions. Department
of Informatics Internal Working Paper no 01–23, University of Fribourg (CH),
Switzerland, July 2001.
8. Gachet, A.: A Software Framework for Developing Distributed Cooperative Deci-
sion Support Systems - Construction Phase, Department of Informatics Internal
Working Paper no 02–02, University of Fribourg (CH), Switzerland, March 2002.
9. Gamma, E. et al.: Design Patterns: Elements of Reusable Object-Oriented Soft-
ware, Addison-Wesley Professional Computing Series: Reading, MA, 1995.
10. Greenhalgh, C., Benford, S.: MASSIVE: A distributed virtual reality system incor-
porating spatial trading. Proceedings 15th International Conference on Distributed
Computing Systems. IEEE Computer Society Press: Vancouver, Canada, 1995, 27–
34.
11. Johnson, R. E., Foote B.: Designing Reusable Classes. Journal of Object-Oriented
Programming 1(2), June/July 1988, pp. 22–35.
12. Larman, C.: Applying UML and Patterns, Prentice Hall PTR:Upper Saddle River,
NJ, 2002.
13. Kouadri Mostéfaoui, G. et al: Security Models for the Jini Networking Technol-
ogy: A Case Study, Department of Informatics Internal Working paper no 02–07,
University of Fribourg (CH), Switzerland, May 2002.
14. Li, S.: Professional Jini, Wrox Press Ltd.:Birmingham, 2000.
15. Pree, W.: Design Patterns for Object-Oriented Software Development, Addison-
Wesley: Reading, MA, 1995.
Distributed Java Platform with Programmable MIMD
Capabilities
Abstract. A distributed Java platform has been designed and built for the
simplified implementation of distributed Java applications. Its programmable
nature means that code as well as data is distributed over a network. The
platform is largely based on the Java Distributed Computation Library of
Fritsche, Power, and Waldron. The generality of our system is demonstrated
through the emulation of a MIMD (multiple instruction, multiple data)
architecture. This is achieved by augmenting the server with a virtual pipeline
processor. We explain the design of the system, its deployment over a
university network, and its evaluation through a sample application.
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 122–131, 2003.
© Springer-Verlag Berlin Heidelberg 2003
Distributed Java Platform with Programmable MIMD Capabilities 123
The server is capable of both detecting client failure and redistributing the
computational load.
Other enhancements (not aspirations of the original JDCL developers)
include adding security to the clients, and expanding the range of applications that the
JDCL can support. A security manager has been developed that limits the downloaded
task’s interaction with the client software and donor machine. The other major
enhancement is the system's emulation of a MIMD (multiple instruction, multiple
data) architecture. This is explained in Sect. 2. The design of the system is explained
in Sect. 3. Section 4 gives a brief overview of how the system is programmed and in
Sect. 5 the system is evaluated with an application from the field of bioinformatics.
Java proved to be an ideal language for the development of this system. It
was possible to design a straightforward interface to the system: users are required to
extend only two classes to completely reconfigure a distributed computation.
Furthermore, identical clients (and identical downloaded tasks) could be run on a
variety of platforms. Existing programmable distributed environments or libraries
range from MPI [4] and PVM [5] to JavaSpaces [6] and the Java OO Neural Engine
(Joone) [7].
⎧ c0 ( X ) if j = 0 ;
rj = ⎨ (3)
⎩c j (r j −1 ) if j > 0 .
where r = rm −1 can be regarded as the seed to the recursion and defines the final
result. The first clause in Eq. (3) is the terminating condition (passing the input to the
first transformation) and the second clause describes how the result of any one
transformation depends on the preceding transformation. We use the following
compact notation to represent the recursive definition of Eq. (3),
m −1
r = C( X ) = ∏ c j ( X ) . (4)
j =0
where º denotes the operation to appropriately pass the results of one transformation
to another. Equation (4) describes passing the complete input X to transformation c0,
the result being passed to c1, and so on. Staying within the pipeline processing
paradigm, we could further partition the input into n segments, as described in
Eq. (1), and pass each segment in turn through the complete sequence of m
transformations. Appropriately combining the partial results at the end of the final
transformation, as in Eq. (2), would allow us to write Eq. (4) as
n −1 ⎛ m −1 ⎞
r = C ( X ) = U ⎜⎜ ∏ c j ( xi ) ⎟⎟ . (5)
i =0 ⎝ j =0 ⎠
The advantages of the representation in Eq. (5) include the ability to arbitrarily
change the granularity of the data throughput (some transformations may have
restrictions on the size or format of their arguments) and to permit parallelisation of
the computation. Pipeline computations could possibly be regarded as MISD
(multiple instruction, single data).
It is possible to combine both the client-server (SIMD) and pipeline (MISD)
models. This is important if we want to allow clients to effect arbitrary transforms
rather than each one performing the same cj. In this case, the server divides the
computation as well as the data. It distributes to the clients a description of a
transformation cj as well as a data segment xi. Since the partitioning shown in Eq. (1)
is possible, there will not be any interdependencies between different parts of the data
stream. Equations (4) and (2) could therefore be combined as
m −1 n −1
r = C ( X ) = ∏ U c j ( xi ) . (6)
j = 0 i =0
which describes transforming all of the data segments with cj before applying cj+1, and
so on. Since Eqs. (5) and (6) describe the same computation, this shows that the order
in which each cj(xi) is effected is unimportant, as long as one finds the appropriate
( U ,º) pair. An out-of-order implementation of Eq. (6) is a MIMD computation.
Consequently, an MIMD emulator is the by-product of a loosely coupled client-server
simulation of a highly structured pipeline processor. This computational theory tells
us nothing about how to find an appropriate ( U ,º) pair, or how efficient the resulting
MIMD emulation might be. Sanders [9] has proposed an efficient algorithm to
emulate MIMD computations on a synchronous SIMD system. Our asynchronous
Distributed Java Platform with Programmable MIMD Capabilities 125
system should admit emulation algorithms that are even more efficient because it
completely avoids what Sanders calls SIMD overhead [9] (where the globally issued
instruction is not required locally). Our system is still susceptible to load imbalance
overhead but this problem-dependent issue is inherent to all parallel computing,
including MIMD parallelism. Figure 1 shows an abstract model of the system.
USER
The design mirrors that of Sanders [9] with a number of enhancements inspired by
our computational model. The user partitions the MIMD algorithm into multiple
independent sequential stages, if possible. Each stage corresponds to a node in a
theoretical ‘pipeline.’ The code corresponding to all stages (the Task) is sent to clients
as a compiled Java class. Execution of each of the (one or more) stages then proceeds
as a SIMD computation as in [9]. All stages of the pipeline could be ‘processing’ at
the same time if the particular problem allowed. Our system is therefore most
efficient at emulating MIMD computations that can be naturally expressed as a
pipeline of SIMD computations. The overall system can be subdivided up into three
main sections; common modules, server, and client.
We found that as with many distributed systems, there is a lot of overlap in terms of
functionality between client and server. Each system (client and server) can keep two
distinct logs: system logs and error logs. The system logs record system events as they
happen. In the event of some catastrophic event (e.g. power loss), it may be possible
to use these logs at the server to restart a particular problem at the point where it was
halted. These logs are an optional feature on the client and are mainly used for
debugging purposes.
Communications are performed on our system using Java sockets. We
decided to produce one single module for use on the server and client to perform all
socket communications. Its main functions are to open and close sockets, send and
receive messages, and to terminate communications. The other shared
communications module is the basic unit used for communication in the system called
the Message class. It is be extendable so that a message can contain items such as
data, algorithms, information on client status, and so on.
126 T. Keane et al.
Each data unit that is sent out to be processed by our server has a user
defined time limit associated with it. If the results for that unit have not returned to the
server within the specified time, it is assumed that the unit has failed. This would
normally happen through client failure, e.g. a donor machine being switched off.
Additionally, if the client is not finished processing the unit when the time limit
expires, it will contact the server and request a time extension. For these purposes, the
server and client have been provided with a common timing module.
At the heart of our distributed system is the piece of compiled Java code that
is downloaded from the server to each client – the Task. This Java class contains the
algorithm that is executed over all of the subsequent data units that are received by the
client. The user of the system is required to extend this common class. All tasks
conforming to this interface will be accepted by the system. Any Java exceptions that
occur in the task as it is executing at the client are fed back to the server via the
communications protocol.
3.2 Server
The server can be divided up into three main sections (see Fig. 2). The
ServerEngine is responsible for initialising the server at start-up. It reads in the
user defined initialisation parameters via an external text file. After reading all of the
initialisation options, it creates the log files, loads and checks the user defined classes
(DataHandler and Task) and then creates the ConnectionManager. The
ServerEngine also acts as the interface between the communications modules and
the current running problem on the system. It also manages the lists of data units
(pending and expired) that are currently out being processed by clients.
The ConnectionManager is responsible for listening on the server
socket for new client connections and creating a new ServerMessageHandler
thread to handle each connection. Each time a new client connection is received, a
new Java thread is created that handles the communication on a separate port to the
main port used by the server to listen for new connections.
Scheduler
TaskList
Task SchedulerThread
ServerMessageHandler MessageHandler
Fig. 2. Server design: the user extends Task (which is sent to the client) and DataHandler
Distributed Java Platform with Programmable MIMD Capabilities 127
3.3 Client
The client software can be divided up into two main sections (see Fig. 3). The
ClientEngine is responsible for initialising the client software. Its first task is to
read in the server details that are contained in the external text file that is included in
the client installation. Once these details have been parsed correctly, the
ClientEngine starts the security manager that remains in operation for the
lifetime of the client. This strictly limits the client’s interactions with the donor
machine’s resources. The final task of the ClientEngine is to start the
communications section of the client.
The ClientMessageHandler is at the centre of the communications
section of the system. It manages the communications protocol, receives and
initialises the downloaded task and manages the execution of each data unit. Each
downloaded algorithm is received by the client as a compiled Java class. By using
Java’s class loading facilities, the algorithm is subsequently dynamically loaded into
the executing client. The ClientMessageHandler uses the shared timing module
to monitor the time being taken to process each data unit. If a data unit is taking
longer than its allotted time to process its data, the client can request a time extension.
The client continues in an infinite loop requesting and processing data units from the
server. When the server sends a new algorithm to the client, this new algorithm is
loaded dynamically and all subsequent data units received are processed using the
new algorithm. There are extensive exception handling mechanisms coded into the
client so that the client should run continuously until explicitly shut down by the
donor of the machine. Full details on the design of the JDCL and its extensions can be
found in [3,9].
ClientSecurityManager
LogFile
To program the system with a given problem, there are two java classes that must be
extended using the standard Java inheritance mechanisms. These are the DataHandler
128 T. Keane et al.
class and the Task class. The subclass of DataHandler specifies how to manage the
data for the distributed computation. The subclass of Task is where the distributed
algorithm is programmed. In order to program a MIMD computation, the user makes
use of the Bucket class when designing their DataHandler.
4.1 DataHandler
The main purpose of the extended DataHandler is to manage all of the data relating to
the current problem. The first section of the DataHandler is the init() method. The
purpose of this method is to initialise whatever data structures are necessary for the
overall computation. This can involve things such as setting up readers of files,
initialising arrays, assigning variables values, etc. The next section is the
getNextParameterSet() method. This is where the pre-processed data units are
generated to be sent to the clients. This method is called every time a client requests a
data unit to process. The return type is an Object array and since all Java classes
inherit from the Object class, this method can return any data type supported by Java.
The task running at the client receives this Object array as its data. Therefore, it is
usual for the elements of this Object array to be explicitly typecasted at the client. The
final section of the DataHandler is the resultsHandler() method.
4.2 Task
The subclass of the Task class describes how the data received by the clients is to be
processed. There is only one method that must be extended in the task - the run()
method. The pre-processed data that is sent by the server in each data unit is available
through the parameterList variable. The general format of the run() method is
described below in Fig. 4.
The main purpose of this class is to effect the theoretical pipeline for MIMD
computations. This allows the developer to set up the server to act like a pipeline
processor (Fig. 5) with several different intermediary stages in the distributed
computation.
Input Output
Data P0 P1 P2 P3 Data
Fig. 5. Pipelined processes, where P0 through P4 are the processes
The pipeline is simulated by using the bucket class to represent the storage
required at each stage of the pipeline. The information is stored in each bucket by
using Java Vectors thus maintaining the type independence of the system. In a MIMD
computation, the complete algorithm (including the code for all stages of the pipeline)
is sent to each client with a flag being sent with each data unit to indicate which stage
the data unit belongs to.
5 Application of System
could get very close to the theoretical speedup maximum. The theoretical speedup
maximum was calculated from S ( p ) = w ⎡w p ⎤ , where w denotes number of
individual work units and where S, speedup, is parameterised by number of
processors p.
Table 1. Speedup achieved for each of the three repeated substring search strategies
Search strategy Single processor 40-90 processors Speedup
(i) Exact matching 130 hours 28 hours 4.6
(ii) Insertions 1790 hours 31 hours 57.7
(iii) Deletions 1670 hours 35 hours 47.7
Theoretical maximum
60
Speedup achieved
50
40
Speedup
30
20
10
0
0 10 20 30 40 50 60 70 80 90
Number of processors
6 Conclusion
We have refined the JDCL in terms of efficiency and functionality, including the
successful extension of the system to emulate a MIMD architecture. This has allowed
us to implement a large-scale bioinformatics application. The system is completely
generalisable, and because it is written in Java, the developer interface is simplified to
the extension of two classes. Work is ongoing on the next generation of this system.
Several new features are to be incorporated into the new system including a multi-
problem scheduler, compression, encryption and authentication of all
communications, a remote server interface and the migration of all communications to
Java RMI. Future work includes performing this type of DNA analysis on other
similar size genomes with a view to eventually performing this type of analysis on the
human genome.
The continuation of this research has recently been funded by the Irish Research
Council for Science, Engineering and Technology: funded by the National
Development Plan.
References
1 2
Alastair Hampshire and Gordon S. Blair
1
School of Computer Science and Information Technology, University of Nottingham,
Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB
[email protected]
2
Department of Computing, Lancaster University, Lancaster, LA1 4YR
[email protected]
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 132–142, 2003.
© Springer-Verlag Berlin Heidelberg 2003
JGrid: Exploiting Jini for the Development of Grid Applications 133
2 Background
A number of middleware platforms have been developed for the GRID. As mentioned
above, most of these platforms are quite low level with their roots in work on parallel
programming environments, the most notable example being Globus. A number of
more high-level platforms have also been developed. We briefly examine a selection
of low level and higher-level platforms below.
The two low level platforms we examine are Globus and Gridware. Globus
[Globus01] is currently the most widely used approach for the construction of Grid
environments. It provides a set of services, the Globus toolkit, which implement
features such as resource discovery, resource management, communication, security
etc. These services are distinct, which allows them to be incorporated into
applications incrementally. This allows the application programmer to decide which
grid services he wishes to provide.
Gridware [Gridware01] is a grid computing engine designed by Sun Microsystems
to locate and harness idle resources on a network. A central holding area queues all
jobs for execution along with the requirements of the job, which may consist of
available memory, execution speed and available software licences. Gridware
matches the available resources to the job requirements. When a resource becomes
available the Sun Grid Engine software will dispatch a suitable job to that resource.
Both the above are interesting platforms, however they are low-level in nature and
thus make it a difficult and error prone task to write GRID applications.
The two high level platforms we examine are Legion and GriT. Legion [Legion01]
is an object oriented grid environment in which everything, whether hardware or
134 A. Hampshire and G.S. Blair
2.2 Jini
Jini [Jini01] was developed by Sun Microsystems to allow dynamic (I.e. services can
be created and removed at any time) distributed environments to be created, which
allow clients and services to communicate with each other whilst possessing only a
minimal knowledge about each other. Any service wanting to participate in a Jini
organisation must find and join a Jini lookup service. For redundancy, more than one
lookup service may be available, in this case the service may choose to join some or
all of the available lookup services. A client wishing to make use of a service finds
all lookup services and retrieves a list of available services. The client can then
choose and make use of whichever service he wants to use.
Jini has been chosen because it provides several features that may be particularly
useful for the development of grid environments, such as automatic discovery of new
resources and eventual clean up of crashed resources (See section 1 for more details).
In addition, because Jini is based on Java, it makes use of Java’s cross platform
compatibility, which is particularly useful for grid environments.
3 Design
This section outlines the design of an object oriented grid environment, JGrid,
implemented using Jini & Java. JGrid allows a client machine to make seamless and
transparent use of multiple distributed computational resources, referred to as a
processor service (PS). The following design principles were defined:
The system should be as dynamic as possible, allowing it to cope with PSs
being created and destroyed sporadically. Further, the system should be able to
recover, with no loss of data, should a PS crash.
The system must be reusable. The user must be able to solve any problem they
want by simply “plugging in” a new algorithm at the client. This simplifies the
task of writing applications that can make use of grid resources.
JGrid: Exploiting Jini for the Development of Grid Applications 135
Fig. 1 shows how the various components of the system interact. For a machine to
make its processor available within a grid environment it must start a PS, which
locates and joins a Jini lookup service. The Jini lookup service is used to store a list
of references to available services and their associated attributes.
The PS is the software used to represent and control a processor. For a computer to
be included in the Grid it must install and run this software which, when executed,
registers itself with any Jini lookup services on the network. This allows clients to
discover and pass objects to the PS, which will process the object and return it.
The PS requires the following attributes: Name (String) – The name “processor
service” identifies the PS, CPU Speed (constant Integer), Spare_capacity (Integer),
Over_use_flag (Boolean), Minimum_spare_capacity (constant Integer).
136 A. Hampshire and G.S. Blair
3.2.3 Leasing
To make use of a PS, a client must first request a lease for that service. To keep hold
of the PS, the client must periodically renew its lease. When the client passes an
object to the PS for processing, the PS first checks to see if the client has a valid lease
before proceeding. If the client does not hold a valid lease then an exception is
thrown. When a client has finished using a service, it must return the lease. If a lease
times out, the PS assumes the client has crashed or ungracefully terminated its
connection. This ensures that a client cannot indefinitely hold a service without
regular renewal messages and thus helps to provide a more robust architecture.
The service discovery and selection mechanism is a reusable component that keeps
track of available services and handles requests from a client for a service. However,
the client does not use this mechanism to communicate with the service, instead the
client downloads a stub of the service from the computer that is running the service.
This stub then allows the client to directly communicate with the service.
This key component will be initialised with the name “processor service” and the
attributes of the service required. The class will then retrieve information from a Jini
lookup service about available services that match the specified criteria. The
information retrieved will be stored within the class. When a client makes a request,
the PS with the highest spare_capacity is selected. Spare_capacity represents the
amount of spare processing power within the processor and is calculated as follows:
CPU _ Speed
spare _ capacity =
Num _ of _ clients + 1
JGrid: Exploiting Jini for the Development of Grid Applications 137
Data can be passed to the PS stored inside the algorithm object. However this is not
efficient, especially when transferring large amounts of data. Therefore, the
following data passing mechanisms have been defined.
The data service is designed to improve the performance of the system by
storing the data at the client, allowing the service to access the data on demand. The
data service contains a hashtable into which data can be placed alongside a tag that
will identify it. Various methods have been implemented to allow different types of
data to be entered into and removed from the data service. This allows a single
hashtable to store multiple sets of data of differing types. The client can enter all the
data required into the data service and then pass an interface of the data service to the
PS alongside the executable object. Because the user who programmed the
executable is aware of how the data were placed into the data service, the algorithm is
able to retrieve those data from the data service on demand.
In some situations, e.g. with large data sets, it would be advantageous to fetch
the data from the data service just before it is required, so the processor does not have
to wait for the data to be copied across the network. The data streamer achieves this
by streaming the data across the network into a buffer on the PS machine. The data
can then be extracted from the buffer when they are needed, instead of being dragged
across the networks. The data streamer is designed to be as flexible as possible,
allowing the user to specify the streaming parameters: how often data are copied
across the network as well as the number of data items in each copy.
The client makes use of the service discovery and selection mechanism, by initialising
it with the attributes for the required service and then requesting as many services as it
needs. It is the job of the client to make the best use of available resources by
splitting the problem into separate sections. Each section can then make use of a
different computational resource thus improving the performance of the system. The
sections must be carefully selected, because they must be sufficiently heavyweight to
outweigh the costs of shipping code around the network, but there must be enough
sections to make use of several processors. The environment can be used by simply
‘plugging in’ the algorithm for a problem, which greatly simplifies the task of writing
applications that make use of grid resources.
138 A. Hampshire and G.S. Blair
3.6 Summary
The most important objective of JGrid is to be as dynamic and robust as possible, i.e.
it must be able to cope with continual changes in the grid environment. To achieve
this, the service discovery and selection mechanism has been carefully designed to
keep track of the changing grid environment.
The most important aspect of the design is the method by which objects are passed
to and processed by remote services, because it is this object passing and processing
that is fundamental to the functioning of the system. Other important aspects include
the use of Leasing to improve the robustness of the architecture, the use of processor
attributes in service selection and the advanced data passing mechanisms such as the
data service and data streamer. Finally, the system is designed to be reusable, by
allowing users to ‘plug in’ new algorithms to solve new problems.
4 Evaluation
Typical problems that could be solved using a grid environment such as JGrid, must
be sufficiently complex to require large computational resources. In addition the
problem must be separable into many coarse grained and loosely coupled subtasks,
which can be processed separately and then amalgamated to produce a final solution.
Several different algorithms were used to test the functionality and performance of
the system in operation. The most important algorithm used was the flow prediction
problem [Young94], which involves running several simulations of the same
algorithm with varied parameters. Each simulation predicts a river flow data set
based on a collected rainfall data set. Each calculated flow data set is then compared
to a collected flow data set, to see which statistically fits the best. The parameters of
the best fitting simulation can then be used in the future to predict river flows from
rainfall data. This problem has formed the basis of both the quantitative and
qualitative evaluation.
The test, as outlined above, was conducted by varying both the number of processors
used in the grid and the workload placed upon the grid. The results where then
compared to similar tests carried out on a standalone machine. The results
demonstrated that for a light workload, a standalone environment is more efficient.
This is because the time taken to discover available services and ship objects around
the network outweighs the performance increase of using multiple processors.
However, once the workload is sufficiently weighty, the cost of the initial service
discovery and object shipping become negligible compared to the overall processing
time and the grid environment is able to offer significantly better performance than a
standalone machine.
Further results were also taken that demonstrated a significant performance
increase when using a grid environment containing more available PSs. This shows
that multiple resources can be effectively combined to reduce the overall computation
time for a given algorithm.
JGrid: Exploiting Jini for the Development of Grid Applications 139
4.2.1 Java & Jini’s Suitability for High Performance Grid Environments
Grid computing environments will generally consist of a heterogeneous collection of
resources. Because of Java’s platform independence Jini is clearly well suited to a
heterogeneous environment, being able to ship code to any machine that supports the
Java virtual machine without having to recompile.
Java provides a serialisation mechanism that allows objects to be converted into a
flat byte array, for transporting over a network. When the serialised object reaches its
destination it can be de-serialised into its previous form. This clearly makes it easy to
ship code around a network, something that is very important for the development of
Grid environments.
Jini’s resource discovery techniques are extremely easy to use; all a service must
do is discover a lookup service and join it. Clients can then simply perform a lookup
to discover available services. Jini is also very dynamic to changes in available
services, with remote events being used to notify interested parties of changes to the
available resources. Jini makes use of leasing to provide an architecture that is robust
to unexpected changes in available resources. Clearly these techniques are ideally
suited to the development of grid computing environments.
However there are reservations about the use of Java and Jini for the development
of grid computing environments. Jini has primarily been designed for use within
small communities and it is debatable whether the lookup service will scale well in
large systems. In addition, Java is not traditionally thought of as the natural choice
for high performance computing. However some current research suggests Java’s
performance characteristic may be significantly better than previously assumed
[Moreira01, Pancake01, Kielmann01].
In summary, Java and Jini provide exactly the sort of platform independent,
dynamic and robust resource discovery protocol that is required for a grid computing
environment. However it is debatable whether Jini scales well to large grids and
whether Java is fast enough to support high performance computing.
removal of services, whilst leasing is used to ensure a robust architecture that can
cope with unexpected or unintended changes to the grid.
Results in Globus are returned as textual outputs. JGrid provides more powerful
features for returning results, because of its object-oriented architecture. The client
may define a specific ‘results’ object or set of ‘results’ objects to store the output and
include methods to allow the results to be easily retrieved.
Globus is essentially a set of tools that can be utilised in C/C++, which are widely
accepted to be efficient and fast programming languages. JGrid, by contrast uses
Java. Whether Java is suitable for high performance computing applications is open
to debate, however it is clear and widely accepted that C and C++ offer better
performance than Java.
In summary, it is clear that JGrid provides or has the potential to provide several
features that are not supported by Globus, most notably platform independence.
Globus however has a clear performance advantage over JGrid. JGrid provides the
user with a higher level interface to the grid and therefore simplifies the task of
creating applications that can make use of the grid.
5 Future Work
There is a great deal of potential future work that could be carried out in this area,
however due to space requirements, only the major issues are mentioned here. One
possible extension to this project is that of dynamic reconfiguration. If, while a
remote object is being processed by a PS, another more powerful/ suitable PS
becomes available, dynamic reconfiguration would allow the object to be migrated to
the new processor. This is a desirable characteristic of a Grid environment, as it
would allow a client to make better use of available resources.
More advanced fault tolerance techniques could be included that periodically save
the state of the computation. Should a PS crash, the computation would not need to
be restarted from the beginning.
A more complex service selection mechanism could be used, for example an
attribute specifying available memory might be useful for data intensive
computations. Also the system could take into account the fact that some processor
architectures are more effective at processing some tasks than others.
As mentioned earlier, inter-object communication allows two objects being
remotely processed on separate machines to communicate with each other. It would
be worthwhile investigating how effectively Jini could support this functionality,
using features such as the Java Messaging Service and JavaSpaces. In a similar area,
further work could be done to exploit the protocol independence inherent in Jini to
interface to other more lightweight protocols such as MPI
6 Related Work
There are numerous other projects similar to the one described in this document, some
of which are outlined below. Many other projects can be found in the grid forum at
the following address: http://www-unix.mcs.anl.gov/gridforum/jini/
JGrid: Exploiting Jini for the Development of Grid Applications 141
Clifford [Clifford00] has developed JiniGrid, which uses Jini to create a grid
environment, which shares computational servers using the task farm paradigm. A
task farm consists of a number of worker processes, each typically running on their
own processor and all co-ordinated by a master process. The master module, known
as the TaskFarmService, must join the Jini lookup service to allow it to be located by
a client. Any problem to be solved is broken down into several component problems
(tasks) to be processed separately by different worker processes. The results are all
returned to the master process, which combines them to produce the overall result.
ICENI [Furmento02] exploits Java and Jini in a similar fashion to JGrid by
providing a grid middleware that builds on top of current grid services to provide a
more high level interface for applications. The ‘added value’ allows for the matching
of applications with underlying resources and services, to maximise the usage of
available resources. ICENI uses the notion of a Computational Community, to which
a group of individuals or organisations with a common goal can contribute their
resources. A resource manager manages resources within a private domain and a
policy manager controls access to these resources by users in the public
computational community. A domain manager is used to publish available resources
within one or more public computational community. Users interact with the public
computational community using tools such as a resource browser.
The main contribution of the JGrid work is to present both the design of a Jini
based Grid computing environment and a detailed evaluation based on environmental
sciences modelling algorithms used to test the system.
7 Conclusion
This paper has described the design and implementation of JGrid, a high level
middleware for GRID computing building on Jini and Java. The work has been
evaluated both quantitatively and qualitatively, based on experiences of developing an
environmental modelling simulation for river flow forecasting.
The major results from this work are:
The integration of Jini’s discovery/ lookup/ join, leasing and the event service do
help considerably in the development of GRID applications by managing the
dynamics of the underlying GRID environment;
The JGrid platform, through its use of object-oriented techniques, maintains a
clean separation of concerns between the application and the underlying
middleware infrastructure in terms of data sharing and resource discovery
services.
As well as the above, JGrid benefits from the cross platform compatibility inherent
in Java. The approach also lends itself strongly to dynamic re-configuration (although
at the time of writing this has not yet been implemented). Other areas of future work
include more advanced fault tolerance, enhanced service selection and inter-object
communication. Finally, the simulation work demonstrates that realistic GRID
applications can be constructed using the Java language and that through the use of
Jini, a reusable architecture can be created which simplifies the development of
applications that make use of a GRID environment.
142 A. Hampshire and G.S. Blair
References
1 Introduction
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 143–154, 2003.
© Springer-Verlag Berlin Heidelberg 2003
144 S. Alda, M. Won, and A.B. Cremers
other services) depending on such services must be notified accordingly, so that they
can adapt to the new situation. Dependency management also can facilitate other man-
agement activities such as fault management [8], accounting management [6], or tai-
lorability management [1].
In this paper, we consider services, which are supplied by software components
[18]. The usage of software components grants to have highly adaptable software
application since components can be deployed during runtime of an application. In the
context of our research work, we analyze to what extend dependencies between dis-
tributed components influence the tailorability of component-based applications. We
examine strategies to enhancing the decision making whether or not it is feasible to
adapt a component on which other components depend. Particularly in the area of
distributed applications where the constituting components interact in a peer-to-peer
[14] fashion, it is by all means expensive to detect all direct and transitive dependen-
cies between components.
The paper is organized as follows: section 2 elaborates different occurrences of de-
pendencies especially in the field of component technology. Section 3 presents our
component architecture FREEVOLVE. This architecture allows to deploy distributed
component-based Java applications, which can be structured either in a client-server
fashion or according to the peer-to-peer paradigm. Furthermore, the architecture sup-
ports tailorability as well as the management of dependencies between components.
Section 4 finally concludes this paper.
2 Dependencies
Dependencies between different components within a system exist in the way that the
whole application becomes unstable if those dependencies are broken or violated.
Normally, there is no reason why this should happen during work with an application
(whether it is designed component-based or not). However, this can necessarily occur
in the case that (distributed) applications change during runtime. Such changes can
appear in many ways. In the following we will concentrate on two fields of applica-
tions where those changes (and therefore possibly violated dependencies) are not only
normal but also intended in a special way.
Tailorability of Components. Most of the software sold nowadays are off-the-shelf
products designed to meet the requirements of very different types of users. One way
to meet these requirements is to design software that is flexible in such a way that it
can be used in very different contexts [7, 12]. This flexibility can be achieved by tai-
lorable design architectures. The idea behind this concept is that every user can tailor
his software in the way that it meets his personal working contexts best. In this way
the parts of specification and design can be done during the use. According to the
ideas of participatory design [7] users then have more possibilities to form the soft-
ware in the way they want to.
Component-based architectures were basically developed with the idea of higher
reusability of parts of software. Components can be used to build those highly flexible
Managing Dependencies in Component-Based Distributed Applications 145
software. The same operations that are used when composing software during devel-
opment (i.e. choosing components, parameterizing them, binding them together) are
then the basis for a tailoring language. Especially if the components have a GUI repre-
sentation it is quite easy to design appropriate tailoring languages including a visual
representation. Although both – the idea of adapting an application seen as a composi-
tion of components and the use of visual tailoring language which supports only very
few tailoring mechanisms – are well understood by users, there is a need for further
support during tailoring time.
Here doing errors during tailoring can violate dependencies between components
that work together. Frese et al. [5] discuss the amount of errors (that take about 12% of
working time) that users do during their work. They present empirical studies that
show that not only beginners but also experienced users do errors (as part of their
“working style”). Furthermore, they distinguish between different types of errors i.e.
those, which can be corrected easily, or those that cannot even be located directly.
What can be learned from this study is that doing errors during use and especially
during tailoring of software is normal or even an intended using behavior. Therefore,
supporting mechanisms which can help finding failures (broken dependencies), under-
standing the composition, or testing the current composition in an exploring environ-
ment seem to be very useful.
Failures in Peer-to-Peer Applications. In the dynamic nature of peer-to-peer appli-
cations it is founded that one peer can not know all the other peers. Especially the
information concerning the up-to-date network composition it gets is time-shifted. So
violated dependencies (seen from the perspective of one special peer) stem from bro-
ken connections to some other peers which services are needed or inconsistencies
between several services that work together. Apparently, observing dependencies
between services can not solve the time-shift problem. The fact that, mostly, inde-
pendent peers cooperate together and have to be adapted in the way that they do so can
actually be described by dependencies. Peer-to-peer networks are highly dynamic;
therefore a single peer has to continuously adapt itself according to a changed envi-
ronment. These adaptations can either be tackled manually (tailorability) or by means
of automatisms which change the configuration of the peer stand-alone (adaptivity). In
the second way there have to be descriptions how a peer should be connected to other
peers, which services it needs, and how parameters can be changed to allow for a
better fitting into the environment.
Event Flow. The event flow between two components is one of the most common
dependencies, which is integrated in nearly all prevalent component models. An event
flow typically consists of an event source and an event sink. An event source produces
an event (e.g. if a component changes its state), which is eventually passed to one or
more event sinks. In order to be notified an event sink must subscribe to the source
(observer pattern). An event flow is uni-directional, that is, an event sink does not send
back an (confirmation) event to the source. In addition to these properties, various
other are conceivable: the FLEXIBEAN Model [15] for example provides a remote
event flow across machine boundaries through the incorporation of the RMI technol-
ogy. With respect to the CORBA COMPONENT MODEL (CCM) 3.0 [11] event notifica-
tion is asynchronous, i.e. time-shifted. In both models, providing two components with
an event flow is performed by using ports, the connecting points of components. In
doing so, the port of the sink must compulsory be compatible to the port of the source.
Data Flow. The data flow represents the flow of information between two compo-
nents. In contrast to an event flow, where only little information about an event is
transmitted, data flow rather illustrates a continuous flow between two components.
This flow of data is typically not instantiated by a state change but requested and
started through an interaction with a user. Furthermore, data flow is mostly synchro-
nous (not time-shifted) and bi-directional, that is, the direction of the flow between
two components can be alternating. In the FLEXIBEAN Component Model a bi-
directional channel is modeled by means of a shared object. Here, two components
have equal access to a shared object, which can be used to exchange arbitrary objects.
Functional Dependency. By a functional dependent component we understand a
component, which depends on certain methods or interfaces provided by another com-
ponent. This connection between two components is obviously synchronous and uni-
directional. In the CORBA COMPONENT MODEL 3.0, a subset of the overall set of inter-
faces a component does provide is termed facet. A facet provides a special view a
component can have on an other component. In other words, a facet represents the role
a component can adopt during its life-cycle. Additionally, a component can navigate
between the facets of an other component and is thus able to change the view during
runtime. A violation occurs, if some facets are removed or become unavailable after
the adaptation of the component.
Implicit Dependencies. In contrast to the first dependency types, implicit depend-
encies [9] impose no direct communication (e.g. method call, or information flow)
between components. These dependencies exist between services supplied by compo-
nents and services stemming from basic underlying services such as memory, or
scheduling, which may have an indirect influence on the performance of a component.
Semantic Constraints and Integrity Conditions. Normally, state-of-the-art de-
velopment tools and platforms do syntactically checks on the code or – in case of
component-based architectures – composition. Nonetheless, there are cases in which
syntactic correct compositions do not work properly. Or they do work properly but not
in the intended way (seen from the perspective of a composer or a tailoring user). In
many cases "integrity information" added to a component by its developer might help
avoiding failures. Another idea is to design "application templates" where conditions
according to a group of applications like the group of word processors or a group of
Managing Dependencies in Component-Based Distributed Applications 147
different database query front ends. Both approaches are based on the assumption that
special meta information on the components and about domains or groups of applica-
tions can be given or added to the component set. The approach described in the fol-
lowing is based on the tailoring techniques mentioned above, but the concept can be
used to support adaptive strategies within a peer-to-peer network as well. In both cases
it seems to be useful to try to match the resulting integrity conditions to the basic de-
sign or tailoring operations, that is:
Parameterization of components: There can be mutual dependencies between dif-
ferent parameters within one component. They can be checked within the component’s
functionality. The idea is very similar to constraints or triggers in the world of data-
bases. Here dependencies between data records can be described. Updating, deleting,
or adding actions are checked according to those conditions. In the case of component-
based software explicit descriptions of dependencies between parameters not only
ensure the correct parameterization but also explain parts of the functionality. This
helps understanding the component’s semantics. External dependencies between pa-
rameters of different components also have to be described explicitly.
Changing the connections: As shown above in component-based architectures
syntactical dependencies such as event flows can be checked. Compilers usually mas-
ter these checks. Furthermore, additional semantic information can be given about
what a component needs from its environment or which services it provides. This idea
is similar to the Parameterized Contracts [13]. The slightly different approach which
we call Event Flow Integrity (EFI) (for more detail see [20, 21]) takes into account
where data are produced, transformed into other data, and consumed. Our approach
here deals with the idea that not only two interacting components are to be investi-
gated (according to the integrity of an application) but the whole composition can be
seen as a network. Thus as well as workflows (in WFMS) they can be seen as directed
graphs and therefore analyzed in a similar way (c.f. next section about dependencies in
peer-to-peer applications). Those graphs can be analyzed in the way that every essen-
tially needed event is produced and the producer is connected transitively to the con-
sumer. Events that have to be consumed can be treated in the same way.
Adding/removing components: The third basic operation set – adding and remov-
ing components – has also impact on integrity conditions described above. Event Flow
Integrity can be violated if components are removed from a composition. On the other
hand, if one adds a component to an existing application this new component may
have certain conditions that describe which of its provided data have to be consumed
or which input is necessary. Furthermore, we could define conditions that describe a
set of applications. Thus, similar to the conditions on parameters, we could demand
that a component of type X has to be part of the composition. More subtle rules can be
designed by using propositional logic terms (if component a is part of application X
then there must be a component b or a component c). Birngruber [3] describe applica-
tion templates in CoPL (Component Plan Language). This plan describes a composi-
tion and parameterization possibilities. Using this plan an interactive agent builds up a
composition. This approach is very helpful for designing new applications but does not
support any concepts of flexible or tailorable applications.
148 S. Alda, M. Won, and A.B. Cremers
The sum of the integrity concepts described above then allows for easier tailoring
(less failures), a better understanding of the application, if detailed descriptions are
added to the integrity conditions, or better adaptivity strategies of dynamic peer-to-
peer applications, if integrity conditions are taken into account. It is an obvious fact
that the more semantical information about dependencies are available, the higher the
degree of adaptivity of a peer application.
In the last years, component-based architectures [18] have become quite fashionable in
the field of software engineering. A very important property of a component is its
reusability and the independent development of components. Thus, components can be
seen as small programs which can exist and run alone but may also be combined with
other components. Thus, an application normally consists of several components that
are connected with each other.
Tailorable software can be designed if an underlying platform provides for tailoring
operations, so that users can change the composition easily. Those tailoring operations
Managing Dependencies in Component-Based Distributed Applications 149
are mainly the same as used in composition tools which developers use except that
they are much easier to handle and hide most of the complexity [15]. This idea was
implemented in the FREEVOLVE (former: EVOLVE) platform [15] [16] which will be
described as follows. Additionally, we describe our current efforts to incorporate an
adequate dependency management in the platform, which serves as a necessary pre-
requisite for the tailorability of components belonging to distributed applications.
The FREEVOLVE platform is based on the FLEXIBEAN component model, which has
already been depicted in section 2.1. FREEVOLVE is a client-server architecture where
all components initially reside on the server. Compositions of these components (con-
stituting concrete applications) are described in a language called CAT. There are
CAT files for both server-sided as well as client-sided compositions and another CAT-
file (DCAT) to describe the remote interaction between client and server components
(see figure 1). When starting a FREEVOLVE client (currently realized as an Applet and
as a stand-alone implementation) descriptions for the client’s compositions are trans-
ferred to the respective client. After choosing an application the components needed to
build up the application are also transferred and instantiated. These applications can
then be tailored by the users according to their likes and dislikes. During runtime,
applications are executed according to the client-server paradigm.
DCAT
DCAT describes the entire
Distributed Application
F
Web
Web F RE E VOLVE Client
F RE E VOLVE Client
Remote Interaction
(via RM I)
nents which are smaller.1 Evaluating tests have shown that this design seems to be
very promising. A second step to ease the learning therefore is to allow for a layered
architectures [19] which means that several components can be stick together and
saved as one larger.
One of the remaining problems is that it is not clear to the users in which situation
which component to choose or how the components have to be bound together. Seen
from the perspective of the developers the semantics of the components as well as
their parameters and interfaces have to become more transparent. There are some
other concepts [10] to create a better understanding of tailoring languages as well as of
the resulting applications. In the following (see 3.4) we will describe in more detail the
possibilities that can be provided by an integrity checking system that is not intended
to auto-correct tailoring errors but shows and explains them. These hints are given
visually or in help texts which allow for more detailed explanations.
In the new version of FREEVOLVE we are also engaged to integrate concepts for the
management of dependencies between distributed components belonging to peer-to-
peer applications. We propose a pre-analysis of the given peer network, which reveals
all direct and transitive dependencies between components[1]. The resulting depend-
ency graph supports the decision process, whether or not it is feasible or even eco-
nomically to execute the tailoring process for a certain component. Besides, an over-
view is provided, if some services or peers have become unavailable or unreliable.
Due to the nature of our FLEXIBEAN component model we are yet able to handle
two different dependencies, event and data flow, respectively. These dependencies are
defined in the corresponding CAT files of a composition. All dependencies starting
from a single component, that is, all inner dependencies among provided and con-
sumed services and all remote dependencies to other components can be explored
through the so-called DEPENDENCY API of a peer. To compute the dependency graph,
the necessary data of all affected peers is thereby collected by an agent which succes-
sively migrates to all peers. Dependencies themselves are described in a XML-based
notation. Each dependency is annotated with additional parameters. These parameters
are based on the classification catalogue as introduced in [8]. From this collection we
have basically adopted the parameters DEPENDENCY STRENGTH (denotes how strongly
the dependent component depends on the respective components) and DEPENDENCY
CRITICALITY (to indicate how a certain dependency must be satisfied in terms of the
resource this component depends on). By means of these additional parameters we
obtain a weighted graph. All parameters are edited manually, but we are working on
mechanisms for an automated fixing.
152 S. Alda, M. Won, and A.B. Cremers
Both the standard FREEVOLVE platform as well as the extended peer-to-peer version
have to be supported by an integrity checking module. So far this has only been done
for the classical tailoring environment. In case of the standard FREEVOLVE client there
is now an additional client that is called Tailoring Client. This special client was de-
veloped for experienced users that want to change a composition. The look and feel
was taken from integrated development environments (see figure 2). In the upper left
side (1) one can see the complete component net of an application. All components
and their provided and used ports are displayed. The small green icons indicate if a
port is an input or an output. Additionally one can see, if the port is optional or has to
be used (according to the idea of event flow integrity). The upper right window shows
the same component net in a tree view. Here the details of a component are listed and
parameters can be set. The third window shows messages that are generated by the
integrity check. Furthermore, icons (circled) mark components, parameters or ports of
components which may be the source of a failure according to the integrity strategy.
The integrity check is integrated loosely according to the strategy pattern. There-
fore, FREEVOLVE allows for extensions or refinements of the integrity strategy. The
integrity conditions or dependencies are explicitly saved in XML. They are loaded by
the strategy when analyzing a distinct composition. The explicit description of integ-
rity conditions that are not coded into the components themselves allows for more
flexibility, in the way that they can be changed or added according to a special do-
main. However, only very experienced users should do this.
Managing Dependencies in Component-Based Distributed Applications 153
4 Conclusions
In this paper we have presented a component architecture, which allows one to deploy
distributed client-server as well as peer-to-peer applications. We provide mechanisms
for tailorability and explained fundamental ideas for the adaptivity of applications.
The later mechanism is somewhat essential in a peer-to-peer environment, where
services are not known à priori or become unavailable after a time. Further, we ex-
plained our approach to handle dependencies between components, that is, the consti-
tuting building blocks of an application. Besides a graph-based approach to master
dependencies especially in peer-to-peer applications, we have incorporated the possi-
bility to define semantic conditions between components.
References
13. Reussner, R. H., ”The use of parameterised contracts for architecting systems with software
components”, in Proc. of 6th Int’l Workshop on Component-Oriented Programming
(WCOP'01), 2001.
14. Shirky, C., What is P2P.. and what isn’t. The O’Reilly Network, 2000. http://openP2P.com
15. Stiemerling, O., Hinken, R., Cremers, A. B., “The EVOLVE Tailoring Platform: Support-
ing the Evolution of Component-Based Groupware”, in: Proc. of EDOC'99, IEEE Press,
Mannheim, 1999, pp. 106–115.
16. Stiemerling, O., Cremers, A.B., “The EVOLVE Project: Component-Based Tailorability for
CSCW Applications”, AI & Society, 14, Springer-Verlag London, 2000, pp. 120–141.
17. Sun Microsystems, JXTA v1.0 Protocols Specification. http://spec.jxta.org/v1.0/, 2001
18. Szyperski, C., “Component Software – Beyond object-oriented programming”, Addison-
Wesley, 1997.
19. Won, M., “Komponentenbasierte Anpassbarkeit – Anwendung auf ein Suchtool für Group-
ware“, Master Thesis, Institute for Computer Science III, University of Bonn, 1998.
20. Won, M., “Checking integrity of component-based architectures", in Proc. of CSCW 2000,
Workshop on Component-Based Groupware, Philadelphia, 2000.
21. Won, M., Cremers, A. B., “Supporting End-User Tailoring of Component-Based Software –
Checking integrity of compositions”, in Proc. of CoLogNet 2002 Workshop affiliated with
LOPSTR 2002, Madrid, Spain.
A Notation for Component-Based Design of Java
Applications*
1 Introduction
In this paper we present JTN2 [1] (Java Targeted Notation 2) a notation for
component-based design of Java applications. JTN2 focuses on two major issues of
component-based development at the design level: component definition and
component interconnection.
In JTN2 we define a component model based on fundamental object-oriented
principles: abstraction, encapsulation, modularization and typing [2]. JTN2 is an
extension of JTN [4], an object-oriented, formal, visual notation, developed by our
group in Genova for designing concurrent Java applications. JTN supports the core
object oriented concepts: class, object, information hiding and specialization. JTN
differentiates between active and passive objects and between objects and the
interaction between them in order to support concurrency. The interactions between
objects are modeled by means of connectors. JTN2 adds support for components that
can be described, understood and analyzed independently from other components. A
component has to come with a clear specification of the services it provides [11].
* This work is supported by the Luxembourg Ministry of Higher Education and Research
under the title | project n° BFR00/040 and by the Italian National Project SAHARA
(Software Architecture for Heterogeneous Access Networks infrastructure) funded by MIUR.
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 155–164, 2003.
© Springer-Verlag Berlin Heidelberg 2003
156 C. Amza and G. Reggio
The UML (Unified Modeling Language)[12] general purpose modeling language that
has emerged as a widely used, standardized notation for describing visual, object-
oriented models. JTN2 aims to be UML compatible where appropriate, enhancing its
acceptance but introduces new concepts when required.
However, UML does not provide constructs related to component-based design.
For instance, the “UML components” are used to describe the physical, deployable
pieces of the system, unlike the more abstract notion of components needed by the
component-oriented design.
In [6] UML is extended to include additional concepts for component based design
by defining a profile that supports components (called capsules), connectors and
ports. UML 2.0 (available by 2004) is intended to be more suited for component-
based development then the previous versions, but its approach will remain object-
oriented. We think that object-oriented concepts are necessary but not enough for
building component software. Indeed, in JTN and JTN2 we introduce concepts that
are not necessarily object-oriented: component, connector and service.
In addition to expressiveness, the object-oriented notations based on UML lack
formal semantics. This makes more complex the formal verification of design and
automatic code generation.
A Notation for Component-Based Design of Java Applications 157
Moreover, UML is not concerned (so much of the current notations) about the
programming language that will be used for coding the application. Instead, any
concept supported by the JTN2 notation is mapped to Java. This will help the
developer to take in account the features and the limitations of the Java programming
language during the design phase and will ease to automate the generation of Java
code.
In this paper we focus on component notation. The definition of the formal
semantics and the mapping to Java associated with the JTN2 notation are part of our
future work. In section 2 we present the main concepts supported by JTN2: object,
class, relationship, connector and component. In section 3 we describe the visual
notation using a case study. Finally, in section 4 we sketch some conclusions and
future work.
Object-Oriented Concepts
In [2] an object is defined as an “entity with a well-defined boundary and identity that
encapsulates state and behavior”. Data values are static data used by other entities. In
contrast with objects, data values are “atemporal, unchanged and non-instantiated”
[2]. Furthermore, data values have no identity, so it is not possible to differentiate
between two instances of the same data value. An active object is one that owns a
thread and can initiate an independent activity. On the other hand, a passive object is
one that cannot initiate an independent activity and, in consequence, can change its
state only when explicitly acted upon. JTN2 differentiates between the values, the
passive and the active objects.
An object consists of a collection of attributes representing its state and a
collection of services representing means of interaction with other objects. There are
two kinds of services: methods, provided by the passive objects, and communication
channels provided by the active objects.
An object can interact with a passive object by calling a method provided by the
passive object. If multiple active objects can call simultaneous a method provided by
a passive object, then the access to the passive object is synchronized. In this
situation, the use of synchronization is necessary to ensure that the access of active
objects to passive objects never creates data races. Active objects do not provide
methods outside; instead they interact with other active objects or with external
entities using I/O operations through communication channels. A communication
channel is an abstraction for an unidirectional point-to-point communication between
two entities with I/O capabilities.
A class is a description of a set of entities with the same properties. In JTN2 there
are three kinds of classes: datatype, passive and active class whose instances are
values, passive and active objects respectively. A class definition is split in two parts:
an interface and a body. The class interface defines the services provided by the
instances of the class, whereas the class body provides an implementation for them.
The class body is separated from the class interface and completely encapsulated.
158 C. Amza and G. Reggio
JTN2 requires to assign a type to each entity used in the model. The most common
data types, either primitive (int, char) or structural (array, list), are predefined. In
addition, each user defined class introduces a new type.
In an OO perspective, stand-alone classes are not meaningful; most of them are
related to accomplish more complex functionality. Thus, we complete the class
descriptions with relevant relationships between them: dependency and
generalization/specialization. A dependency relationship denotes a client/server
relationship between two classes: the client uses the functionality provided by the
server class.
A generalization/specialization relationship denotes a relationship that supports:
Subclassing, that is inheritance of the class body
Subtyping, that is inheritance of the class interface
Substitutability, that is object of the specialized element (the child) are
substitutable for objects of the generalized element (the parent).
Conventional design modeling notations, as the UML, focuses on the static
structure of the system, leaving the description of the architecture and of the
interaction among objects implicit, distributed and difficult to identify. In JTN2 we
explicitly model the run-time architecture of the system by describing the objects and
the interactions among them in the architecture diagram. The architecture diagram is
essentially a graph whose nodes correspond to the entities building the application
and whose arcs, named connectors, correspond to the interactions among them. There
are two kinds of connectors:
Method call connectors. A method call connector is an architectural abstraction
corresponding to a standard method call in the Java programming language.
Channel connectors. A channel connector is an architectural abstraction
corresponding to a message exchange mechanism through abstract communication
channels (streams in Java).
Component
3 JTN2 Notation
In this section we use a simplified version of the LuxDeal [5] case study to exemplify
the JTN2 notation. LuxDeal is a free online barter service that brings together people
that want to exchange goods and services. The simplified application provides only
the following functionality: when a new user connects to the application, she/he will
have the possibility to create a new account by providing some useful information.
The application is decomposed as follows:
$FFRXQW component. This component manages the user accounts. The $FFRXQW
component allows the user to create a new account and access account information.
The account information is saved to a database, so it can be retrieved latter.
'EKHOSHU component. This component provides database access services.
The two components are connected together to provide the overall functionality of
the application. Here, for simplicity, the concurrent and distributed aspects of the
application are ignored. We focus on component definition and integration, without
concern about the internal realization of the services.
Component Definition
Account
PreparedStatement
! Access
getLogin():String setString(int, String)
getPassword():String executeUpdate()
getName():String close()
! Account
register(String, String, String): boolean
Connection
?DBAccess
getConnection():Connection prepareStatement(String): PreparedStatement
Figures 2 and 3 give the body of the $FFRXQW component. The $FFRXQW component
body is defined using three classes (figure 2): $FFRXQW0RGHO that encapsulates the
component state and the access services, the $FFRXQW&RQWUROOHU that implements the
registration service and the $FFRXQW'$2 that implements the persistence functionality
and requires a database access service.
AccountModel
AccountController
getLogin():String
register(String, String, String): boolean
getPassword():String
getName():String
setLogin(String)
AccountDAO
setPassword(String)
setName(String) store(AccountModel)
register DBAccess
store
:AccountDAO
DBAccess getConnection
The body diagrams associated with all the classes can be found in [1].
Figure 3 shows that an object of the type $FFRXQW&RQWUROOHU interacts with an object
of the type $FFRXQW0RGHO to modify the state of the account and with an object of the
type $FFRXQW'$2 to store the state of the account. The object of the type $FFRXQW0RGHO
implements the service $FFHVV. The object of the type $FFRXQW'$2 is a helper object
that contributes to the implementation of the service $FFRXQW by accessing the
component state and stores it in a database. It requires a service that provides access
A Notation for Component-Based Design of Java Applications 161
DBHelper
! LuxDealDB Connection
getConnection():Connection prepareStatement(String): PreparedStatement
Component Integration
Connection
Message
prepareStatement(String): PreparedStatement
! MessageService
send(Message) Message PreparedStatement
setSender(String) setString(int, String)
? DBAccess
setReceiver(String) executeUpdate()
getConnection():Connection
setText(String) close()
Account
PreparedStatement
! Access
getLogin():String setString(int, String)
getPassword():String executeUpdate()
getName():String close()
! Account
register(String, String, String): boolean
Connection
?DBAccess
getConnection():Connection prepareStatement(String): PreparedStatement
?Message
sendMessage(String, String, String))
Because the service 0HVVDJH required by the component $FFRXQW is not compatible
with the service 0HVVDJH6HUYLFH provided by the component 0HVVDJH, we need to add
a glue object. In our example, the role of the glue object is to convert the service
required by the component $FFRXQW to the service provided by the component 0HVVDJH,
so they can match. In JTN2 we allow an object to be connected to a service (provided
or required) through a method call connector. Figure 8 shows the application class
diagram and figure 9 shows the new application architecture.
A Notation for Component-Based Design of Java Applications 163
MessageAdaptor message.Message
setSender(String)
sendMessage(String, String, String)
setReceiver(String)
setText(String)
Fig. 8. Adapting incompatible components – the class diagram
Account Access
:Account :DBhelper
Access
sendMessage
:MessageAdaptor :Message
ALL send
MessageService
:Message
In this paper, we have proposed a component model, at the basis of a visual design
notation JTN2, that allows complex systems to be built from independently
understandable, reusable building blocks. Our model is based on the principles of the
object-oriented modeling: abstraction, encapsulation, modularity, typing and
hierarchy. In addition to well-known object-oriented concepts it supports the concepts
of connector and component as first-class model abstractions.
JTN2 aims to answer to two fundamental issues of component-oriented modeling:
component definition and component composition. A component can be described,
understand and analyzed independently. By defining the services required by a
component to function correctly, we allow a component to be compiled and deployed
independently. Then, more complex components and applications can be built by
wiring pre-designed components. If two components are incompatible, they can be
adapted by adding supplementary glue components.
In future, we intend to provide JTN2 with a well-defined syntax and a formal
semantic, so we can check the models for inconsistencies. By mapping the concepts
164 C. Amza and G. Reggio
supported by JTN2 to Java, we aim to generate complete Java code starting from the
visual component descriptions. We are currently investigating the Java related
technologies that support component-based development: JavaBeans, EJB and JINI.
For a component description in JTN2 we should be able to generate different
implementations, depending on the target technology.
The notation is restrained to Java applications consisting of a static number of
components that run in a non-distributed environment. We intend to extend the
notation to support the development of distributed and mobile system.
We also have developed a visual editor for constructing object-oriented JNT2
models. We want to extend it to support the component notation and to perform type
checking and code generation.
References
1. Amza, C., Reggio, G.: A Notation For Component-Based Design of Java Applications.
Long Version. Technical Report. DISI-TR-2002-20. Dipartimento di Informatica e
Scienze dell’Informatione. Universita di Genova. Available online at:
ftp://ftp.disi.unige.it/person/AmzaC/papers/jtn2_02.pdf (2002)
2. Booch, G.: Object-Oriented Analysis and Design with Applications. Second Edition. The
Benjamin/Cummings Publishing Company (1994)
3. Booch, G.: The Unified Modeling Language User Guide. Addison-Wesley (2000)
4. Coscia, E., Reggio,G.: JTN:A Java-Targeted Graphical Formal Notations for Reactive and
Concurrent Systems. In: FASE 99 – Fundamental Approaches on Software Engineering.
Lecture Notes in Computer Science. Vol. 1577. Berlin, Springer Verlag (1999)
5. Ries, B., Sterges,P.: Requirements for the LuxDeal Prototype. Luxembourg University of
Applied Sceineces (2002)
6. Selic, B., Raumbaugh, J.: Using UML for Modeling Complex Real-Time Systems.
Rational Software. Available online at http://www.rational.com/products/whitepapers/
UML-rt.pdf (1999)
7. Sun Microsystems : JavaBeans Specification. Version 1.01. Available online at
http://java.sun.com/products/javabeans/docs/spec.html (1997)
8. Sun Microsystems : Enterprise Java Beans Specification. Version 2.1. .Available online at
http://java.sun.com/products/ejb/docs.html (2002)
9. Sun Microsystems : JINI Architecture Specification. Available online at
http:// sun.com/jini/jini1.1html/jini-title.html (2000)
10. Sun Microsystems. JINI Technology Core Platform Specification. Version 1.1. Available
online at http://sun.com/jini/specs/jini1.1html/core-tilte.html (2000)
11. Szypersky, C.: Component Software – Beyond Object Oriented Programming. Addison-
Wesley and ACM Press (1998)
12. UML Revision Task Force: OMG UML v.1.3 specification. Available at
http://www.rational.com/media/uml/post.pdf (1999)
13. Xiong, Y.: An Extensible Type System for Component-Based Design. Ph.D. Thesis.
University of California at Berkeley (2002)
WCT: A Wrapper for Component Testing
1 Introduction
Component based (CB) methodologies are today largely used in all the classical engi-
neering branches. Their adoption is mainly motivated by the need to get more predict-
able timing and costs of the development phase. Although the introduction of a CB
paradigm also in the software engineering branch has been advocated for long time
[1], it is only in the last years that we can observe significant advances towards the
real applicability of this methodology to software production. Proof of this progress is
the advent of the first successful component models such as COM+/.Net, EJB, Java-
Beans, CCM. However, in spite of these advances, we can certainly say that CB pro-
duction is not Software Engineering state of practice yet. What is still lacking for the
real take-up of the CB paradigm is a major revision of the software process, to ad-
dress the peculiarity of a CB production. In [2] a list of important challenges in the
CBSE is discussed.
A first and basic difference between the traditional production methodology and a
CB one is in the non-deterministic distribution, in time and in space, of the CB devel-
opment process. In fact in CB production the “pieces” that will constitute the final as-
sembled system can be acquired from many other organizations, that do not necessar-
ily communicate or synchronize with each other. Moreover the acquired elements are
not in general developed as a consequence of a specific requirements specification,
instead are retrieved from the market as pre-built elements.
* Andrea Polini’s PhD grant is supported by Ericsson Lab Italy in the framework of the
Pisatel initiative (http://www.iei.pi.cnr.it/ERI).
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 165–174, 2003.
© Springer-Verlag Berlin Heidelberg 2003
166 A. Bertolino and A. Polini
In this scenario we can distinguish, at least, two different stakeholders. The first is
the component developer, who is engaged in the construction of the components that
will be released to third parties. The second kind of stakeholder is represented by the
system constructor; this is himself/herself a software developer, who builds a system
by assembling together some components, either internally developed or externally
acquired. Certainly the CB paradigm mainly affects the process of the system con-
structor. In [3] we have outlined a possible iterative process that tries to address the
new requirements of a CB production; another process is shown in [4]. Within the
process of the system constructor, we can note the presence of two new related
phases, referred to as the searching phase and the selection phase.
Aim of the searching phase is to find one or more components that can be assem-
bled in the system to address some specific functional requirements. This search is not
an easy job and requires specific tools; in particular, one active research direction tries
to identify what kind of information (and how) can be attached to the component by
the component developer, so to automate, as much as possible, the task.
Aim of the selection phase is to choose the component, among those identified by
the previous phase, which is the most suitable for the system under construction. It is
obvious that in the general case the result of the searching phase cannot be a single
component for which the full conformity with the searched one can be guaranteed;
therefore a validation phase is necessary to evaluate the components and to select the
most promising one.
In practice these two phases are particularly hard, since it is generally the rule to
provide the component without the source code (black-box); moreover, the attached
documentation is often incomplete or only limited to the explanation of how to use
the provided API.
A direct consequence of this lack of information is generally referred to as the
“component trust problem”, to indicate that the system constructor, who acquires a
component from the component developer, needs some means to get a better under-
standing of what the component does and how it behaves. In this context, our per-
spective is in studying how to use testing techniques for the validation and selec-
tion of the components, so contributing to the mitigation of the component trust
problem. In particular we are studying how the testing activities and tools must be re-
viewed and augmented to take in account the peculiarities of a CB development.
So far we have used the term “component” in a complete general form. A reference
definition for the concept of component is still debated and in the literature there is
not yet a universal agreement. An often reported definition is that provided in [5]: “A
software component is a unit of composition with contractually specified interfaces
and explicit context dependencies only. A software component can be deployed inde-
pendently and is subject to composition by third parties”. With respect to this defini-
tion we take a more simplified view, and, as in [6], we identify a component with a
system or subsystem developed by one organization, deployed by one or more differ-
ent organizations, and possibly provided without the source code. According to this
definition, we will consider also a class or a set of classes as a particular example of a
component.
Regarding the technical aspects of the methodology, we require that the used com-
ponent model foresees basic introspection mechanisms, because we need to retrieve at
run-time information mostly referring to the component interfaces. As a consequence,
we have adopted the Java language.
WCT: A Wrapper for Component Testing 167
2 Related Work
As recognized in [7], several techniques can be usefully combined to reduce the com-
ponent trust problem (formal specification, Design by Contract, testing and others),
and further research is still necessary. Different approaches are under study to address
the problem; in particular some authors suggest to add supplementary information to
the component, in the form of metadata, with the objective to increase the analysis
capability of the system constructor [6][8]. Another approach proposes to establish
independent agencies that act as software certification laboratories. The main duty of
this agencies would be the certification of the components (as done in many other en-
gineering disciplines) so to increase guarantee of adequate behavior [9].
Regarding testing, in our knowledge there is not much work addressing the prob-
lem: we list three different proposals. A first approach proposes to embed test cases in
the component itself (Built-In Tests) in the form of methods externally visible [10]. A
disadvantage of this approach is the size growth of components that have to contain
also code specific for the testing phase. To overcome this problem, another approach
introduced the concept of a testable architecture. This architecture foresees that the
components implement a particular interface for testing purposes, that permits to exe-
cute pre-built tests without the necessity to include them in the component code [11].
Finally in another approach [12], the authors propose to provide the component user
with the test cases that the component has undergone in the form of a XML file. In
this case the authors also provide, attached to the component, a tool for the re-
execution of the test cases.
It is worth noting that, differently from the listed approaches, we do not impose
that a component implements any particular interface, so to make this task less costly
and more generally applicable. Moreover our aim is to provide the component user
with tools to simplify the execution of test cases developed on the base of the specifi-
cation for the searched components, permitting at the same time the re-execution of
the test cases provided by the component developer.
test phase performed by the component user results composed of two sub-phases: in
the first phase a selected component is evaluated directly invoking the provided API;
in the second phase, the component is inserted in a subsystem and then integration-
tested as an element of the subsystem. The last stage of system test does not show
notable differences with respect to the traditional one.
From the above discussion, the system constructor process must involve at least
two kinds of independents teams. The first is the searching team, who on the base of
precise specifications looks for components that, in their understanding, correspond to
the searched ones. The second team instead, on the base of the specification, develops
test cases that can be used to ascertain the conformance of the found selected compo-
nents to the searched ones.
Probably the major costs induced in CB development can be ascribed to the two
illustrated phases, and then any little improvement, thanks to suitable techniques and
tools, can bring great benefits. In particular it is worth noting that a lot of the work
made by the two kinds of teams can be in principle carried on in parallel, and there-
fore it is important to adopt a methodology that permits the effective exploitation of
this possibility.
The WCT wrapper has a fixed structure that is independent from the connected com-
ponents. It can be removed, e.g., for performance purposes, when a final configura-
tion is identified, to be substituted with a static wrapper: clearly the test wrapper can
WCT: A Wrapper for Component Testing 169
be very useful to develop the permanent wrapper, because it can be taken as a refer-
ence model, like a prototype.
Dynamically
connected
Candi date to other
Listener Interpreter
..
.
Component candidate
components
Redirector or stubs
WCT
Provided Required
services services Interfaces for the controller process
Fig. 1. Schema of the WCT illustrating the constituent elements and the various connections.
The approach forecasts to connect all the required interfaces of a candidate compo-
nent to a “standard” component named the Redirector. As we will see this consists of
two parts: one depending on the candidate component, and the second fixed for each
instance of the Redirector. As illustrated in Figure 1, the composition of a candidate
component and of the Redirector forms a new component, named WCT, that provides
to the external world the same services of the candidate component but does not stati-
cally requires any particular service. Only at run-time, to realize the subsystem under
test, the binding between the various components is established as the result of the
appropriate interpretation of a XML document. Therefore a subsystem under test, at
each time, results from the composition of more WCTs. The configuration of the sub-
system can be modified also at run-time, e.g., by inserting a candidate component in
the place of a stub or by substituting a component with a more promising one.
To clarify the approach and how it works, it can be useful to describe in detail the
structure of the WCT and in particular of the Redirector. The duty of the Redirector is
exactly to redirect (then its name) at run-time the invocations made by the candidate
component towards other candidate components (or otherwise stubs) that opportunely
provide the required services. To put in place the redirection, it uses three main ele-
ments (see also Fig. 1):
1. the Listener class
2. the Interpreter class
3. a XML file named “XMLAdapter”
In the following, we describe in detail these elements. We refer to the associate com-
ponent to indicate the component that is contained in the same WCT of the Redirec-
tor, and to the attached components to indicate the components that provide the re-
quired services to the associate component.
The Listener Class. Main target of the class Listener is to isolate the associate
component from the attached components. In other words, this class act as a proxy,
postponing to a subsequent moment the instantiation of one or more real attached
components, which will be able to manage the invocations made by the associate
component. To do this, the Listener class has to implement all the interfaces that the
170 A. Bertolino and A. Polini
associate component requires, but instead of giving, for each method, a real
implementation, it delegates this task to the Interpreter class (as we will see the latter,
in turn, does not implement the method, but “knows who” can opportunely serve the
invocation). The implementation of the class Listener depends from the specific
candidate component, having to implement the specific interfaces, but the
implementation can be totally automated with the support of suitable tools1. In fact,
the scheme of the methods is completely fixed, and for each method the only duty is
to redirect the invocation to the Interpreter class packaging the parameter in a vector
of objects. In the following piece of Java code, we show a scheme for the Listener
class that implements two interfaces A and B, each of which requires the
implementation of one method, respectively named “a” and “b”.
public class Listener implements A,B{
private Interpreter interpreter;
public Listener(Interpreter interpreter) {
this.interpreter = interpreter;}
public t1 a(tA1 p1,…,tAn pn) {
Object[] parameters = new Object[] {p1,…,pn}2;
return (t1)interpreter.execMethod(“a”,parameters);}
public t2 b(tB1 p1,…,tBn pm) {
Object[] parameters = new Object[] {p1,…,pm};
return (t2)interpreter.execMethod(“b”,parameters);}
}
The Interpreter Class. If the duty of the Listener classes is to “deceive” the
candidate component simulating the presence of the attached components, the main
duty of the Interpreter class is to redirect, at run-time, the invocations towards an
implementation that can really give suitable answers. The redirection is based on the
information retrieved from the XMLAdapter, that contains the rules based on which
the redirection will be based (in the next section we explain in major detail how to
draw up this file).
To identify the method, or methods, that must be invoked, as a consequence of a
request made by the associate component, the instance of the Interpreter class uses the
introspection mechanisms provided by the component model, trying to retrieve infor-
mation from the attached components. In particular to apply the model we exploit the
Java introspection mechanism which permits to retrieve information from all the pub-
lic methods and to invoke, on the base of these information, a selected method. The
use of reflection permits the easy reconfiguration of the system, allowing for the in-
troduction of new candidate components taking the place of a stub or substituting a
candidate component with a more promising one.
The class Interpreter presents two main public methods. The first method is in-
voked by the controller of the testing process that provides the name of the
XMLAdapter. Obtained the name, the method reacts by parsing the file (to do this the
interpreter contains an instance of a suitable XML parser) and storing the retrieved in-
1 It is worth noting that in the specific case of a clash in one or more of the method names en-
closed in the required interfaces, it will be necessary to use more then one Listener class.
2 In Java we need also to use a wrapper type when the parameter is of a basic type.
WCT: A Wrapper for Component Testing 171
formation in appropriate data structures. Through the invocation of this method, the
controller of the testing process can reconfigure the subsystem under test.
The second method, instead, is invoked by the associated listener and is appointed
to redirect the invocations, made by the associate component, to the opportune
method (methods) of the attached components. To perform this task the method uses
the information stored in the data structures by the XML parser, and uses the reflec-
tion mechanisms on the attached components.
We think that this method, that has the control over the methods invocations of the
associate component, can be augmented to perform other useful tasks. A first task that
comes to mind is the recording of method invocations to keep trace of each test case
execution. This tracing facility may result particularly useful when we consider the
replacing of a stub with a real component or the substitution of a candidate compo-
nent with a more promising one. In fact, having recorded the test cases that stimulate
a particular method, in the case of a replacement we can re-execute only the test se-
quences exercising the methods affected by the substitution.
The XMLAdapter file. Aim of the XMLAdapter is to provide a means by which the
searching teams can explicitly formulate the correspondences between a client
component (the component that needs a service) and a server component (the
component that provides the services). Several levels of mismatches can exist
between a searched component and a found one. These are immediate consequences
of the selection process. In fact, we suppose that the selection of a candidate
component has to follow some “semantic” principles, in the sense that the choice is
mainly based on the understanding of what a component does, understanding that the
searching teams must derive from the documentation associated to the component.
This “choice” obviously implies the necessity of suitable wrappers to actually permit
the “syntactic” interaction among components. Our approach is to codify the rules
that establish the correspondence in the XMLAdapter, a XML file with definite tags,
that can be parsed by the Interpreter to redirect the invocations made by the associated
component. We have identified several levels of mismatch between the client and the
server components, that can be overcome with the use of the XMLAdapter:
1. differences in the methods names and signatures:
a. the methods have different names
b. the methods have the same number and types of parameters, but they
are declared in different order
c. the parameters have different types, but we can make them compati-
ble, through suitable transformations. It can be also necessary to set
some default parameters.
2. one method in the client component corresponds to the execution of more than
one method, in one or more server components.
Regarding the structure of the XMLAdapter, it can be divided in two parts. The
first part specifies the component instances (that can also be remote) that must be
used and manages the invocations of the associated component. In the second part, for
each invocation of the associated component, the corresponding sequence of methods
and transformations, that must be invoked in the attached components, are specified.
172 A. Bertolino and A. Polini
Fig. 2. Collaboration diagram that illustrates the interaction between the WCT elements and
two attached components. In particular to provide service f the associated component requires
two services, s1 and s2. To provide this services the Interpreter instance opportunely trans-
forms the invocations of s1 and s2 into sequences of invocations of the services provided by
the two attached components Comp1 and Comp2.
In this section we explain how the WCT component can be used in CB development.
In particular we outline the scheme of a general platform for component deployment
testing, within which the WCT component can be employed for assembling and test-
ing the components.
The general structure of the platform is shown in Figure 3. The main target of this
platform is to provide the capability to exploit the potential parallelism in the activi-
ties of the searching and testing teams. In fact, following the approach proposed in
[14], we suppose that the test cases are defined by the system constructor, before the
component is acquired, on the basis of a virtual component specification. The struc-
ture of this virtual component can be derived from the system specifications, which
are also used as a reference by the searching teams.
For the purposes of presentation, we have spoken so far of the acquisition of a sin-
gle component. Indeed, when we have to assembly a “piece of software” within a CB
system, in our approach there is no difference between a “monolithic” component or
instead a “composite” component, made by more opportunely connected components.
Since a subsystem can be externally viewed as a single component, the testing teams
can use the same platform to codify the test cases for exercising either a component,
or a composite subsystem (such as the gray box in Figure 3). To do that the subsystem
under test is viewed itself as a component with a specified virtual interface that ex-
presses the functionality that a real instance of the subsystem has to provide. On the
basis of the virtual interface the test cases are codified and stored in a suitable reposi-
tory to be later used when the subsystem will be instantiated. It follows that the deri-
vation of the test cases and the searching activity can largely proceed in parallel, since
the codification of the test cases in this way is not dependent from any particular im-
plementation.
As we can see, the platform shown in Figure 3 is composed of three main ele-
ments. The first is the Tester Component, widely described in [14], that permit the
application of the test cases established by the testing teams on the subsystem. The
second element of the approach is the Controller, which is a distributed interface that
WCT: A Wrapper for Component Testing 173
permit to control the testing process, in particular giving to the testing teams a means
to add developed test cases, and to the searching teams a means to modify the sub-
system structure. The use of the WCT components, as constituent elements of a sub-
system under test (that constitute the third element in the figure), is particularly useful
at this stage. In fact, each time a searching team identifies a potential candidate com-
ponent, to be able to insert the latter in the subsystem it is only necessary to modify
the XMLAdapter associated with those components that need to invoke the new in-
serted one. However, when the introduction/removal/substitution of a component has
also effect on the external interface of the subsystem, it is necessary that the set of test
cases used by the Tester Component is accordingly modified.
The WCT can also be fruitfully employed to reduce the number of test cases to be
re-executed when a new component is introduced. To do that, the Interpreter in the
WCT can keep trace of the methods that are invoked by the associated component
during the execution of a test case, and communicate them to the Controller. This in-
formation can be opportunely stored by the Controller and then used to establish the
set of regression test cases, when a new component is inserted and an invocation is
redirected. Concluding, in this paper we have briefly revisited the testing process in
CB development and we have highlighted how the testing activity can result particu-
larly useful in the component selection phase. We have then presented the notion of a
test wrapper that can be usefully employed by a system constructor to test sets of in-
tegrated components (subsystem) within his/her environment.
WCT1
WCT3
Tester
...
Component
WCT2
Controller
Fig. 3. The whole logical structure of a possible subsystem under test composed of three com-
ponents. The figure shows the Tester Component and the Controller, that constitutes the inter-
face towards the Testing and Searching teams.
Favorable features of the proposed approach are that: it does not require that the
component implements any specific interface for testing purpose; it permits an easy
and fast reconfiguration of a subsystem after the introduction/removal/substitution of
a component within the subsystem; and finally it is possible to introduce in the WCT
useful features for regression testing purposes: in particular, we have outlined how it
is possible to reduce the number of test cases to re-execute at each reconfiguration.
174 A. Bertolino and A. Polini
The WCT wrapper is part of a long term research project addressing CB testing.
Our aim is to employ the WCT component within a more general platform for CB
testing currently under development.
In the next future we will work at the implementation of the mentioned testing plat-
form. In particular we intend to formalize as much as possible the drawing up of the
XMLAdapter by means of suitable graphical interfaces that partially automate the
process. It is also our objective to reuse as much as possible existing tools, as for in-
stance Junit [15], a framework developed for the early testing of OO code but that can
be partially revisited in the CB testing field. Lastly, we also plan to validate the ap-
proach within industrial CB production using real case studies.
References
1. McIllroy, D.: Mass Produced Software Components. In P. Naur and B. Randall Eds, Soft-
ware Eng.: Report on a Conf. by the NATO Science Committee, pp 138–155, Brussels,
1968.
2. Crnkovic, I.: Component-based Software Engineering – New Challenges in Software De-
velopment. Software Focus, John Wiley & Sons Eds, December2001
3. Bertolino, A., Polini, A.: Re-thinking the Development Process of Component-Based
Software. ECBS02 Workshop on CBSE, Composing Systems From Components, April
10–11, 2002, Lund, Sweden.
4. Crnkovic, I.: Component-based Software Engineering – New Paradigm of Software De-
velopment. Invited talk & Invited report, Proc. MIPRO 2001, Opatija, Croatia , May 2001.
5. Szyperski, C.: Component Software: Beyond Object-Oriented Programming, Addison-
Wesley, 1998.
6. Orso, A., Harrold, M.J., Rosenblum, D.: Component Metadata for Software Engineering
Tasks. In W. Emmerich and S. Tai Eds. EDO2000, LNCS 1999, pp.129–144.
7. The Trusted Component Initiative: http://trusted-components.org, Access date 2002-09-21
8. Stafford, J.A., Wolf, A.L.: Annotating Components to Support Component-Based Static
Analyses of Software Systems. In Proceedings of the Grace Hopper Celeb. of Women in
Computing 2001.
9. Voas, J.: Developing a Usage-Based Software Certification Process. IEEE Computer,
August 2000, pp. 32–37.
10. Wang, Y., King, G., Wickburg, H.: A Method for Built-in Tests in Component-based
Software Maintenance. In Proceedings of the 3rd ECSMR, 1999.
11. Gao, J., Gupta, K., Gupta, S., Shim, S.: On Building Testable Software Components. In J.
Dean, and A. Gravel Eds, Proceedings of ICCBSS 2002, LNCS 2255, pp.108–121.
12. Morris, J., Lee, G., Parker, K., Bundell, G.A., Lam, C.P.: Software Component Certifica-
tion. IEEE Computer, September 2001, pp.30–36.
13. Weyuker, E.: Testing Component-Based Software: A Cautionary Tale. IEEE Software,
Sept./Oct. 1998, pp. 54–59.
14. Bertolino, A., Polini, A.: A Framework for Component Deployment Testing. Proceedings
of the ACM/IEEE International Conference on Software Engineering ICSE 2003 (to ap-
pear), Portland, USA, May 3–10, 2003
15. JUnit: http://www.junit.org.
A Java Middleware for Guaranteeing Privacy of
Distributed Tuple Spaces
Abstract. The tuple space communication model, such as the one used
in Linda, provides great flexibility for modeling concurrent, distributed
and mobile processes. In a distributed setting with mobile agents, par-
ticular attention is needed for protecting sites and information. We have
designed and developed a Java middleware, Klava, for implementing
distributed tuple spaces and operations to support agent interaction and
mobility. In this paper, we extend the Klava middleware with crypto-
graphic primitives that enable encryption and decryption of tuple fields.
We describe the actual implementation of the new primitives and provide
a few examples. The proposed extension is general enough to be applied
to similar Java frameworks using multiple distributed tuples spaces pos-
sibly dealing with mobility.
1 Introduction
A successful approach to concurrent programming is the one relying on the
Linda coordination model [11]. Processes communicate by reading and writing
tuples in a shared memory called tuple space. Control of accesses is guaranteed by
requiring that tuples selection be associative, by means of pattern matching. The
communication model is asynchronous, anonymous, and generative, i.e., tuple’s
life-time is independent of producer’s life time.
The Linda model has been adopted in many communication frameworks such
as, e.g., JavaSpaces [1] and T Spaces [10], and for adding the tuple space commu-
nication model to existing programming languages. More recently, distributed
variants of tuple spaces have been proposed to exploit the Linda model for
programming distributed applications over wide area networks [7,2], possibly
exploiting code mobility [8,12]. As shown in [9], where several messaging models
for mobile agents are examined, the blackboard approach, of which the tuple
space model is a variant, is one of the most favorable and flexible.
Sharing data over a wide area network such as Internet, calls for very strong
security mechanisms. Computers and data are exposed to eavesdropping and
This work has been partially supported by EU within the FET – Global Computing
initiative project MIKADO IST-2001-32222, by MIUR project NAPOLI and by
Microsoft project NAPI. The funding bodies are not responsible for any use that
might be made of the results presented here.
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 175–184, 2003.
c Springer-Verlag Berlin Heidelberg 2003
176 L. Bettini and R. De Nicola
manipulations. Dealing with these issues is even more important in the context
of code mobility, where code or agents can be moved over the different sites
of a net. Malicious agents could seriously damage hosts and compromise their
integrity, and may tamper and brainwash other agents. On the other hand,
malicious hosts may extract sensible data from agents, change their execution
or modify their text [17,13].
The flexibility of the shared tuple space model opens possible security holes;
it basically provides no access protection to the shared data. Indeed there is no
way to determine the issuer of an operation to the tuple space and there is no
way to protect data: a process may (even not intentionally) retrieve/erase data
that do not belong to it and shared data can be easily modified and corrupted.
In spite of this, within the Linda based approaches, very little attention has been
devoted to protection and access control.
In this paper we present a Java middleware for building distributed and
mobile code applications interacting through tuple spaces, by means of cryptog-
raphy. In this middleware, classical Linda operations are extended for handling
encrypted data. Primitives are also supplied for encrypting and decrypting tuple
contents. This finer granularity allows mobile agents (that are not supposed to
carry private keys with them when migrating) to collect encrypted data, while
executing on remote sites, and decrypt them safely when back at the home site.
The proposed extension, while targeted to our middleware for mobile agents
interacting through distributed tuple spaces, Klava [4], is still general enough to
be applied to similar Java frameworks using multiple distributed tuples spaces
possibly dealing with mobility, such, e.g., [12,1,7]. Indeed, this extension rep-
resents a compromise between the flexibility and open nature of Linda and of
mobile code, and the privacy of data in a distributed context.
The Linda communication model [11] is based on the notion of tuple space that is
a multiset of tuples. These are just sequences of items, called fields that are of two
kinds: actual fields, i.e., values and identifiers, and formal fields, i.e., variables.
Syntactically, a formal field is denoted with !ide, where ide is an identifier. Tuples
can be inserted in a tuple space with the operation out and retrieved from a
tuple space with the operations in and read (read does not withdraw the tuple
from the tuple space). If no matching tuple is found, both in and read block the
process that execute them, until a matching tuple becomes available. Pattern-
matching is used to select tuples from the tuple space; two tuples match if they
have the same number of fields and corresponding fields do match: a formal field
matches any value of the same type, and two actual fields match only if they
are identical (but two formals never match). For instance, if Val is an integer
variable, then tuples (“foo”, “bar”, !Val ) and (“foo”, “bar”, 300) do match. After
matching, the variable of a formal field gets the value of the matched field; in
the previous example, after matching, Val will contain the integer value 300.
A Java Middleware for Guaranteeing Privacy of Distributed Tuple Spaces 177
Thus the programmer is relieved from the burden of executing all these internal
tasks, and when a readk or an ink operation succeeds it is guaranteed that the
retrieved tuple has been correctly decrypted. Basically the original Linda pattern
matching mechanism is not modified: encrypted fields are seen as ordinary fields
that have type KCipher (as shown in Section 3). It can be seen as an extended
pattern matching mechanism that, after the structural matching, also attempts
to decrypt encrypted fields.
In case mobile code is used, the above approach may be unsafe. Indeed,
symmetric and asymmetric key encryption techniques rely on the secrecy of the
key (in asymmetric encryption the private key must be kept secret). Thus, a
fundamental requirement is that mobile code and mobile agents must not carry
private keys when migrating to a remote site (“Software agents have no hopes
of keeping cryptographic keys secret in a realistic, efficient setting” [17]). This
implies that the above introduced operations ink and readk cannot be used by
a mobile agent executing on a remote site, because they would require carrying
over a key for decryption.
For mobile agents it is then necessary to supply a finer grain retrieval mech-
anism. For this reason we introduced also operations for the explicit decryption
of tuples: a tuple, containing encrypted fields, will be retrieved by a mobile agent
by means of standard in and read operations and no automatic decryption will
be attempted. The actual decryption of the retrieved tuples can take place when
the agent is executing at the home site, where the key for decryption is available
and can be safely used. Typically a mobile agent system consists of stationary
agents, that do not migrate, and mobile agents that visit other sites in the net-
work, and, upon arrival at the home site, can communicate with the stationary
agents.
Thus the basic idea is that mobile agents collect encrypted data at remote
sites and communicate these data to the stationary agents, which can safely
decrypt their contents. Obviously, if some data are retrieved by mistake, it is up
to the agents to put it back on the site from where they were withdrawn. This
restriction of the protocol for fetching tuples is necessary if one wants to avoid
running the risk of leaking private keys. On the contrary, public keys can be
safely transported and communicated. By using public keys mobile agents are
able to encrypt the data collected along their itinerary.
Notice that there is no guarantee that a “wrong” tuple is put back: our
framework addresses privacy, not security, i.e., even if data can be stolen, still
it cannot be read. Should this be not acceptable, one should resort to a secure
channel-based communication model, and give up the Linda shared tuple space
A Java Middleware for Guaranteeing Privacy of Distributed Tuple Spaces 179
model. Indeed the functionalities of our framework are similar to the one pro-
vided, e.g., by PGP [18] that does not avoid e-mails be eavesdropped and stolen,
but their contents are still private since they are unreadable for those that do
not own the right decryption key.
An alternative approach could be that of physically removing an encrypted
tuple, retrieved with an in, only when the home site of the agent that performed
the in, notifies that the decryption has taken place successfully. Such a tuple
would be restored if the decryption is acknowledged to have failed or after a
specific timeout expired. However, this approach makes a tuple’s life time de-
pendent on that of a mobile agent, which, by its own nature, is independent
and autonomous: agents would be expected to accomplish their task within a
specific amount of time. Moreover, inconsistencies could arise in case successful
decryption acknowledgments arrive after the timeout has expired.
3 Implementation
Klava [4] is deployed as an extensible Java package, Klava, that defines the
classes and the run-time system for developing distributed and mobile code ap-
plications according to the programming model of Klaim. In Klava processes
are instances of subclasses of class KlavaProcess and can use methods for ac-
cessing a tuple space of a node: out(t,l), for inserting the tuple t into the
tuple space of the node at locality l, read(t,l) and in(t,l), for, respectively,
reading and withdrawing a tuple matching with t from the tuple space of the
node at locality l. Moreover the method eval(P,l) can be used for spawning
a KlavaProcess P for remote execution on site l. Some wrapper classes are
supplied for tuple fields such as KString, KInteger, etc.
The extension of this package, CryptoKlava, provides the cryptography
features described in the previous section. We have used the Java Cryptogra-
phy Extension (JCE ) [14], a set of packages that provide a framework and im-
plementations for encryption, key generation and key agreement, and Message
Authentication Code (MAC) algorithms. JCE defines a set of standard API,
so that different cryptography algorithms can be plugged into a system or an
application, without modifying the existing code. Keys and certificates can be
safely stored in a Keystore, an encrypted archive.
CryptoKlava is implemented as a subpackage of the package Klava, namely
Klava.crypto, so that it is self-contained and does not affect the main pack-
age. In the rest of this section we will describe the main classes of the package
Klava.crypto, implementing cryptographic features.
The class KCipher is introduced in order to handle formal and actual fields
containing encrypted data (it follows the Klava convention that wrapper classes
for tuple items start with a K). Basically it can be seen as a wrapper for standard
Klava tuple fields. This class includes the following fields:
protected byte[] encItem; // encrypted data
protected Object ref; // reference to the real tuple item
protected String alg; // enc−dec algorithm type
180 L. Bettini and R. De Nicola
The reference ref will be null when the field is a formal field, or the field
has not yet been decrypted. After retrieving a matching tuple, encItem will
contain the encrypted data (that is always stored and manipulated as an array
of bytes). After the decryption, ref will refer to the decrypted data. Conversely,
upon creation of an actual field, ref will contain the data to be encrypted; after
encryption, encItem will contain the encrypted data, while ref will be set to
null (so that the garbage collector can eventually erase such clear data also from
the memory). alg stores information about the algorithm used for encryption
and decryption.
An actual encrypted tuple field can be created by firstly creating a standard
Klava tuple field (in the example a string) and then by passing such field to an
instance of class KCipher:
KString s = new KString("foo");
KCipher ks = new KCipher(s);
Similarly the following code creates an encrypted string formal tuple field (In
Klava a formal field is created by instantiating an object from a Klava class for
tuple fields – such as KString, KInteger, etc. – through the default constructor):
KString s = new KString();
KCipher ks = new KCipher(s);
KCipher supplies methods enc and dec for respectively encrypting and de-
crypting data represented by the tuple field. These methods receive, as param-
eter, the Key that has to be used for encryption and decryption, and enc also
accepts the specification of the algorithm. These methods can be invoked only
by the classes of the package.
The class Tuplex extends the standard Klava class Tuple, in order to con-
tain fields of class KCipher, besides standard tuple fields; apart from provid-
ing methods for cryptographic primitives, it also serves as a first filter during
matching: it will avoid that ordinary tuples (containing only clear text data)
be matched with encrypted tuples. Once tuple fields are inserted into a Tuplex
object, the KCipher fields can be encrypted by means of the method encode.
For instance, the following code
KString ps = new KString("clear");
KCipher ks = new KCipher(new KString("secret"));
Tuplex t = new Tuplex();
t.add(ps); t.add(ks);
t.encode();
creates a tuple where the first field is a clear text string, and the second is a
field to be encrypted, and then actually encrypts the KCipher field by calling
encode. Also encode can receive parameters specifying the key and the algorithm
for the encryption; otherwise the default values are used. encode basically calls
the previously described method enc on every KCipher tuple field, thus ensuring
that all encrypted fields within a tuple rely on the same key and algorithm.
As for the retrieval operation, this can be performed either with the new
introduced operations, ink and readk, if they are executed on the local site
A Java Middleware for Guaranteeing Privacy of Distributed Tuple Spaces 181
Notice that clients that want to communicate privately must have agreed about
the specific key to be used during the private message exchange; this is definitely
the case with symmetric keys. As for public and private key encryption the
receiver can simply use its private key, to decrypt a message encrypted with its
own public key.
A private message is represented by a tuple of shape ("PERSONAL", <body>,
<recipient>, <sender >), where <recipient> and <sender > are, respectively,
the locality of the client the message is destined to and the locality of the issuer of
the message. Basically, when a client wants to send a message with an encrypted
body, it will have to perform the following steps:
Tuplex t = new Tuplex() ;
KCipher cryptMessage = new KCipher( message ) ;
t.add( new KString( "PERSONAL" ) );
t.add( cryptMessage ) ;
t.add( selectedUser ) ;
t.add( self ) ;
t.encode();
out( t, server ) ;
where message is the actual message body.
The server handles encrypted messages by retrieving them through the fol-
lowing actions (it will deliver the tuple without the field <recipient>, which is
useless at this time):
KString message = new KString() ;
KCipher cryptMessage = new KCipher( message ) ;
Locality to = new PhysicalLocality() ;
Locality from = new PhysicalLocality() ;
A Java Middleware for Guaranteeing Privacy of Distributed Tuple Spaces 183
On the other hand, the receiver, which is always waiting for incoming mes-
sages, will read and decrypt a message (in one atomic step), by means of the
operation ink:
KString message = new KString() ;
KCipher cryptMessage = new KCipher( message ) ;
KString from = new KString() ;
Tuplex t = new Tuplex() ;
t.add( new KString( "PERSONAL" ) ) ;
t.add( cryptMessage ) ;
t.add( from ) ;
ink( t, self ) ;
Print("Received message: " + message);
Both the server and the clients execute these operations within the loop for
handling incoming messages.
Since tuple space operations can be used both by local processes and by mo-
bile agents, the extended operations, presented in this paper, address both the
privacy of hosts and of mobile agents. We did not deal with key distribution
explicitly that can be seen as an orthogonal problem. Digital signatures can
be smoothly integrated in our framework and the pattern matching extended
accordingly.
The work that is closer to ours is [5], which introduces the Secure Object Space
(SecOS) model. This model is intended to extend Linda with fine-grained ac-
cess control semantics. In SecOS all tuple fields are locked with a key, and each
field must be locked with a different key. The basic idea is that a process, upon
retrieving a tuple, can see only the fields for which he owns the corresponding
key. The structure of a tuple does not influence pattern matching: due to an in-
troduced subsumption rule, a template can match also a bigger tuple, and fields
can be reordered during the matching. [6] proposes a similar, but richer frame-
work, SecSpaces, where also resource access control and tuple space partitioning
facilities are provided (orthogonal and complementary to our approach).
All these features tend to alter the original Linda model, while our principal
aim is to provide an extension of the Linda communication model that can be
smoothly integrated into the existing features, without significantly changing the
184 L. Bettini and R. De Nicola
original model. Moreover, neither SecOS nor SecSpaces handle code mobility,
which is one of our main concerns.
Mobility imposes additional restrictions on the underlying model, e.g., re-
quiring that agents do not carry private keys during migrations, and calls for
alternatives such as explicit encryption and decryption mechanisms and a two-
stage pattern matching. Indeed the problem of protecting an agent against a
malicious host is even more complicated than that of protecting a host from a
malicious agent (we refer to the papers in [15,16]).
References
1. K. Arnold, E. Freeman, and S. Hupfer. JavaSpaces Principles, Patterns and Prac-
tice. Addison-Wesley, 1999.
2. K. Arnold, B. O’Sullivan, R. Scheifler, J. Waldo, and A. Wollrath. The Jini Spec-
ification. Addison-Wesley, 1999.
3. L. Bettini. Linguistic Constructs for Object-Oriented Mobile Code Programming &
their Implementations. PhD thesis, Dip. di Matematica, Università di Siena, 2003.
forthcoming.
4. L. Bettini, R. De Nicola, and R. Pugliese. Klava: a Java package for distributed
and mobile applications. Software – Practice and Experience, 32(14):1365–1394,
2002.
5. C. Bryce, M. Oriol, and J. Vitek. A Coordination Model for Agents Based on Secure
Spaces. In P. Ciancarini and A. Wolf, editors, Proc. 3rd Int. Conf. on Coordination
Models and Languages, number 1594 in LNCS, pages 4–20. Springer-Verlag, 1999.
6. N. Busi, R. Gorrieri, R. Lucchi, and G. Zavattaro. SecSpaces: a Data-driven
Coordination Model for Environments Open to Untrusted Agents. In Proc. of
FOCLASA’02, ENTCS. Elsevier, 2002.
7. P. Ciancarini and D. Rossi. Jada - Coordination and Communication for Java
Agents. In J. Vitek and C. Tschudin, editors, Mobile Object Systems - Towards
the Programmable Internet, number 1222 in LNCS, pages 213–228. Springer, 1997.
8. R. De Nicola, G. Ferrari, and R. Pugliese. Klaim: a Kernel Language for Agents
Interaction and Mobility. IEEE Transactions on Software Engineering, 24(5):315–
330, 1998.
9. D. Deugo. Choosing a Mobile Agent Messaging Model. In Proc. of ISADS 2001,
pages 278–286. IEEE, 2001.
10. D. Ford, T. Lehman, S. McLaughry, and P. Wyckoff. T Spaces. IBM Systems
Journal, pages 454–474, August 1998.
11. D. Gelernter. Generative Communication in Linda. ACM Transactions on Pro-
gramming Languages and Systems, 7(1):80–112, 1985.
12. G. Picco, A. Murphy, and G.-C. Roman. Lime: Linda Meets Mobility. In D. Garlan,
editor, Proc. ICSE’99, pages 368–377. ACM Press, 1999.
13. T. Sander and C. Tschudin. Protecting Mobile Agents Against Malicious Hosts.
In Vigna [15].
14. Sun Microsystems. Java Cryptography Extension (JCE), Refence Guide, 2001.
15. G. Vigna, editor. Mobile Agents and Security. Number 1419 in LNCS. Springer,
1998.
16. J. Vitek and C. Jensen, editors. Secure Internet Programming: Security Issues for
Mobile and Distributed Objects, number 1603 in LNCS. Springer-Verlag, 1999.
17. B. Yee. A Sanctuary For Mobile Agents. In Vitek and Jensen [16], pages 261–273.
18. P. Zimmermann. The Official PGP User’s Guide. MIT Press, 1995.
Designing Fault-Tolerant Mobile Systems
1 Introduction
From the early 90s the community working on mobile agent systems has been
looking for a killer application to prove the usefulness of the concepts. Today, due
to the fast development of the Internet, Grid computing, e-commerce, and due to
the tremendous growth of the sizes of distributed systems and their proliferation
to many new application areas, mobility is becoming a practical issue. Many
companies are now building complex applications with some elements of mobility.
In this paper we will focus on code mobility [7], which is mainly used for
providing a wide range of non-functional benefits, including on-line system cus-
tomisation and upgrading, improvement of system performance and dynamic
N. Guelfi et al. (Eds.): FIDJI 2002, LNCS 2604, pp. 185–201, 2003.
c Springer-Verlag Berlin Heidelberg 2003
186 G. Di Marzo Serugendo and A. Romanovsky
There is a need for much more flexible and dynamic fault-tolerant techniques
that are light in both code and communication exchanges. One of the important
issues here is to involve application programmers that develop such mobile appli-
cations, in providing fault tolerance. We believe that it is inefficient to focus only
on fault tolerance that the underlying middleware can provide transparently for
the application. Application-specific fault tolerance incorporated at the applica-
tion level is clearly a much more powerful approach to dealing with faults and
abnormal situations of all possible types. In this paper we propose different tech-
niques, not all necessarily meant for fault-tolerance, and show how they could be
applied at this level to improve the overall dependability of mobile applications.
1
Section 2.2 outlines major application domains in which mobile code is used.
Designing Fault-Tolerant Mobile Systems 187
Network Management. The use of customised mobile agents for network man-
agement increases decentralisation and flexibility. Indeed, agents pro-actively
carry out administration tasks, and can be dynamically replaced allowing, for
instance, dynamic update of network policies. In addition, since mobile agents
are located in network devices, they help reduce traffic around the management
station, and make it possible to distribute processing load [1].
Remote Device Control and Configuration. Mobile agents are useful for per-
forming monitoring tasks, such as information filtering, control functions and
intrusion detection. The advantage of using mobile agents in this domain is that
policies and itineraries of mobile agents can be modified dynamically [23] and
can respond to an intrusion in real-time [6].
188 G. Di Marzo Serugendo and A. Romanovsky
Wireless Applications. Wireless networks suffer from low bandwidth and discon-
nection errors. Mobile agents help to overcome these limitations, since a mobile
agent, roaming the connected network, can still work on behalf of a mobile
user, even if the mobile user is disconnected. Dispatching a mobile agent close
to a server reduces data transfer among information servers and the wireless
device [14].
Active Documents. Moving data with mobile code leads to the notion of ac-
tive documents. Early forms consisted in interactive and animated web pages
provided by applets running at the client side. Recent works enable complex
documents, such as meeting schedules containing mobile code, to mediate com-
munication among the participants, including notification of participants, and
observation of workflow rules [8].
Grid Computing and Global Computing. Mobile agents are particularly useful
for efficiently monitoring distributed computational resources in Grid systems,
since they help in coping with Grid large-scale size and discovering data [22].
Mobile code proved to be very useful for exploiting idle computing resources
Designing Fault-Tolerant Mobile Systems 189
in a Grid system, for distributing data and computational tasks, for collecting
computed results, as well as for charging and billing participants [2].
There is a wide range of abnormal situations which mobile applications can face,
which include: migration requests failures, unavailability of resources at the new
location, security restrictions, communication delays, failures of components in
the local (host) environment, users’ mistakes, agent programmers’ mistakes, fail-
ures of the subsystems of a complex mobile application to provide the required
services, node crashes, network partitioning, and differences in the host environ-
ments on which mobile code is executed. These abnormal situations are mainly
caused by:
These points show that traditional techniques for fault tolerance in dis-
tributed systems are not directly applicable for handling abnormal situations of
different types at the agent level. First of all, traditional techniques are mainly
oriented on tolerating hardware faults (e.g. by using replication, group communi-
cation, transactions); but typically mobile systems have to face faults of a much
wider range. Secondly, recovery based on rollback or abort cannot be always
applied as the main recovery technique, because there are many application-
specific modifications that cannot be simply ’erased’; this is why many abnor-
mal situations have to be dealt with at the application level, using forward error
recovery [12]. One more crucial factor for choosing forward error recovery is its
comparatively low cost. Thirdly, mobile systems need flexible recovery (exception
handling) solutions; in particular, it should be possible to change dynamically
system configuration, structure, fault tolerance means to be applied and agent
connections. One more reason is that, as we have explained above, there is a
Designing Fault-Tolerant Mobile Systems 191
huge variety of abnormal situations of different types that can happen. The last
reason is that mobile environments have computational models and communica-
tion techniques that are very different from the conventional ones (for example,
they use asynchronous communication and dynamic binding).
The fault tolerance techniques described below either use built-in fault tol-
erance (Coordinated Atomic actions actions, proof carrying code) or separate
it from the functionality (meta-agent), or they allow programmers to flexibly
choose these two (asynchronous resolution, self-repair).
Error Processing and Error Confinement. The essence of any fault toler-
ance technique is its error processing (error detection and error recovery) and
error confinement capabilities. This includes the way the errors are detected
when a particular technique is used, and the information that can be used for
this detection (for example, internal agent states, input/output parameters of
the calls, histories of event).
The area of error confinement defines the damage domain to be recovered
after an error has been detected - this can be an agent, a part of an agent (e.g. its
method) or a group of agents. This criterion is closely related to the structuring
view discussed before.
In this subsection we introduce a small banking system that operates with mo-
bile agents and requires fault tolerance. This example is used to illustrate the
discussion of possible fault-tolerant techniques that could be used for mobile
systems.
Designing Fault-Tolerant Mobile Systems 193
A user, wishing to acquire some product, launches a buyer agent that will
roam the Internet searching for a seller agent offering the requested object. The
buyer and seller agents will meet in an electronic market place, where they will
exchange information regarding products that their respective users desire to buy
or sell. Each agent is equipped with an e-purse holding some amount of electronic
money. If the two agents have matching requests, they reach an agreement (the
product is booked, and a contract is concluded), and the payment is then realised
by transferring some money from the e-purse of the buyer agent directly to the
e-purse of the seller agent. In this case the e-purse acts as cash money, there is
no need to ask for a bank certification. If the payment fails, then the agreement
has to be cancelled, and the seller agent releases the product.
In this scenario, the buyer agent can be either alone or composed of several
agents roaming the Internet simultaneously. It may be difficult or even impossible
for the user to contact the buyer agent or for the buyer agent to contact the
agents distributed world-wide. Indeed, a mobile IP scheme enabling to contact
mobile agents irrespectively of their position is difficult to consider when agents
neither know each other nor participate in the same application.
We will consider the following errors in this scenario:
– there is no offer matching the request;
– there is a bug in the seller agent code: it is impossible to reach an agreement
or to pay money from its e-purse;
– the buyer agent e-purse does not contain sufficient money;
– the buyer agent e-purse does not present sufficient privileges, e.g., a config-
uration error does not authorise money to be withdrawn from the e-purse.
3.3 Meta-agent
In the meta-agent scheme, the normal functionality is contained in the agent,
while the fault-tolerant or resource control aspect is left to the meta-agent [25].
In our electronic market place, each agent has an additional meta-agent, as
shown by figure 1. We will now consider the errors listed above. If the buyer
agent does not find any corresponding seller, it informs its meta-agent that will
take the appropriate decision: stop the buyer agent, let it move to another place,
inform the user for changing the request, etc.
The bug in the seller code results in an internal error for the buyer agent. It
will abort its current transaction, and try to find another partner. Alternatively,
it may inform the meta-agent about the problem.
In the case (a) when the buyer has not enough money, the buyer agent raises
an exception to its meta-agent (1), which is responsible for downloading money
to the buyer agent e-purse. The meta-agent first contacts the user (2) for his
agreement (3), and then the bank (4) for actually loading money (5), and finally
uploads the electronic money to the e-purse (6). The payment can then occur
(7).
For insufficient privileges (b), the meta-agent acts in a similar way, it simply
asks the user for more privileges, and changes the code of the agent accord-
ingly (4).
194 G. Di Marzo Serugendo and A. Romanovsky
Discussion. Meta-agents offer the advantage that the error handling code
(actually contained in the meta-agent) can be downloaded or changed at run-
time, thus favoring flexibility and adaptability with respect to errors. More than
one meta-agent can be related to an agent, thus enabling several exception han-
dling. Meta-agents may either come with the agent or be only requested in case
of problems or simply be stationary at some well-known place. There is no over-
head at the level of the agent itself, even though at the level of the application,
there is a certain amount of code dedicated to the meta-agent, and some message
transfer is necessary between the agent and its meta-agent.
The meta-agents are useful for an asynchronous recovery from local errors
(i.e., the meta-agent can deal with the error and then inform the agent), when
there is no need for a cooperative resolution scheme at the level of the agents and
when exceptions are raised in isolated agents. This may be the case for wireless
applications and active networks.
4.RequestForMoney 2. NeedMorePrivileges
2. NeedMoreMoney
Bank Meta-Agent User Meta-Agent User
5. 40 Euros 3. OK 3. AuthorisedToPay
6. 40 Euros 4. AuthorisedToPay 1. NotEnoughPrivileges
1. NotEnoughMoney
e-purse e-purse
7. Payment 5. Payment
Fig. 1. Meta-agent
Whenever the payment fails, either (a) the buyer agent enters a recovery phase,
where it asks for more money, and the payment finally succeeds; or (b) the
buyer cannot recover, leading to the failure of CAPayment, which in turns causes
CAAgreement to abort. We notice that before aborting CAAgreement releases
the booking of the object that had been previously realised.
User
CAAgreement AskForMoney 40 Euros
CAPayment
BuyerAgent
WantToBuyObject Payment PaymentFailed Payment
SellerAgent
BookObject
ObjectToAcquire
time
CAAgreement
CAPayment
BuyerAgent
WantToBuyObject Payment PaymentFailed
SellerAgent
BookObject ReleaseObject
ObjectToAcquire
time
(b) Abort
Discussion. CA actions offer the most general way of providing fault tol-
erance in concurrent systems: they clearly define the damage area (i.e., the CA
action scope) to be recovered and are intended for recursive system structuring.
The hidden overheads are caused by additional synchronisation of the agents
participating in an action on the action entry and exit.
CA actions are especially well suited to complex applications involving co-
operating agents; for large-scale applications in which agents are aware of other
agents, i.e., active documents; for systems in which we need cooperative handling
of exceptions, i.e., workflow management applications or when we need recursive
structuring of complex mobile applications for fault tolerance. However, addi-
tional work has to be carried out to make this concept more applicable for mobile
applications. This in particular includes better understanding of the underlying
196 G. Di Marzo Serugendo and A. Romanovsky
In some cases, several roaming agents act collaboratively for the resolution of a
given task. If the task has to be aborted, all the agents have to be informed in
order to stop their work. However, those agents are unattainable by asynchronous
exceptions, since we do not know where they are.
Consider, for instance, the scenario in which we want to cancel purchasing
a series of stamps, where each stamp in the series is bought by an individual
agent, possibly located in different market places. This problem can be solved
using an asynchronous resolution scheme, depicted by figure 3: individual agents,
from time to time, may consult some agreed place (AgentPlatform 3), where
messages for the agents are stored. Either this place contains no more money
that the agent can use, and thus they will no longer be able to buy stamps; or
it contains directives to stop buying stamps, and if possible to re-sell those that
have already been purchased.
This collaborative asynchronous resolution scheme may also help the buyer
agents to recover from money/privileges problems, since one of them may ask
the others for more money or for additional privileges.
AgentPlatform_1 AgentPlatform_2
BuyerAgent BuyerAgent 1
1
AgentPlatform_3
Money
Directives
AgentPlatform_4 AgentPlatform_5
SellStamp
BuyerAgent 2 BuyerAgent SellStamp
1
3.6 Self-Repair
In this case, the agent does not consult its user (neither through a meta-agent,
nor directly). It tries instead to solve the problem autonomously by asking other
agents, not necessarily involved in the same task, for help.
For instance, in the case of the electronic market place, if the buyer does not
find any matching offer, it may ask other agents if they know about possible
sellers.
As shown by figure 4, if the buyer agent has no sufficient money (1), and
provided it is authorised to spend more money than the e-purse holds, it can
ask all other agents present in the market place to borrow it some money (2). If
an agent agrees to lend money to the buyer agent (3), they conclude a contract
(specifying interests and delays). Then, the buyer agent receives some money,
that it will use for buying the initial object (4). The agent that lent the money
will be refunded afterwards, once the buyer agent will have downloaded some
money directly from its bank. This scheme can be very convenient in case of
urgent decisions. Borrowing money from another agent present in the market
place may be realised more quickly than entering a remote communication with
the bank.
Discussion. It is worth noting that this kind of resolution can be undertaken
with meta-agents as well as with CA actions. However, it suits better for the
situations: when the agent is given full autonomy or crucially needs to recover
even partially from abnormal situations, e.g. wireless applications; when control
of the application is decentralised both for the functionality and the fault tol-
erance requirements; and for situations requiring a quick resolution, i.e., remote
device control or active networks.
This technique scales to large-size systems since the error is confined to the
agent itself, which autonomously tries to solve the problem. Obviously there is
an overhead in the size of code necessary to carry resolution schemes and in
execution time needed to overcome problems.
Before trying to reach an agreement and then realising that the buyer has not
enough money, the proof carrying code alternative enables both parties to expose
198 G. Di Marzo Serugendo and A. Romanovsky
Agent1 Agent2
2.AskForMoney 2.AskForMoney
2.AskForMoney 4.Payment
SellerAgent
Agent3
AgentPlatform
Fig. 4. Self-Repair
some of their internal information, that will prove or not that they are able to
enter the transaction. In the example of figure 5, the buyer and seller agent
exchange information regarding the amount of money contained in the e-purse,
the privileges they have (buy or sell), as well as the minimum and maximum
price requested or allowed for the object. Given the values, we see that the buyer
has not sufficient money (only 4 Euros), even if an agreement is reached for a
price between 5 and 7 Euros.
AgentPlatform
Money = 40 Euros
SellerAgent Privilege = Sell
Price = 10 Euros
Min = 5 Euros
Impossible
Money = 4 Euros
BuyerAgent Privilege = Buy
Price = 4 Euros
Maximum = 7 Euros
In the case of bugs in the seller agents, an additional proof carried by the
seller agent code may allow for them to be detected.
This is a simplified version of the original proof carrying code [16]. We can
easily imagine how to actually replace specification values with code.
Discussion. Flexibility of this approach is limited since the proof, used for
error detection, is part of the agent and not customisable at run-time. There is a
clear overhead in code size due to the proof that has to be carried by the agent
and in execution time required to execute it.
This scheme is particularly well adapted for mobile agents that enter interac-
tions with unknown agents, i.e., agents that have been designed independently
and that may use different standards, and for discovering and composing un-
known services proposed by potential partners.
4 Related Works
The main body of research in fault tolerance of mobile systems focuses on soft-
ware techniques oriented towards tolerating hardware faults. For example, an
approach to developing consensus protocols to be used for achieving an agree-
ment when agents crash is presented in [18]. A novel algorithm for ensuring
that a message is always delivered to a mobile agent is proposed in [15]. The
particularity of this algorithm is that it does not enforce continuous connectiv-
ity with the message source. Our intention is to build on such techniques when
possible to allow tolerance of faults of a wider range. The techniques we are con-
sidering usually use such services as the underlying middleware supports (which
promotes separation of concerns). Moreover, when some of them are either not
capable of delivering the required service or not used or very expensive to use,
the agents can deal with such problems in an application-specific fashion using
the techniques discussed in Section 3.
There are two techniques which are related to this work. Paper [9] introduces
an agent service (called Electronic exception handling institution) dedicated to
detecting and solving exceptional conditions in software agent marketplaces.
This service collects information about typical abnormal situations and uses AI
techniques to find out the best way of handling them. It is interesting that the
authors recognise the needs for involving several agents in cooperative handling
of complex exceptions (in a way similar to CA actions). In the scheme pro-
posed in [24] each mobile agent has a guard agent - comparable to meta-agents
- that handles all exceptions propagated from it. This scheme is very important,
as it is the first scheme dealing with the specific characteristics of the mobile
environment.
5 Conclusion
Mobile agent systems have specific characteristics (they are highly decentralised,
dynamic and made of roaming entities) and encounter specific abnormal situa-
tions (migration errors, resource and security problems, partner’s failures). These
200 G. Di Marzo Serugendo and A. Romanovsky
References
1. M. Baldi, S. Gai, and G. P. Picco. Exploiting code mobility in decentralized and
flexible network management. In K. Rothermel and R. Popescu-Zeletin, editors,
Proceedings of the 1st International Workshop on Mobile Agents 97 (MA’97), vol-
ume 1219 of LNCS, pages 13–26. Springer-Verlag, 1997.
2. W. Binder, G. Di Marzo Serugendo, and J. Hulaas. Towards a Secure and Efficient
Model for Grid Computing using Mobile Code. In 8th ECOOP Workshop on Mobile
Object Systems: Agent Applications and New Frontiers, June 2002.
3. B. Brewington, R. Gray, K. Moizumi, D. Kotz, G. Cybenko, and D. Rus. Mobile
Agents for Distributed Information Retrieval. In M. Klusch, editor, Intelligent
Information Agents, chapter 15, pages 355–395. Springer-Verlag, 1999.
4. C. Bryce, C. Razafimahefa, and M. Pawlak. Lana: An Approach to Programming
Autonomous Systems. In 16th European Conference on Object-Oriented Program-
ming, ECOOP’02, 2002.
5. P. Dasgupta, N. Narasimhan, L. E. Moser, and P. M. Melliar-Smith. MAgNET:
Mobile Agents for Networked Electronic Trading. IEEE Transactions on Knowl-
edge and Data Engineering, Special Issue on Web Applications, 11(4):509–525,
July-August 1999.
6. N. Foukia, S. Hassas, S. Fenet, and J. Hulaas. An Intrusion Response Scheme:
Tracking the Source using the Stigmergy Paradigm. In Proceedings of Security of
Mobile Multiagent Systems Workshop (SEMAS-2002), 2002.
7. A. Fuggetta, G. P. Picco, and G. Vigna. Understanding Code Mobility. IEEE
Transactions on Software Engineering, 24(5):342–361, 1998.
8. F. Kilander, P. Werle, and K. Hansson. Jima - A Jini-based Infrastructure for
Active Documents and Mobile Agents. In Proceedings of the Personal Computing
and Communication (PCC) Workshop, November 1999.
9. M. Klein and C. Dellarocas. Exception handling in agent systems. In O. Etzioni,
J. P. Müller, and J. M. Bradshaw, editors, Proceedings of the Third International
Conference on Autonomous Agents (Agents’99), pages 62–68. ACM Press, 1999.
10. D. Kotz, R. Gray, and D. Rus. Future Directions for Mobile Agent Research. IEEE
Distributed Systems Online, 3(8), 2002.
Designing Fault-Tolerant Mobile Systems 201
11. D. B. Lange and M. Oshima. Seven good reasons for mobile agents. Communica-
tions of the ACM, 42(3):88–89, 1999.
12. P. A. Lee and T. Anderson. Fault Tolerance: Principles and Practice. Dependable
computing and fault-tolerant systems. Springer-Verlag, 1990.
13. S. W. Loke and A. B. Zaslavsky. Towards distributed workflow enactment with
itineraries and mobile agent management. In J. Liu and Y. Ye, editors, E-
Commerce Agents, Marketplace Solutions, Security Issues, and Supply and De-
mand, volume 2033 of Lecture Notes in Computer Science, pages 283–294. Springer-
Verlag, 2001.
14. Q. H. Mahmoud. MobiAgent: A Mobile Agent-based Approach to Wireless Infor-
mation Systems. In Proceedings of the 3rd International Bi-Conference Workshop
on Agent-Oriented Information Systems, held with the 5th International Confer-
ence on Autonomous Agents 2001, 2001.
15. A. L. Murphy and G. P. Picco. Reliable communication for highly mobile agents.
Journal of Autonomous Agents and Multi-Agent Systems, 5(1):81–100, March 2002.
16. G. C. Necula. Proof-carrying code. In The 24th ACM SIGPLAN-SIGACT Sym-
posium on Principles of Programming Languages (POPL’97), 1997.
17. G. P. Picco. Mobile agents: An introduction. Journal of Microprocessors and
Microsystems, 25(2):65–74, April 2001.
18. S. Pleisch and A. Schiper. Modeling fault-tolerant mobile agent execution as a
sequence of agreement problems. In 19th IEEE Symposium on Reliable Distributed
Systems (SRDS’00), pages 11–20. IEEE Computer Society Press, 2000.
19. B. Randell. Recursively structured distributed computing systems. In Proceedings
of Third Symposium on Reliability in Distributed Software and Database Systems,
pages 3–11. IEEE Computer Society Press, 1983.
20. D. L. Tennenhouse. Active networks. In Proceedings of the Second Symposium on
Operating Systems Design and Implementation (OSDI ’96), pages 89–90, Berkeley,
CA, USA, 1996. USENIX Association.
21. D. L. Tennenhouse, J. M. Smith, W. D. Sincoskie, D. J. Wetherall, and G. J.
Minden. A Survey of Active Network Research. IEEE Communications Magazine,
1997.
22. O. Tomarchio, L. Vita, and A. Puliafito. Active monitoring in GRID environments
using mobile agent technology. In 2nd Workshop on Active Middleware Services
(AMS’00) in HPDC-9, August 2002.
23. A. Tripathi, T. Ahmed, S. Pathak, A. Pathak, M. Carney, M. Koka, and P. Dokas.
Active Monitoring of Network Systems using Mobile Agents. In Proceedings of
Networks 2002, a joint conference of ICWLHN 2002 and ICN 2002, 2002.
24. A. Tripathi and R. Miller. Exception handling in agent-oriented systems. In
A. Romanovsky, C. Dony, J. Lindskov Knudsen, and A. Tripathi, editors, Advances
in Exception Handling Techniques, volume 2022 of LNCS, pages 128–146. Springer-
Verlag, 2001.
25. A. Villazon and W. Binder. Portable Resource Reification in Java-based Mobile
Agent Systems. In Proceedings of the Fifth IEEE International Conference on
Mobile Agents (MA’01), December 2001.
26. J. Xu, B. Randell, A. Romanovsky, C. Rubira, R. Stroud, and Z. Wu. Fault toler-
ance in concurrent object-oriented software through coordinated error recovery. In
25th International Symposium on Fault-Tolerant Computing Systems (FTCS-25),
pages 499–509. IEEE Computer Society Press, 1995.
27. A. F. Zorzo and R. J. Stroud. A Distributed Object-Oriented Framework for De-
pendable Multiparty Interactions. ACM Sigplan Notices, 34(10):435–446, October
1999.
The Role of OCL in the Model Driven Architecture
Jos Warmer
Abstract. Within the Model Driven Architecture (MDA) models and model
transformations play a central role. At the model level OCL adds the precision
that is lacking in plain UML. A model that combines the use of OCL and UML
is perfectly suitable for MDA transformations. At the meta-model level, OCL is
used to define validation rules for models. These can be executed against a
model to automatically check whether the model conforms to the validation
rules. The third use of OCL is as part of the transformation language in which
MDA model transformations will be defined. The OMG has issues an RfP for
such a language.
Shane Sendall
Claude Petitpierre